AWK & SED

SED#

SED - for Stream Editor, its basically stream oriented meaning in linux, many commands are streamed from input to the output, for EG: ls -l | grep “text” the output from ls -l will be streamed as input to grep.

sed is basically an on the fly automating tool to edit, search, filter etc on ur text files.

AWK#

AWK itself is kinda programming language for unix based OS’s. It can almost do everything that a normal shell script can do.

Working of AWK and SED#

All these AWK and SED are basically line editors, think of line editors are kinda nvim | vim but on terminal on the fly, in a single line. One good example of line editor is ed. All the things that u do with ed for eg: ed file_name, this command opens ur file_name under ed u can do things like p(displays the current line), line_number(displays that line), u can do modifications etc. all these modifications affect the file that u loaded initially using ed. EG Below

ed_line_editor

Working of SED#

In both sed and awk, each instructions has two parts a pattern and a procedure.
pattern - is mostly a regular expression kinda matching statement.
procedure - set of actions/instructions that the pattern needs to be applied to the input.

Sed can be invoked in two ways, either the procedure(instructions) can be written in the command line or it can written in a file and can be loaded using sed -f procedure_file_name

basic example - sed ’s/CH/Chennai/’ file.txt, this command basically substitutes(s) the word CH with Chennai on file.txt.
Remember this won’t change/edit the file, it just prints the desired output on stdout
if you really want to save the output of sed, you can either do sed command > output.txt, u can stream to an output file. (>) this redirection operator deletes the whole file(output.txt), before
you can also combine multiple commands either using ;(sed ’s/CH/Chennai/; s/MA/Mas/’ file.txt) or using -e(sed -e ’s/CH/Chennai’ -e ’s/MA/Mas/’ file.txt).

you can use a flag “-n” to show only the output i.e the affected lines/texts to be displayed to the stdout, for eg: sed ’s/CH/Chennai/’ this will basically substitutes ch to chennai in all occurences in the file, but this will also outputs the entire file content, if u specify -n only the changed contents will be displayed.

Working of AWK#

the syntax and working is almost similar to sed awk -f script_file, awk ‘procedure/instruction’ file.txt, as i said and u see almost the syntax is as same as sed, the only difference is the way of execution of procedure/instructions.

for eg: there is an instruction called print, it basically reads each line from a text and prints it $0 - prints the whole file/contents
$1 - prints the first word alone
$2 - prints the second word alone …..
Each spaces in a line or tabs or consecutive spaces are considered as delimeter(will not be outputted) by default. EG: awk ‘{ print $1 }’ file.txt
in awk print statement should be accompanied within braces. Think of awk as more like a query statement, sed is kinda update(will not really update the file) to the output. Below example basically matches any line that has MA word in it and prints the first word of that line. awk ‘/MA/ { print $1 }’ file.txt
if u dont provide { print $1 }, it will print the entire line wrt matching MA.
you can also use -F flag to specify ur own delimeter. for EG: consider below text file
Manikandan Arjunan, Chennai, Tamilnadu
John Doe, Erode, Tamilnadu
Little John, Salem, Tamilnadu
if u do a command awk -F, ‘{ print $1 }’ instead of printing Manikandan, John, Little(coz print $1 means print 1st word which is space separated by default), but here we will be getting Manikandan Arjunan, john Doe, little john since we manually given ,.