This article is part of the on-going Awk Tutorial Examples series. Awk has several powerful built-in variables. There are two types of built-in variables in Awk.
- Variable which defines values which can be changed such as field separator and record separator.
- Variable which can be used for processing and reports such as Number of records, number of fields.
1. Awk FS Example: Input field separator variable.
Awk reads and parses each line from input based on whitespace character by default and set the variables $1,$2 and etc. Awk FS variable is used to set the field separator for each record. Awk FS can be set to any single character or regular expression. You can use input field separator using one of the following two options:
- Using -F command line option.
- Awk FS can be set like normal variable.
Syntax: $ awk -F 'FS' 'commands' inputfilename (or) $ awk 'BEGIN{FS="FS";}'
- Awk FS is any single character or regular expression which you want to use as a input field separator.
- Awk FS can be changed any number of times, it retains its values until it is explicitly changed. If you want to change the field separator, its better to change before you read the line. So that change affects the line what you read.
Here is an awk FS example to read the /etc/passwd file which has “:” as field delimiter.
$ cat etc_passwd.awk BEGIN{ FS=":"; print "Name\tUserID\tGroupID\tHomeDirectory"; } { print $1"\t"$3"\t"$4"\t"$6; } END { print NR,"Records Processed"; }
$awk -f etc_passwd.awk /etc/passwd Name UserID GroupID HomeDirectory gnats 41 41 /var/lib/gnats libuuid 100 101 /var/lib/libuuid syslog 101 102 /home/syslog hplip 103 7 /var/run/hplip avahi 105 111 /var/run/avahi-daemon saned 110 116 /home/saned pulse 111 117 /var/run/pulse gdm 112 119 /var/lib/gdm 8 Records Processed
2. Awk OFS Example: Output Field Separator Variable
Awk OFS is an output equivalent of awk FS variable. By default awk OFS is a single space character. Following is an awk OFS example.
$ awk -F':' '{print $3,$4;}' /etc/passwd 41 41 100 101 101 102 103 7 105 111 110 116 111 117 112 119
Concatenator in the print statement “,” concatenates two parameters with a space which is the value of awk OFS by default. So, Awk OFS value will be inserted between fields in the output as shown below.
$ awk -F':' 'BEGIN{OFS="=";} {print $3,$4;}' /etc/passwd 41=41 100=101 101=102 103=7 105=111 110=116 111=117 112=119
3. Awk RS Example: Record Separator variable
Awk RS defines a line. Awk reads line by line by default.
Let us take students marks are stored in a file, each records are separated by double new line, and each fields are separated by a new line character.
$cat student.txt Jones 2143 78 84 77 Gondrol 2321 56 58 45 RinRao 2122 38 37 65 Edwin 2537 78 67 45 Dayan 2415 30 47 20
Now the below Awk script prints the Student name and Rollno from the above input file.
$cat student.awk BEGIN { RS="\n\n"; FS="\n"; } { print $1,$2; } $ awk -f student.awk student.txt Jones 2143 Gondrol 2321 RinRao 2122 Edwin 2537 Dayan 2415
In the script student.awk, it reads each student detail as a single record,because awk RS has been assigned to double new line character and each line in a record is a field, since FS is newline character.
4. Awk ORS Example: Output Record Separator Variable
Awk ORS is an Output equivalent of RS. Each record in the output will be printed with this delimiter. Following is an awk ORS example:
$ awk 'BEGIN{ORS="=";} {print;}' student-marks Jones 2143 78 84 77=Gondrol 2321 56 58 45=RinRao 2122 38 37 65=Edwin 2537 78 67 45=Dayan 2415 30 47 20=
In the above script,each records in the file student-marks file is delimited by the character “=”.
5. Awk NR Example: Number of Records Variable
Awk NR gives you the total number of records being processed or line number. In the following awk NR example, NR variable has line number, in the END section awk NR tells you the total number of records in a file.
$ awk '{print "Processing Record - ",NR;}END {print NR, "Students Records are processed";}' student-marks Processing Record - 1 Processing Record - 2 Processing Record - 3 Processing Record - 4 Processing Record - 5 5 Students Records are processed
6. Awk NF Example: Number of Fields in a record
Awk NF gives you the total number of fields in a record. Awk NF will be very useful for validating whether all the fields are exist in a record.
Let us take in the student-marks file, Test3 score is missing for to students as shown below.
$cat student-marks Jones 2143 78 84 77 Gondrol 2321 56 58 45 RinRao 2122 38 37 Edwin 2537 78 67 45 Dayan 2415 30 47
The following Awk script, prints Record(line) number, and number of fields in that record. So It will be very simple to find out that Test3 score is missing.
$ awk '{print NR,"->",NF}' student-marks 1 -> 5 2 -> 5 3 -> 4 4 -> 5 5 -> 4
7. Awk FILENAME Example: Name of the current input file
FILENAME variable gives the name of the file being read. Awk can accept number of input files to process.
$ awk '{print FILENAME}' student-marks student-marks student-marks student-marks student-marks student-marks
In the above example, it prints the FILENAME i.e student-marks for each record of the input file.
8. Awk FNR Example: Number of Records relative to the current input file
When awk reads from the multiple input file, awk NR variable will give the total number of records relative to all the input file. Awk FNR will give you number of records for each input file.
$ awk '{print FILENAME, FNR;}' student-marks bookdetails student-marks 1 student-marks 2 student-marks 3 student-marks 4 student-marks 5 bookdetails 1 bookdetails 2 bookdetails 3 bookdetails 4 bookdetails 5
In the above example, instead of awk FNR, if you use awk NR, for the file bookdetails the you will get from 6 to 10 for each record.
Recommended Reading
Sed and Awk 101 Hacks, by Ramesh Natarajan. I spend several hours a day on UNIX / Linux environment dealing with text files (data, config, and log files). I use Sed and Awk for all my my text manipulation work. Based on my Sed and Awk experience, I’ve written Sed and Awk 101 Hacks eBook that contains 101 practical examples on various advanced features of Sed and Awk that will enhance your UNIX / Linux life. Even if you’ve been using Sed and Awk for several years and have not read this book, please do yourself a favor and read this book. You’ll be amazed with the capabilities of Sed and Awk utilities.
Comments on this entry are closed.
Thanks for posting good article.
Hi Ramesh
I tried the example 3..but it gives the file as is in the output..not as expected..is there anything else i need to do….
to Guru
because there is an error in example 3
line
FS”\n”;
should be
FS=”\n”;
example 3 is still not producing the desired the output.The output produced is the same as the input
@Frank Huang,
Thanks for pointing it out. I’ve corrected it. Just FYI. For me both FS”\n” and FS=”\n” worked. I’m using GNU awk.
@Guru, @Arthur,
Please try again with FS=”\n” and see if it works again. If not, can you let me know what version of Awk you are using?
The Best description about Built-in Variables of awk.
In breif this useful web for anyone went to study, so i am intresting in its subjects , specially lunix
I have the following which works fine on AIX but does not on Linux using Gawk
echo “a:b:c” | awk ‘{FS=”:”; print $2}’
On AIX it produces
b
as expected
On Linux it produces a blank line
???
I have a 100+ line awk script which makes a lot of use of FS so I need to be able to change it from : to a space to a % within the script rather than simply use -F
you blog is good
thanks
Good aricle
Helped me a lot 🙂
Ramesh,
Still ex:3 is not giving expected result(shown in your example).The op is each filed is still in a new line with a space after end of field (before \n on each line).
My env is AIX 6.1 (not sure of awk version) awk –version|head -1 did not give any result.I have even tried with nawk but op did not change.
Can you please help me solve this problem. These statements run fine on their own but I’d like to combine them into one. I can’t seem to get it to work.
# First statement
egrep -w ‘Deny TCP|Deny UDP’ $FW_LOG | awk ‘{print $1 ” ” $2 ” ” $3}’ >> $OUTFILE
#Output
TCP 109.75.171.98 in
TCP 210.128.108.48 in
===================
# Second statement
echo 109.75.171.98 | geo
echo 210.128.108.48 | geo
#Output
Japan
United Kingdom
What I’d like to do is combine the 2 commands above so the output looks like this –
TCP 109.75.171.98 in Japan
TCP 210.128.108.48 in United Kingdom
this is what I have so far –
egrep -w ‘Deny TCP’ IPs.txt | awk ‘{cmd=”geo “$2;cmd | getline rslt;close(cmd);print $2” “rslt}’ >> $OUTPUT
also, the way geo script works is that is accepts input – for example
echo 111.222.333.444 | geo
doesn’t seem to work like geo 111.222.333.444 ..so maybe why it doesn’t work.
Jim,
the “awk” command has built-in pattern matching similar to “grep”, so you should be able to do what you want with this command line: (JETS)
awk ‘/Deny (UD|TC)P/{print $1,$2,$3}’ $FW_LOG >> $OUTFILE
In sample 3, the RS=”\n\n” does not work on my Mac terminal. Seems awk only retrieve the first character as Record Separator.
However, if I set RS=”” (empty string). It works.
I test with other cases, seems RS=”” would make awk to treat multiple newline mark as separator, kind like “\n{2,}”. Though RS does not support regex.
Maybe it is because the version of awk?
Hello Guys,
can you please let me know meaning of awk(‘/Started in/{print $NF} ) server.log
log entry is like Started in 09m:03sec:20ms
I want to know here what is meaning o
Thanks,
Vineesh
@Vineesh
I believe you would have found your answer by now 😛 anways..
it just searches for the string “Started in” and when matches is found prints the last column value.
i love this website 🙂
colin,
this is syntex for linux
echo “a:b:c” | awk -F”:” ‘{ print $2}’
or
echo “a:b:c” | awk ‘BEGIN {FS=”:”};{ print $2}’
Hi,
What does this mean?
gsub(/[/^/~]/,” “,instring);
Hi,
I am new to unix and can someone please explain me what the below command does ?
day=`cal $month $year | awk ‘NF != 0{ last = $0 }; END{ print last }’ | awk ‘{ print $NF }’`
Hi,
In example 3, why we r using FS=”\n” without using this ans is still same…….
i think RS=”\n\n” is sufficient….., pls clear my confusion ……..
thanx in advance 🙂
Jim,
an old but interesting question. You didn’t provide input but it looks like $1 would be Deny so I started at $2, and I don’t have geo but you may have to put the full path if it’s another script.
awk ‘/Deny (UD|TC)P/{printf “%s %s %s “,$2,$3,$4;cmd=”echo ” $3 “|./geo”;cmd|getline $result;print $result}’ $FW_LOG >> $OUTFILE
@Shweta: While going through the above docu…I also came with the same doubt…
NOTE: “Awk RS defines a line,Awk reads line by line by default”…
I share my views with the below examples,
=====
Eg., 3.1
cat student1.txt | awk ‘BEGIN{RS=”\n\n”;FS=”\n”}{print $1,$2;}END{print “Done”;}’
Jones 2143
Gondrol 2321
RinRao 2122
Edwin 2537
Dayan 2415
Done
Eg., 3.2
cat student1.txt | awk ‘BEGIN{RS=”\n\n”}{print $1,$2;}END{print “Done”;}’
Jones 2143
Gondrol 2321
RinRao 2122
Edwin 2537
Dayan 2415
Done
Eg., 3.3
cat student1.txt | awk ‘BEGIN{RS=”\n\n”;FS=”\n\n”}{print $1,$2;}END{print “Done”;}’
Jones
2143
78
84
77
Gondrol
2321
56
58
45
RinRao
2122
38
37
65
Edwin
2537
78
67
45
Dayan
2415
30
47
20
Done
======
I also tried with FS=”\n\n\n\n”, n number of times using \n inside FS, but got the same output as Eg., 3.3 only.
Records are always seperated by “\n” new line characters,obviously every line in a file is a record, including blank lines.This is default.
When FS is a single character, then the newline character always serves as a field separator, in addition to whatever value FS may have. Leading and trailing newlines in a file are ignored.
Hence according to above example 3.1,3.2,3.3, when we use FS=”\n” or not, we are getting the desired output( as expected).
Cheers,
Hi ,
Its regarding example 3 .As we have mentioned FS=”\n” and RS=”\n\n”.the newline character of RS is also considered as FS character and is treated as $1,$2
i.e
a
b
c
d
here there are two \n after and b so these are also counted as fields by FS.So finally $1 =a,$2=b,$3 and $4 are null,$5=c and $6=d.
I think it clarifies why we are not getting desired output.
Hi,
I want to separate the fields in a file using ‘~’
I am using FS=”‘~'”, but its not working..
This is actually very useful!!
Many thanks!!
Good pages to know about AWK
Hi All,
I have two files A and B files
>cat A
empid empname deptid
1 bond 10
2 james 11
> cat B
deptid deptname
10 IT
11 HR
From the above two files, i want the 3rd file C which should contain the output like below :-
>cat C
empid empname deptname
1 bond IT
2 james HR
Please help me how to get the bove output in unix shell scripting
I have 2 files of the same data, 2nd file is more recent with some values changed.
File one contains
City,174,533,1,11,0,1,0,48,m,,1200174568,1200176241
City,627,34,1,ronaldo,0,1777,0,0,m,primus,1200119084,1200120025
File two contains
City,14,153,1,11,0,1,0,485,m,,1200174568,1200176241
City,627,34,1,ronaldo,0,1777,0,0,m,primus,1200119084,1200120025
City,748,332,1,janana,0,1811,0,14,m,,1200169533,1200171129
The files are csv, the fields that are not updated are $0, $9, $11, $12, the other can change, what I want to do is compare the 2 files, search file 2 for $11, and then look in file 1 to see if it is there and then if any of the data in $1, $2, $3, $4, $5, $6, $7, $8, or $10 have changed then output that to another file. Also if file 2 has a record that does not show up in file 1 then print that to the output file also.
So the output would look like this
City,14,153,1,11,0,1,0,485,m,,1200174568,1200176241
City,748,332,1,janana,0,1811,0,14,m,,1200169533,1200171129
seeing how in line 1, both $1, $2, and $8 changed
line 2 is a new entry not in file 1
I hope this makes sense, and thanks in advance for any help someone can shed on this for me, its driving me bonkers.
this is very nice post. could you please explain the below awk code, i am never been a programmer .
[root@spacelab1 ~]# fuser -cu /apps01
/apps01: 859ce(oracle) 882ce(oracle) 1156ce(oracle) 2847ce(oracle) 2882ce(oracle) 3034ce(oracle) 3290ce(oracle) 3560ce(oracle) 6104ce(oracle)
7663ce(oracle) 8260ce(oracle) 8312ce(oracle) 8547ce(oracle) 8549ce(oracle) 8551ce(oracle) 8555ce(oracle) 8557ce(oracle) 8559ce(oracle) 8561ce(oracle) 8563ce
[root@spacelab1 ~]# fuser -cu /apps01 2>/dev/null | awk ‘{ for (i=1; i<=NF; i++) print $i }' | more
859
882
1156
3034
3290
3560
6104
Thanks Much!
-Karn
Dear All, sample 3, the RS=”\n\n” does not work on my AIX box.
Regards,
Sams
Dear Shiva,
This is the solution for your query. This is may not be the apt solution. But i put my efforts to get it. This is working fine.
grep -i “bond” a | awk ‘BEGIN {print “empid”,”empname”}{print $1,”\t”,$2}’ >> aa
cat aa
empid empname
1 bond
grep -i “it” b | awk ‘BEGIN {print “DEPTNAME”}{print $2}’ >>bb
cat bb
DEPTNAME
IT
awk ‘NR==FNR{_[NR]=$0;next}{print $1,$2,_[FNR]}’ bb aa
empid empname DEPTNAME
1 bond IT
Thank you shiva for posting such a query. This query is similar to the pl/sql query.
how to delete a particular line from file using AWK….
For ex: My input is given below ( I want to delete a particular line) in my case line no:3 will be delete
Sample Input:
“Unix is multitasking and multiuser system
Unix is open source
Unix contain lot of commands
Unix is my favourite subject”
so I want to get below output
“Unix is multitasking and multiuser system
Unix is open source
Unix is my favourite subject”
An awk one liner can do the job for shiva.
awk ‘BEGIN {while (getline < "b") {arr[$1]=$2}} { print $1 ,$2, arr[$3] }' a
empid empname deptname
1 bond IT
2 james HR
3 john RnD
First read file b into an awk associative array arr [$1]=$2 in the BEGIN block. This becomes your look-up table for the main program which prints from file a, $1 $2 and arr [$3] instead of just $3.
awk can open another file b and read each line using getline and populate the array arr before reading from file a. You can add close ("b") in the BEGIN block for good programming practice.
Hi,
I have a CDR file with fields separated by ‘|’.Each line is a single cdr. some cdr has has 54 fields and some have 41 fields.
i want to get only the cdrs with 41 fields.
I can count the lines with number of filed value with below command but can not find the lines themselves. Can you please help
zcat push.cdr.20160216.105825.gz|awk -F ‘|’ ‘{print NF}’|sort -r |uniq -c
zcat x.gz | awk -F’|’ ‘NF==41′
-F’|’ will reset the field separator FS variable in to fields delimited by the pipe.
Then, use NF==41 as the awk pattern/condition to print on by default.
To print specific fields you can use
zcat x.gz | awk -F’|’ ‘NF==41 { print $1 $2 }’ # where $1 and $2 are the fields you want printed.
Awk will read the unzipped file from the output of the zcat command and split the lines one by one with the pipe as delimiter and assign the number of fields to NF, automatically. Whenever the condition NF==41 is met, it will print the output line by default. If you specify the print command, with specific field variables, in this case the first and second variable $1 and $2 respectively, columns one and two will be printed.
Example 3 is working for me as below given…
# cat student.awk
BEGIN {
RS = “”;
FS = “\n”;
}
{
print $1,$2;
}
#awk -f student.awk student.txt
Jones 2143
Gondrol 2321
RinRao 2122
Edwin 2537
Dayan 2415
here
Hands down the best article on awk variables I’ve come across…Keep up the good work !!!