This article is part of the on-going Awk Tutorial and Examples series. Like any other programming languages, Awk also has user defined variables and built-in variables.
In this article let us review how to define and use awk variables.
- Awk variables should begin with the letter, followed by it can consist of alpha numeric characters or underscore.
- Keywords cannot be used as a awk variable
- Awk does not support variable declaration like other programming languages
- Its always better to initialize awk variables in BEGIN section, which will be executed only once in the beginning.
- There are no datatypes in Awk. Whether a awk variable is to be treated as a number or as a string depends on the context it is used in.
Now let us review few simple examples to learn how to use user-defined awk variables.
Awk Example 1: Billing for Books
In this example, the input file bookdetails.txt contains records with fields — item number, Book name, Quantity and Rate per book.
$ cat bookdetails.txt 1 Linux-programming 2 450 2 Advanced-Linux 3 300 3 Computer-Networks 4 400 4 OOAD&UML 3 450 5 Java2 5 200
Now the following Awk script, reads and processes the above bookdetails.txt file, and generates report that displays — rate of each book sold, and total amount for all the books sold.
So far we have seen Awk reads the commands from the command line, but Awk can also read the commands from the file using -f option.
Syntax: $ awk -f script-filename inputfilename
Now our Awk script for billing calculation for books is given below.
$ cat book-calculation.awk BEGIN { total=0; } { itemno=$1; book=$2; bookamount=$3*$4; total=total+bookamount; print itemno," ", book,"\t","$"bookamount; } END { print "Total Amount = $"total; }
In the above script,
- Awk BEGIN section initializes the variable total. itemno, total, book, bookamount are userdefined awk variables.
- In the Awk Action section, Quantity*bookprice will be stored in a variable called bookamount. Each bookamount will be added with the total.
- Finally in the Awk END section, total variable will have total amount.
Now execute the book-calculation.awk script to generate the report that displays each book rate and total amount as shown below.
$ awk -f book-calculation.awk bookdetails.txt 1 Linux-programming $900 2 Advanced-Linux $900 3 Computer-Networks $1600 4 OOAD&UML $1350 5 Java2 $1000 Total Amount = $5750
Awk Example 2. Student Mark Calculation
In this example, create an input file “student-marks.txt” with the following content — Student name, Roll Number, Test1 score, Test2 score and Test3 score.
$ cat student-marks.txt Jones 2143 78 84 77 Gondrol 2321 56 58 45 RinRao 2122 38 37 65 Edwin 2537 78 67 45 Dayan 2415 30 47 20
Now the following Awk script will calculate and generate the report to show the Average marks of each student, average of Test1, Test2 and Test3 scores.
$cat student.awk BEGIN { test1=0; test2=0; test3=0; print "Name\tRollNo\t Average Score"; } { total=$3+$4+$5; test1=test1+$3; test2=test2+$4; test3=test3+$5; print $1"\t"$2"\t",total/3; } END{ print "Average of Test1="test1/NR; print "Average of Test2="test2/NR; print "Average of Test3="test3/NR; }
In the above Awk script,
- In the Awk BEGIN section all the awk variables are initialized to zero. test1, test2, test3 and total are user-defined awk variables.
- In the Awk ACTION section, $3, $4, $5 are Test1, Test2 and Test3 scores respectively. total variable is the addition of 3 test scores for each student. The awk variable test1, test2 and test3 has the total scores of each corresponding test.
- So in the Awk END section, dividing each test total by total number of records (i.e student) will give you the average score. NR is an Awk built-in variable which gives total number of records in input.
Awk Example 3. HTML Report for Student Details
In the above two example, we have seen awk variable which has numbers as its values. This example shows awk script to generate the html report for the students name and their roll number.
$ cat string.awk BEGIN{ title="AWK"; print "<html>\n<title>"title"</title><body bgcolor=\"#ffffff\">\n<table border=1><th colspan=2 align=centre>Student Details</th>"; } { name=$1; rollno=$2; print "<tr><td>"name"</td><td>"rollno"</td></tr>"; } END { print "</table></body>\n</html>"; }
Use the same student-marks.txt input file that we created in the above example.
$ awk -f string.awk student-marks.txt <html> <title>AWK</title><body bgcolor="#ffffff"> <table border=1><th colspan=2 align=centre>Student Details</th> <tr><td>Jones</td><td>2143</td></tr> <tr><td>Gondrol</td><td>2321</td></tr> <tr><td>RinRao</td><td>2122</td></tr> <tr><td>Edwin</td><td>2537</td></tr> <tr><td>Dayan</td><td>2415</td></tr> </table></body> </html>
We can store the above output, which gives the following html table. In the above script, variable called name and rollno are string variable, because it is used in string context.
Student Details | |
---|---|
Jones | 2143 |
Gondrol | 2321 |
RinRao | 2122 |
Edwin | 2537 |
Dayan | 2415 |
Recommended Reading
Sed and Awk 101 Hacks, by Ramesh Natarajan. I spend several hours a day on UNIX / Linux environment dealing with text files (data, config, and log files). I use Sed and Awk for all my my text manipulation work. Based on my Sed and Awk experience, I’ve written Sed and Awk 101 Hacks eBook that contains 101 practical examples on various advanced features of Sed and Awk that will enhance your UNIX / Linux life. Even if you’ve been using Sed and Awk for several years and have not read this book, please do yourself a favor and read this book. You’ll be amazed with the capabilities of Sed and Awk utilities.
Comments on this entry are closed.
hi,
Can the AWK command work on data that is comma separated or for that matter any other special character(say pipe)?
Anything can be used as the separator by assigning that value to the FS variable, either in the BEGIN block or on the command line:
awk -F\| ‘….’ ## assigment on the command line
awk ‘BEGIN {FS = “|”} …’ ## assigment in the BEGINblock
Thanks Chris.
You can even work with data files with no delimiters by using substring function.
Eg.
name=substring($0,1,18);
id=substring($0,19,5);
,etc.
where $0 refers to the whole line.
$0 refers to the whole line of input data.
There is no substring function; the function is substr().
Thanks Chris. Sorry for the slip up.
awwww(k), people should really be using perl(*)
(*) Or other appropriate, but modern, interpreter. python, whatever pleases you.
I want to sort and remove a huge data file with respect to delimiter.
boo.
“If you have an awk script that you don’t want anyone to be able to understand, just rewrite it in perl”
— Ed Morton in comp.unix shell
You don’t use awk to sort a file; use the corect tool, sort.
Thanks a ton as usual! Learning a lot of cool stuff from the geek stuff 🙂
How to analyse data using multiple files in awk
Ankit, what data do you want to analyse and what information do you want from it?
No i’ve not got any specific project. I just want to try that out and hence asked for the same.
what does awk -v do?
man awk:
-v var=val
Assign the value val to the variable var, before execution of
the program begins. Such variable values are available to the
BEGIN block of an AWK program.
i disagree with the commenter who said that one should right away use perl/python to solve their problems. I love perl/python but that commenter obviously has little or limited idea of what happens over the command-line which sysadmins interact with every day every minute across AIX, Linux, Solaris, Freebsd and etc. DO NOT right away solve and decide to use perl/python. Understand first what is available with awk/sed/m4/ex/ed then scale up to python/perl when needed..
Useful for me to know what is AWK..
Hi,
I have a question related to awk-sorting of data files that are located in different folders.
I have got multiple data files (>500 in number) all with the same name ‘file.dat’ but located under different folders (e.g. ~/amit/folder1/file.dat, ~/amit/folder2/file.dat, etc.). Each of these files are essentially one-columned and have the same number of entries (that is, rows). What I want to do is to collect all these files serially and create a new file ‘fileout.dat’ which has file.dat from folder 1 followed by file.dat from folder 2, etc. Had this been a small finite number of files, I could have simply used ‘cat ~/amit/folder/file.dat ~/amit/folder2/file.dat > fileout.dat’ but this is out of the question with more than 500 files to deal with.
Any ideas on now to get this done?
Cheers,
Amit
Thanks a lot.. Very useful article
Use
cat ~/amit/folder*/file.dat > fileout.dat
Use
find / -type f -name file.dat | xargs cat > fileout.dat
If you want to restrict your search use -mindepth -maxdepth option with a find command
in my script, i have a variable var1=”Example.001″. in my awk, i want to be able to send output within awk. In my “if” statement, i check for a condition “if (column1==”ok”); then print $0 >> “/tmp/script.$var1.results”. This appears to not work. Does anyone know how to do what i am attempting? what i have to do is actually type in the full path/filename, (print $0 >> “/tmp/script.Example.001.results”) which makes my script(s) not quite as easy to manage. i have a bunch of scripts that have the same commands. my $var1 never exapands.. even if i include ticks around it.. any help is appreciated.
tia
Ahh.. i have found a solution!