Grep and AWK

Back Up Next

horizontal rule

Home
Chapter 1
Chapter 2
Chapter 3
The Shell
Parameters and Variables
The Bourne Again Shell
Grep and AWK
Control Structures
Shell Program examples
TC Shell
Programming tools
Networking and the Internet

GREP

  

Grep  searches the named input files (or standard input if no files are named, or the file name - is given) for lines

 Examples  grep 'fred' /etc/passwd

 This command searches for all occurrences of the text string 'fred' within the "/etc/passwd" file. It will find and print (on the screen) all of the lines in this file that contain the text string 'fred', including lines that contain usernames like "fred" - and also "alfred".

-----

$grep '^fred' /etc/passwd

 This command searches for all occurrences of the text string 'fred' within the "/etc/passwd" file, but also requires that the "f" in the name "fred" be in the first column of each record (that's what the caret character tells grep). Using this more-advanced search, a user named "alfred" would not be matched, because the letter "a" will be in the first column.

 -----

 $grep 'joe' *

 This command searches for all occurrences of the text string 'joe' within all files of the current directory.

 When passing the regular expression into the grep, we must be careful because the shell will expand the metacharacters first before the grep.

 For example, if we want to grep all the occurrences of dollar sign ($) in a file, we cannot just type

 $ grep $ filename

The shell will expand the special meaning of '$' which means variable. Since the variable is null, we are actually grepping null chracter from the file now. We can neither do these :

 $ grep '$' filename

$ grep \$ filename

Although the special meaning of dollar sign for shell is turn off now, dollar sign actually has its special meaning in the regular expression. Recall from the section "Regular Expression", dollar sign means end-of-line character in regex. So the correct way of doing this is actually :

 

$ grep '\$' filename


The wildcard character "." matches any character. Therefore,

 $ grep 'eur.' fft.c

would find eureka, amateur, chauffeur, etc... in the file fft.c.

Characters placed inside square brackets are each compared when searching.  

 $ grep '[cm]an' fft.c

would find any words with the sequence can or man, but would not locate sequences like ran or and.

  

The options for grep :

There are also a few options for grep. We will look at the options -c, -h, -i, -l and -n.

 -c Display the count of matching line only.

$grep -c 'the' t* | more

testfile:5  
testfile10:2
testfile2:1

 -h Does not print the file name. If more than one file is given on the command line, does not precede each with the name of the file containing it

 -i Ignore the case of letters.

 -l List only filename with the matching line.

$ grep -l 'the' t*

teste.c

testefile3

testfile                      

-n Print line numbers.

   

Links

bullet

Searching with the Grep Search Engine

AWK

 

The awk utility interprets a special-purpose programming language that makes it possible to handle simple data-reformatting jobs with just a few lines of code. AWK is a pattern-scanning and processing language.

 When you run awk, you specify an awk program that tells awk what to do. The program consists of a series of rules. (It may also contain function definitions. Each rule specifies one pattern to search for, and one action to perform when that pattern is found.

Syntactically, a rule consists of a pattern followed by an action. The action is enclosed in curly braces to separate it from the pattern. Rules are usually separated by newlines. Therefore, an awk program looks like this:

 

pattern { action }

 

How to Run awk Programs

  There are several ways to run an awk program. If the program is short, it is easiest to include it in the command that runs awk, like this:

 

awk 'program' input-file1 input-file2 ...

 

where program consists of a series of patterns and actions, as described above. When the program is long, it is usually more convenient to put it in a file and run it with a command like this:

 

awk -f program-file input-file1 input-file2 ...

 

 awk 'program' input-file1 input-file2 ...

 where program consists of a series of patterns and actions, as described earlier.

 This command format instructs the shell, or command interpreter, to start awk and use the program to process records in the input file(s). There are single quotes around program so that the shell doesn't interpret any awk characters as special shell characters. They also cause the shell to treat all of program as a single argument for awk and allow program to be more than one line long.

 

Running awk without Input Files

 

You can also run awk without any input files. If you type the command line:

 

awk 'program'

 

$ awk '{ print }'

Now is the time for all good men

Now is the time for all good men

Control-d

 

Running Long Programs

Sometimes your awk programs can be very long. In this case it is more convenient to put the program into a separate file. To tell awk to use that file for its program, you type:

 

$awk -f source-file input-file1 input-file2 ...

 

The `-f' instructs the awk utility to get the awk program from the file source-file. Any file name can be used for source-file. For example, you could put the program:

 $BEGIN { print "Don't Panic!" }

 into the file `advice'. Then this command:

 $awk -f advice

 

 

 

horizontal rule

Back to CS140U Homepage
This page was last modified September 26, 2004
wmorales@pcc.edu