Tuesday, May 17, 2011

Grep,Awk,Sed


GREP

Features:
1. The ability to parse lines based on text and/or RegExes
2. Post-processor
3. Searches case-sensitively, by default
4. Searches for the text anywhere on the line


1. grep 'linux' grep1.txt
2. grep -i 'linux' grep1.txt - case-insensitive search
3. grep '^linux' grep1.txt - uses '^' anchor to anchor searches at the beginning of lines
4. grep -i '^linux' grep1.txt
5. grep -i 'linux$' grep1.txt - uses '$' anchor to anchor searches at the end of lines

Note: Anchors are RegEx characters (meta-characters). They're used to match at the beginning and end of lines

6. grep '[0-9]' grep1.txt - returns lines containing at least 1 number
7. grep '[a-z]' grep1.txt


8. rpm -qa | grep grep - searches the package database for programs named 'grep'

9. rpm -qa | grep -i xorg | wc -l - returns the number of pacakges with 'xorg' in their names

10. grep sshd messages
11. grep -v sshd messages - performs and inverted search (all but 'sshd' entries will be returned)
12. grep -v sshd messages | grep -v gconfd
13. grep -C 2 sshd messages - returns 2 lines, above and below matching line

Note: Most, if not all, Linux programs log linearly, which means one line after another, from the earliest to the current

Note: Use single or double quotes to specify RegExes
Also, execute 'grep' using 'egrep' when RegExes are being used

Awk

Features:
1. Field/Column processor
2. Supports egrep-compatible (POSIX) RegExes
3. Can return full lines like grep
4. Awk runs 3 steps:
a. BEGIN - optional
b. Body, where the main action(s) take place
c. END - optional
5. Multiple body actions can be executed by separating them using semicolons. e.g. '{ print $1; print $2 }'
6. Awk, auto-loops through input stream, regardless of the source of the stream. e.g. STDIN, Pipe, File


Usage:
1. awk '/optional_match/ { action }' file_name | Pipe
2. awk '{ print $1 }' grep1.txt

Note: Use single quotes with awk, to avoid shell interpolation of awk's variables

3. awk '{ print $1,$2 }' grep1.txt

Note: Default input and output field separators is whitespace

4. awk '/linux/ { print } ' grep1.txt - this will print ALL lines containing 'linux'

5. awk '{ if ($2 ~ /Linux/) print}' grep1.txt

6. awk '{ if ($2 ~ /8/) print }' /var/log/messages - this will print the entire line for log items for the 8th

7. awk '{ print $3 }' /var/log/messages | awk -F: '{ print $1}'

Sed - Stream Editor

Features:
1. Faciliates automated text editing
2. Supports RegExes (POSIX)
3. Like Awk, supports scripting using '-F' option
4. Supports input via: STDIN, pipe, file

Usage:
1. sed [options] 'instruction[s]' file[s]
2. sed -n '1p' grep1.txt - prints the first line of the file
3. sed -n '1,5p' grep1.txt - prints the first 5 lines of the file
4. sed -n '$p' grep1.txt - prints the last line of the file
5. sed -n '1,3!p' grep1.txt - prints ALL but lines 1-3
6. sed -n '/linux/p' grep1.txt - prints lines with 'linux'
7. sed -e '/^$/d' grep1.txt - deletes blank lines from the document
8. sed -e '/^$/d' grep1.txt > sed1.txt - deletes blank lines from the document 'grep1.txt' and creates 'sed1.txt'

9. sed -ne 's/search/replace/p' sed1.txt
10. sed -ne 's/linux/unix/p' sed1.txt
11. sed -i.bak -e 's/3/4' sed1.txt - this backs up the original file and creates a new 'sed1.txt' with the modifications indicated in the command

Note: Generally, to create new files, use output redirection, instead of allowing sed to write to STDOUT

Note: Sed applies each instruction to each line