0% found this document useful (0 votes)
10 views

Linuxfun Part 2-38-77

Uploaded by

naceri.amir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Linuxfun Part 2-38-77

Uploaded by

naceri.amir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Part III.

pipes and commands


Chapter 1. I/O redirection
One of the powers of the Unix command line is the use of input/output redirection and
pipes.

This chapter explains redirection of input, output and error streams.

39
I/O redirection

1.1. stdin, stdout, and stderr


The bash shell has three basic streams; it takes input from stdin (stream 0), it sends output
to stdout (stream 1) and it sends error messages to stderr (stream 2) .

The drawing below has a graphical interpretation of these three streams.

stdout (1)
stdin (0) bash
stderr (2)

The keyboard often serves as stdin, whereas stdout and stderr both go to the display. This
can be confusing to new Linux users because there is no obvious way to recognize stdout
from stderr. Experienced users know that separating output from errors can be very useful.

The next sections will explain how to redirect these streams.

1.2. output redirection


1.2.1. > stdout
stdout can be redirected with a greater than sign. While scanning the line, the shell will
see the > sign and will clear the file.

The > notation is in fact the abbreviation of 1> (stdout being referred to as stream 1).

40
I/O redirection

[paul@RHELv4u3 ~]$ echo It is cold today!


It is cold today!
[paul@RHELv4u3 ~]$ echo It is cold today! > winter.txt
[paul@RHELv4u3 ~]$ cat winter.txt It
is cold today!
[paul@RHELv4u3 ~]$
Note that the bash shell effectively removes the redirection from the command line before
argument 0 is executed. This means that in the case of this command:
echo hello > greetings.txt

the shell only counts two arguments (echo = argument 0, hello = argument 1). The
redirection is removed before the argument counting takes place.

1.2.2. output file is erased


While scanning the line, the shell will see the > sign and will clear the file! Since this
happens before resolving argument 0, this means that even when the command fails, the
file will have been cleared!
[paul@RHELv4u3 ~]$ cat winter.txt It is cold today!
[paul@RHELv4u3 ~]$ zcho It is cold today! > winter.txt
-bash: zcho: command not found
[paul@RHELv4u3 ~]$ cat winter.txt
[paul@RHELv4u3 ~]$

1.2.3. noclobber
Erasing a file while using > can be prevented by setting the noclobber option.
[paul@RHELv4u3 ~]$ cat winter.txt It
is cold today!
[paul@RHELv4u3 ~]$ set -o noclobber
[paul@RHELv4u3 ~]$ echo It is cold today! > winter.txt
-bash: winter.txt: cannot overwrite existing file
[paul@RHELv4u3 ~]$ set +o noclobber
[paul@RHELv4u3 ~]$

1.2.4. overruling noclobber


The noclobber can be overruled with >|.
[paul@RHELv4u3 ~]$ set -o noclobber
[paul@RHELv4u3 ~]$ echo It is cold today! > winter.txt
-bash: winter.txt: cannot overwrite existing file
[paul@RHELv4u3 ~]$ echo It is very cold today! >| winter.txt
[paul@RHELv4u3 ~]$ cat winter.txt It
is very cold today!
[paul@RHELv4u3 ~]$

41
I/O redirection

1.2.5. >> append


Use >> to append output to a file.
[paul@RHELv4u3 ~]$ echo It is cold today! > winter.txt
[paul@RHELv4u3 ~]$ cat winter.txt It
is cold today!
[paul@RHELv4u3 ~]$ echo Where is the summer ? >> winter.txt
[paul@RHELv4u3 ~]$ cat winter.txt It
is cold today!
Where is the summer ?
[paul@RHELv4u3 ~]$

1.3. error redirection


1.3.1. 2> stderr
Redirecting stderr is done with 2>. This can be very useful to prevent error messages from
cluttering your screen.

The screenshot below shows redirection of stdout to a file, and stderr to /dev/null. Writing
1> is the same as >.
[paul@RHELv4u3 ~]$ find / > allfiles.txt 2> /dev/null
[paul@RHELv4u3 ~]$

1.3.2. 2>&1
To redirect both stdout and stderr to the same file, use 2>&1.
[paul@RHELv4u3 ~]$ find / > allfiles_and_errors.txt 2>&1
[paul@RHELv4u3 ~]$

Note that the order of redirections is significant. For example, the command
ls > dirlist 2>&1

directs both standard output (file descriptor 1) and standard error (file descriptor 2) to the
file dirlist, while the command
ls 2>&1 > dirlist

42
I/O redirection

directs only the standard output to file dirlist, because the standard error made a copy of the
standard output before the standard output was redirected to dirlist.

1.4. output redirection and pipes


By default you cannot grep inside stderr when using pipes on the command line, because
only stdout is passed.
paul@debian7:~$ rm file42 file33 file1201 | grep file42
rm: cannot remove ‘file42’: No such file or directory
rm: cannot remove ‘file33’: No such file or directory
rm: cannot remove ‘file1201’: No such file or directory

With 2>&1 you can force stderr to go to stdout. This enables the next command in the pipe
to act on both streams.
paul@debian7:~$ rm file42 file33 file1201 2>&1 | grep file42
rm: cannot remove ‘file42’: No such file or directory

You cannot use both 1>&2 and 2>&1 to switch stdout and stderr.
paul@debian7:~$ rm file42 file33 file1201 2>&1 1>&2 | grep file42
rm: cannot remove ‘file42’: No such file or directory
paul@debian7:~$ echo file42 2>&1 1>&2 | sed 's/file42/FILE42/'
FILE42

You need a third stream to switch stdout and stderr after a pipe symbol.
paul@debian7:~$ echo file42 3>&1 1>&2 2>&3 | sed 's/file42/FILE42/'
file42 paul@debian7:~$ rm file42 3>&1 1>&2 2>&3 | sed
's/file42/FILE42/' rm: cannot remove ‘FILE42’: No such file or
directory

1.5. joining stdout and stderr


The &> construction will put both stdout and stderr in one stream (to a file).
paul@debian7:~$ rm file42 &> out_and_err
paul@debian7:~$ cat out_and_err
rm: cannot remove ‘file42’: No such file or directory
paul@debian7:~$ echo file42 &> out_and_err
paul@debian7:~$ cat out_and_err file42 paul@debian7:~$

1.6. input redirection


1.6.1. < stdin
Redirecting stdin is done with < (short for 0<).

43
I/O redirection

[paul@RHEL4b ~]$ cat < text.txt one two [paul@RHEL4b ~]$ tr 'onetw'
'ONEZZ' < text.txt
ONE
ZZO
[paul@RHEL4b ~]$

1.6.2. << here document


The here document (sometimes called here-is-document) is a way to append input until a
certain sequence (usually EOF) is encountered. The EOF marker can be typed literally or
can be called with Ctrl-D.

[paul@RHEL4b ~]$ cat <<EOF > text.txt


> one
> two
> EOF [paul@RHEL4b ~]$ cat text.txt
one two [paul@RHEL4b ~]$ cat <<brol >
text.txt
> brel
> brol [paul@RHEL4b ~]$ cat
text.txt brel [paul@RHEL4b
~]$

1.6.3. <<< here string


The here string can be used to directly pass strings to a command. The result is the same as
using echo string | command (but you have one less process running).
paul@ubu1110~$ base64 <<< linux-training.be bGludXgtdHJhaW5pbmcuYmUK
paul@ubu1110~$ base64 -d <<< bGludXgtdHJhaW5pbmcuYmUK linux-
training.be

See rfc 3548 for more information about base64.

1.7. confusing redirection


The shell will scan the whole line before applying redirection. The following command line
is very readable and is correct.
cat winter.txt > snow.txt 2> errors.txt

But this one is also correct, but less readable.


2> errors.txt cat winter.txt > snow.txt

44
I/O redirection

Even this will be understood perfectly by the shell.


< winter.txt > snow.txt 2> errors.txt cat

1.8. quick file clear


So what is the quickest way to clear a file ?
>foo

And what is the quickest way to clear a file when the noclobber option is set ?
>|bar

45
I/O redirection

1.9. practice: input/output redirection


1. Activate the noclobber shell option.

2. Verify that noclobber is active by repeating an ls on /etc/ with redirected output to a file.

3. When listing all shell options, which character represents the noclobber option ?

4. Deactivate the noclobber option.

5. Make sure you have two shells open on the same computer. Create an empty tailing.txt
file. Then type tail -f tailing.txt. Use the second shell to append a line of text to that file.
Verify that the first shell displays this line.

6. Create a file that contains the names of five people. Use cat and output redirection to
create the file and use a here document to end the input.

46
I/O redirection

1.10. solution: input/output redirection


1. Activate the noclobber shell option.
set -o noclobber
set -C

2. Verify that noclobber is active by repeating an ls on /etc/ with redirected output to a file.
ls /etc > etc.txt
ls /etc > etc.txt (should not work)

4. When listing all shell options, which character represents the noclobber option ?
echo $- (noclobber is visible as C)

5. Deactivate the noclobber option.


set +o noclobber

6. Make sure you have two shells open on the same computer. Create an empty tailing.txt
file. Then type tail -f tailing.txt. Use the second shell to append a line of text to that file.
Verify that the first shell displays this line.
paul@deb503:~$ > tailing.txt
paul@deb503:~$ tail -f tailing.txt
hello world

in the other shell:


paul@deb503:~$ echo hello >> tailing.txt paul@deb503:~$
echo world >> tailing.txt

7. Create a file that contains the names of five people. Use cat and output redirection to
create the file and use a here document to end the input.
paul@deb503:~$ cat > tennis.txt << ace
> Justine Henin
> Venus Williams
> Serena Williams
> Martina Hingis
> Kim Clijsters
> ace
paul@deb503:~$ cat tennis.txt Justine
Henin
Venus Williams
Serena Williams
Martina Hingis
Kim Clijsters
paul@deb503:~$

47
Chapter 2. filters
Commands that are created to be used with a pipe are often called filters. These filters are
very small programs that do one specific thing very efficiently. They can be used as building
blocks.

This chapter will introduce you to the most common filters. The combination of simple
commands and filters in a long pipe allows you to design elegant solutions.

48
filters

2.1. cat
When between two pipes, the cat command does nothing (except putting stdin on stdout).
[paul@RHEL4b pipes]$ tac count.txt | cat | cat | cat | cat | cat
five four three two one

[paul@RHEL4b pipes]$

2.2. tee
Writing long pipes in Unix is fun, but sometimes you may want intermediate results. This
is were tee comes in handy. The tee filter puts stdin on stdout and also into a file. So tee is
almost the same as cat, except that it has two identical outputs.
[paul@RHEL4b pipes]$ tac count.txt | tee temp.txt | tac
one two three four five

[paul@RHEL4b pipes]$ cat temp.txt five four three two


one

[paul@RHEL4b pipes]$

2.3. grep
The grep filter is famous among Unix users. The most common use of grep is to filter lines
of text containing (or not containing) a certain string.
[paul@RHEL4b pipes]$ cat tennis.txt
Amelie Mauresmo, Fra
Kim Clijsters, BEL
Justine Henin, Bel
Serena Williams, usa
Venus Williams, USA
[paul@RHEL4b pipes]$ cat tennis.txt | grep Williams
Serena Williams, usa
Venus Williams, USA

You can write this without the cat.


[paul@RHEL4b pipes]$ grep Williams tennis.txt
Serena Williams, usa
Venus Williams, USA

One of the most useful options of grep is grep -i which filters in a case insensitive way.
[paul@RHEL4b pipes]$ grep Bel tennis.txt
Justine Henin, Bel
[paul@RHEL4b pipes]$ grep -i Bel tennis.txt
Kim Clijsters, BEL
Justine Henin, Bel
[paul@RHEL4b pipes]$

Another very useful option is grep -v which outputs lines not matching the string.

49
filters

[paul@RHEL4b pipes]$ grep -v Fra tennis.txt


Kim Clijsters, BEL
Justine Henin, Bel
Serena Williams, usa
Venus Williams, USA
[paul@RHEL4b pipes]$

And of course, both options can be combined to filter all lines not containing a case
insensitive string.
[paul@RHEL4b pipes]$ grep -vi usa tennis.txt
Amelie Mauresmo, Fra
Kim Clijsters, BEL
Justine Henin, Bel
[paul@RHEL4b pipes]$

With grep -A1 one line after the result is also displayed.
paul@debian5:~/pipes$ grep -A1 Henin tennis.txt Justine Henin,
Bel
Serena Williams, usa

With grep -B1 one line before the result is also displayed.
paul@debian5:~/pipes$ grep -B1 Henin tennis.txt Kim Clijsters,
BEL
Justine Henin, Bel

With grep -C1 (context) one line before and one after are also displayed. All three options
(A,B, and C) can display any number of lines (using e.g. A2, B4 or C20).
paul@debian5:~/pipes$ grep -C1 Henin tennis.txt Kim Clijsters, BEL
Justine Henin, Bel
Serena Williams, usa

2.4. cut
The cut filter can select columns from files, depending on a delimiter or a count of bytes.
The screenshot below uses cut to filter for the username and userid in the /etc/passwd file.
It uses the colon as a delimiter, and selects fields 1 and 3.
[[paul@RHEL4b pipes]$ cut -d: -f1,3 /etc/passwd | tail -4
Figo:510
Pfaff:511
Harry:516
Hermione:517
[paul@RHEL4b pipes]$

When using a space as the delimiter for cut, you have to quote the space.
[paul@RHEL4b pipes]$ cut -d" " -f1 tennis.txt
Amelie
Kim
Justine
Serena
Venus
[paul@RHEL4b pipes]$

This example uses cut to display the second to the seventh character of /etc/passwd.

50
filters

[paul@RHEL4b pipes]$ cut -c2-7 /etc/passwd | tail -4


igo:x: faff:x arry:x ermion

[paul@RHEL4b pipes]$

2.5. tr
You can translate characters with tr. The screenshot shows the translation of all occurrences
of e to E.
[paul@RHEL4b pipes]$ cat tennis.txt | tr 'e' 'E'
AmEliE MaurEsmo, Fra
Kim ClijstErs, BEL
JustinE HEnin, BEl
SErEna Williams, usa
VEnus Williams, USA

Here we set all letters to uppercase by defining two ranges.


[paul@RHEL4b pipes]$ cat tennis.txt | tr 'a-z' 'A-Z'
AMELIE MAURESMO, FRA
KIM CLIJSTERS, BEL
JUSTINE HENIN, BEL
SERENA WILLIAMS, USA
VENUS WILLIAMS, USA
[paul@RHEL4b pipes]$

Here we translate all newlines to spaces.


[paul@RHEL4b pipes]$ cat count.txt
one two
three four five

[paul@RHEL4b pipes]$ cat count.txt | tr '\n' ' '


one two three four five

[paul@RHEL4b pipes]$

The tr -s filter can also be used to squeeze multiple occurrences of a character to one.
[paul@RHEL4b pipes]$ cat spaces.txt one two
three four five six

[paul@RHEL4b pipes]$ cat spaces.txt | tr -s ' '


one two three four five six

[paul@RHEL4b pipes]$

You can also use tr to 'encrypt' texts with rot13.


[paul@RHEL4b pipes]$ cat count.txt | tr 'a-z' 'nopqrstuvwxyzabcdefghijklm'
bar gjb guerr sbhe svir

[paul@RHEL4b pipes]$ cat count.txt | tr 'a-z' 'n-za-m' bar gjb guerr sbhe
svir [paul@RHEL4b pipes]$

This last example uses tr -d to delete characters.

51
filters

paul@debian5:~/pipes$ cat tennis.txt | tr -d e Amli Maursmo, Fra


Kim Clijstrs, BEL
Justin Hnin, Bl
Srna Williams, usa
Vnus Williams, USA

2.6. wc
Counting words, lines and characters is easy with wc.
[paul@RHEL4b pipes]$ wc tennis.txt
5 15 100 tennis.txt
[paul@RHEL4b pipes]$ wc -l tennis.txt
5 tennis.txt
[paul@RHEL4b pipes]$ wc -w tennis.txt
15 tennis.txt
[paul@RHEL4b pipes]$ wc -c tennis.txt
100 tennis.txt
[paul@RHEL4b pipes]$

2.7. sort
The sort filter will default to an alphabetical sort.
paul@debian5:~/pipes$ cat music.txt Queen
Brel
Led Zeppelin
Abba
paul@debian5:~/pipes$ sort music.txt Abba
Brel
Led Zeppelin
Queen

But the sort filter has many options to tweak its usage. This example shows sorting different
columns (column 1 or column 2).
[paul@RHEL4b pipes]$ sort -k1 country.txt
Belgium, Brussels, 10
France, Paris, 60
Germany, Berlin, 100
Iran, Teheran, 70
Italy, Rome, 50
[paul@RHEL4b pipes]$ sort -k2 country.txt
Germany, Berlin, 100
Belgium, Brussels, 10
France, Paris, 60
Italy, Rome, 50
Iran, Teheran, 70

The screenshot below shows the difference between an alphabetical sort and a numerical
sort (both on the third column).
[paul@RHEL4b pipes]$ sort -k3 country.txt
Belgium, Brussels, 10
Germany, Berlin, 100
Italy, Rome, 50
France, Paris, 60
Iran, Teheran, 70
[paul@RHEL4b pipes]$ sort -n -k3 country.txt
Belgium, Brussels, 10

52
filters

Italy, Rome, 50
France, Paris, 60
Iran, Teheran, 70
Germany, Berlin, 100

2.8. uniq
With uniq you can remove duplicates from a sorted list.
paul@debian5:~/pipes$ cat music.txt Queen
Brel
Queen
Abba
paul@debian5:~/pipes$ sort music.txt Abba
Brel
Queen
Queen
paul@debian5:~/pipes$ sort music.txt |uniq
Abba
Brel
Queen

uniq can also count occurrences with the -c option.


paul@debian5:~/pipes$ sort music.txt |uniq -c
1 Abba
1 Brel
2 Queen

53
filters

2.9. comm
Comparing streams (or files) can be done with the comm. By default comm will output
three columns. In this example, Abba, Cure and Queen are in both lists, Bowie and Sweet
are only in the first file, Turner is only in the second.
paul@debian5:~/pipes$ cat > list1.txt
Abba
Bowie
Cure
Queen
Sweet
paul@debian5:~/pipes$ cat > list2.txt
Abba
Cure
Queen
Turner
paul@debian5:~/pipes$ comm list1.txt list2.txt
Abba
Bowie
Cure
Queen
Sweet
Turner
The output of comm can be easier to read when outputting only a single column. The digits
point out which output columns should not be displayed.

paul@debian5:~/pipes$ comm -12 list1.txt list2.txt Abba


Cure
Queen
paul@debian5:~/pipes$ comm -13 list1.txt list2.txt Turner
paul@debian5:~/pipes$ comm -23 list1.txt list2.txt Bowie
Sweet

2.10. od
European humans like to work with ascii characters, but computers store files in bytes. The
example below creates a simple file, and then uses od to show the contents of the file in
hexadecimal bytes

paul@laika:~/test$ cat > text.txt


abcdefg 1234567
paul@laika:~/test$ od -t x1 text.txt
0000000 61 62 63 64 65 66 67 0a 31 32 33 34 35 36 37 0a
0000020

The same file can also be displayed in octal bytes.

paul@laika:~/test$ od -b text.txt
0000000 141 142 143 144 145 146 147 012 061 062 063 064 065 066 067 012
0000020

And here is the file in ascii (or backslashed) characters.

paul@laika:~/test$ od -c text.txt

54
filters

0000000 a b c d e f g \n 1 2 3 4 5 6 7 \n
0000020

55
filters

2.11. sed
The stream editor sed can perform editing functions in the stream, using regular
expressions.

paul@debian5:~/pipes$ echo level5 | sed 's/5/42/'


level42

paul@debian5:~/pipes$ echo level5 | sed 's/level/jump/'


jump5

Add g for global replacements (all occurrences of the string per line).

paul@debian5:~/pipes$ echo level5 level7 | sed 's/level/jump/'


jump5 level7

paul@debian5:~/pipes$ echo level5 level7 | sed 's/level/jump/g'


jump5 jump7

With d you can remove lines from a stream containing a character.

paul@debian5:~/test42$ cat tennis.txt Venus


Williams, USA
Martina Hingis, SUI
Justine Henin, BE
Serena williams, USA
Kim Clijsters, BE
Yanina Wickmayer, BE
paul@debian5:~/test42$ cat tennis.txt | sed '/BE/d'
Venus Williams, USA
Martina Hingis, SUI
Serena williams, USA

2.12. pipe examples


2.12.1. who | wc
How many users are logged on to this system ?
[paul@RHEL4b pipes]$ who root tty1 Jul 25
10:50 paul pts/0 Jul 25 09:29 (laika) Harry
pts/1 Jul 25 12:26 (barry) paul pts/2
Jul 25 12:26 (pasha)
[paul@RHEL4b pipes]$ who | wc -l
4

2.12.2. who | cut | sort


Display a sorted list of logged on users.
[paul@RHEL4b pipes]$ who | cut -d' ' -f1 | sort
Harry paul paul root

Display a sorted list of logged on users, but every user only once .
[paul@RHEL4b pipes]$ who | cut -d' ' -f1 | sort | uniq
Harry paul root

56
filters

2.12.3. grep | cut


Display a list of all bash user accounts on this computer. Users accounts are explained in
detail later.
paul@debian5:~$ grep bash /etc/passwd
root:x:0:0:root:/root:/bin/bash
paul:x:1000:1000:paul,,,:/home/paul:/bin/bash
serena:x:1001:1001::/home/serena:/bin/bash
paul@debian5:~$ grep bash /etc/passwd | cut -d: -f1
root paul serena

2.13. practice: filters


1. Put a sorted list of all bash users in bashusers.txt.

2. Put a sorted list of all logged on users in onlineusers.txt.

3. Make a list of all filenames in /etc that contain the string conf in their filename.

4. Make a sorted list of all files in /etc that contain the case insensitive string conf in their
filename.

5. Look at the output of /sbin/ifconfig. Write a line that displays only ip address and the
subnet mask.

6. Write a line that removes all non-letters from a stream.

7. Write a line that receives a text file, and outputs all words on a separate line.

8. Write a spell checker on the command line. (There may be a dictionary in /usr/share/
dict/ .)

57
Chapter 3. basic Unix tools
This chapter introduces commands to find or locate files and to compress files, together
with other common tools that were not discussed before. While the tools discussed here are
technically not considered filters, they can be used in pipes.

58
basic Unix tools

3.1. find
The find command can be very useful at the start of a pipe to search for files. Here are some
examples. You might want to add 2>/dev/null to the command lines to avoid cluttering your
screen with error messages.

Find all files in /etc and put the list in etcfiles.txt


find /etc > etcfiles.txt

Find all files of the entire


system and put the list in allfiles.txt
find / > allfiles.txt

Find files that end in .conf in the current directory (and all subdirs).
find . -name "*.conf"

Find files of type file (not directory, pipe or etc.) that end in .conf.
find . -type f -name "*.conf"

Find files of type directory that end in .bak .


find /data -type d -name "*.bak"

Find files that are newer than file42.txt


find . -newer file42.txt

Find can also execute another command on every file found. This example will look for
*.odf files and copy them to /backup/.
find /data -name "*.odf" -exec cp {} /backup/ \;

Find can also execute, after your confirmation, another command on every file found. This
example will remove *.odf files if you approve of it for every file found.
find /data -name "*.odf" -ok rm {} \;

3.2. locate
The locate tool is very different from find in that it uses an index to locate files. This is a
lot faster than traversing all the directories, but it also means that it is always outdated. If
the index does not exist yet, then you have to create it (as root on Red Hat Enterprise Linux)
with the updatedb command.
[paul@RHEL4b ~]$ locate Samba

59
basic Unix tools

warning: locate: could not open database: /var/lib/slocate/slocate.db:...


warning: You need to run the 'updatedb' command (as root) to create th...
Please have a look at /etc/updatedb.conf to enable the daily cron job.
[paul@RHEL4b ~]$ updatedb fatal error: updatedb: You are not authorized to
create a default sloc...
[paul@RHEL4b ~]$ su Password:
[root@RHEL4b ~]# updatedb
[root@RHEL4b ~]#

Most Linux distributions will schedule the updatedb to run once every day.

3.3. date
The date command can display the date, time, time zone and more.
paul@rhel55 ~$ date Sat Apr
17 12:44:30 CEST 2010

A date string can be customized to display the format of your choice. Check the man page
for more options.
paul@rhel55 ~$ date +'%A %d-%m-%Y'
Saturday 17-04-2010

Time on any Unix is calculated in number of seconds since 1969 (the first second being the
first second of the first of January 1970). Use date +%s to display Unix time in seconds.
paul@rhel55 ~$ date +%s
1271501080

When will this seconds counter reach two thousand million ?


paul@rhel55 ~$ date -d '1970-01-01 + 2000000000 seconds'
Wed May 18 04:33:20 CEST 2033

3.4. cal
The cal command displays the current month, with the current day highlighted.
paul@rhel55 ~$ cal April 2010
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30

You can select any month in the past or the future.


paul@rhel55 ~$ cal 2 1970 February 1970
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28

60
basic Unix tools

3.5. sleep
The sleep command is sometimes used in scripts to wait a number of seconds. This example
shows a five second sleep.
paul@rhel55 ~$ sleep 5
paul@rhel55 ~$

3.6. time
The time command can display how long it takes to execute a command. The date command
takes only a little time.
paul@rhel55 ~$ time date Sat
Apr 17 13:08:27 CEST 2010

real 0m0.014s
user 0m0.008s
sys 0m0.006s
The sleep 5 command takes five real seconds to execute, but consumes little cpu time.
paul@rhel55 ~$ time sleep
5

real 0m5.018s
user 0m0.005s
sys 0m0.011s
This bzip2 command compresses a file and uses a lot of cpu time.
paul@rhel55 ~$ time bzip2 text.txt

real 0m2.368s
user 0m0.847s
sys 0m0.539s

61
basic Unix tools

3.7. gzip - gunzip


Users never have enough disk space, so compression comes in handy. The gzip command
can make files take up less space.
paul@rhel55 ~$ ls -lh text.txt
-rw-rw-r-- 1 paul paul 6.4M Apr 17 13:11 text.txt
paul@rhel55 ~$ gzip text.txt
paul@rhel55 ~$ ls -lh text.txt.gz
-rw-rw-r-- 1 paul paul 760K Apr 17 13:11 text.txt.gz
You can get the original back with gunzip.
paul@rhel55 ~$ gunzip text.txt.gz
paul@rhel55 ~$ ls -lh text.txt
-rw-rw-r-- 1 paul paul 6.4M Apr 17 13:11 text.txt

3.8. zcat - zmore


Text files that are compressed with gzip can be viewed with zcat and zmore.
paul@rhel55 ~$ head -4 text.txt /
/opt
/opt/VBoxGuestAdditions-3.1.6
/opt/VBoxGuestAdditions-3.1.6/routines.sh
paul@rhel55 ~$ gzip text.txt
paul@rhel55 ~$ zcat text.txt.gz | head -4
/
/opt
/opt/VBoxGuestAdditions-3.1.6
/opt/VBoxGuestAdditions-3.1.6/routines.sh

3.9. bzip2 - bunzip2


Files can also be compressed with bzip2 which takes a little more time than gzip, but
compresses better.
paul@rhel55 ~$ bzip2 text.txt
paul@rhel55 ~$ ls -lh text.txt.bz2
-rw-rw-r-- 1 paul paul 569K Apr 17 13:11 text.txt.bz2
Files can be uncompressed again with bunzip2.
paul@rhel55 ~$ bunzip2 text.txt.bz2
paul@rhel55 ~$ ls -lh text.txt
-rw-rw-r-- 1 paul paul 6.4M Apr 17 13:11 text.txt

3.10. bzcat - bzmore


And in the same way bzcat and bzmore can display files compressed with bzip2.
paul@rhel55 ~$ bzip2 text.txt
paul@rhel55 ~$ bzcat text.txt.bz2 | head -4 /
/opt
/opt/VBoxGuestAdditions-3.1.6
/opt/VBoxGuestAdditions-3.1.6/routines.sh

62
basic Unix tools

3.11. practice: basic Unix tools


1. Explain the difference between these two commands. This question is very important.
If you don't know the answer, then look back at the shell chapter.
find /data -name "*.txt"

find /data -name *.txt

2. Explain the difference between these two statements. Will they both work when there
are200 .odf files in /data ? How about when there are 2 million .odf files ?
find /data -name "*.odf" > data_odf.txt

find /data/*.odf > data_odf.txt

3. Write a find command that finds all files created after January 30th 2010.

4. Write a find command that finds all *.odf files created in September 2009.

5. Count the number of *.conf files in /etc and all its subdirs.

6. Here are two commands that do the same thing: copy *.odf files to /backup/ . What
wouldbe a reason to replace the first command with the second ? Again, this is an
important question.
cp -r /data/*.odf /backup/

find /data -name "*.odf" -exec cp {} /backup/ \;

7. Create a file called loctest.txt. Can you find this file with locate ? Why not ? How do
you make locate find this file ?

8. Use find and -exec to rename all .htm files to .html.

9. Issue the date command. Now display the date in YYYY/MM/DD format.

10. Issue the cal command. Display a calendar of 1582 and 1752. Notice anything special ?

63
Chapter 4. regular expressions
Regular expressions are a very powerful tool in Linux. They can be used with a variety of
programs like bash, vi, rename, grep, sed, and more.

This chapter introduces you to the basics of regular expressions.

64
regular expressions

4.1. regex versions


There are three different versions of regular expression syntax:
BRE: Basic Regular Expressions
ERE: Extended Regular Expressions
PRCE: Perl Regular Expressions

Depending on the tool being used, one or more of these syntaxes can be used.

For example the grep tool has the -E option to force a string to be read as ERE while -G
forces BRE and -P forces PRCE.

Note that grep also has -F to force the string to be read literally.

The sed tool also has options to choose a regex syntax.

Read the manual of the tools you use!

65
regular expressions

4.2. grep
4.2.1. print lines matching a pattern
grep is a popular Linux tool to search for lines that match a certain pattern. Below are some
examples of the simplest regular expressions.

This is the contents of the test file. This file contains three lines (or three newline
characters).
paul@rhel65:~$ cat names Tania
Laura
Valentina

When grepping for a single character, only the lines containing that character are returned.
paul@rhel65:~$ grep u names Laura
paul@rhel65:~$ grep e names
Valentina
paul@rhel65:~$ grep i names Tania
Valentina

The pattern matching in this example should be very straightforward; if the given character
occurs on a line, then grep will return that line.

4.2.2. concatenating characters


Two concatenated characters will have to be concatenated in the same way to have a match.

This example demonstrates that ia will match Tania but not Valentina and in will match
Valentina but not Tania.
paul@rhel65:~$ grep a names
Tania
Laura
Valentina
paul@rhel65:~$ grep ia names
Tania
paul@rhel65:~$ grep in names
Valentina
paul@rhel65:~$

4.2.3. one or the other


PRCE and ERE both use the pipe symbol to signify OR. In this example we grep for lines
containing the letter i or the letter a.

66
regular expressions

paul@debian7:~$ cat list


Tania Laura
paul@debian7:~$ grep -E 'i|a' list Tania
Laura

Note that we use the -E switch of grep to force interpretion of our string as an ERE.

We need to escape the pipe symbol in a BRE to get the same logical OR.
paul@debian7:~$ grep -G 'i|a' list
paul@debian7:~$ grep -G 'i\|a' list Tania
Laura

4.2.4. one or more


The * signifies zero, one or more occurences of the previous and the + signifies one or more
of the previous.
paul@debian7:~$ cat list2 ll lol
lool loool
paul@debian7:~$ grep -E 'o*' list2
ll lol lool loool
paul@debian7:~$ grep -E 'o+' list2
lol lool loool
paul@debian7:~$

4.2.5. match the end of a string


For the following examples, we will use this file.
paul@debian7:~$ cat names Tania
Laura
Valentina
Fleur
Floor

The two examples below show how to use the dollar character to match the end of a string.
paul@debian7:~$ grep a$ names Tania
Laura
Valentina
paul@debian7:~$ grep r$ names Fleur
Floor

4.2.6. match the start of a string


The caret character (^) will match a string at the start (or the beginning) of a line.

Given the same file as above, here are two examples.

67
regular expressions

paul@debian7:~$ grep ^Val names Valentina


paul@debian7:~$ grep ^F names Fleur
Floor

Both the dollar sign and the little hat are called anchors in a regex.

4.2.7. separating words


Regular expressions use a \b sequence to reference a word separator. Take for example this
file:
paul@debian7:~$ cat text The governer is governing.
The winter is over.
Can you get over there?

Simply grepping for over will give too many results.


paul@debian7:~$ grep over text The governer is governing.
The winter is over.
Can you get over there?

Surrounding the searched word with spaces is not a good solution (because other characters
can be word separators). This screenshot below show how to use \b to find only the searched
word:
paul@debian7:~$ grep '\bover\b' text
The winter is over. Can you get over
there?

paul@debian7:~$

Note that grep also has a -w option to grep for words.


paul@debian7:~$ cat text The
governer is governing.
The winter is over. Can
you get over there?
paul@debian7:~$ grep -w over text
The winter is over. Can you get
over there?
paul@debian7:~$

4.2.8. grep features


Sometimes it is easier to combine a simple regex with grep options, than it is to write a
more complex regex. These options where discussed before:
grep -i
grep -v
grep -w
grep -A5

68
regular expressions

grep -B5
grep -C5

4.2.9. preventing shell expansion of a regex


The dollar sign is a special character, both for the regex and also for the shell (remember
variables and embedded shells). Therefore it is advised to always quote the regex, this
prevents shell expansion.
paul@debian7:~$ grep 'r$' names Fleur
Floor

69
regular expressions

4.3. rename
4.3.1. the rename command
On Debian Linux the /usr/bin/rename command is a link to /usr/bin/prename installed by
the perl package.
paul@pi ~ $ dpkg -S $(readlink -f $(which rename))
perl: /usr/bin/prename

Red Hat derived systems do not install the same rename command, so this section does not
describe rename on Red Hat (unless you copy the perl script manually).

There is often confusion on the internet about the rename command because solutions
that work fine in Debian (and Ubuntu, xubuntu, Mint, ...) cannot be used in Red Hat
(and CentOS, Fedora, ...).

4.3.2. perl
The rename command is actually a perl script that uses perl regular expressions. The
complete manual for these can be found by typing perldoc perlrequick (after installing
perldoc).
root@pi:~# aptitude install perl-doc The
following NEW packages will be installed:
perl-doc 0 packages upgraded, 1 newly installed, 0 to remove and 0
not upgraded.
Need to get 8,170 kB of archives. After unpacking 13.2 MB will be used.
Get: 1 http://mirrordirector.raspbian.org/raspbian/ wheezy/main perl-do...
Fetched 8,170 kB in 19s (412 kB/s) Selecting
previously unselected package perl-doc.
(Reading database ... 67121 files and directories currently installed.)
Unpacking perl-doc (from .../perl-doc_5.14.2-21+rpi2_all.deb) ...
Adding 'diversion of /usr/bin/perldoc to /usr/bin/perldoc.stub by perl-
doc' Processing triggers for man-db ... Setting up perl-doc (5.14.2-
21+rpi2) ...

root@pi:~# perldoc perlrequick

4.3.3. well known syntax


The most common use of the rename is to search for filenames matching a certain string
and replacing this string with an other string.

This is often presented as s/string/other string/ as seen in this example:


paul@pi ~ $ ls abc

70
regular expressions

allfiles.TXT bllfiles.TXT Scratch tennis2.TXT abc.conf


backup cllfiles.TXT temp.TXT tennis.TXT

paul@pi ~ $ rename 's/TXT/text/' * paul@pi ~ $ ls abc


allfiles.text bllfiles.text Scratch tennis2.text abc.conf
backup cllfiles.text temp.text tennis.text

And here is another example that uses rename with the well know syntax to change the
extensions of the same files once more:
paul@pi ~ $ ls abc

allfiles.text bllfiles.text Scratch tennis2.text abc.conf


backup cllfiles.text temp.text tennis.text

paul@pi ~ $ rename 's/text/txt/' *.text paul@pi ~ $ ls abc


allfiles.txt bllfiles.txt Scratch tennis2.txt abc.conf
backup cllfiles.txt temp.txt tennis.txt paul@pi ~ $

These two examples appear to work because the strings we used only exist at the end of the
filename. Remember that file extensions have no meaning in the bash shell.

The next example shows what can go wrong with this syntax.
paul@pi ~ $ touch atxt.txt

paul@pi ~ $ rename 's/txt/problem/' atxt.txt

paul@pi ~ $ ls abc allfiles.txt backup cllfiles.txt temp.txt


tennis.txt abc.conf aproblem.txt bllfiles.txt Scratch tennis2.txt
paul@pi ~ $

Only the first occurrence of the searched string is replaced.

4.3.4. a global replace


The syntax used in the previous example can be described as s/regex/replacement/. This is
simple and straightforward, you enter a regex between the first two slashes and a
replacement string between the last two.

This example expands this syntax only a little, by adding a modifier.


paul@pi ~ $ rename -n 's/TXT/txt/g' aTXT.TXT
aTXT.TXT renamed as atxt.txt

paul@pi ~ $

71
regular expressions

The syntax we use now can be described as s/regex/replacement/g where s signifies switch
and g stands for global.

Note that this example used the -n switch to show what is being done (instead of actually
renaming the file).

4.3.5. case insensitive replace


Another modifier that can be useful is i. this example shows how to replace a case
insensitive string with another string.
paul@debian7:~/files$ ls file1.text file2.TEXT
file3.txt

paul@debian7:~/files$ rename 's/.text/.txt/i' *


paul@debian7:~/files$ ls file1.txt file2.txt
file3.txt

paul@debian7:~/files$

4.3.6. renaming extensions


Command line Linux has no knowledge of MS-DOS like extensions, but many end users
and graphical application do use them.

Here is an example on how to use rename to only rename the file extension. It uses the
dollar sign to mark the ending of the filename.
paul@pi ~ $ ls *.txt
allfiles.txt bllfiles.txt cllfiles.txt really.txt.txt temp.txt tennis.txt
paul@pi ~ $ rename 's/.txt$/.TXT/' *.txt
paul@pi ~ $ ls *.TXT
allfiles.TXT bllfiles.TXT cllfiles.TXT
really.txt.TXT temp.TXT tennis.TXT
paul@pi ~ $
Note that the dollar sign in the regex means at the end. Without the dollar sign this
command would fail on the really.txt.txt file.

4.4. sed
4.4.1. stream editor
The stream editor or short sed uses regex for stream editing.

In this example sed is used to replace a string.

72
regular expressions

echo Sunday | sed 's/Sun/Mon/'


Monday

The slashes can be replaced by a couple of other characters, which can be handy in some
cases to improve readability.
echo Sunday | sed 's:Sun:Mon:'
Monday
echo Sunday | sed 's_Sun_Mon_'
Monday
echo Sunday | sed 's|Sun|Mon|'
Monday

4.4.2. interactive editor


While sed is meant to be used in a stream, it can also be used interactively on a file.
paul@debian7:~/files$ echo Sunday > today
paul@debian7:~/files$ cat today Sunday
paul@debian7:~/files$ sed -i 's/Sun/Mon/' today
paul@debian7:~/files$ cat today
Monday

4.4.3. simple back referencing


The ampersand character can be used to reference the searched (and found) string. In

this example the ampersand is used to double the occurence of the found string.
echo Sunday | sed 's/Sun/&&/'
SunSunday
echo Sunday | sed 's/day/&&/'
Sundayday

4.4.4. back referencing


Parentheses (often called round brackets) are used to group sections of the regex so they can
leter be referenced. Consider this simple example:
paul@debian7:~$ echo Sunday | sed 's_\(Sun\)_\1ny_'
Sunnyday
paul@debian7:~$ echo Sunday | sed 's_\(Sun\)_\1ny \1_'
Sunny Sunday

4.4.5. a dot for any character


In a regex a simple dot can signify any character.

73
regular expressions

paul@debian7:~$ echo 2014-04-01 | sed 's/....-..-../YYYY-MM-DD/' YYYY-MM-


DD
paul@debian7:~$ echo abcd-ef-gh | sed 's/....-..-../YYYY-MM-DD/' YYYY-MM-
DD

4.4.6. multiple back referencing


When more than one pair of parentheses is used, each of them can be referenced separately
by consecutive numbers.
paul@debian7:~$ echo 2014-04-01 | sed 's/\(....\)-\(..\)-\(..\)/\1+\2+\3/'
2014+04+01
paul@debian7:~$ echo 2014-04-01 | sed 's/\(....\)-\(..\)-\(..\)/\3:\2:\1/'
01:04:2014

This feature is called grouping.

4.4.7. white space


The \s can refer to white space such as a space or a tab.

This example looks for white spaces (\s) globally and replaces them with 1 space.
paul@debian7:~$ echo -e 'today\tis\twarm'

today is warm

paul@debian7:~$ echo -e 'today\tis\twarm' | sed 's_\s_ _g'


today is warm

4.4.8. optional occurrence


A question mark signifies that the previous is optional.

The example below searches for three consecutive letter o, but the third o is optional.
paul@debian7:~$ cat list2 ll lol lool loool
paul@debian7:~$ grep -E 'ooo?' list2 lool
loool
paul@debian7:~$ cat list2 | sed 's/ooo\?/A/'
ll lol lAl lAl

4.4.9. exactly n times


You can demand an exact number of times the previous has to occur.

This example wants exactly three o's.


paul@debian7:~$ cat list2

74
regular expressions

ll lol lool loool

paul@debian7:~$ grep -E 'o{3}' list2

loool

paul@debian7:~$ cat list2 | sed 's/o\{3\}/A/'


ll lol lool lAl

paul@debian7:~$

4.4.10. between n and m times


And here we demand exactly from minimum 2 to maximum 3 times.
paul@debian7:~$ cat list2 ll lol lool loool
paul@debian7:~$ grep -E 'o{2,3}' list2 lool
loool
paul@debian7:~$ grep 'o\{2,3\}' list2 lool
loool
paul@debian7:~$ cat list2 | sed 's/o\{2,3\}/A/'
ll lol lAl lAl
paul@debian7:~$

4.5. bash history


The bash shell can also interprete some regular expressions.

This example shows how to manipulate the exclamation mask history feature of the bash
shell.
paul@debian7:~$ mkdir hist

paul@debian7:~$ cd hist/

paul@debian7:~/hist$ touch file1 file2 file3


paul@debian7:~/hist$ ls -l file1

-rw-r--r-- 1 paul paul 0 Apr 15 22:07 file1


paul@debian7:~/hist$ !l

ls -l file1

-rw-r--r-- 1 paul paul 0 Apr 15 22:07 file1


paul@debian7:~/hist$ !l:s/1/3

ls -l file3

75
regular expressions

-rw-r--r-- 1 paul paul 0 Apr 15 22:07 file3


paul@debian7:~/hist$

This also works with the history numbers in bash.


paul@debian7:~/hist$ history 6 2089 mkdir hist
2090 cd hist/
2091 touch file1 file2 file3
2092 ls -l file1
2093 ls -l file3 2094 history 6 paul@debian7:~/hist$
!2092
ls -l file1
-rw-r--r-- 1 paul paul 0 Apr 15 22:07 file1
paul@debian7:~/hist$ !2092:s/1/2
ls -l file2
-rw-r--r-- 1 paul paul 0 Apr 15 22:07 file2
paul@debian7:~/hist$

76
77

You might also like