Linuxfun Part 2-38-77
Linuxfun Part 2-38-77
39
I/O redirection
stdout (1)
stdin (0) bash
stderr (2)
The keyboard often serves as stdin, whereas stdout and stderr both go to the display. This
can be confusing to new Linux users because there is no obvious way to recognize stdout
from stderr. Experienced users know that separating output from errors can be very useful.
The > notation is in fact the abbreviation of 1> (stdout being referred to as stream 1).
40
I/O redirection
the shell only counts two arguments (echo = argument 0, hello = argument 1). The
redirection is removed before the argument counting takes place.
1.2.3. noclobber
Erasing a file while using > can be prevented by setting the noclobber option.
[paul@RHELv4u3 ~]$ cat winter.txt It
is cold today!
[paul@RHELv4u3 ~]$ set -o noclobber
[paul@RHELv4u3 ~]$ echo It is cold today! > winter.txt
-bash: winter.txt: cannot overwrite existing file
[paul@RHELv4u3 ~]$ set +o noclobber
[paul@RHELv4u3 ~]$
41
I/O redirection
The screenshot below shows redirection of stdout to a file, and stderr to /dev/null. Writing
1> is the same as >.
[paul@RHELv4u3 ~]$ find / > allfiles.txt 2> /dev/null
[paul@RHELv4u3 ~]$
1.3.2. 2>&1
To redirect both stdout and stderr to the same file, use 2>&1.
[paul@RHELv4u3 ~]$ find / > allfiles_and_errors.txt 2>&1
[paul@RHELv4u3 ~]$
Note that the order of redirections is significant. For example, the command
ls > dirlist 2>&1
directs both standard output (file descriptor 1) and standard error (file descriptor 2) to the
file dirlist, while the command
ls 2>&1 > dirlist
42
I/O redirection
directs only the standard output to file dirlist, because the standard error made a copy of the
standard output before the standard output was redirected to dirlist.
With 2>&1 you can force stderr to go to stdout. This enables the next command in the pipe
to act on both streams.
paul@debian7:~$ rm file42 file33 file1201 2>&1 | grep file42
rm: cannot remove ‘file42’: No such file or directory
You cannot use both 1>&2 and 2>&1 to switch stdout and stderr.
paul@debian7:~$ rm file42 file33 file1201 2>&1 1>&2 | grep file42
rm: cannot remove ‘file42’: No such file or directory
paul@debian7:~$ echo file42 2>&1 1>&2 | sed 's/file42/FILE42/'
FILE42
You need a third stream to switch stdout and stderr after a pipe symbol.
paul@debian7:~$ echo file42 3>&1 1>&2 2>&3 | sed 's/file42/FILE42/'
file42 paul@debian7:~$ rm file42 3>&1 1>&2 2>&3 | sed
's/file42/FILE42/' rm: cannot remove ‘FILE42’: No such file or
directory
43
I/O redirection
[paul@RHEL4b ~]$ cat < text.txt one two [paul@RHEL4b ~]$ tr 'onetw'
'ONEZZ' < text.txt
ONE
ZZO
[paul@RHEL4b ~]$
44
I/O redirection
And what is the quickest way to clear a file when the noclobber option is set ?
>|bar
45
I/O redirection
2. Verify that noclobber is active by repeating an ls on /etc/ with redirected output to a file.
3. When listing all shell options, which character represents the noclobber option ?
5. Make sure you have two shells open on the same computer. Create an empty tailing.txt
file. Then type tail -f tailing.txt. Use the second shell to append a line of text to that file.
Verify that the first shell displays this line.
6. Create a file that contains the names of five people. Use cat and output redirection to
create the file and use a here document to end the input.
46
I/O redirection
2. Verify that noclobber is active by repeating an ls on /etc/ with redirected output to a file.
ls /etc > etc.txt
ls /etc > etc.txt (should not work)
4. When listing all shell options, which character represents the noclobber option ?
echo $- (noclobber is visible as C)
6. Make sure you have two shells open on the same computer. Create an empty tailing.txt
file. Then type tail -f tailing.txt. Use the second shell to append a line of text to that file.
Verify that the first shell displays this line.
paul@deb503:~$ > tailing.txt
paul@deb503:~$ tail -f tailing.txt
hello world
7. Create a file that contains the names of five people. Use cat and output redirection to
create the file and use a here document to end the input.
paul@deb503:~$ cat > tennis.txt << ace
> Justine Henin
> Venus Williams
> Serena Williams
> Martina Hingis
> Kim Clijsters
> ace
paul@deb503:~$ cat tennis.txt Justine
Henin
Venus Williams
Serena Williams
Martina Hingis
Kim Clijsters
paul@deb503:~$
47
Chapter 2. filters
Commands that are created to be used with a pipe are often called filters. These filters are
very small programs that do one specific thing very efficiently. They can be used as building
blocks.
This chapter will introduce you to the most common filters. The combination of simple
commands and filters in a long pipe allows you to design elegant solutions.
48
filters
2.1. cat
When between two pipes, the cat command does nothing (except putting stdin on stdout).
[paul@RHEL4b pipes]$ tac count.txt | cat | cat | cat | cat | cat
five four three two one
[paul@RHEL4b pipes]$
2.2. tee
Writing long pipes in Unix is fun, but sometimes you may want intermediate results. This
is were tee comes in handy. The tee filter puts stdin on stdout and also into a file. So tee is
almost the same as cat, except that it has two identical outputs.
[paul@RHEL4b pipes]$ tac count.txt | tee temp.txt | tac
one two three four five
[paul@RHEL4b pipes]$
2.3. grep
The grep filter is famous among Unix users. The most common use of grep is to filter lines
of text containing (or not containing) a certain string.
[paul@RHEL4b pipes]$ cat tennis.txt
Amelie Mauresmo, Fra
Kim Clijsters, BEL
Justine Henin, Bel
Serena Williams, usa
Venus Williams, USA
[paul@RHEL4b pipes]$ cat tennis.txt | grep Williams
Serena Williams, usa
Venus Williams, USA
One of the most useful options of grep is grep -i which filters in a case insensitive way.
[paul@RHEL4b pipes]$ grep Bel tennis.txt
Justine Henin, Bel
[paul@RHEL4b pipes]$ grep -i Bel tennis.txt
Kim Clijsters, BEL
Justine Henin, Bel
[paul@RHEL4b pipes]$
Another very useful option is grep -v which outputs lines not matching the string.
49
filters
And of course, both options can be combined to filter all lines not containing a case
insensitive string.
[paul@RHEL4b pipes]$ grep -vi usa tennis.txt
Amelie Mauresmo, Fra
Kim Clijsters, BEL
Justine Henin, Bel
[paul@RHEL4b pipes]$
With grep -A1 one line after the result is also displayed.
paul@debian5:~/pipes$ grep -A1 Henin tennis.txt Justine Henin,
Bel
Serena Williams, usa
With grep -B1 one line before the result is also displayed.
paul@debian5:~/pipes$ grep -B1 Henin tennis.txt Kim Clijsters,
BEL
Justine Henin, Bel
With grep -C1 (context) one line before and one after are also displayed. All three options
(A,B, and C) can display any number of lines (using e.g. A2, B4 or C20).
paul@debian5:~/pipes$ grep -C1 Henin tennis.txt Kim Clijsters, BEL
Justine Henin, Bel
Serena Williams, usa
2.4. cut
The cut filter can select columns from files, depending on a delimiter or a count of bytes.
The screenshot below uses cut to filter for the username and userid in the /etc/passwd file.
It uses the colon as a delimiter, and selects fields 1 and 3.
[[paul@RHEL4b pipes]$ cut -d: -f1,3 /etc/passwd | tail -4
Figo:510
Pfaff:511
Harry:516
Hermione:517
[paul@RHEL4b pipes]$
When using a space as the delimiter for cut, you have to quote the space.
[paul@RHEL4b pipes]$ cut -d" " -f1 tennis.txt
Amelie
Kim
Justine
Serena
Venus
[paul@RHEL4b pipes]$
This example uses cut to display the second to the seventh character of /etc/passwd.
50
filters
[paul@RHEL4b pipes]$
2.5. tr
You can translate characters with tr. The screenshot shows the translation of all occurrences
of e to E.
[paul@RHEL4b pipes]$ cat tennis.txt | tr 'e' 'E'
AmEliE MaurEsmo, Fra
Kim ClijstErs, BEL
JustinE HEnin, BEl
SErEna Williams, usa
VEnus Williams, USA
[paul@RHEL4b pipes]$
The tr -s filter can also be used to squeeze multiple occurrences of a character to one.
[paul@RHEL4b pipes]$ cat spaces.txt one two
three four five six
[paul@RHEL4b pipes]$
[paul@RHEL4b pipes]$ cat count.txt | tr 'a-z' 'n-za-m' bar gjb guerr sbhe
svir [paul@RHEL4b pipes]$
51
filters
2.6. wc
Counting words, lines and characters is easy with wc.
[paul@RHEL4b pipes]$ wc tennis.txt
5 15 100 tennis.txt
[paul@RHEL4b pipes]$ wc -l tennis.txt
5 tennis.txt
[paul@RHEL4b pipes]$ wc -w tennis.txt
15 tennis.txt
[paul@RHEL4b pipes]$ wc -c tennis.txt
100 tennis.txt
[paul@RHEL4b pipes]$
2.7. sort
The sort filter will default to an alphabetical sort.
paul@debian5:~/pipes$ cat music.txt Queen
Brel
Led Zeppelin
Abba
paul@debian5:~/pipes$ sort music.txt Abba
Brel
Led Zeppelin
Queen
But the sort filter has many options to tweak its usage. This example shows sorting different
columns (column 1 or column 2).
[paul@RHEL4b pipes]$ sort -k1 country.txt
Belgium, Brussels, 10
France, Paris, 60
Germany, Berlin, 100
Iran, Teheran, 70
Italy, Rome, 50
[paul@RHEL4b pipes]$ sort -k2 country.txt
Germany, Berlin, 100
Belgium, Brussels, 10
France, Paris, 60
Italy, Rome, 50
Iran, Teheran, 70
The screenshot below shows the difference between an alphabetical sort and a numerical
sort (both on the third column).
[paul@RHEL4b pipes]$ sort -k3 country.txt
Belgium, Brussels, 10
Germany, Berlin, 100
Italy, Rome, 50
France, Paris, 60
Iran, Teheran, 70
[paul@RHEL4b pipes]$ sort -n -k3 country.txt
Belgium, Brussels, 10
52
filters
Italy, Rome, 50
France, Paris, 60
Iran, Teheran, 70
Germany, Berlin, 100
2.8. uniq
With uniq you can remove duplicates from a sorted list.
paul@debian5:~/pipes$ cat music.txt Queen
Brel
Queen
Abba
paul@debian5:~/pipes$ sort music.txt Abba
Brel
Queen
Queen
paul@debian5:~/pipes$ sort music.txt |uniq
Abba
Brel
Queen
53
filters
2.9. comm
Comparing streams (or files) can be done with the comm. By default comm will output
three columns. In this example, Abba, Cure and Queen are in both lists, Bowie and Sweet
are only in the first file, Turner is only in the second.
paul@debian5:~/pipes$ cat > list1.txt
Abba
Bowie
Cure
Queen
Sweet
paul@debian5:~/pipes$ cat > list2.txt
Abba
Cure
Queen
Turner
paul@debian5:~/pipes$ comm list1.txt list2.txt
Abba
Bowie
Cure
Queen
Sweet
Turner
The output of comm can be easier to read when outputting only a single column. The digits
point out which output columns should not be displayed.
2.10. od
European humans like to work with ascii characters, but computers store files in bytes. The
example below creates a simple file, and then uses od to show the contents of the file in
hexadecimal bytes
paul@laika:~/test$ od -b text.txt
0000000 141 142 143 144 145 146 147 012 061 062 063 064 065 066 067 012
0000020
paul@laika:~/test$ od -c text.txt
54
filters
0000000 a b c d e f g \n 1 2 3 4 5 6 7 \n
0000020
55
filters
2.11. sed
The stream editor sed can perform editing functions in the stream, using regular
expressions.
Add g for global replacements (all occurrences of the string per line).
Display a sorted list of logged on users, but every user only once .
[paul@RHEL4b pipes]$ who | cut -d' ' -f1 | sort | uniq
Harry paul root
56
filters
3. Make a list of all filenames in /etc that contain the string conf in their filename.
4. Make a sorted list of all files in /etc that contain the case insensitive string conf in their
filename.
5. Look at the output of /sbin/ifconfig. Write a line that displays only ip address and the
subnet mask.
7. Write a line that receives a text file, and outputs all words on a separate line.
8. Write a spell checker on the command line. (There may be a dictionary in /usr/share/
dict/ .)
57
Chapter 3. basic Unix tools
This chapter introduces commands to find or locate files and to compress files, together
with other common tools that were not discussed before. While the tools discussed here are
technically not considered filters, they can be used in pipes.
58
basic Unix tools
3.1. find
The find command can be very useful at the start of a pipe to search for files. Here are some
examples. You might want to add 2>/dev/null to the command lines to avoid cluttering your
screen with error messages.
Find files that end in .conf in the current directory (and all subdirs).
find . -name "*.conf"
Find files of type file (not directory, pipe or etc.) that end in .conf.
find . -type f -name "*.conf"
Find can also execute another command on every file found. This example will look for
*.odf files and copy them to /backup/.
find /data -name "*.odf" -exec cp {} /backup/ \;
Find can also execute, after your confirmation, another command on every file found. This
example will remove *.odf files if you approve of it for every file found.
find /data -name "*.odf" -ok rm {} \;
3.2. locate
The locate tool is very different from find in that it uses an index to locate files. This is a
lot faster than traversing all the directories, but it also means that it is always outdated. If
the index does not exist yet, then you have to create it (as root on Red Hat Enterprise Linux)
with the updatedb command.
[paul@RHEL4b ~]$ locate Samba
59
basic Unix tools
Most Linux distributions will schedule the updatedb to run once every day.
3.3. date
The date command can display the date, time, time zone and more.
paul@rhel55 ~$ date Sat Apr
17 12:44:30 CEST 2010
A date string can be customized to display the format of your choice. Check the man page
for more options.
paul@rhel55 ~$ date +'%A %d-%m-%Y'
Saturday 17-04-2010
Time on any Unix is calculated in number of seconds since 1969 (the first second being the
first second of the first of January 1970). Use date +%s to display Unix time in seconds.
paul@rhel55 ~$ date +%s
1271501080
3.4. cal
The cal command displays the current month, with the current day highlighted.
paul@rhel55 ~$ cal April 2010
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
60
basic Unix tools
3.5. sleep
The sleep command is sometimes used in scripts to wait a number of seconds. This example
shows a five second sleep.
paul@rhel55 ~$ sleep 5
paul@rhel55 ~$
3.6. time
The time command can display how long it takes to execute a command. The date command
takes only a little time.
paul@rhel55 ~$ time date Sat
Apr 17 13:08:27 CEST 2010
real 0m0.014s
user 0m0.008s
sys 0m0.006s
The sleep 5 command takes five real seconds to execute, but consumes little cpu time.
paul@rhel55 ~$ time sleep
5
real 0m5.018s
user 0m0.005s
sys 0m0.011s
This bzip2 command compresses a file and uses a lot of cpu time.
paul@rhel55 ~$ time bzip2 text.txt
real 0m2.368s
user 0m0.847s
sys 0m0.539s
61
basic Unix tools
62
basic Unix tools
2. Explain the difference between these two statements. Will they both work when there
are200 .odf files in /data ? How about when there are 2 million .odf files ?
find /data -name "*.odf" > data_odf.txt
3. Write a find command that finds all files created after January 30th 2010.
4. Write a find command that finds all *.odf files created in September 2009.
5. Count the number of *.conf files in /etc and all its subdirs.
6. Here are two commands that do the same thing: copy *.odf files to /backup/ . What
wouldbe a reason to replace the first command with the second ? Again, this is an
important question.
cp -r /data/*.odf /backup/
7. Create a file called loctest.txt. Can you find this file with locate ? Why not ? How do
you make locate find this file ?
9. Issue the date command. Now display the date in YYYY/MM/DD format.
10. Issue the cal command. Display a calendar of 1582 and 1752. Notice anything special ?
63
Chapter 4. regular expressions
Regular expressions are a very powerful tool in Linux. They can be used with a variety of
programs like bash, vi, rename, grep, sed, and more.
64
regular expressions
Depending on the tool being used, one or more of these syntaxes can be used.
For example the grep tool has the -E option to force a string to be read as ERE while -G
forces BRE and -P forces PRCE.
Note that grep also has -F to force the string to be read literally.
65
regular expressions
4.2. grep
4.2.1. print lines matching a pattern
grep is a popular Linux tool to search for lines that match a certain pattern. Below are some
examples of the simplest regular expressions.
This is the contents of the test file. This file contains three lines (or three newline
characters).
paul@rhel65:~$ cat names Tania
Laura
Valentina
When grepping for a single character, only the lines containing that character are returned.
paul@rhel65:~$ grep u names Laura
paul@rhel65:~$ grep e names
Valentina
paul@rhel65:~$ grep i names Tania
Valentina
The pattern matching in this example should be very straightforward; if the given character
occurs on a line, then grep will return that line.
This example demonstrates that ia will match Tania but not Valentina and in will match
Valentina but not Tania.
paul@rhel65:~$ grep a names
Tania
Laura
Valentina
paul@rhel65:~$ grep ia names
Tania
paul@rhel65:~$ grep in names
Valentina
paul@rhel65:~$
66
regular expressions
Note that we use the -E switch of grep to force interpretion of our string as an ERE.
We need to escape the pipe symbol in a BRE to get the same logical OR.
paul@debian7:~$ grep -G 'i|a' list
paul@debian7:~$ grep -G 'i\|a' list Tania
Laura
The two examples below show how to use the dollar character to match the end of a string.
paul@debian7:~$ grep a$ names Tania
Laura
Valentina
paul@debian7:~$ grep r$ names Fleur
Floor
67
regular expressions
Both the dollar sign and the little hat are called anchors in a regex.
Surrounding the searched word with spaces is not a good solution (because other characters
can be word separators). This screenshot below show how to use \b to find only the searched
word:
paul@debian7:~$ grep '\bover\b' text
The winter is over. Can you get over
there?
paul@debian7:~$
68
regular expressions
grep -B5
grep -C5
69
regular expressions
4.3. rename
4.3.1. the rename command
On Debian Linux the /usr/bin/rename command is a link to /usr/bin/prename installed by
the perl package.
paul@pi ~ $ dpkg -S $(readlink -f $(which rename))
perl: /usr/bin/prename
Red Hat derived systems do not install the same rename command, so this section does not
describe rename on Red Hat (unless you copy the perl script manually).
There is often confusion on the internet about the rename command because solutions
that work fine in Debian (and Ubuntu, xubuntu, Mint, ...) cannot be used in Red Hat
(and CentOS, Fedora, ...).
4.3.2. perl
The rename command is actually a perl script that uses perl regular expressions. The
complete manual for these can be found by typing perldoc perlrequick (after installing
perldoc).
root@pi:~# aptitude install perl-doc The
following NEW packages will be installed:
perl-doc 0 packages upgraded, 1 newly installed, 0 to remove and 0
not upgraded.
Need to get 8,170 kB of archives. After unpacking 13.2 MB will be used.
Get: 1 http://mirrordirector.raspbian.org/raspbian/ wheezy/main perl-do...
Fetched 8,170 kB in 19s (412 kB/s) Selecting
previously unselected package perl-doc.
(Reading database ... 67121 files and directories currently installed.)
Unpacking perl-doc (from .../perl-doc_5.14.2-21+rpi2_all.deb) ...
Adding 'diversion of /usr/bin/perldoc to /usr/bin/perldoc.stub by perl-
doc' Processing triggers for man-db ... Setting up perl-doc (5.14.2-
21+rpi2) ...
70
regular expressions
And here is another example that uses rename with the well know syntax to change the
extensions of the same files once more:
paul@pi ~ $ ls abc
These two examples appear to work because the strings we used only exist at the end of the
filename. Remember that file extensions have no meaning in the bash shell.
The next example shows what can go wrong with this syntax.
paul@pi ~ $ touch atxt.txt
paul@pi ~ $
71
regular expressions
The syntax we use now can be described as s/regex/replacement/g where s signifies switch
and g stands for global.
Note that this example used the -n switch to show what is being done (instead of actually
renaming the file).
paul@debian7:~/files$
Here is an example on how to use rename to only rename the file extension. It uses the
dollar sign to mark the ending of the filename.
paul@pi ~ $ ls *.txt
allfiles.txt bllfiles.txt cllfiles.txt really.txt.txt temp.txt tennis.txt
paul@pi ~ $ rename 's/.txt$/.TXT/' *.txt
paul@pi ~ $ ls *.TXT
allfiles.TXT bllfiles.TXT cllfiles.TXT
really.txt.TXT temp.TXT tennis.TXT
paul@pi ~ $
Note that the dollar sign in the regex means at the end. Without the dollar sign this
command would fail on the really.txt.txt file.
4.4. sed
4.4.1. stream editor
The stream editor or short sed uses regex for stream editing.
72
regular expressions
The slashes can be replaced by a couple of other characters, which can be handy in some
cases to improve readability.
echo Sunday | sed 's:Sun:Mon:'
Monday
echo Sunday | sed 's_Sun_Mon_'
Monday
echo Sunday | sed 's|Sun|Mon|'
Monday
this example the ampersand is used to double the occurence of the found string.
echo Sunday | sed 's/Sun/&&/'
SunSunday
echo Sunday | sed 's/day/&&/'
Sundayday
73
regular expressions
This example looks for white spaces (\s) globally and replaces them with 1 space.
paul@debian7:~$ echo -e 'today\tis\twarm'
today is warm
The example below searches for three consecutive letter o, but the third o is optional.
paul@debian7:~$ cat list2 ll lol lool loool
paul@debian7:~$ grep -E 'ooo?' list2 lool
loool
paul@debian7:~$ cat list2 | sed 's/ooo\?/A/'
ll lol lAl lAl
74
regular expressions
loool
paul@debian7:~$
This example shows how to manipulate the exclamation mask history feature of the bash
shell.
paul@debian7:~$ mkdir hist
paul@debian7:~$ cd hist/
ls -l file1
ls -l file3
75
regular expressions
76
77