Coreutils
Coreutils
Short Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Common options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Output of entire files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Formatting file contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Output of parts of files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6 Summarizing files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7 Operating on sorted files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8 Operating on fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9 Operating on characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
10 Directory listing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
11 Basic operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
12 Special file types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
13 Changing file attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
14 File space usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
15 Printing text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
16 Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
17 Redirection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
18 File name manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
19 Working context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
20 User information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
21 System context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
22 SELinux context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
23 Modified command invocation . . . . . . . . . . . . . . . . . . . . . . . . . 210
24 Process control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
25 Delaying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
26 Numeric operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
27 File permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
28 File timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
29 Date input formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
30 Version sort ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
31 Opening the Software Toolbox . . . . . . . . . . . . . . . . . . . . . . . . . 265
A GNU Free Documentation License . . . . . . . . . . . . . . . . . . . . . . 273
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
ii
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Common options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 Backup options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Block size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Signal specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 chown, chgrp, chroot, id: Disambiguating user names and IDs . . . 6
2.5 Sources of random data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.6 Target directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.7 Trailing slashes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.8 Traversing symlinks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.9 Treating / specially . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.10 Special built-in utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.11 Exit status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.12 Floating point numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.13 Standards conformance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.14 coreutils: Multi-call program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6 Summarizing files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.1 wc: Print newline, word, and byte counts . . . . . . . . . . . . . . . . . . . . . . 41
6.2 sum: Print checksum and block counts . . . . . . . . . . . . . . . . . . . . . . . . . 42
iii
8 Operating on fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.1 cut: Print selected parts of lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.2 paste: Merge lines of files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.3 join: Join lines on a common field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.3.1 General options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.3.2 Pre-sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.3.3 Working with fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.3.4 Controlling join’s field matching . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.3.5 Header lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.3.6 Union, Intersection and Difference of files . . . . . . . . . . . . . . . . . 80
9 Operating on characters . . . . . . . . . . . . . . . . . . . . . . . 82
9.1 tr: Translate, squeeze, and/or delete characters . . . . . . . . . . . . . . . . 82
9.1.1 Specifying arrays of characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
9.1.2 Translating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.1.3 Squeezing repeats and deleting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.2 expand: Convert tabs to spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9.3 unexpand: Convert spaces to tabs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
iv
10 Directory listing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.1 ls: List directory contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.1.1 Which files are listed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.1.2 What information is listed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
10.1.3 Sorting the output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
10.1.4 General output formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
10.1.5 Formatting file timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.1.6 Formatting the file names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
10.2 dir: Briefly list directory contents . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
10.3 vdir: Verbosely list directory contents . . . . . . . . . . . . . . . . . . . . . . . 103
10.4 dircolors: Color setup for ls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
16 Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
16.1 false: Do nothing, unsuccessfully . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
16.2 true: Do nothing, successfully . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
16.3 test: Check file types and compare values . . . . . . . . . . . . . . . . . . . 165
16.3.1 File type tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
16.3.2 Access permission tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
16.3.3 File characteristic tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
16.3.4 String tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
16.3.5 Numeric tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
16.3.6 Connectives for test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
16.4 expr: Evaluate expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
16.4.1 String expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
16.4.2 Numeric expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
16.4.3 Relations for expr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
16.4.4 Examples of using expr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
17 Redirection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
17.1 tee: Redirect output to multiple files or processes . . . . . . . . . . . 172
25 Delaying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
25.1 sleep: Delay for a specified time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
1
1 Introduction
This manual is a work in progress: many sections make no attempt to explain basic concepts
in a way suitable for novices. Thus, if you are interested, please get involved in improving
this manual. The entire GNU community will benefit.
The GNU utilities documented here are mostly compatible with the POSIX standard.
Please report bugs to [email protected]. Include the version number, machine
architecture, input files, and any other information needed to reproduce the bug: your input,
what you expected, what you got, and why it is wrong.
If you have a problem with sort or date, try using the --debug option, as it can often
help find and fix problems without having to wait for an answer to a bug report. If the
debug output does not suffice to fix the problem on your own, please compress and attach it
to the rest of your bug report.
Although diffs are welcome, please include a description of the problem as well, since
this is sometimes difficult to infer. See Section “Bugs” in Using and Porting GNU CC.
This manual was originally derived from the Unix man pages in the distributions, which
were written by David MacKenzie and updated by Jim Meyering. What you are reading
now is the authoritative documentation for these utilities; the man pages are no longer being
maintained. The original fmt man page was written by Ross Paterson. François Pinard did
the initial conversion to Texinfo format. Karl Berry did the indexing, some reorganization,
and editing of the results. Brian Youmans of the Free Software Foundation office staff
combined the manuals for textutils, fileutils, and sh-utils to produce the present omnibus
manual. Richard Stallman contributed his usual invaluable insights to the overall process.
2
2 Common options
Certain options are available in all of these programs. Rather than writing identical
descriptions for each of the programs, they are described here. (In fact, every GNU program
accepts (or should accept) these options.)
Normally options and operands can appear in any order, and programs act as if all the
options appear before any operands. For example, ‘sort -r passwd -t :’ acts like ‘sort
-r -t : passwd’, since ‘:’ is an option-argument of -t. However, if the POSIXLY_CORRECT
environment variable is set, options must appear before operands, unless otherwise specified
for a particular command.
A few programs can usefully have trailing operands with leading ‘-’. With such a program,
options must precede operands even if POSIXLY_CORRECT is not set, and this fact is noted in
the program description. For example, the env command’s options must appear before its
operands, since in some cases the operands specify a command that itself contains options.
Most programs that accept long options recognize unambiguous abbreviations of those
options. For example, ‘rmdir --ignore-fail-on-non-empty’ can be invoked as ‘rmdir
--ignore-fail’ or even ‘rmdir --i’. Ambiguous options, such as ‘ls --h’, are identified
as such.
Some of these programs recognize the --help and --version options only when one of
them is the sole command line argument. For these programs, abbreviations of the long
options are not always recognized.
‘--help’ Print a usage message listing all available options, then exit successfully.
‘--version’
Print the version number, then exit successfully.
‘--’ Delimit the option list. Later arguments, if any, are treated as operands even if
they begin with ‘-’. For example, ‘sort -- -r’ reads from the file named -r.
A single ‘-’ operand is not really an option, though it looks like one. It stands for a file
operand, and some tools treat it as standard input, or as standard output if that is clear
from the context. For example, ‘sort -’ reads from standard input, and is equivalent to
plain ‘sort’. Unless otherwise specified, a ‘-’ can appear as any operand that requires a file
name.
If none of the above environment variables are set, the block size currently defaults to
1024 bytes in most contexts, but this number may change in the future. For ls file sizes,
the block size defaults to 1 byte.
A block size specification can be a positive integer specifying the number of bytes per
block, or it can be human-readable or si to select a human-readable format. Integers may
be followed by suffixes that are upward compatible with the SI prefixes (http://www.bipm.
org/en/publications/si-brochure/chapter3.html) for decimal multiples and with the
ISO/IEC 80000-13 (formerly IEC 60027-2) prefixes (https://physics.nist.gov/cuu/
Units/binary.html) for binary multiples.
With human-readable formats, output sizes are followed by a size letter such as ‘M’ for
megabytes. BLOCK_SIZE=human-readable uses powers of 1024; ‘M’ stands for 1,048,576
bytes. BLOCK_SIZE=si is similar, but uses powers of 1000 and appends ‘B’; ‘MB’ stands for
1,000,000 bytes.
A block size specification preceded by ‘'’ causes output sizes to be displayed with
thousands separators. The LC_NUMERIC locale specifies the thousands separator and grouping.
For example, in an American English locale, ‘--block-size="'1kB"’ would cause a size of
1234000 bytes to be displayed as ‘1,234’. In the default C locale, there is no thousands
separator so a leading ‘'’ has no effect.
An integer block size can be followed by a suffix to specify a multiple of that size. A bare
size letter, or one followed by ‘iB’, specifies a multiple using powers of 1024. A size letter
followed by ‘B’ specifies powers of 1000 instead. For example, ‘1M’ and ‘1MiB’ are equivalent
to ‘1048576’, whereas ‘1MB’ is equivalent to ‘1000000’.
A plain suffix without a preceding integer acts as if ‘1’ were prepended, except that it
causes a size indication to be appended to the output. For example, ‘--block-size="kB"’
displays 3000 as ‘3kB’.
The following suffixes are defined. Large sizes like 1Q may be rejected by your computer
due to limitations of its arithmetic.
‘kB’ kilobyte: 103 = 1000.
‘k’
‘K’
‘KiB’ kibibyte: 210 = 1024. ‘K’ is special: the SI prefix is ‘k’ and the ISO/IEC 80000-13
prefix is ‘Ki’, but tradition and POSIX use ‘k’ to mean ‘KiB’.
‘MB’ megabyte: 106 = 1, 000, 000.
‘M’
‘MiB’ mebibyte: 220 = 1, 048, 576.
‘GB’ gigabyte: 109 = 1, 000, 000, 000.
‘G’
‘GiB’ gibibyte: 230 = 1, 073, 741, 824.
‘TB’ terabyte: 1012 = 1, 000, 000, 000, 000.
‘T’
‘TiB’ tebibyte: 240 = 1, 099, 511, 627, 776.
‘PB’ petabyte: 1015 = 1, 000, 000, 000, 000, 000.
Chapter 2: Common options 5
‘P’
‘PiB’ pebibyte: 250 = 1, 125, 899, 906, 842, 624.
‘EB’ exabyte: 1018 = 1, 000, 000, 000, 000, 000, 000.
‘E’
‘EiB’ exbibyte: 260 = 1, 152, 921, 504, 606, 846, 976.
‘ZB’ zettabyte: 1021 = 1, 000, 000, 000, 000, 000, 000, 000
‘Z’
‘ZiB’ zebibyte: 270 = 1, 180, 591, 620, 717, 411, 303, 424.
‘YB’ yottabyte: 1024 = 1, 000, 000, 000, 000, 000, 000, 000, 000.
‘Y’
‘YiB’ yobibyte: 280 = 1, 208, 925, 819, 614, 629, 174, 706, 176.
‘RB’ ronnabyte: 1027 = 1, 000, 000, 000, 000, 000, 000, 000, 000, 000.
‘R’
‘RiB’ robibyte: 290 = 1, 237, 940, 039, 285, 380, 274, 899, 124, 224.
‘QB’ quettabyte: 1030 = 1, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000.
‘Q’
‘QiB’ quebibyte: 2100 = 1, 267, 650, 600, 228, 229, 401, 496, 703, 205, 376.
Block size defaults can be overridden by an explicit --block-size=size option. The -k
option is equivalent to --block-size=1K, which is the default unless the POSIXLY_CORRECT
environment variable is set. The -h or --human-readable option is equivalent to
--block-size=human-readable. The --si option is equivalent to --block-size=si.
Note for ls the -k option does not control the display of the apparent file sizes, whereas the
--block-size option does.
GNU chown, chgrp, chroot, and id provide a way to work around this, that at the
same time may result in a significant performance improvement by eliminating a database
look-up. Simply precede each numeric user ID and/or group ID with a ‘+’, in order to force
its interpretation as an integer:
chown +42 F
chgrp +$numeric_group_id another-file
chown +0:+0 /
The name look-up process is skipped for each ‘+’-prefixed string, because a string
containing ‘+’ is never a valid user or group name. This syntax is accepted on most common
Unix systems, but not on Solaris 10.
what is wanted, so these commands support the following options to allow more fine-grained
control:
‘-T’
‘--no-target-directory’
Do not treat the last operand specially when it is a directory or a symbolic link
to a directory. This can help avoid race conditions in programs that operate in
a shared area. For example, when the command ‘mv /tmp/source /tmp/dest’
succeeds, there is no guarantee that /tmp/source was renamed to /tmp/dest:
it could have been renamed to /tmp/dest/source instead, if some other process
created /tmp/dest as a directory. However, if mv -T /tmp/source /tmp/dest
succeeds, there is no question that /tmp/source was renamed to /tmp/dest.
In the opposite situation, where you want the last operand to be treated as a
directory and want a diagnostic otherwise, you can use the --target-directory
(-t) option.
‘-t directory’
‘--target-directory=directory’
Use directory as the directory component of each destination file name.
The interface for most programs is that after processing options and a finite
(possibly zero) number of fixed-position arguments, the remaining argument list
is either expected to be empty, or is a list of items (usually files) that will all
be handled identically. The xargs program is designed to work well with this
convention.
The commands in the mv-family are unusual in that they take a variable number
of arguments with a special case at the end (namely, the target directory). This
makes it nontrivial to perform some operations, e.g., “move all files from here
to ../d/”, because mv * ../d/ might exhaust the argument space, and ls |
xargs ... doesn’t have a clean way to specify an extra final argument for each
invocation of the subject command. (It can be done by going through a shell
command, but that requires more human labor and brain power than it should.)
The --target-directory (-t) option allows the cp, install, ln, and mv
programs to be used conveniently with xargs. For example, you can move the
files from the current directory to a sibling directory, d like this:
ls | xargs mv -t ../d --
However, this doesn’t move files whose names begin with ‘.’. If you use the
GNU find program, you can move those files too, with this command:
find . -mindepth 1 -maxdepth 1 \
| xargs mv -t ../d
But both of the above approaches fail if there are no files in the current directory,
or if any file has a name containing a blank or some other special characters.
The following example removes those limitations and requires both GNU find
and GNU xargs:
find . -mindepth 1 -maxdepth 1 -print0 \
| xargs --null --no-run-if-empty \
mv -t ../d
Chapter 2: Common options 9
Commands that parse floating point also understand case-insensitive inf, infinity, and
NaN, although whether such values are useful depends on the command in question. Modern
C implementations also accept hexadecimal floating point numbers such as -0x.ep-3, which
stands for −14/16 times 2− 3, which equals −0.109375. See Section “Parsing of Floats” in
The GNU C Library Reference Manual.
Normally the LC_NUMERIC locale determines the decimal-point character. However, some
commands’ descriptions specify that they accept numbers in either the current or the C
locale; for example, they treat ‘3.14’ like ‘3,14’ if the current locale uses comma as a decimal
point.
An exit status of zero indicates success, and a nonzero value indicates failure.Examples:
# Output f's contents, then standard input, then g's contents.
cat f - g
The beginnings of the sections of logical pages are indicated in the input file by a line
containing exactly one of these delimiter strings:
‘\:\:\:’ start of header;
‘\:\:’ start of body;
‘\:’ start of footer.
The characters from which these strings are made can be changed from ‘\’ and ‘:’ via
options (see below), but the pattern of each string cannot be changed.
A section delimiter is replaced by an empty line on output. Any text that comes before
the first section delimiter string in the input file is considered to be part of a body section,
so nl treats a file that contains no section delimiters as a single body section.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-b style’
‘--body-numbering=style’
Select the numbering style for lines in the body section of each logical page.
When a line is not numbered, the current line number is not incremented, but
the line number separator character is still prepended to the line. The styles
are:
‘a’ number all lines,
‘t’ number only nonempty lines (default for body),
‘n’ do not number lines (default for header and footer),
‘pbre’ number only lines that contain a match for the basic regular ex-
pression bre. See Section “Regular Expressions” in The GNU Grep
Manual.
‘-d cd’
‘--section-delimiter=cd’
Set the section delimiter characters to cd; default is ‘\:’. If only c is given,
the second remains ‘:’. As a GNU extension more than two characters can be
specified, and also if cd is empty (-d ''), then section matching is disabled.
(Remember to protect ‘\’ or other metacharacters from shell expansion with
quotes or extra backslashes.)
‘-f style’
‘--footer-numbering=style’
Analogous to --body-numbering.
‘-h style’
‘--header-numbering=style’
Analogous to --body-numbering.
‘-i number’
‘--line-increment=number’
Increment line numbers by number (default 1). number can be negative to
decrement.
Chapter 3: Output of entire files 15
‘-l number’
‘--join-blank-lines=number’
Consider number (default 1) consecutive empty lines to be one logical line for
numbering, and only number the last one. Where fewer than number consecutive
empty lines occur, do not number them. An empty line is one that contains no
characters, not even spaces or tabs.
‘-n format’
‘--number-format=format’
Select the line numbering format (default is rn):
‘ln’ left justified, no leading zeros;
‘rn’ right justified, no leading zeros;
‘rz’ right justified, leading zeros.
‘-p’
‘--no-renumber’
Do not reset the line number at the start of a logical page.
‘-s string’
‘--number-separator=string’
Separate the line number from the text line in the output with string (default
is the TAB character).
‘-v number’
‘--starting-line-number=number’
Set the initial line number on each logical page to number (default 1). The
starting number can be negative.
‘-w number’
‘--number-width=number’
Use number characters for line numbers (default 6).
An exit status of zero indicates success, and a nonzero value indicates failure.
If a command is of both the first and second forms, the second form is assumed if the
last operand begins with ‘+’ or (if there are two operands) a digit. For example, in ‘od foo
10’ and ‘od +10’ the ‘10’ is an offset, whereas in ‘od 10’ the ‘10’ is a file name.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-A radix’
‘--address-radix=radix’
Select the base in which file offsets are printed. radix can be one of the following:
‘d’ decimal;
‘o’ octal;
‘x’ hexadecimal;
‘n’ none (do not print offsets).
The default is octal.
‘--endian=order’
Reorder input bytes, to handle inputs with differing byte orders, or to provide
consistent output independent of the endian convention of the current system.
Swapping is performed according to the specified --type size and endian order,
which can be ‘little’ or ‘big’.
‘-j bytes’
‘--skip-bytes=bytes’
Skip bytes input bytes before formatting and writing. If bytes begins with ‘0x’
or ‘0X’, it is interpreted in hexadecimal; otherwise, if it begins with ‘0’, in octal;
otherwise, in decimal. bytes may be, or may be an integer optionally followed
by, one of the following multiplicative suffixes:
‘b’ => 512 ("blocks")
‘KB’ => 1000 (KiloBytes)
‘K’ => 1024 (KibiBytes)
‘MB’ => 1000*1000 (MegaBytes)
‘M’ => 1024*1024 (MebiBytes)
‘GB’ => 1000*1000*1000 (GigaBytes)
‘G’ => 1024*1024*1024 (GibiBytes)
and so on for ‘T’, ‘P’, ‘E’, ‘Z’, ‘Y’, ‘R’, and ‘Q’. Binary prefixes can be used, too:
‘KiB’=‘K’, ‘MiB’=‘M’, and so on.
‘-N bytes’
‘--read-bytes=bytes’
Output at most bytes bytes of the input. Prefixes and suffixes on bytes are
interpreted as for the -j option.
‘-S bytes’
‘--strings[=bytes]’
Instead of the normal output, output only string constants: at least bytes
consecutive printable characters, followed by a zero byte (ASCII NUL). Prefixes
and suffixes on bytes are interpreted as for the -j option.
If bytes is omitted with --strings, the default is 3.
Chapter 3: Output of entire files 17
‘-t type’
‘--format=type’
Select the format in which to output the file data. type is a string of one or
more of the below type indicator characters. If you include more than one type
indicator character in a single type string, or use this option more than once,
od writes one copy of each output line using each of the data types that you
specified, in the order that you specified.
Adding a trailing “z” to any type specification appends a display of the single
byte character representation of the printable characters to the output line
generated by the type specification.
‘a’ named character, ignoring high-order bit
‘c’ printable single byte character, C backslash escape or a 3 digit octal
sequence
‘d’ signed decimal
‘f’ floating point (see Section 2.12 [Floating point], page 10)
‘o’ octal
‘u’ unsigned decimal
‘x’ hexadecimal
The type a outputs things like ‘sp’ for space, ‘nl’ for newline, and ‘nul’ for
a zero byte. Only the least significant seven bits of each byte is used; the
high-order bit is ignored. Type c outputs ‘ ’, ‘\n’, and \0, respectively.
Except for types ‘a’ and ‘c’, you can specify the number of bytes to use in
interpreting each number in the given data type by following the type indicator
character with a decimal integer. Alternately, you can specify the size of one of
the C compiler’s built-in data types by following the type indicator character
with one of the following characters. For integers (‘d’, ‘o’, ‘u’, ‘x’):
‘C’ char
‘S’ short
‘I’ int
‘L’ long
For floating point (f):
B brain 16 bit float (https://en.wikipedia.org/wiki/
Bfloat16_floating-point_format)
H half precision float (https://en.wikipedia.org/wiki/
Half-precision_floating-point_format)
F float
D double
L long double
Chapter 3: Output of entire files 18
‘-v’
‘--output-duplicates’
Output consecutive lines that are identical. By default, when two or more
consecutive output lines would be identical, od outputs only the first line, and
puts just an asterisk on the following line to indicate the elision.
‘-w[n]’
‘--width[=n]’
Dump n input bytes per output line. This must be a multiple of the least
common multiple of the sizes associated with the specified output types.
If this option is not given at all, the default is 16. If n is omitted, the default is
32.
The next several options are shorthands for format specifications. GNU od accepts any
combination of shorthands and format specification options. These options accumulate.
‘-a’ Output as named characters. Equivalent to ‘-t a’.
‘-b’ Output as octal bytes. Equivalent to ‘-t o1’.
‘-c’ Output as printable single byte characters, C backslash escapes or 3 digit octal
sequences. Equivalent to ‘-t c’.
‘-d’ Output as unsigned decimal two-byte units. Equivalent to ‘-t u2’.
‘-f’ Output as floats. Equivalent to ‘-t fF’.
‘-i’ Output as decimal ints. Equivalent to ‘-t dI’.
‘-l’ Output as decimal long ints. Equivalent to ‘-t dL’.
‘-o’ Output as octal two-byte units. Equivalent to -t o2.
‘-s’ Output as decimal two-byte units. Equivalent to -t d2.
‘-x’ Output as hexadecimal two-byte units. Equivalent to ‘-t x2’.
‘--traditional’
Recognize the non-option label argument that traditional od accepted. The
following syntax:
od --traditional [file] [[+]offset[.][b] [[+]label[.][b]]]
can be used to specify at most one file and optional arguments specifying an
offset and a pseudo-start address, label. The label argument is interpreted just
like offset, but it specifies an initial pseudo-address. The pseudo-addresses are
displayed in parentheses following any normal address.
An exit status of zero indicates success, and a nonzero value indicates failure.
An exit status of zero indicates success, and a nonzero value indicates failure.
‘--base64url’
Encode into (or decode from with -d/--decode) file-and-url-safe base64 form
(using ‘_’ and ‘-’ instead of ‘+’ and ‘/’). The format conforms to RFC 4648#5
(https://datatracker.ietf.org/doc/html/rfc4648#section-5).
‘--base32’
Encode into (or decode from with -d/--decode) base32 form. The encoded data
uses the ‘ABCDEFGHIJKLMNOPQRSTUVWXYZ234567=’ characters. The format con-
forms to RFC 4648#6 (https://datatracker.ietf.org/doc/html/rfc4648#
section-6). Equivalent to the base32 command.
‘--base32hex’
Encode into (or decode from with -d/--decode) Extended Hex Alphabet base32
form. The encoded data uses the ‘0123456789ABCDEFGHIJKLMNOPQRSTUV=’ char-
acters. The format conforms to RFC 4648#7 (https://datatracker.ietf.
org/doc/html/rfc4648#section-7).
‘--base16’
Encode into (or decode from with -d/--decode) base16 (hexadecimal) form.
The encoded data uses the ‘0123456789ABCDEF’ characters. The format con-
forms to RFC 4648#8 (https://datatracker.ietf.org/doc/html/rfc4648#
section-8).
‘--base2lsbf’
Encode into (or decode from with -d/--decode) binary string form (‘0’ and ‘1’)
with the least significant bit of every byte first.
‘--base2msbf’
Encode into (or decode from with -d/--decode) binary string form (‘0’ and ‘1’)
with the most significant bit of every byte first.
‘--z85’ Encode into (or decode from with -d/--decode) Z85 form (a modified Ascii85
form). The encoded data uses the ‘0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQR
VWXYZ.-:+=^!/*?&<>()[]{}@%$#’. characters. The format conforms to
ZeroMQ spec:32/Z85 (https://rfc.zeromq.org/spec:32/Z85/).
When encoding with --z85, input length must be a multiple of 4; when decoding
with --z85, input length must be a multiple of 5.
Encoding/decoding examples:
$ printf '\376\117\202' | basenc --base64
/k+C
‘-g goal’
‘--goal=goal’
fmt initially tries to make lines goal characters wide. By default, this is 7%
shorter than width.
‘-p prefix’
‘--prefix=prefix’
Only lines beginning with prefix (possibly preceded by whitespace) are subject
to formatting. The prefix and any preceding whitespace are stripped for the
formatting and then re-attached to each formatted output line. One use is to
format certain kinds of program comments, while leaving the code unchanged.
An exit status of zero indicates success, and a nonzero value indicates failure.
as column increases; unless you use the -W/-w option to increase page width as
well. This option might well cause some lines to be truncated. The number of
lines in the columns on each page are balanced. The options -e and -i are on
for multiple text-column output. Together with -J option column alignment and
line truncation is turned off. Since spaces are converted to TABs in multicolumn
output, they can be converted back by further processing through pr -t -e or
expand. Lines of full length are joined in a free field format and -S option may
set field separators. -column may not be used with the -m option.
‘-a’
‘--across’
With each single file, print columns across rather than down. The -column
option must be given with column greater than one. If a line is too long to fit
in a column, it is truncated.
‘-c’
‘--show-control-chars’
Print control characters using hat notation (e.g., ‘^G’); print other nonprinting
characters in octal backslash notation. By default, nonprinting characters are
not changed.
‘-d’
‘--double-space’
Double space the output.
‘-D format’
‘--date-format=format’
Format header dates using format, using the same conventions as for the
command ‘date +format’. See Section 21.1 [date invocation], page 195. Except
for directives, which start with ‘%’, characters in format are printed unchanged.
You can use this option to specify an arbitrary string in place of the header
date, e.g., --date-format="Monday morning".
The default date format is ‘%Y-%m-%d %H:%M’ (for example, ‘2020-07-09 23:59’);
but if the POSIXLY_CORRECT environment variable is set and the LC_TIME locale
category specifies the POSIX locale, the default is ‘%b %e %H:%M %Y’ (for example,
‘Jul 9 23:59 2020’.
Timestamps are listed according to the time zone rules specified by the TZ
environment variable, or by the system default rules if TZ is not set. See Section
“Specifying the Time Zone with TZ” in The GNU C Library Reference Manual.
‘-e[in-tabchar[in-tabwidth]]’
‘--expand-tabs[=in-tabchar[in-tabwidth]]’
Expand tabs to spaces on input. Optional argument in-tabchar is the input tab
character (default is the TAB character). Second optional argument in-tabwidth
is the input tab character’s width (default is 8).
Chapter 4: Formatting file contents 25
‘-f’
‘-F’
‘--form-feed’
Use a form feed instead of newlines to separate output pages. This does not
alter the default page length of 66 lines.
‘-h header’
‘--header=header’
Replace the file name in the header with the centered string header. When using
the shell, header should be quoted and should be separated from -h by a space.
‘-i[out-tabchar[out-tabwidth]]’
‘--output-tabs[=out-tabchar[out-tabwidth]]’
Replace spaces with tabs on output. Optional argument out-tabchar is the
output tab character (default is the TAB character). Second optional argument
out-tabwidth is the output tab character’s width (default is 8).
‘-J’
‘--join-lines’
Merge lines of full length. Used together with the column options -column, -a
-column or -m. Turns off -W/-w line truncation; no column alignment used; may
be used with --sep-string[=string]. -J has been introduced (together with
-W and --sep-string) to disentangle the old (POSIX-compliant) options -w
and -s along with the three column options.
‘-l page_length’
‘--length=page_length’
Set the page length to page length (default 66) lines, including the lines of the
header [and the footer]. If page length is less than or equal to 10, the header
and footer are omitted, as if the -t option had been given.
‘-m’
‘--merge’ Merge and print all files in parallel, one in each column. If a line is too long to fit in
a column, it is truncated, unless the -J option is used. --sep-string[=string]
may be used. Empty pages in some files (form feeds set) produce empty columns,
still marked by string. The result is a continuous line numbering and column
marking throughout the whole merged file. Completely empty merged pages
show no separators or line numbers. The default header becomes ‘date page’
with spaces inserted in the middle; this may be used with the -h or --header
option to fill up the middle blank part.
‘-n[number-separator[digits]]’
‘--number-lines[=number-separator[digits]]’
Provide digits digit line numbering (default for digits is 5). With multicolumn
output the number occupies the first digits column positions of each text column
or only each line of -m output. With single column output the number precedes
each line just as -m does. Default counting of the line numbers starts with the
first line of the input file (not the first line printed, compare the --page option
and -N option). Optional argument number-separator is the character appended
to the line number to separate it from the text followed. The default separator
Chapter 4: Formatting file contents 26
is the TAB character. In a strict sense a TAB is always printed with single
column output only. The TAB width varies with the TAB position, e.g., with
the left margin specified by -o option. With multicolumn output priority is
given to ‘equal width of output columns’ (a POSIX specification). The TAB
width is fixed to the value of the first column and does not change with different
values of left margin. That means a fixed number of spaces is always printed
in the place of the number-separator TAB. The tabification depends upon the
output position.
‘-N line_number’
‘--first-line-number=line_number’
Start line counting with the number line number at first line of first page printed
(in most cases not the first line of the input file).
‘-o margin’
‘--indent=margin’
Indent each line with a margin margin spaces wide (default is zero). The total
page width is the size of the margin plus the page width set with the -W/-w
option. A limited overflow may occur with numbered single column output
(compare -n option).
‘-r’
‘--no-file-warnings’
Do not print a warning message when an argument file cannot be opened. (The
exit status will still be nonzero, however.)
‘-s[char]’
‘--separator[=char]’
Separate columns by a single character char. The default for char is the TAB
character without -w and ‘no character’ with -w. Without -s the default
separator ‘space’ is set. -s[char] turns off line truncation of all three column
options (-COLUMN|-a -COLUMN|-m) unless -w is set. This is a POSIX-compliant
formulation.
‘-S[string]’
‘--sep-string[=string]’
Use string to separate output columns. The -S option doesn’t affect the -W/-w
option, unlike the -s option which does. It does not affect line truncation
or column alignment. Without -S, and with -J, pr uses the default output
separator, TAB. Without -S or -J, pr uses a ‘space’ (same as -S" "). If no
‘string’ argument is specified, ‘""’ is assumed.
‘-t’
‘--omit-header’
Do not print the usual header [and footer] on each page, and do not fill out
the bottom of pages (with blank lines or a form feed). No page structure is
produced, but form feeds set in the input files are retained. The predefined
pagination is not changed. -t or -T may be useful together with other options;
e.g.: -t -e4, expand TAB characters in the input file to 4 spaces but don’t make
any other changes. Use of -t overrides -h.
Chapter 4: Formatting file contents 27
‘-T’
‘--omit-pagination’
Do not print header [and footer]. In addition eliminate all form feeds set in the
input files.
‘-v’
‘--show-nonprinting’
Print nonprinting characters in octal backslash notation.
‘-w page_width’
‘--width=page_width’
Set page width to page width characters for multiple text-column output only
(default for page width is 72). The specified page width is rounded down so
that columns have equal width. -s[CHAR] turns off the default page width
and any line truncation and column alignment. Lines of full length are merged,
regardless of the column options set. No page width setting is possible with
single column output. A POSIX-compliant formulation.
‘-W page_width’
‘--page_width=page_width’
Set the page width to page width characters, honored with and without a
column option. With a column option, the specified page width is rounded
down so that columns have equal width. Text lines are truncated, unless -J
is used. Together with one of the three column options (-column, -a -column
or -m) column alignment is always used. The separator options -S or -s don’t
disable the -W option. Default is 72 characters. Without -W page_width and
without any of the column options NO line truncation is used (defined to keep
downward compatibility and to meet most frequent tasks). That’s equivalent to
-W 72 -J. The header line is never truncated.
An exit status of zero indicates success, and a nonzero value indicates failure.
‘-b’
‘--bytes’ Count bytes rather than columns, so that tabs, backspaces, and carriage returns
are each counted as taking up one column, just like other characters.
Chapter 4: Formatting file contents 28
‘-s’
‘--spaces’
Break at word boundaries: the line is broken after the last blank before the
maximum line length. If the line contains no such blanks, the line is broken at
the maximum line length as usual.
‘-w width’
‘--width=width’
Use a maximum line length of width columns instead of 80.
For compatibility fold supports an obsolete option syntax -width. New scripts
should use -w width instead.
An exit status of zero indicates success, and a nonzero value indicates failure.
29
NUL. This option can be useful in conjunction with ‘perl -0’ or ‘find -print0’
and ‘xargs -0’ which do the same in order to reliably handle arbitrary file
names (even those containing blanks or other special characters).
For compatibility head also supports an obsolete option syntax -[num][bkm][cqv], which
is recognized only if it is specified first. num is a decimal number optionally followed by a
size letter (‘b’, ‘k’, ‘m’) as in -c, or ‘l’ to mean count by lines, or other option letters (‘cqv’).
Scripts intended for standard hosts should use -c num or -n num instead. If your script must
also run on hosts that support only the obsolete syntax, it is usually simpler to avoid head,
e.g., by using ‘sed 5q’ instead of ‘head -5’.
An exit status of zero indicates success, and a nonzero value indicates failure.
‘--max-unchanged-stats=n’
When tailing a file by name, if there have been n (default n=5) consecutive
iterations for which the file has not changed, then open/fstat the file to
determine if that file name is still associated with the same device/inode-number
pair as before. When following a log file that is rotated, this is approximately
the number of seconds between when tail prints the last pre-rotation lines and
when it prints the lines that have accumulated in the new log file. This option
is meaningful only when polling (i.e., without inotify) and when following by
name.
‘-n [+]num’
‘--lines=[+]’
Output the last num lines. If num is prefixed with a ‘+’, start printing with line
num from the start of each file. For example to skip the first line use tail -n
+2, while to skip all but the last line use tail -n 1. Size multiplier suffixes are
the same as with the -c option.
‘--pid=pid’
When following by name or by descriptor, you may specify the process ID, pid,
of one or more (by repeating --pid) writers of the file arguments. Then, shortly
after all the identified processes terminate, tail will also terminate. This will
work properly only if the writers and the tailing process are running on the
same machine. For example, to save the output of a build in a file and to watch
the file grow, if you invoke make and tail like this then the tail process will
stop when your build completes. Without this option, you would have had to
kill the tail -f process yourself.
$ make >& makerr & tail --pid=$! -f makerr
If you specify a pid that is not in use or that does not correspond to the process
that is writing to the tailed files, then tail may terminate long before any
files stop growing or it may not terminate until long after the real writer has
terminated. On some systems, --pid is not supported and tail outputs a
warning.
‘-q’
‘--quiet’
‘--silent’
Never print file name headers.
‘--retry’ Indefinitely try to open the specified file. This option is useful mainly when
following (and otherwise issues a warning).
When following by file descriptor (i.e., with --follow=descriptor), this option
only affects the initial open of the file, as after a successful open, tail will start
following the file descriptor.
When following by name (i.e., with --follow=name), tail infinitely retries to
re-open the given files until killed.
Without this option, when tail encounters a file that doesn’t exist or is otherwise
inaccessible, it reports that fact and never checks it again.
Chapter 5: Output of parts of files 33
‘-s number’
‘--sleep-interval=number’
Change the number of seconds to wait between iterations (the default is 1.0).
During one iteration, every specified file is checked to see if it has changed size.
When tail uses inotify, this polling-related option is usually ignored. However,
if you also specify --pid=p, tail checks whether process p is alive at least every
number seconds. The number must be non-negative and can be a floating-point
number in either the current or the C locale. See Section 2.12 [Floating point],
page 10.
‘-v’
‘--verbose’
Always print file name headers.
‘-z’
‘--zero-terminated’
Delimit items with a zero byte rather than a newline (ASCII LF). I.e., treat
input as items separated by ASCII NUL and terminate output items with ASCII
NUL. This option can be useful in conjunction with ‘perl -0’ or ‘find -print0’
and ‘xargs -0’ which do the same in order to reliably handle arbitrary file
names (even those containing blanks or other special characters).
For compatibility tail also supports an obsolete usage ‘tail -[num][bcl][f] [file]’,
which is recognized only if it does not conflict with the usage described above. This obsolete
form uses exactly one option and at most one file. In the option, num is an optional decimal
number optionally followed by a size letter (‘b’, ‘c’, ‘l’) to mean count by 512-byte blocks,
bytes, or lines, optionally followed by ‘f’ which has the same meaning as -f.
On systems not conforming to POSIX 1003.1-2001, the leading ‘-’ can be replaced by
‘+’ in the traditional option syntax with the same meaning as in counts, and on obsolete
systems predating POSIX 1003.1-2001 traditional usage overrides normal usage when the two
conflict. This behavior can be controlled with the _POSIX2_VERSION environment variable
(see Section 2.13 [Standards conformance], page 11).
Scripts intended for use on standard hosts should avoid traditional syntax and should use
-c num[b], -n num, and/or -f instead. If your script must also run on hosts that support
only the traditional syntax, you can often rewrite it to avoid problematic usages, e.g., by
using ‘sed -n '$p'’ rather than ‘tail -1’. If that’s not possible, the script can use a test like
‘if tail -c +1 </dev/null >/dev/null 2>&1; then ...’ to decide which syntax to use.
Even if your script assumes the standard behavior, you should still beware usages whose
behaviors differ depending on the POSIX version. For example, avoid ‘tail - main.c’, since
it might be interpreted as either ‘tail main.c’ or as ‘tail -- - main.c’; avoid ‘tail -c 4’,
since it might mean either ‘tail -c4’ or ‘tail -c 10 4’; and avoid ‘tail +4’, since it might
mean either ‘tail ./+4’ or ‘tail -n +4’.
An exit status of zero indicates success, and a nonzero value indicates failure.
single run is assumed and the minimum suffix length required is automatically
determined.
‘-x’
‘--hex-suffixes[=from]’
Like --numeric-suffixes, but use hexadecimal numbers (in lower case).
‘--additional-suffix=suffix’
Append an additional suffix to output file names. suffix must not contain slash.
‘-e’
‘--elide-empty-files’
Suppress the generation of zero-length output files. This can happen with the
--number option if a file is (truncated to be) shorter than the number requested,
or if a line is so long as to completely span a chunk. The output file sequence
numbers, always run consecutively even when this option is specified.
‘-t separator’
‘--separator=separator’
Use character separator as the record separator instead of the default newline
character (ASCII LF). To specify ASCII NUL as the separator, use the two-
character string ‘\0’, e.g., ‘split -t '\0'’.
‘-u’
‘--unbuffered’
Immediately copy input to output in --number r/... mode, which is a much
slower mode of operation.
‘--verbose’
Write a diagnostic just before each output file is opened.
An exit status of zero indicates success, and a nonzero value indicates failure.Here are a
few examples to illustrate how the --number (-n) option works:
Notice how, by default, one line may be split onto two or more:
$ seq -w 6 10 > k; split -n3 k; head xa?
==> xaa <==
06
07
==> xab <==
08
0
==> xac <==
9
10
Use the "l/" modifier to suppress that:
$ seq -w 6 10 > k; split -nl/3 k; head xa?
==> xaa <==
06
07
Chapter 5: Output of parts of files 37
an integer, that can be preceded by ‘+’ or ‘-’. If it is given, the input up to (but
not including) the matching line plus or minus offset is put into the output file,
and the line after that begins the next section of input. Lines within a negative
offset of a regexp pattern are not matched in subsequent regexp patterns.
‘%regexp%[offset]’
Like the previous type, except that it does not create an output file, so that
section of the input file is effectively ignored.
‘{repeat-count}’
Repeat the previous pattern repeat-count additional times. The repeat-count
can either be a positive integer or an asterisk, meaning repeat as many times as
necessary until the input is exhausted.
The output files’ names consist of a prefix (‘xx’ by default) followed by a suffix. By
default, the suffix is an ascending sequence of two-digit decimal numbers from ‘00’ to ‘99’.
In any case, concatenating the output files in sorted order by file name produces the original
input file, excluding portions skipped with a %regexp% pattern or the --suppress-matched
option.
By default, if csplit encounters an error or receives a hangup, interrupt, quit, or
terminate signal, it removes any output files that it has created so far before it exits.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-f prefix’
‘--prefix=prefix’
Use prefix as the output file name prefix.
‘-b format’
‘--suffix-format=format’
Use format as the output file name suffix. When this option is specified, the
suffix string must include exactly one printf(3)-style conversion specification,
possibly including format specification flags, a field width, a precision speci-
fication, or all of these kinds of modifiers. The format letter must convert a
binary unsigned integer argument to readable form. The format letters ‘d’ and
‘i’ are aliases for ‘u’, and the ‘u’, ‘o’, ‘x’, and ‘X’ conversions are allowed. The
entire format is given (with the current output file number) to sprintf(3) to
form the file name suffixes for each of the individual output files in turn. If this
option is used, the --digits option is ignored.
‘-n digits’
‘--digits=digits’
Use output file names containing numbers that are digits digits long instead of
the default 2.
‘-k’
‘--keep-files’
Do not remove output files when errors are encountered.
‘--suppress-matched’
Do not output lines matching the specified pattern. I.e., suppress the boundary
line from the start of the second and subsequent splits.
Chapter 5: Output of parts of files 39
‘-z’
‘--elide-empty-files’
Suppress the generation of zero-length output files. (In cases where the section
delimiters of the input file are supposed to mark the first lines of each of the
sections, the first output file will generally be a zero-length file unless you use
this option.) The output file sequence numbers always run consecutively starting
from 0, even when this option is specified.
‘-s’
‘-q’
‘--silent’
‘--quiet’ Do not print counts of output file sizes.
An exit status of zero indicates success, and a nonzero value indicates failure.Here is an
example of its usage. First, create an empty directory for the exercise, and cd into it:
$ mkdir d && cd d
Now, split the sequence of 1..14 on lines that end with 0 or 5:
$ seq 14 | csplit - '/[05]$/' '{*}'
8
10
15
Each number printed above is the size of an output file that csplit has just created. List
the names of those output files:
$ ls
xx00 xx01 xx02
Use head to show their contents:
$ head xx*
==> xx00 <==
1
2
3
4
6 Summarizing files
These commands generate just a few numbers representing entire contents of files.
‘-L’
‘--max-line-length’
Print only the maximum display widths. Tabs are set at every 8th column.
Display widths of wide characters are considered. Non-printable characters are
given 0 width.
‘--total=when’
Control when and how the final line with cumulative counts is printed. when is
one of:
• auto - This is the default mode of wc when no --total option is specified.
Output a total line if more than one file is specified.
• always - Always output a total line, irrespective of the number of files
processed.
• only - Only output total counts. I.e., don’t print individual file counts,
suppress any leading spaces, and don’t print the ‘total’ word itself, to
simplify subsequent processing.
• never - Never output a total line.
‘--files0-from=file’
Disallow processing files named on the command line, and instead process those
named in file file; each name being terminated by a zero byte (ASCII NUL). This
is useful when the list of file names is so long that it may exceed a command line
length limitation. In such cases, running wc via xargs is undesirable because
it splits the list into pieces and makes wc print a total for each sublist rather
than for the entire list. One way to produce a list of ASCII NUL terminated
file names is with GNU find, using its -print0 predicate. If file is ‘-’ then the
ASCII NUL terminated file names are read from standard input.
For example, to find the length of the longest line in any .c or .h file in the
current hierarchy, do this:
find . -name '*.[ch]' -print0 |
wc -L --files0-from=- | tail -n1
An exit status of zero indicates success, and a nonzero value indicates failure.
‘-s’
‘--sysv’ Compute checksums using an algorithm compatible with System V sum’s default,
and print file sizes in units of 512-byte blocks.
sum is provided for compatibility; the cksum program (see next section) is preferable in
new applications.
An exit status of zero indicates success, and a nonzero value indicates failure.
of arbitrary file names. Since the backslash character itself is escaped, any other backslash
escape sequences are reserved for future use.
especially in combination with the --zero option. This does not identify the
digest algorithm used for the checksum. See Section 6.3.1 [cksum output modes],
page 43, for details of this format.
‘-c’
‘--check’ Read file names and checksum information (not data) from each file (or from
standard input if no file was specified) and report whether the checksums match
the contents of the named files. The input to this mode is usually the output of
a prior, checksum-generating run of the command.
Three input formats are supported. Either the default output format described
above, the --tag output format, or the BSD reversed mode format which is
similar to the default mode, but doesn’t use a character to distinguish binary
and text modes.
For the cksum command, the --check option supports auto-detecting the digest
algorithm to use, when presented with checksum information in the --tag
output format.
Also for the cksum command, the --check option auto-detects the digest encod-
ing, accepting both standard hexadecimal checksums and those generated via
cksum with its --base64 option.
Output with --zero enabled is not supported by --check.
For each such line, cksum reads the named file and computes its checksum. Then,
if the computed message digest does not match the one on the line with the
file name, the file is noted as having failed the test. Otherwise, the file passes
the test. By default, for each valid line, one line is written to standard output
indicating whether the named file passed the test. After all checks have been
performed, if there were any failures, a warning is issued to standard error. Use
the --status option to inhibit that output. If any listed file cannot be opened
or read, if any valid line has a checksum inconsistent with the associated file, or
if no valid line is found, cksum exits with nonzero status. Otherwise, it exits
successfully. The cksum command does not support --check with the older
‘sysv’, ‘bsd’, or ‘crc’ algorithms.
Chapter 6: Summarizing files 46
‘--ignore-missing’
This option is useful only when verifying checksums. When verifying checksums,
don’t fail or report any status for missing files. This is useful when verifying a
subset of downloaded files given a larger list of checksums.
‘--quiet’ This option is useful only when verifying checksums. When verifying checksums,
don’t generate an ’OK’ message per successfully checked file. Files that fail the
verification are reported in the default one-line-per-file format. If there is any
checksum mismatch, print a warning summarizing the failures to standard error.
‘--status’
This option is useful only when verifying checksums. When verifying checksums,
don’t generate the default one-line-per-file diagnostic and don’t output the
warning summarizing any failures. Failures to open or read a file still evoke
individual diagnostics to standard error. If all listed files are readable and are
consistent with the associated checksums, exit successfully. Otherwise exit with
a status code indicating there was a failure.
‘--tag’ Output BSD style checksums, which indicate the checksum algorithm used. As
a GNU extension, if --zero is not used, file names with problematic characters
are escaped as described above, using the same escaping indicator of ‘\’ at the
start of the line, as used with the other output format. The --tag option implies
binary mode, and is disallowed with --text mode as supporting that would
unnecessarily complicate the output format, while providing little benefit. See
Section 6.3.1 [cksum output modes], page 43, for details of this format. The
cksum command, uses --tag as its default output format.
‘-t’
‘--text’ This option is not supported by the cksum command. Treat each input file as
text, by reading it in text mode and outputting a ‘ ’ flag. This is the inverse
of --binary. This option is the default on systems like GNU that do not
distinguish between binary and text files. On other systems, it is the default for
reading standard input when standard input is a terminal. This mode is never
defaulted to if --tag is used.
‘-w’
‘--warn’ When verifying checksums, warn about improperly formatted checksum lines.
This option is useful only if all but a few lines in the checked input are valid.
‘--strict’
When verifying checksums, if one or more input line is invalid, exit nonzero after
all warnings have been issued.
‘-z’
‘--zero’ Output a zero byte (ASCII NUL) at the end of each line, rather than a newline.
This option enables other programs to parse the output even when that output
would contain data with embedded newlines.Also file name escaping is not used.
The MD5 digest is more reliable than a simple CRC (provided by the cksum command)
for detecting accidental file corruption, as the chances of accidentally having two files with
identical MD5 are vanishingly small. However, it should not be considered secure against
malicious tampering: although finding a file with a given MD5 fingerprint is considered
infeasible at the moment, it is known how to modify certain files, including digital certificates,
so that they appear valid when signed with an MD5 digest. For more secure hashes, consider
using SHA-2 or b2sum. See Section 6.7 [sha2 utilities], page 48. See Section 6.5 [b2sum
invocation], page 47.
If a file is specified as ‘-’ or if no files are given md5sum computes the checksum for the
standard input. md5sum can also determine whether a file and checksum are consistent.
Synopsis:
md5sum [option]... [file]...
md5sum uses the ‘Untagged output format’ for each specified file, as described at
Section 6.3.1 [cksum output modes], page 43.
The program accepts Section 6.3.3 [cksum common options], page 45. Also see Chapter 2
[Common options], page 2.
An exit status of zero indicates success, and a nonzero value indicates failure.
so that they appear valid when signed with an SHA-1 digest. For more secure hashes, consider
using SHA-2 or b2sum. See Section 6.7 [sha2 utilities], page 48. See Section 6.5 [b2sum
invocation], page 47.
If a file is specified as ‘-’ or if no files are given sha1sum computes the checksum for the
standard input. sha1sum can also determine whether a file and checksum are consistent.
Synopsis:
sha1sum [option]... [file]...
sha1sum uses the ‘Untagged output format’ for each specified file, as described at
Section 6.3.1 [cksum output modes], page 43.
The program accepts Section 6.3.3 [cksum common options], page 45. Also see Chapter 2
[Common options], page 2.
Exit status:
0 if no error occurred
1 if invoked with -c or -C and the input is not sorted
2 if an error occurred
If the environment variable TMPDIR is set, sort uses its value as the directory for
temporary files instead of /tmp. The --temporary-directory (-T) option in turn overrides
the environment variable.
The following options affect the ordering of output lines. They may be specified globally
or as part of a specific key field. If no key fields are specified, global options apply to
comparison of entire lines; otherwise the global options are inherited by key fields that do
not specify any special options of their own. In pre-POSIX versions of sort, global options
affect only later key fields, so portable shell scripts should specify global options first.
‘-b’
‘--ignore-leading-blanks’
Ignore leading blanks when finding sort keys in each line. By default a blank
is a space or a tab, but the LC_CTYPE locale can change this. Blanks may be
ignored by your locale’s collating rules, but without this option they will be
significant for character positions specified in keys with the -k option.
‘-d’
‘--dictionary-order’
Sort in phone directory order: ignore all characters except letters, digits and
blanks when sorting. By default letters and digits are those of ASCII and a
blank is a space or a tab, but the LC_CTYPE locale can change this.
‘-f’
‘--ignore-case’
Fold lowercase characters into the equivalent uppercase characters when com-
paring so that, for example, ‘b’ and ‘B’ sort as equal. The LC_CTYPE locale
determines character types. When used with --unique those lower case equiva-
lent lines are thrown away. (There is currently no way to throw away the upper
case equivalent instead. (Any --reverse given would only affect the final result,
after the throwing away.))
‘-g’
‘--general-numeric-sort’
‘--sort=general-numeric’
Sort numerically, converting a prefix of each line to a long double-precision
floating point number. See Section 2.12 [Floating point], page 10. Do not report
overflow, underflow, or conversion errors. Use the following collating sequence:
• Lines that do not start with numbers (all considered to be equal).
• NaNs (“Not a Number” values, in IEEE floating point arithmetic) in a
consistent but machine-dependent order.
• Minus infinity.
• Finite numbers in ascending numeric order (with −0 and +0 equal).
• Plus infinity.
Chapter 7: Operating on sorted files 51
locale, spaces and tabs are blanks, there is no thousands separator, and ‘.’ is
the decimal point.
Neither a leading ‘+’ nor exponential notation is recognized. To compare such
strings numerically, use the --general-numeric-sort (-g) option.
‘-V’
‘--version-sort’
Sort by version name and number. It behaves like a standard sort, except
that each sequence of decimal digits is treated numerically as an index/version
number. (See Chapter 30 [Version sort ordering], page 253.)
‘-r’
‘--reverse’
Reverse the result of comparison, so that lines with greater key values appear
earlier in the output instead of later.
‘-R’
‘--random-sort’
‘--sort=random’
Sort by hashing the input keys and then sorting the hash values. Choose the
hash function at random, ensuring that it is free of collisions so that differing
keys have differing hash values. This is like a random permutation of the inputs
(see Section 7.2 [shuf invocation], page 57), except that keys with the same value
sort together.
If multiple random sort fields are specified, the same random hash function is
used for all fields. To use different random hash functions for different fields,
you can invoke sort more than once.
The choice of hash function is affected by the --random-source option.
Other options are:
‘--compress-program=prog’
Compress any temporary files with the program prog.
With no arguments, prog must compress standard input to standard output,
and when given the -d option it must decompress standard input to standard
output.
Terminate with an error if prog exits with nonzero status.
White space and the backslash character should not appear in prog; they are
reserved for future use.
‘--files0-from=file’
Disallow processing files named on the command line, and instead process those
named in file file; each name being terminated by a zero byte (ASCII NUL). This
is useful when the list of file names is so long that it may exceed a command
line length limitation. In such cases, running sort via xargs is undesirable
because it splits the list into pieces and makes sort print sorted output for each
sublist rather than for the entire list. One way to produce a list of ASCII NUL
terminated file names is with GNU find, using its -print0 predicate. If file is
‘-’ then the ASCII NUL terminated file names are read from standard input.
Chapter 7: Operating on sorted files 53
‘-k pos1[,pos2]’
‘--key=pos1[,pos2]’
Specify a sort field that consists of the part of the line between pos1 and pos2
(or the end of the line, if pos2 is omitted), inclusive.
In its simplest form pos specifies a field number (starting with 1), with fields
being separated by runs of blank characters, and by default those blanks being
included in the comparison at the start of each field. To adjust the handling of
blank characters see the -b and -t options.
More generally, each pos has the form ‘f[.c][opts]’, where f is the number of
the field to use, and c is the number of the first character from the beginning
of the field. Fields and character positions are numbered starting with 1; a
character position of zero in pos2 indicates the field’s last character. If ‘.c’ is
omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from
pos2, it defaults to 0 (the end of the field). opts are ordering options, allowing
individual keys to be sorted according to different rules; see below for details.
Keys can span multiple fields.
Example: To sort on the second field, use --key=2,2 (-k 2,2). See below for
more notes on keys and more examples. See also the --debug option to help
determine the part of the line being used in the sort.
‘--debug’ Highlight the portion of each line used for sorting. Also issue warnings about
questionable usage to standard error.
‘--batch-size=nmerge’
Merge at most nmerge inputs at once.
When sort has to merge more than nmerge inputs, it merges them in groups of
nmerge, saving the result in a temporary file, which is then used as an input in
a subsequent merge.
A large value of nmerge may improve merge performance and decrease temporary
storage utilization at the expense of increased memory usage and I/O. Conversely
a small value of nmerge may reduce memory requirements and I/O at the expense
of temporary storage consumption and merge performance.
The value of nmerge must be at least 2. The default value is currently 16, but
this is implementation-dependent and may change in the future.
The value of nmerge may be bounded by a resource limit for open file descriptors.
The commands ‘ulimit -n’ or ‘getconf OPEN_MAX’ may display limits for your
systems; these limits may be modified further if your program already has some
files open, or if the operating system has other limits on the number of open
files. If the value of nmerge exceeds the resource limit, sort silently uses a
smaller value.
‘-o output-file’
‘--output=output-file’
Write output to output-file instead of standard output. Normally, sort reads
all input before opening output-file, so you can sort a file in place by using
commands like sort -o F F and cat F | sort -o F. However, it is often safer
to output to an otherwise-unused file, as data may be lost if the system crashes
Chapter 7: Operating on sorted files 54
or sort encounters an I/O or other serious error while a file is being sorted in
place. Also, sort with --merge (-m) can open the output file before reading all
input, so a command like cat F | sort -m -o F - G is not safe as sort might
start writing F before cat is done reading it.
On newer systems, -o cannot appear after an input file if POSIXLY_CORRECT is
set, e.g., ‘sort F -o F’. Portable scripts should specify -o output-file before
any input files.
‘--random-source=file’
Use file as a source of random data used to determine which random hash
function to use with the -R option. See Section 2.5 [Random sources], page 7.
‘-s’
‘--stable’
Make sort stable by disabling its last-resort comparison. This option has no
effect if no fields or global ordering options other than --reverse (-r) are
specified.
‘-S size’
‘--buffer-size=size’
Use a main-memory sort buffer of the given size. By default, size is in units
of 1024 bytes. Appending ‘%’ causes size to be interpreted as a percentage
of physical memory. Appending ‘K’ multiplies size by 1024 (the default), ‘M’
by 1,048,576, ‘G’ by 1,073,741,824, and so on for ‘T’, ‘P’, ‘E’, ‘Z’, ‘Y’, ‘R’, and
‘Q’. Appending ‘b’ causes size to be interpreted as a byte count, with no
multiplication.
This option can improve the performance of sort by causing it to start with a
larger or smaller sort buffer than the default. However, this option affects only
the initial buffer size. The buffer grows beyond size if sort encounters input
lines larger than size.
‘-t separator’
‘--field-separator=separator’
Use character separator as the field separator when finding the sort keys in each
line. By default, fields are separated by the empty string between a non-blank
character and a blank character. By default a blank is a space or a tab, but the
LC_CTYPE locale can change this.
That is, given the input line ‘ foo bar’, sort breaks it into fields ‘ foo’ and
‘ bar’. The field separator is not considered to be part of either the field preceding
or the field following, so with ‘sort -t " "’ the same input line has three fields:
an empty field, ‘foo’, and ‘bar’. However, fields that extend to the end of the
line, as -k 2, or fields consisting of a range, as -k 2,3, retain the field separators
present between the endpoints of the range.
To specify ASCII NUL as the field separator, use the two-character string ‘\0’,
e.g., ‘sort -t '\0'’.
Chapter 7: Operating on sorted files 55
‘-T tempdir’
‘--temporary-directory=tempdir’
Use directory tempdir to store temporary files, overriding the TMPDIR environ-
ment variable. If this option is given more than once, temporary files are stored
in all the directories given. If you have a large sort or merge that is I/O-bound,
you can often improve performance by using this option to specify directories
on different file systems.
‘--parallel=n’
Set the number of sorts run in parallel to n. By default, n is set to the number
of available processors, but limited to 8, as performance gains diminish after
that. Using n threads increases the memory usage by a factor of log n. Also see
Section 21.3 [nproc invocation], page 203.
‘-u’
‘--unique’
Normally, output only the first of a sequence of lines that compare equal. For
the --check (-c or -C) option, check that no pair of consecutive lines compares
equal.
This option also disables the default last-resort comparison.
The commands sort -u and sort | uniq are equivalent, but this equivalence
does not extend to arbitrary sort options. For example, sort -n -u inspects only
the value of the initial numeric string when checking for uniqueness, whereas sort
-n | uniq inspects the entire line. See Section 7.3 [uniq invocation], page 59.
‘-z’
‘--zero-terminated’
Delimit items with a zero byte rather than a newline (ASCII LF). I.e., treat
input as items separated by ASCII NUL and terminate output items with ASCII
NUL. This option can be useful in conjunction with ‘perl -0’ or ‘find -print0’
and ‘xargs -0’ which do the same in order to reliably handle arbitrary file
names (even those containing blanks or other special characters).
Historical (BSD and System V) implementations of sort have differed in their interpre-
tation of some options, particularly -b, -f, and -n. GNU sort follows the POSIX behavior,
which is usually (but not always!) like the System V behavior. According to POSIX, -n no
longer implies -b. For consistency, -M has been changed in the same way. This may affect
the meaning of character positions in field specifications in obscure cases. The only fix is to
add an explicit -b.
A position in a sort field specified with -k may have any of the option letters ‘MbdfghinRrV’
appended to it, in which case no global ordering options are inherited by that particular
field. The -b option may be independently attached to either or both of the start and end
positions of a field specification, and if it is inherited from the global options it will be
attached to both. If input lines can contain leading or adjacent blanks and -t is not used,
then -k is typically combined with -b or an option that implicitly ignores leading blanks
(‘Mghn’) as otherwise the varying numbers of leading blanks in fields can cause confusing
results.
Chapter 7: Operating on sorted files 56
If the start position in a sort field specifier falls after the end of the line or after the end
field, the field is empty. If the -b option was specified, the ‘.c’ part of a field specification is
counted from the first nonblank character of the field.
On systems not conforming to POSIX 1003.1-2001, sort supports a traditional origin-zero
syntax ‘+pos1 [-pos2]’ for specifying sort keys. The traditional command ‘sort +a.x -b.y’
is equivalent to ‘sort -k a+1.x+1,b’ if y is ‘0’ or absent, otherwise it is equivalent to ‘sort
-k a+1.x+1,b+1.y’.
This traditional behavior can be controlled with the _POSIX2_VERSION environment
variable (see Section 2.13 [Standards conformance], page 11); it can also be enabled when
POSIXLY_CORRECT is not set by using the traditional syntax with ‘-pos2’ present.
Scripts intended for use on standard hosts should avoid traditional syntax and should
use -k instead. For example, avoid ‘sort +2’, since it might be interpreted as either ‘sort
./+2’ or ‘sort -k 3’. If your script must also run on hosts that support only the traditional
syntax, it can use a test like ‘if sort -k 1 </dev/null >/dev/null 2>&1; then ...’ to
decide which syntax to use.
Here are some examples to illustrate various combinations of options.
• Sort in descending (reverse) numeric order.
sort -n -r
• Run no more than 4 sorts concurrently, using a buffer size of 10M.
sort --parallel=4 -S 10M
• Sort alphabetically, omitting the first and second fields and the blanks at the start of
the third field. This uses a single key composed of the characters beginning at the start
of the first nonblank character in field three and extending to the end of each line.
sort -k 3b
• Sort numerically on the second field and resolve ties by sorting alphabetically on the
third and fourth characters of field five. Use ‘:’ as the field delimiter.
sort -t : -k 2,2n -k 5.3,5.4
If you had written -k 2n instead of -k 2,2n sort would have used all characters
beginning in the second field and extending to the end of the line as the primary
numeric key. For the large majority of applications, treating keys spanning more than
one field as numeric will not do what you expect.
Also, the ‘n’ modifier was applied to the field-end specifier for the first key. It would
have been equivalent to specify -k 2n,2 or -k 2n,2n. All modifiers except ‘b’ apply
to the associated field, regardless of whether the modifier character is attached to the
field-start and/or the field-end part of the key specifier.
• Sort the password file on the fifth field and ignore any leading blanks. Sort lines with
equal values in field five on the numeric user ID in field three. Fields are separated by
‘:’.
sort -t : -k 5b,5 -k 3,3n /etc/passwd
sort -t : -n -k 5b,5 -k 3,3 /etc/passwd
sort -t : -b -k 5,5 -k 3,3n /etc/passwd
These three commands have equivalent effect. The first specifies that the first key’s start
position ignores leading blanks and the second key is sorted numerically. The other
Chapter 7: Operating on sorted files 57
two commands rely on global options being inherited by sort keys that lack modifiers.
The inheritance works in this case because -k 5b,5b and -k 5b,5 are equivalent, as the
location of a field-end lacking a ‘.c’ character position is not affected by whether initial
blanks are skipped.
• Sort a set of log files, primarily by IPv4 address and secondarily by timestamp. If two
lines’ primary and secondary keys are identical, output the lines in the same order that
they were input. The log files contain lines that look like this:
4.150.156.3 - - [01/Apr/2020:06:31:51 +0000] message 1
211.24.3.231 - - [24/Apr/2020:20:17:39 +0000] message 2
Fields are separated by exactly one space. Sort IPv4 addresses lexicographically, e.g.,
212.61.52.2 sorts before 212.129.233.201 because 61 is less than 129.
sort -s -t ' ' -k 4.9n -k 4.5M -k 4.2n -k 4.14,4.21 file*.log |
sort -s -t '.' -k 1,1n -k 2,2n -k 3,3n -k 4,4n
This example cannot be done with a single POSIX sort invocation, since IPv4 address
components are separated by ‘.’ while dates come just after a space. So it is broken
down into two invocations of sort: the first sorts by timestamp and the second by
IPv4 address. The timestamp is sorted by year, then month, then day, and finally by
hour-minute-second field, using -k to isolate each field. Except for hour-minute-second
there’s no need to specify the end of each key field, since the ‘n’ and ‘M’ modifiers sort
based on leading prefixes that cannot cross field boundaries. The IPv4 addresses are
sorted lexicographically. The second sort uses ‘-s’ so that ties in the primary key are
broken by the secondary key; the first sort uses ‘-s’ so that the combination of the two
sorts is stable. As a GNU extension, the above example could be achieved in a single
sort invocation by sorting the IPv4 address field using a ‘V’ version type, like ‘-k1,1V’.
• Generate a tags file in case-insensitive sorted order.
find src -type f -print0 | sort -z -f | xargs -0 etags --append
The use of -print0, -z, and -0 in this case means that file names that contain blanks
or other special characters are not broken up by the sort operation.
• Use the common DSU, Decorate Sort Undecorate idiom to sort lines according to their
length.
awk '{print length, $0}' /etc/passwd | sort -n | cut -f2- -d' '
In general this technique can be used to sort data that the sort command does not
support, or is inefficient at, sorting directly.
• Shuffle a list of directories, but preserve the order of files within each directory. For
instance, one could use this to generate a music playlist in which albums are shuffled
but the songs of each album are played in order.
ls */* | sort -t / -k 1,1R -k 2,2
A man,
a canal:
a plan,
Similarly, the command:
shuf -e clubs hearts diamonds spades
might output:
clubs
diamonds
spades
hearts
and the command ‘shuf -i 1-4’ might output:
4
2
1
3
The above examples all have four input lines, so shuf might produce any of the twenty-four
possible permutations of the input. In general, if there are n input lines, there are n! (i.e., n
factorial, or n * (n - 1) * . . . * 1) possible output permutations.
To output 50 random numbers each in the range 0 through 9, use:
shuf -r -n 50 -i 0-9
To simulate 100 coin flips, use:
shuf -r -n 100 -e Head Tail
An exit status of zero indicates success, and a nonzero value indicates failure.
characters followed by non-blank characters. Field numbers are one based, i.e.,
-f 1 will skip the first field (which may optionally have leading blanks).
For compatibility uniq supports a traditional option syntax -n. New scripts
should use -f n instead.
‘-s n’
‘--skip-chars=n’
Skip n characters before checking for uniqueness. Use a null string for comparison
if a line has fewer than n characters. If you use both the field and character
skipping options, fields are skipped over first.
On systems not conforming to POSIX 1003.1-2001, uniq supports a traditional
option syntax +n. Although this traditional behavior can be controlled with
the _POSIX2_VERSION environment variable (see Section 2.13 [Standards con-
formance], page 11), portable scripts should avoid commands whose behavior
depends on this variable. For example, use ‘uniq ./+10’ or ‘uniq -s 10’ rather
than the ambiguous ‘uniq +10’.
‘-c’
‘--count’ Print the number of times each line occurred along with the line.
‘-i’
‘--ignore-case’
Ignore differences in case when comparing lines.
‘-d’
‘--repeated’
Discard lines that are not repeated. When used by itself, this option causes
uniq to print the first copy of each repeated line, and nothing else.
‘-D’
‘--all-repeated[=delimit-method]’
Do not discard the second and subsequent repeated input lines, but discard lines
that are not repeated. This option is useful mainly in conjunction with other
options e.g., to ignore case or to compare only selected fields. The optional
delimit-method, supported with the long form option, specifies how to delimit
groups of repeated lines, and must be one of the following:
‘none’ Do not delimit groups of repeated lines. This is equivalent to
--all-repeated (-D).
‘prepend’ Output a newline before each group of repeated lines. With
--zero-terminated (-z), use a zero byte (ASCII NUL) instead of
a newline as the delimiter.
‘separate’
Separate groups of repeated lines with a single newline. This is the
same as using ‘prepend’, except that no delimiter is inserted before
the first group, and hence may be better suited for output direct
to users. With --zero-terminated (-z), use a zero byte (ASCII
NUL) instead of a newline as the delimiter.
Chapter 7: Operating on sorted files 61
Output is ambiguous when groups are delimited and the input stream contains
empty lines. To avoid that, filter the input through ‘tr -s '\n'’ to remove
blank lines.This is a GNU extension.
‘--group[=delimit-method]’
Output all lines, and delimit each unique group. With --zero-terminated
(-z), use a zero byte (ASCII NUL) instead of a newline as the delimiter.The
optional delimit-method specifies how to delimit groups, and must be one of
the following:
‘separate’
Separate unique groups with a single delimiter. This is the default
delimiting method if none is specified, and better suited for output
direct to users.
‘prepend’ Output a delimiter before each group of unique items.
‘append’ Output a delimiter after each group of unique items.
‘both’ Output a delimiter around each group of unique items.
Output is ambiguous when groups are delimited and the input stream contains
empty lines. To avoid that, filter the input through ‘tr -s '\n'’ to remove
blank lines.This is a GNU extension.
‘-u’
‘--unique’
Discard the last line that would be output for a repeated input group. When
used by itself, this option causes uniq to print unique lines, and nothing else.
‘-w n’
‘--check-chars=n’
Compare at most n characters on each line (after skipping any specified fields
and characters). By default the entire rest of the lines are compared.
‘-z’
‘--zero-terminated’
Delimit items with a zero byte rather than a newline (ASCII LF). I.e., treat
input as items separated by ASCII NUL and terminate output items with ASCII
NUL. This option can be useful in conjunction with ‘perl -0’ or ‘find -print0’
and ‘xargs -0’ which do the same in order to reliably handle arbitrary file
names (even those containing blanks or other special characters).With -z the
newline character is treated as a field separator.
An exit status of zero indicates success, and a nonzero value indicates failure.
newline is silently appended. The sort command with no options always outputs a file that
is suitable input to comm.
With no options, comm produces three-column output. Column one contains lines unique
to file1, column two contains lines unique to file2, and column three contains lines common
to both files. Columns are separated by a single TAB character.
The options -1, -2, and -3 suppress printing of the corresponding columns (and separa-
tors). Also see Chapter 2 [Common options], page 2.
Unlike some other comparison utilities, comm has an exit status that does not depend on
the result of the comparison. Upon normal completion comm produces an exit code of zero.
If there is an error it exits with nonzero status.
If the --check-order option is given, unsorted inputs will cause a fatal error message. If
the option --nocheck-order is given, unsorted inputs will never cause an error message. If
neither of these options is given, wrongly sorted inputs are diagnosed only if an input file is
found to contain unpairable lines. If an input file is diagnosed as being unsorted, the comm
command will exit with a nonzero status (and the output should not be used).
Forcing comm to process wrongly sorted input files containing unpairable lines by specifying
--nocheck-order is not guaranteed to produce any particular output. The output will
probably not correspond with whatever you hoped it would be.
‘--check-order’
Fail with an error message if either input file is wrongly ordered.
‘--nocheck-order’
Do not check that both input files are in sorted order.
Other options are:
‘--output-delimiter=str’
Print str between adjacent output columns, rather than the default of a single
TAB character.
The delimiter str may be empty, in which case the ASCII NUL character is used
to delimit output columns.
‘--total’ Output a summary at the end.
Similar to the regular output, column one contains the total number of lines
unique to file1, column two contains the total number of lines unique to file2, and
column three contains the total number of lines common to both files, followed
by the word ‘total’ in the additional column four.
In the following example, comm omits the regular output (-123), thus just
printing the summary:
$ printf '%s\n' a b c d e > file1
$ printf '%s\n' b c d e f g > file2
$ comm --total -123 file1 file2
1 2 4 total
This option is a GNU extension. Portable scripts should use wc to get the totals,
e.g. for the above example files:
$ comm -23 file1 file2 | wc -l # number of lines only in file1
Chapter 7: Operating on sorted files 63
1
$ comm -13 file1 file2 | wc -l # number of lines only in file2
2
$ comm -12 file1 file2 | wc -l # number of lines common to both files
4
‘-z’
‘--zero-terminated’
Delimit items with a zero byte rather than a newline (ASCII LF). I.e., treat
input as items separated by ASCII NUL and terminate output items with ASCII
NUL. This option can be useful in conjunction with ‘perl -0’ or ‘find -print0’
and ‘xargs -0’ which do the same in order to reliably handle arbitrary file
names (even those containing blanks or other special characters).
An exit status of zero indicates success, and a nonzero value indicates failure.
‘-o file’
‘--only-file=file’
The file associated with this option contains a list of words which will be retained
in concordance output; any word not mentioned in this file is ignored. The file
is called the Only file. The file contains exactly one word in each line; the end
of line separation of words is not subject to the value of the -S option.
There is no default for the Only file. When both an Only file and an Ignore file
are specified, a word is considered a keyword only if it is listed in the Only file
and not in the Ignore file.
‘-r’
‘--references’
On each input line, the leading sequence of non-white space characters will be
taken to be a reference that has the purpose of identifying this input line in the
resulting permuted index. See Section 7.5.4 [Output formatting in ptx], page 66,
for more information about reference production. Using this option changes the
default value for option -S.
Using this option, the program does not try very hard to remove references from
contexts in output, but it succeeds in doing so when the context ends exactly at
the newline. If option -r is used with -S default value, or when GNU extensions
are disabled, this condition is always met and references are completely excluded
from the output contexts.
‘-S regexp’
‘--sentence-regexp=regexp’
This option selects which regular expression will describe the end of a line or
the end of a sentence. In fact, this regular expression is not the only distinction
between end of lines or end of sentences, and input line boundaries have no
special significance outside this option. By default, when GNU extensions are
enabled and if -r option is not used, end of sentences are used. In this case,
this regex is imported from GNU Emacs:
[.?!][]\"')}]*\\($\\|\t\\| \\)[ \t\n]*
Whenever GNU extensions are disabled or if -r option is used, end of lines are
used; in this case, the default regexp is just:
\n
Using an empty regexp is equivalent to completely disabling end of line or end of
sentence recognition. In this case, the whole file is considered to be a single big
line or sentence. The user might want to disallow all truncation flag generation
as well, through option -F "". See Section “Syntax of Regular Expressions” in
The GNU Emacs Manual.
When the keywords happen to be near the beginning of the input line or sentence,
this often creates an unused area at the beginning of the output context line;
when the keywords happen to be near the end of the input line or sentence, this
often creates an unused area at the end of the output context line. The program
tries to fill those unused areas by wrapping around context in them; the tail of
the input line or sentence is used to fill the unused area on the left of the output
Chapter 7: Operating on sorted files 66
line; the head of the input line or sentence is used to fill the unused area on the
right of the output line.
As a matter of convenience to the user, many usual backslashed escape sequences
from the C language are recognized and converted to the corresponding characters
by ptx itself.
‘-W regexp’
‘--word-regexp=regexp’
This option selects which regular expression will describe each keyword. By
default, if GNU extensions are enabled, a word is a sequence of letters; the
regexp used is ‘\w+’. When GNU extensions are disabled, a word is by default
anything which ends with a space, a tab or a newline; the regexp used is ‘[^
\t\n]+’.
An empty regexp is equivalent to not using this option. See Section “Syntax of
Regular Expressions” in The GNU Emacs Manual.
As a matter of convenience to the user, many usual backslashed escape sequences,
as found in the C language, are recognized and converted to the corresponding
characters by ptx itself.
not take into account the space taken by references, nor the gap that precedes
them.
‘-A’
‘--auto-reference’
Select automatic references. Each input line will have an automatic reference
made up of the file name and the line ordinal, with a single colon between them.
However, the file name will be empty when standard input is being read. If both
-A and -r are selected, then the input reference is still read and skipped, but
the automatic reference is used at output time, overriding the input reference.
‘-R’
‘--right-side-refs’
In the default output format, when option -R is not used, any references produced
by the effect of options -r or -A are placed to the far right of output lines, after
the right context. With default output format, when the -R option is specified,
references are rather placed at the beginning of each output line, before the left
context. For any other output format, option -R is ignored, with one exception:
with -R the width of references is not taken into account in total output width
given by -w.
This option is automatically selected whenever GNU extensions are disabled.
‘-F string’
‘--flag-truncation=string’
This option will request that any truncation in the output be reported using the
string string. Most output fields theoretically extend towards the beginning or
the end of the current line, or current sentence, as selected with option -S. But
there is a maximum allowed output line width, changeable through option -w,
which is further divided into space for various output fields. When a field has
to be truncated because it cannot extend beyond the beginning or the end of
the current line to fit in, then a truncation occurs. By default, the string used
is a single slash, as in -F /.
string may have more than one character, as in -F .... Also, in the particular
case when string is empty (-F ""), truncation flagging is disabled, and no
truncation marks are appended in this case.
As a matter of convenience to the user, many usual backslashed escape sequences,
as found in the C language, are recognized and converted to the corresponding
characters by ptx itself.
‘-M string’
‘--macro-name=string’
Select another string to be used instead of ‘xx’, while generating output suitable
for nroff, troff or TEX.
‘-O’
‘--format=roff’
Choose an output format suitable for nroff or troff processing. Each output
line will look like:
.xx "tail" "before" "keyword_and_after" "head" "ref"
Chapter 7: Operating on sorted files 68
so it will be possible to write a ‘.xx’ roff macro to take care of the output
typesetting. This is the default output format when GNU extensions are
disabled. Option -M can be used to change ‘xx’ to another macro name.
In this output format, each non-graphical character, like newline and tab, is
merely changed to exactly one space, with no special attempt to compress
consecutive spaces. Each quote character ‘"’ is doubled so it will be correctly
processed by nroff or troff.
‘-T’
‘--format=tex’
Choose an output format suitable for TEX processing. Each output line will
look like:
\xx {tail}{before}{keyword}{after}{head}{ref}
so it will be possible to write a \xx definition to take care of the output
typesetting. When references are not being produced, that is, neither option
-A nor option -r is selected, the last parameter of each \xx call is inhibited.
Option -M can be used to change ‘xx’ to another macro name.
In this output format, some special characters, like ‘$’, ‘%’, ‘&’, ‘#’ and ‘_’ are
automatically protected with a backslash. Curly brackets ‘{’, ‘}’ are protected
with a backslash and a pair of dollar signs (to force mathematical mode). The
backslash itself produces the sequence \backslash{}. Circumflex and tilde
diacritical marks produce the sequence ^\{ } and ~\{ } respectively. Other
diacriticized characters of the underlying character set produce an appropriate
TEX sequence as far as possible. The other non-graphical characters, like newline
and tab, and all other characters which are not part of ASCII, are merely changed
to exactly one space, with no special attempt to compress consecutive spaces.
Let me know how to improve this special character processing for TEX.
Moreover, some options have a slightly different meaning when GNU extensions are
enabled, as explained below.
• By default, concordance output is not formatted for troff or nroff. It is rather
formatted for a dumb terminal. troff or nroff output may still be selected through
option -O.
• Unless -R option is used, the maximum reference width is subtracted from the total
output line width. With GNU extensions disabled, width of references is not taken into
account in the output line width computations.
• All 256 bytes, even ASCII NUL bytes, are always read and processed from input file
with no adverse effect, even if GNU extensions are disabled. However, System V ptx
does not accept 8-bit characters, a few control characters are rejected, and the tilde ‘~’
is also rejected.
• Input line length is only limited by available memory, even if GNU extensions are
disabled. However, System V ptx processes only the first 200 characters in each line.
• The break (non-word) characters default to be every character except all letters of the
underlying character set, diacriticized or not. When GNU extensions are disabled, the
break characters default to space, tab and newline only.
• The program makes better use of output line width. If GNU extensions are disabled, the
program rather tries to imitate System V ptx, but still, there are some slight disposition
glitches this program does not completely reproduce.
• The user can specify both an Ignore file and an Only file. This is not allowed with
System V ptx.
f
Consider a more realistic example. You have a large set of functions all in one file, and
they may all be declared static except one. Currently that one (say main) is the first function
defined in the file, and the ones it calls directly follow it, followed by those they call, etc.
Let’s say that you are determined to take advantage of prototypes, so you have to choose
between declaring all of those functions (which means duplicating a lot of information from
the definitions) and rearranging the functions so that as many as possible are defined before
they are used. One way to automate the latter process is to get a list for each function of
the functions it calls directly. Many programs can generate such lists. They describe a call
graph. Consider the following list, in which a given line indicates that the function on the
left calls the one on the right directly.
main parse_options
main tail_file
main tail_forever
tail_file pretty_name
tail_file write_header
tail_file tail
tail_forever recheck
tail_forever pretty_name
tail_forever write_header
tail_forever dump_remainder
tail tail_lines
tail tail_bytes
tail_lines start_lines
tail_lines dump_remainder
tail_lines file_lines
tail_lines pipe_lines
tail_bytes xlseek
tail_bytes start_bytes
tail_bytes dump_remainder
tail_bytes pipe_bytes
file_lines dump_remainder
recheck pretty_name
then you can use tsort to produce an ordering of those functions that satisfies your
requirement.
example$ tsort call-graph | tac
dump_remainder
start_lines
file_lines
pipe_lines
xlseek
start_bytes
pipe_bytes
tail_lines
tail_bytes
pretty_name
Chapter 7: Operating on sorted files 71
write_header
tail
recheck
parse_options
tail_file
tail_forever
main
tsort detects any cycles in the input and writes the first cycle encountered to standard
error.
For a given partial ordering, generally there is no unique total ordering. In the context
of the call graph above, the function parse_options may be placed anywhere in the list as
long as it precedes main.
The only options are --help and --version. See Chapter 2 [Common options], page 2.
An exit status of zero indicates success, and a nonzero value indicates failure.
8 Operating on fields
An exit status of zero indicates success, and a nonzero value indicates failure.
c
Take lines sequentially from each file:
$ paste num2 let3
1 a
2 b
c
Duplicate lines from a file:
$ paste num2 let3 num2
1 a 1
2 b 2
c
Intermix lines from standard input:
$ paste - let3 - < num2
1 a 2
b
c
Join consecutive lines with a space:
$ seq 4 | paste -d ' ' - -
1 2
3 4
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-s’
‘--serial’
Paste the lines of one file at a time rather than one line from each file. Using
the above example data:
$ paste -s num2 let3
1 2
a b c
‘-d delim-list’
‘--delimiters=delim-list’
Consecutively use the characters in delim-list instead of TAB to separate merged
lines. When delim-list is exhausted, start again at its beginning. Using the
above example data:
$ paste -d '%_' num2 let3 num2
1%a_1
2%b_2
%c_
‘-z’
‘--zero-terminated’
Delimit items with a zero byte rather than a newline (ASCII LF). I.e., treat
input as items separated by ASCII NUL and terminate output items with ASCII
NUL. This option can be useful in conjunction with ‘perl -0’ or ‘find -print0’
and ‘xargs -0’ which do the same in order to reliably handle arbitrary file
names (even those containing blanks or other special characters).
Chapter 8: Operating on fields 75
An exit status of zero indicates success, and a nonzero value indicates failure.
$ cat file2
a X
e Y
f Z
‘--header’
Treat the first line of each input file as a header line. The header lines will be
joined and printed as the first output line. If -o is used to specify output format,
the header line will be printed according to the specified format. The header
lines will not be checked for ordering even if --check-order is specified. Also
if the header lines from each file do not match, the heading fields from the first
file will be used.
‘-i’
‘--ignore-case’
Ignore differences in case when comparing keys. With this option, the lines of
the input files must be ordered in the same way. Use ‘sort -f’ to produce this
ordering.
‘-1 field’ Join on field field (a positive integer) of file 1.
‘-2 field’ Join on field field (a positive integer) of file 2.
‘-j field’ Equivalent to -1 field -2 field.
‘-o field-list’
‘-o auto’ If the keyword ‘auto’ is specified, infer the output format from the first line in
each file. This is the same as the default output format but also ensures the
same number of fields are output for each line. Missing fields are replaced with
the -e option and extra fields are discarded.
Otherwise, construct each output line according to the format in field-list. Each
element in field-list is either the single character ‘0’ or has the form m.n where
the file number, m, is ‘1’ or ‘2’ and n is a positive field number.
A field specification of ‘0’ denotes the join field. In most cases, the functionality
of the ‘0’ field spec may be reproduced using the explicit m.n that corresponds
to the join field. However, when printing unpairable lines (using either of the -a
or -v options), there is no way to specify the join field using m.n in field-list if
there are unpairable lines in both files. To give join that functionality, POSIX
invented the ‘0’ field specification notation.
The elements in field-list are separated by commas or blanks. Blank separators
typically need to be quoted for the shell. For example, the commands ‘join -o
1.2,2.2’ and ‘join -o '1.2 2.2'’ are equivalent.
All output lines – including those printed because of any -a or -v option – are
subject to the specified field-list.
‘-t char’ Use character char as the input and output field separator. Treat as significant
each occurrence of char in the input file. Use ‘sort -t char’, without the -b
option of ‘sort’, to produce this ordering. If ‘join -t ''’ is specified, the whole
line is considered, matching the default operation of sort. If ‘-t '\0'’ is specified
then the ASCII NUL character is used to delimit the fields.
‘-v file-number’
Print a line for each unpairable line in file file-number (either ‘1’ or ‘2’), instead
of the normal output.
Chapter 8: Operating on fields 77
‘-z’
‘--zero-terminated’
Delimit items with a zero byte rather than a newline (ASCII LF). I.e., treat
input as items separated by ASCII NUL and terminate output items with ASCII
NUL. This option can be useful in conjunction with ‘perl -0’ or ‘find -print0’
and ‘xargs -0’ which do the same in order to reliably handle arbitrary file
names (even those containing blanks or other special characters).With -z the
newline character is treated as a field separator.
An exit status of zero indicates success, and a nonzero value indicates failure.If the
--check-order option is given, unsorted inputs will cause a fatal error message. If the
option --nocheck-order is given, unsorted inputs will never cause an error message. If
neither of these options is given, wrongly sorted inputs are diagnosed only if an input file is
found to contain unpairable lines, and when both input files are non empty. If an input file
is diagnosed as being unsorted, the join command will exit with a nonzero status (and the
output should not be used).
Forcing join to process wrongly sorted input files containing unpairable lines by specifying
--nocheck-order is not guaranteed to produce any particular output. The output will
probably not correspond with whatever you hoped it would be.
8.3.2 Pre-sorting
join requires sorted input files. Each input file should be sorted according to the key
(=field/column number) used in join. The recommended sorting option is ‘sort -k 1b,1’
(assuming the desired key is in the first column).
Typical usage:
$ sort -k 1b,1 file1 > file1.sorted
$ sort -k 1b,1 file2 > file2.sorted
$ join file1.sorted file2.sorted > file3
Normally, the sort order is that of the collating sequence specified by the LC_COLLATE
locale. Unless the -t option is given, the sort comparison ignores blanks at the start of the
join field, as in sort -b. If the --ignore-case option is given, the sort comparison ignores
the case of characters in the join field, as in sort -f:
$ sort -k 1bf,1 file1 > file1.sorted
$ sort -k 1bf,1 file2 > file2.sorted
$ join --ignore-case file1.sorted file2.sorted > file3
The sort and join commands should use consistent locales and options if the output of
sort is fed to join. You can use a command like ‘sort -k 1b,1’ to sort a file on its default
join field, but if you select a non-default locale, join field, separator, or comparison options,
then you should do so consistently between join and sort.
To avoid any locale-related issues, it is recommended to use the ‘C’ locale for both commands:
$ LC_ALL=C sort -k 1b,1 file1 > file1.sorted
$ LC_ALL=C sort -k 1b,1 file2 > file2.sorted
$ LC_ALL=C join file1.sorted file2.sorted > file3
Chapter 8: Operating on fields 78
1
the $'\t' is supported in most modern shells. For older shells, use a literal tab.
Chapter 8: Operating on fields 79
Command Outcome
$ join -a 1 file1 file2 common lines and unpaired lines from the
a 1 A first file
b 2
$ join -a 2 file1 file2 common lines and unpaired lines from the
a 1 A second file
c C
$ join -a 1 -a 2 file1 file2 all lines (paired and unpaired) from both
a 1 A files (union).
b 2 see note below regarding -o auto.
c C
$ join -v 1 file1 file2 unpaired lines from the first file (difference)
b 2
$ join -v 2 file1 file2 unpaired lines from the second file (differ-
c C ence)
$ join -v 1 -v 2 file1 file2 unpaired lines from both files, omitting com-
b 2 mon lines (symmetric difference).
c C
The -o auto -e X options are useful when dealing with unpaired lines. The following example
prints all lines (common and unpaired) from both files. Without -o auto it is not easy to
discern which fields originate from which file:
If the input has no unpairable lines, a GNU extension is available; the sort order can be
any order that considers two fields to be equal if and only if the sort comparison described
above considers them to be equal. For example:
Chapter 8: Operating on fields 80
$ cat file1
a a1
c c1
b b1
$ cat file2
a a2
c c2
b b2
$ cat file2
Name Country
Alice France
Bob Spain
join -t '' -v1 -v2 file1 file2 Symmetric Difference of sorted files
All examples above operate on entire lines and not on specific fields: sort without -k
and join -t '' both consider entire lines as the key.
82
9 Operating on characters
These commands operate on individual characters.
Unfortunately, this means GNU tr will not handle commands like ‘tr ö Ł’ the way you
might expect, since (assuming a UTF-8 encoding) this is equivalent to ‘tr '\303\266'
'\305\201'’ and GNU tr will simply transliterate all ‘\303’ bytes to ‘\305’ bytes, etc.
POSIX does not clearly specify the behavior of tr in locales where characters are represented
by byte sequences instead of by individual bytes, or where data might contain invalid bytes
that are encoding errors. To avoid problems in this area, you can run tr in a safe single-byte
locale by using a shell command like ‘LC_ALL=C tr’ instead of plain tr.
Although most characters simply represent themselves in string1 and string2, the strings
can contain shorthands listed below, for convenience. Some shorthands can be used only in
string1 or string2, as noted below.
Backslash escapes
The following backslash escape sequences are recognized:
‘\a’ Bell (BEL, Control-G).
‘\b’ Backspace (BS, Control-H).
‘\f’ Form feed (FF, Control-L).
‘\n’ Newline (LF, Control-J).
‘\r’ Carriage return (CR, Control-M).
‘\t’ Tab (HT, Control-I).
‘\v’ Vertical tab (VT, Control-K).
‘\ooo’ The eight-bit byte with the value given by ooo, which is the longest
sequence of one to three octal digits following the backslash. For
portability, ooo should represent a value that fits in eight bits. As a
GNU extension to POSIX, if the value would not fit, then only the
first two digits of ooo are used, e.g., ‘\400’ is equivalent to ‘\0400’
and represents a two-byte sequence.
‘\\’ A backslash.
It is an error if no character follows an unescaped backslash. As a GNU
extension, a backslash followed by a character not listed above is interpreted as
that character, removing any special significance; this can be used to escape the
characters ‘[’ and ‘-’ when they would otherwise be special.
Ranges
The notation ‘m-n’ expands to the characters from m through n, in ascending
order. m should not collate after n; if it does, an error results. As an example,
‘0-9’ is the same as ‘0123456789’.
GNU tr does not support the System V syntax that uses square brackets to
enclose ranges. Translations specified in that format sometimes work as expected,
since the brackets are often transliterated to themselves. However, they should
be avoided because they sometimes behave unexpectedly. For example, ‘tr -d
'[0-9]'’ deletes brackets as well as digits.
Many historically common and even accepted uses of ranges are not fully portable.
For example, on EBCDIC hosts using the ‘A-Z’ range will not do what most
Chapter 9: Operating on characters 84
would expect because ‘A’ through ‘Z’ are not contiguous as they are in ASCII.
One way to work around this is to use character classes (see below). Otherwise,
it is most portable (and most ugly) to enumerate the members of the ranges.
Repeated characters
The notation ‘[c*n]’ in string2 expands to n copies of character c. Thus, ‘[y*6]’
is the same as ‘yyyyyy’. The notation ‘[c*]’ in string2 expands to as many
copies of c as are needed to make array2 as long as array1. If n begins with ‘0’,
it is interpreted in octal, otherwise in decimal. A zero-valued n is treated as if
it were absent.
Character classes
The notation ‘[:class:]’ expands to all characters in the (predefined) class
class. When the --delete (-d) and --squeeze-repeats (-s) options are
both given, any character class can be used in string2. Otherwise, only the
character classes lower and upper are accepted in string2, and then only if the
corresponding character class (upper and lower, respectively) is specified in the
same relative position in string1. Doing this specifies case conversion. Except
for case conversion, a class’s characters appear in no particular order. The class
names are given below; an error results when an invalid class name is given.
alnum Letters and digits.
alpha Letters.
blank Horizontal whitespace.
cntrl Control characters.
digit Digits.
graph Printable characters, not including space.
lower Lowercase letters.
print Printable characters, including space.
punct Punctuation characters.
space Horizontal or vertical whitespace.
upper Uppercase letters.
xdigit Hexadecimal digits.
Equivalence classes
The syntax ‘[=c=]’ expands to all characters equivalent to c, in no particular
order. These equivalence classes are allowed in string2 only when --delete
(-d) and --squeeze-repeats -s are both given.
Although equivalence classes are intended to support non-English alphabets,
there seems to be no standard way to define them or determine their con-
tents. Therefore, they are not fully implemented in GNU tr; each character’s
equivalence class consists only of that character, which is of no particular use.
Chapter 9: Operating on characters 85
9.1.2 Translating
tr performs translation when string1 and string2 are both given and the --delete (-d)
option is not given. tr translates each character of its input that is in array1 to the
corresponding character in array2. Characters not in array1 are passed through unchanged.
As a GNU extension to POSIX, when a character appears more than once in array1, only
the final instance is used. For example, these two commands are equivalent:
tr aaa xyz
tr a z
A common use of tr is to convert lowercase characters to uppercase. This can be done
in many ways. Here are three of them:
tr abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
tr a-z A-Z
tr '[:lower:]' '[:upper:]'
However, ranges like a-z are not portable outside the C locale.
When tr is performing translation, array1 and array2 typically have the same length. If
array1 is shorter than array2, the extra characters at the end of array2 are ignored.
On the other hand, making array1 longer than array2 is not portable; POSIX says that
the result is undefined. In this situation, BSD tr pads array2 to the length of array1 by
repeating the last character of array2 as many times as necessary. System V tr truncates
array1 to the length of array2.
By default, GNU tr handles this case like BSD tr. When the --truncate-set1 (-t)
option is given, GNU tr handles this case like the System V tr instead. This option is
ignored for operations other than translation.
Acting like System V tr in this case breaks the relatively common BSD idiom:
tr -cs A-Za-z0-9 '\012'
because it converts only zero bytes (the first element in the complement of array1), rather
than all non-alphanumerics, to newlines.
By the way, the above idiom is not portable because it uses ranges, and it assumes that the
octal code for newline is 012. Here is a better way to write it:
tr -cs '[:alnum:]' '[\n*]'
‘-t tab1[,tab2]...’
‘--tabs=tab1[,tab2]...’
If only one tab stop is given, set the tabs tab1 spaces apart (default is 8).
Otherwise, set the tabs at columns tab1, tab2, . . . (numbered from 0), and
replace any tabs beyond the last tab stop given with single spaces. Tab stops
can be separated by blanks as well as by commas.
As a GNU extension the last tab specified can be prefixed with a ‘/’ to indicate
a tab size to use for remaining positions. For example, --tabs=2,4,/8 will set
tab stops at position 2 and 4, and every multiple of 8 after that.
Also the last tab specified can be prefixed with a ‘+’ to indicate a tab size to
use for remaining positions, offset from the final explicitly specified tab stop.
For example, to ignore the 1 character gutter present in diff output, one can
specify a 1 character offset using --tabs=1,+8, which will set tab stops at
positions 1,9,17,. . . For compatibility, GNU expand also accepts the obsolete
option syntax, -t1[,t2].... New scripts should use -t t1[,t2]... instead.
‘-i’
‘--initial’
Only convert initial tabs (those that precede all non-space or non-tab characters)
on each line to spaces.
An exit status of zero indicates success, and a nonzero value indicates failure.
a 1 character offset using --tabs=1,+8, which will set tab stops at positions
1,9,17,. . . This option implies the -a option.
For compatibility, GNU unexpand supports the obsolete option syntax,
-tab1[,tab2]..., where tab stops must be separated by commas. (Unlike -t,
this obsolete option does not imply -a.) New scripts should use --first-only
-t tab1[,tab2]... instead.
‘-a’
‘--all’ Also convert all sequences of two or more blanks just before a tab stop, even if
they occur after non-blank characters in a line.
An exit status of zero indicates success, and a nonzero value indicates failure.
89
10 Directory listing
This chapter describes the ls command and its variants dir and vdir, which list information
about files.
‘-B’
‘--ignore-backups’
In directories, ignore files that end with ‘~’. This option is equivalent to
‘--ignore='*~' --ignore='.*~'’.
‘-d’
‘--directory’
List just the names of directories, as with other types of files, rather than
listing their contents. Do not follow symbolic links listed on the command
line unless the --dereference-command-line (-H), --dereference (-L), or
--dereference-command-line-symlink-to-dir options are specified.
‘-H’
‘--dereference-command-line’
If a command line argument specifies a symbolic link, show information for the
file the link references rather than for the link itself.
‘--dereference-command-line-symlink-to-dir’
Do not dereference symbolic links, with one exception: if a command line
argument specifies a symbolic link that refers to a directory, show informa-
tion for that directory rather than for the link itself. This is the default
behavior unless long format is being used or any of the following options
is in effect: --classify (-F), --directory (-d), --dereference (-L), or
--dereference-command-line (-H)).
‘--group-directories-first’
Group all the directories before the files and then sort the directories and the
files separately using the selected sort key (see --sort option). That is, this
option specifies a primary sort key, and the --sort option specifies a secondary
key. However, any use of --sort=none (-U) disables this option altogether.
‘--hide=PATTERN’
In directories, ignore files whose names match the shell pattern pattern, unless
the --all (-a) or --almost-all (-A) is also given. This option acts like
--ignore=pattern except that it has no effect if --all (-a) or --almost-all
(-A) is also given.
This option can be useful in shell aliases. For example, if lx is an alias for ‘ls
--hide='*~'’ and ly is an alias for ‘ls --ignore='*~'’, then the command
‘lx -A’ lists the file README~ even though ‘ly -A’ would not.
‘-I pattern’
‘--ignore=pattern’
In directories, ignore files whose names match the shell pattern (not regular
expression) pattern. As in the shell, an initial ‘.’ in a file name does not match
a wildcard at the start of pattern. Sometimes it is useful to give this option
several times. For example,
$ ls --ignore='.??*' --ignore='.[^.]' --ignore='#*'
The first option ignores names of length 3 or more that start with ‘.’, the second
ignores all two-character names that start with ‘.’ except ‘..’, and the third
ignores names that start with ‘#’.
Chapter 10: Directory listing 91
‘-L’
‘--dereference’
When showing file information for a symbolic link, show information for the file
the link references rather than the link itself. However, even with this option,
ls still prints the name of the link itself, not the name of the file that the link
points to.
‘-R’
‘--recursive’
List the contents of all directories recursively.
a/sub:
total 4
Chapter 10: Directory listing 92
a/sub/deeper:
total 0
-rw-r--r-- 1 0 Jun 10 12:27 file
a/sub2:
total 0
//DIRED// 48 50 84 86 120 123 158 162 217 223 282 286
//SUBDIRED// 2 3 167 172 228 240 290 296
//DIRED-OPTIONS// --quoting-style=literal
The pairs of offsets on the ‘//DIRED//’ line above delimit these names: f1, f2,
sub, sub2, deeper, file. The offsets on the ‘//SUBDIRED//’ line delimit the
following directory names: a, a/sub, a/sub/deeper, a/sub2.
Here is an example of how to extract the fifth entry name, ‘deeper’, correspond-
ing to the pair of offsets, 222 and 228:
$ ls -gloRF --dired a > out
$ dd bs=1 skip=222 count=6 < out 2>/dev/null; echo
deeper
Although the listing above includes a trailing slash for the ‘deeper’ entry, the
offsets select the name without the trailing slash. However, if you invoke ls
with --dired (-D) along with an option like --escape (-b) and operate on a
file whose name contains special characters, the backslash is included:
$ touch 'a b'
$ ls -blog --dired 'a b'
-rw-r--r-- 1 0 Jun 10 12:28 a\ b
//DIRED// 30 34
//DIRED-OPTIONS// --quoting-style=escape
If you use a quoting style like --quoting-style=c (-Q) that adds quote marks,
then the offsets include the quote marks. So beware that the user may select the
quoting style via the environment variable QUOTING_STYLE. Hence, applications
using --dired should either specify an explicit --quoting-style=literal (-N)
option on the command line, or else be prepared to parse the escaped names.
The --dired (-D) option implies long format output with hyperlinks disabled,
and takes precedence over previously specified output formats or hyperlink
mode.
‘--full-time’
Produce long format, and list times in full. It is equivalent to using
--format=long (-l) with --time-style=full-iso (see Section 10.1.5
[Formatting file timestamps], page 99).
‘-g’ Produce long format, but omit owner information.
‘-G’
‘--no-group’
Inhibit display of group information in long format. (This is the default in some
non-GNU versions of ls, so we provide this option for compatibility.)
Chapter 10: Directory listing 93
‘-h’
‘--human-readable’
Append a size letter to each size, such as ‘M’ for mebibytes. Powers of 1024
are used, not 1000; ‘M’ stands for 1,048,576 bytes. This option is equivalent to
--block-size=human-readable. Use the --si option if you prefer powers of
1000.
‘-i’
‘--inode’ Print the inode number (also called the file serial number and index number) of
each file to the left of the file name. (This number uniquely identifies each file
within a particular file system.)
‘-l’
‘--format=long’
‘--format=verbose’
Produce long format. In addition to the name of each file, print the file type,
file mode bits, number of hard links, owner name, group name, size, and
timestamp (see Section 10.1.5 [Formatting file timestamps], page 99), normally
the modification timestamp (the mtime, see Chapter 28 [File timestamps],
page 244). If the owner or group name cannot be determined, print the owner
or group ID instead, right-justified as a cue that it is a number rather than
a textual name. Print question marks for other information that cannot be
determined.
For block special files and character special files, the size field is replaced by
the corresponding device major and minor numbers as two decimal numbers
separated by a comma and at least one space.
Normally the size is printed as a byte count without punctuation, but
this can be overridden (see Section 2.2 [Block size], page 3). For example,
--human-readable (-h) prints an abbreviated, human-readable count, and
‘--block-size="'1"’ prints a byte count with the thousands separator of the
current locale.
For each directory that is listed, preface the files with a line ‘total blocks’,
where blocks is the file system allocation for all files in that directory. The block
size currently defaults to 1024 bytes, but this can be overridden (see Section 2.2
[Block size], page 3). The blocks computed counts each hard link separately;
this is arguably a deficiency.
The file type is one of the following characters:
‘-’ regular file
‘b’ block special file
‘c’ character special file
‘C’ high performance (“contiguous data”) file
‘d’ directory
‘D’ door (Solaris)
‘l’ symbolic link
Chapter 10: Directory listing 94
For files that are NFS-mounted from an HP-UX system to a BSD system, this
option reports sizes that are half the correct values. On HP-UX systems, it
reports sizes that are twice the correct values for files that are NFS-mounted
from BSD systems. This is due to a flaw in HP-UX; it also affects the HP-UX
ls program.
‘--si’ Append an SI-style abbreviation to each size, such as ‘M’ for megabytes. Powers of
1000 are used, not 1024; ‘M’ stands for 1,000,000 bytes. This option is equivalent
to --block-size=si. Use the -h or --human-readable option if you prefer
powers of 1024.
‘-Z’
‘--context’
Display the SELinux security context or ‘?’ if none is found. In long format,
print the security context to the left of the size column.
‘-c’
‘--time=ctime’
‘--time=status’
In long format, print the status change timestamp (the ctime) instead of the
mtime. When sorting by time or when not using long format, sort according to
the ctime. See Chapter 28 [File timestamps], page 244.
‘-f’ Produce an unsorted directory listing. This is like --sort=none (-U), but also
enable --all (-a), while also disabling any previous use of -l, --color --size,
or --hyperlink.
‘-r’
‘--reverse’
Reverse whatever the sorting method is – e.g., list files in reverse alphabetical
order, youngest first, smallest first, or whatever. This option has no effect when
--sort=none (-U) is in effect.
‘-S’
‘--sort=size’
Sort by file size, largest first.
‘-t’
‘--sort=time’
Sort by modification timestamp (mtime) by default, newest first. The timestamp
to order by can be changed with the --time option. See Chapter 28 [File
timestamps], page 244.
Chapter 10: Directory listing 96
‘-u’
‘--time=atime’
‘--time=access’
‘--time=use’
In long format, print the last access timestamp (the atime). When sorting by
time or when not using long format, sort according to the atime. See Chapter 28
[File timestamps], page 244.
‘--time=mtime’
‘--time=modification’
This is the default timestamp display and sorting mode. In long format, print
the last data modification timestamp (the mtime). When sorting by time or
when not using long format, sort according to the mtime. See Chapter 28 [File
timestamps], page 244.
‘--time=birth’
‘--time=creation’
In long format, print the file creation timestamp if available, falling back to
the file modification timestamp (mtime) if not. When sorting by time or when
not using long format, sort according to the birth time. See Chapter 28 [File
timestamps], page 244.
‘-U’
‘--sort=none’
Do not sort; list the files in whatever order they are stored in the directory. (Do
not do any of the other unrelated things that -f does.) This can be useful when
listing large directories, where sorting can take some time.
‘-v’
‘--sort=version’
Sort by version name and number, lowest first. It behaves like a default
sort, except that each sequence of decimal digits is treated numerically as an
index/version number. See Chapter 30 [Version sort ordering], page 253.
‘--sort=width’
Sort by printed width of file names. This can be useful with the
--format=vertical (-C) output format, to most densely display the listed
files.
‘-X’
‘--sort=extension’
Sort directory contents alphabetically by file extension (characters after the last
‘.’); files with no extension are sorted first.
‘--file-type’
‘--indicator-style=file-type’
Append a character to each file name indicating the file type. This is like
--classify (-F, except that executables are not marked.
‘--hyperlink [=when]’
Output codes recognized by some terminals to link to files using the ‘file://’
URI format. when may be omitted, or one of:
• none - Do not use hyperlinks at all. This is the default.
• auto - Only use hyperlinks if standard output is a terminal.
• always - Always use hyperlinks.
Specifying --hyperlink and no when is equivalent to --hyperlink=always.
‘--indicator-style=word’
Append a character indicator with style word to entry names, as follows:
‘none’ Do not append any character indicator; this is the default.
‘slash’ Append ‘/’ for directories. This is the same as the -p option.
‘file-type’
Append ‘/’ for directories, ‘@’ for symbolic links, ‘|’ for FIFOs, ‘=’
for sockets, and nothing for regular files. This is the same as the
--file-type option.
‘classify’
Append ‘*’ for executable regular files, otherwise behave as for
‘file-type’. This is the same as the --classify (-F) option.
‘-k’
‘--kibibytes’
Set the default block size to its normal value of 1024 bytes, overriding any
contrary specification in environment variables (see Section 2.2 [Block size],
page 3). If --block-size, --human-readable (-h), or --si options are used,
they take precedence even if --kibibytes (-k) is placed after
The --kibibytes (-k) option affects the per-directory block count written in
long format, and the file system allocation written by the --size (-s) option.
It does not affect the file size in bytes that is written in long format.
‘-m’
‘--format=commas’
List files horizontally, with as many as will fit on each line, separated by ‘, ’ (a
comma and a space), and with no other information.
‘-p’
‘--indicator-style=slash’
Append a ‘/’ to directory names.
‘-x’
‘--format=across’
‘--format=horizontal’
List the files in columns, sorted horizontally.
Chapter 10: Directory listing 99
‘-T cols’
‘--tabsize=cols’
Assume that each tab stop is cols columns wide. The default is 8. ls uses tabs
where possible in the output, for efficiency. If cols is zero, do not use tabs at all.
Some terminal emulators might not properly align columns to the right of a
TAB following a non-ASCII byte. You can avoid that issue by using the -T0
option or put TABSIZE=0 in your environment, to tell ls to align using spaces,
not tabs.
If you set a terminal’s hardware tabs to anything other than the default, you
should also use a --tabsize option or TABSIZE environment variable either
to match the hardware tabs, or to disable the use of hardware tabs. Other-
wise, the output of ls may not line up. For example, if you run the shell
command ‘tabs -4’ to set hardware tabs to every four columns, you should
also run ‘export TABSIZE=4’ or ‘export TABSIZE=0’, or use the corresponding
--tabsize options.
‘-w cols’
‘--width=cols’
Assume the screen is cols columns wide. The default is taken from the terminal
settings if possible; otherwise the environment variable COLUMNS is used if it is
set; otherwise the default is 80. With a cols value of ‘0’, there is no limit on
the length of the output line, and that single output line will be delimited with
spaces, not tabs.
‘--zero’ Output a zero byte (ASCII NUL) at the end of each line, rather than a new-
line. This option enables other programs to parse the output even when
that output would contain data with embedded newlines.This option is in-
compatible with the --dired (-D) option. This option also implies the options
--show-control-chars, -1, --color=none, and --quoting-style=literal
(-N).
‘literal’ Output strings as-is; this is the same as the --literal (-N) option.
‘shell’ Quote strings for the shell if they contain shell metacharacters
or would cause ambiguous output. The quoting is suitable for
POSIX-compatible shells like bash, but it does not always work for
incompatible shells like csh.
‘shell-always’
Quote strings for the shell, even if they would normally not require
quoting.
‘shell-escape’
Like ‘shell’, but also quoting non-printable characters using the
POSIX proposed ‘$''’ syntax suitable for most shells.
‘shell-escape-always’
Like ‘shell-escape’, but quote strings even if they would normally
not require quoting.
‘c’ Quote strings as for C character string literals, including the surround-
ing double-quote characters; this is the same as the --quote-name
(-Q) option.
‘escape’ Quote strings as for C character string literals, except omit the sur-
rounding double-quote characters; this is the same as the --escape
(-b) option.
‘clocale’ Quote strings as for C character string literals, except use surround-
ing quotation marks appropriate for the locale.
‘locale’ Quote strings as for C character string literals, except use surround-
ing quotation marks appropriate for the locale, and quote ’like
this’ instead of "like this" in the default C locale. This looks
nicer on many displays.
You can specify the default value of the --quoting-style option with the
environment variable QUOTING_STYLE. If that environment variable is not set,
the default value is ‘shell-escape’ when the output is a terminal, and ‘literal’
otherwise.
‘--show-control-chars’
Print nongraphic characters as-is in file names. This is the default unless the
output is a terminal and the program is ls.
11 Basic operations
This chapter describes the commands for basic file manipulation: copying, moving (renaming),
and deleting (removing).
‘ls -U’ may list the entries in a copied directory in a different order). Try to
preserve SELinux security context and extended attributes (xattr), but ignore
any failure to do that and print no corresponding diagnostic. Equivalent to -dR
--preserve=all with the reduced diagnostics.
‘--attributes-only’
Copy only the specified attributes of the source file to the destination. If the
destination already exists, do not alter its contents. See the --preserve option
for controlling which attributes to copy.
‘-b’
‘--backup[=method]’
See Section 2.1 [Backup options], page 2. Make a backup of each file that would
otherwise be overwritten or removed. As a special case, cp makes a backup
of source when the force and backup options are given and source and dest
are the same name for an existing, regular file. One useful application of this
combination of options is this tiny Bourne shell script:
#!/bin/sh
# Usage: backup FILE...
# Create a GNU-style backup of each listed FILE.
fail=0
for i; do
cp --backup --force --preserve=all -- "$i" "$i" || fail=1
done
exit $fail
‘--copy-contents’
If copying recursively, copy the contents of any special files (e.g., FIFOs and
device files) as if they were regular files. This means trying to read the data
in each source file and writing it to the destination. It is usually a mistake
to use this option, as it normally has undesirable effects on special files like
FIFOs and the ones typically found in the /dev directory. In most cases, cp -R
--copy-contents will hang indefinitely trying to read from FIFOs and special
files like /dev/console, and it will fill up your destination file system if you use
it to copy /dev/zero. This option has no effect unless copying recursively, and
it does not affect the copying of symbolic links.
‘-d’ Copy symbolic links as symbolic links rather than copying the files that they
point to, and preserve hard links between source files in the copies. Equivalent
to --no-dereference --preserve=links.
‘--debug’ Print extra information to stdout, explaining how files are copied. This option
implies the --verbose option.
‘-f’
‘--force’ When copying without this option and an existing destination file cannot be
opened for writing, the copy fails. However, with --force, when a destination
file cannot be opened, cp then tries to recreate the file by first removing it. The
--force option alone will not remove dangling symlinks. When this option is
combined with --link (-l) or --symbolic-link (-s), the destination link is re-
Chapter 11: Basic operations 106
placed, and unless --backup (-b) is also given there is no brief moment when the
destination does not exist. Also see the description of --remove-destination.
This option is independent of the --interactive or -i option: neither cancels
the effect of the other.
This option is ignored when the --no-clobber or -n option is also used.
‘-H’ If a command line argument specifies a symbolic link, then copy the file it points
to rather than the symbolic link itself. However, copy (preserving its nature)
any symbolic link that is encountered via recursive traversal.
‘-i’
‘--interactive’
When copying a file other than a directory, prompt whether to overwrite an
existing destination file, and fail if the response is not affirmative. The -i option
overrides a previous -n option.
‘-l’
‘--link’ Make hard links instead of copies of non-directories.
‘-L’
‘--dereference’
Follow symbolic links when copying from them. With this option, cp cannot
create a symbolic link. For example, a symlink (to regular file) in the source
tree will be copied to a regular file in the destination tree.
‘-n’
‘--no-clobber’
Do not overwrite an existing file; silently skip instead. This option overrides
a previous -i option. This option is mutually exclusive with -b or --backup
option. This option is deprecated due to having a different exit status from
other platforms. See also the --update option which will give more control over
how to deal with existing files in the destination, and over the exit status in
particular.
‘-P’
‘--no-dereference’
Copy symbolic links as symbolic links rather than copying the files that they
point to. This option affects only symbolic links in the source; symbolic links in
the destination are always followed if possible.
‘-p’
‘--preserve[=attribute_list]’
Preserve the specified attributes of the original files. If specified, the attribute list
must be a comma-separated list of one or more of the following strings:
‘mode’ Preserve attributes relevant to access permissions, including file mode
bits and (if possible) access control lists (ACLs). ACL preservation
is system-dependent, and ACLs are not necessarily translated when
the source and destination are on file systems with different ACL
formats (e.g., NFSv4 versus POSIX formats).
Chapter 11: Basic operations 107
‘ownership’
Preserve the owner and group. On most modern systems, only users
with appropriate privileges may change the owner of a file, and
ordinary users may preserve the group ownership of a file only if
they happen to be a member of the desired group.
‘timestamps’
Preserve the times of last access and last modification, when possible.
On older systems, it is not possible to preserve these attributes when
the affected file is a symbolic link. However, many systems now
provide the utimensat function, which makes it possible even for
symbolic links.
‘links’ Preserve in the destination files any links between corresponding
source files. With -L or -H, this option can convert symbolic links
to hard links. For example,
$ mkdir c; : > a; ln -s a b; cp -aH a b c; ls -i1 c
74161745 a
74161745 b
Although b is a symlink to regular file a, the files in the destination
directory c/ are hard-linked. Since -a implies --no-dereference it
would copy the symlink, but the later -H tells cp to dereference the
command line arguments where it then sees two files with the same
inode number. Then the --preserve=links option also implied by
-a will preserve the perceived hard link.
Here is a similar example that exercises cp’s -L option:
$ mkdir b c; (cd b; : > a; ln -s a b); cp -aL b c; ls -i1 c/b
74163295 a
74163295 b
‘context’ Preserve SELinux security context of the file, or fail with full diag-
nostics.
‘xattr’ Preserve extended attributes of the file, or fail with full diagnostics.
If cp is built without xattr support, ignore this option. If SELinux
context, ACLs or Capabilities are implemented using xattrs, they
are preserved implicitly by this option as well, i.e., even without
specifying --preserve=mode or --preserve=context.
‘all’ Preserve all file attributes. Equivalent to specifying all of the above,
but with the difference that failure to preserve SELinux security
context or extended attributes does not change cp’s exit status.
In contrast to -a, all but ‘Operation not supported’ warnings are
output.
Using --preserve with no attribute list is equivalent to --preserve=mode,ownership,timestamps
In the absence of this option, the permissions of existing destination files are
unchanged. Each new file is created with the mode of the corresponding source
file minus the set-user-ID, set-group-ID, and sticky bits as the create mode;
Chapter 11: Basic operations 108
the operating system then applies either the umask or a default ACL, possibly
resulting in a more restrictive file mode. See Chapter 27 [File permissions],
page 236.
‘--no-preserve=attribute_list’
Do not preserve the specified attributes. The attribute list has the same form
as for --preserve.
‘--parents’
Form the name of each destination file by appending to the target directory a
slash and the specified name of the source file. The last argument given to cp
must be the name of an existing directory. For example, the command:
cp --parents a/b/c existing_dir
copies the file a/b/c to existing_dir/a/b/c, creating any missing intermediate
directories.
‘-R’
‘-r’
‘--recursive’
Copy directories recursively. By default, do not follow symbolic links in the
source unless used together with the --link (-l) option; see the --archive (-a),
-d, --dereference (-L), --no-dereference (-P), and -H options. Special files
are copied by creating a destination file of the same type as the source; see the
--copy-contents option. It is not portable to use -r to copy symbolic links or
special files. On some non-GNU systems, -r implies the equivalent of -L and
--copy-contents for historical reasons. Also, it is not portable to use -R to
copy symbolic links unless you also specify -P, as POSIX allows implementations
that dereference symbolic links by default.
‘--reflink[=when]’
Perform a lightweight, copy-on-write (COW) copy, if supported by the file
system. Once it has succeeded, beware that the source and destination files
share the same data blocks as long as they remain unmodified. Thus, if an I/O
error affects data blocks of one of the files, the other suffers the same fate.
The when value can be one of the following:
‘always’ If the copy-on-write operation is not supported then report the
failure for each file and exit with a failure status. Plain --reflink
is equivalent to --reflink=always.
‘auto’ If the copy-on-write operation is not supported then fall back to the
standard copy behavior. This is the default if no --reflink option
is given.
‘never’ Disable copy-on-write operation and use the standard copy behavior.
This option is overridden by the --link, --symbolic-link and
--attributes-only options, thus allowing it to be used to configure the
default data copying behavior for cp.
Chapter 11: Basic operations 109
‘--remove-destination’
Remove each existing destination file before attempting to open it (contrast
with -f above).
‘--sparse=when’
A sparse file contains holes – a sequence of zero bytes that does not occupy any
file system blocks; the ‘read’ system call reads these as zeros. This can both
save considerable space and increase speed, since many binary files contain lots
of consecutive zero bytes. By default, cp detects holes in input source files via a
crude heuristic and makes the corresponding output file sparse as well. Only
regular files may be sparse.
The when value can be one of the following:
‘auto’ The default behavior: if the input file is sparse, attempt to make the
output file sparse, too. However, if an output file exists but refers
to a non-regular file, then do not attempt to make it sparse.
‘always’ For each sufficiently long sequence of zero bytes in the input file,
attempt to create a corresponding hole in the output file, even if
the input file does not appear to be sparse. This is useful when the
input file resides on a file system that does not support sparse files
(for example, ‘efs’ file systems in SGI IRIX 5.3 and earlier), but
the output file is on a type of file system that does support them.
Holes may be created only in regular files, so if the destination file
is of some other type, cp does not even try to make it sparse.
‘never’ Never make the output file sparse. This is useful in creating a file
for use with the mkswap command, since such a file must not have
any holes.
For example, with the following alias, cp will use the minimum amount of
space supported by the file system. (Older versions of cp can also benefit from
--reflink=auto here.)
alias cp='cp --sparse=always'
‘--strip-trailing-slashes’
Remove any trailing slashes from each source argument. See Section 2.7 [Trailing
slashes], page 9.
‘-s’
‘--symbolic-link’
Make symbolic links instead of copies of non-directories. All source file names
must be absolute (starting with ‘/’) unless the destination files are in the current
directory. This option merely results in an error message on systems that do
not support symbolic links.
‘-S suffix’
‘--suffix=suffix’
Append suffix to each backup file made with -b. See Section 2.1 [Backup
options], page 2.
Chapter 11: Basic operations 110
‘-t directory’
‘--target-directory=directory’
Specify the destination directory. See Section 2.6 [Target directory], page 7.
‘-T’
‘--no-target-directory’
Do not treat the last operand specially when it is a directory or a symbolic link
to a directory. See Section 2.6 [Target directory], page 7.
‘-u’
‘--update[=which]’
Do not copy a non-directory that has an existing destination with the same
or newer modification timestamp; instead, silently skip the file without failing.
If timestamps are being preserved, the comparison is to the source timestamp
truncated to the resolutions of the destination file system and of the system
calls used to update timestamps; this avoids duplicate work if several ‘cp
-pu’ commands are executed with the same source and destination. This
option is ignored if the -n or --no-clobber option is also specified. Also, if
--preserve=links is also specified (like with ‘cp -au’ for example), that will
take precedence; consequently, depending on the order that files are processed
from the source, newer files in the destination may be replaced, to mirror hard
links in the source.
which gives more control over which existing files in the destination are replaced,
and its value can be one of the following:
‘all’ This is the default operation when an --update option is not speci-
fied, and results in all existing files in the destination being replaced.
‘none’ This is like the deprecated --no-clobber option, where no files in
the destination are replaced, and also skipping a file does not induce
a failure.
‘none-fail’
This is similar to ‘none’, in that no files in the destination are
replaced, but any skipped files are diagnosed and induce a failure.
‘older’ This is the default operation when --update is specified, and results
in files being replaced if they’re older than the corresponding source
file.
‘-v’
‘--verbose’
Print the name of each file before copying it.
‘-x’
‘--one-file-system’
Skip subdirectories that are on different file systems from the one that the copy
started on. However, mount point directories are copied.
‘-Z’
‘--context[=context]’
Without a specified context, adjust the SELinux security context according
to the system default type for destination files, similarly to the restorecon
Chapter 11: Basic operations 111
command. The long form of this option with a specific context specified, will set
the context for newly created files only. With a specified context, if both SELinux
and SMACK are disabled, a warning is issued.This option is mutually exclusive
with the --preserve=context option, and overrides the --preserve=all and
-a options.
An exit status of zero indicates success, and a nonzero value indicates failure.
‘sync’ Pad every input block to size of ‘ibs’ with trailing zero bytes. When
used with ‘block’ or ‘unblock’, pad with spaces instead of zero
bytes.
The following “conversions” are really file flags and don’t affect internal process-
ing:
‘excl’ Fail if the output file already exists; dd must create the output file
itself.
‘nocreat’ Do not create the output file; the output file must already exist.
The ‘excl’ and ‘nocreat’ conversions are mutually exclusive, and
are GNU extensions to POSIX.
‘notrunc’ Do not truncate the output file.
‘noerror’ Continue after read errors.
‘fdatasync’
Synchronize output data just before finishing, even if there were
write errors. This forces a physical write of output data, so that
even if power is lost the output data will be preserved. If neither
this nor ‘fsync’ are specified, output is treated as usual with file
systems, i.e., output data and metadata may be cached in primary
memory for some time before the operating system physically writes
it, and thus output data and metadata may be lost if power is lost.
See Section 14.4 [sync invocation], page 159. This conversion is a
GNU extension to POSIX.
‘fsync’ Synchronize output data and metadata just before finishing, even
if there were write errors. This acts like ‘fdatasync’ except it also
preserves output metadata, such as the last-modified time of the
output file; for this reason it may be a bit slower than ‘fdatasync’
although the performance difference is typically insignificant for dd.
This conversion is a GNU extension to POSIX.
‘iflag=flag[,flag]...’
Access the input file using the flags specified by the flag argument(s). (No
spaces around any comma(s).)
‘oflag=flag[,flag]...’
Access the output file using the flags specified by the flag argument(s). (No
spaces around any comma(s).)
Here are the flags.
‘append’ Write in append mode, so that even if some other process is writing
to this file, every dd write will append to the current contents of the
file. This flag makes sense only for output. If you combine this flag
with the ‘of=file’ operand, you should also specify ‘conv=notrunc’
unless you want the output file to be truncated before being ap-
pended to.
Chapter 11: Basic operations 115
‘cio’ Use concurrent I/O mode for data. This mode performs direct I/O
and drops the POSIX requirement to serialize all I/O to the same
file. A file cannot be opened in CIO mode and with a standard
open at the same time.
‘direct’ Use direct I/O for data, avoiding the buffer cache. The kernel may
impose restrictions on read or write buffer sizes. For example, with
an ext4 destination file system and a Linux-based kernel, using
‘oflag=direct’ will cause writes to fail with EINVAL if the output
buffer size is not a multiple of 512.
‘directory’
Fail unless the file is a directory. Most operating systems do not
allow I/O to a directory, so this flag has limited utility.
‘dsync’ Use synchronized I/O for data. For the output file, this forces a
physical write of output data on each write. For the input file,
this flag can matter when reading from a remote file that has been
written to synchronously by some other process. Metadata (e.g.,
last-access and last-modified time) is not necessarily synchronized.
‘nocache’ Request to discard the system data cache for a file. When count=0
all cached data for the file is specified, otherwise the cache is dropped
for the processed portion of the file. Also when count=0, failure to
discard the cache is diagnosed and reflected in the exit status.
Because data not already persisted to storage is not discarded from
the cache, the ‘sync’ conversions in the following examples maximize
the effectiveness of the ‘nocache’ flag.
Here are some usage examples:
# Advise to drop cache for whole file
dd if=ifile iflag=nocache count=0
‘nonblock’
Use non-blocking I/O.
Chapter 11: Basic operations 116
‘noatime’ Do not update the file’s access timestamp. See Chapter 28 [File
timestamps], page 244. Some older file systems silently ignore this
flag, so it is a good idea to test it on your files before relying on it.
‘noctty’ Do not assign the file to be a controlling terminal for dd. This
has no effect when the file is not a terminal. On many hosts (e.g.,
GNU/Linux hosts), this flag has no effect at all.
‘nofollow’
Do not follow symbolic links.
‘nolinks’ Fail if the file has multiple hard links.
‘binary’ Use binary I/O. This flag has an effect only on nonstandard platforms
that distinguish binary from text I/O.
‘text’ Use text I/O. Like ‘binary’, this flag has no effect on standard
platforms.
‘fullblock’
Accumulate full blocks from input. The read system call may return
early if a full block is not available. When that happens, continue
calling read to fill the remainder of the block. This flag can be
used only with iflag. This flag is useful with pipes for example
as they may return short reads. In that case, this flag is needed
to ensure that a ‘count=’ argument is interpreted as a block count
rather than a count of read operations.
These flags are all GNU extensions to POSIX. They are not supported on all
systems, and ‘dd’ rejects attempts to use them when they are not supported.
When reading from standard input or writing to standard output, the ‘nofollow’
and ‘noctty’ flags should not be specified, and the other flags (e.g., ‘nonblock’)
can affect how other processes behave with the affected file descriptors, even
after dd exits.
The behavior of dd is unspecified if operands other than ‘conv=’, ‘iflag=’, ‘oflag=’, and
‘status=’ are specified more than once.
The numeric-valued strings above (n and bytes) are unsigned decimal integers that can
be followed by a multiplier: ‘b’=512, ‘c’=1, ‘w’=2, ‘xm’=m, or any of the standard block
size suffixes like ‘k’=1024 (see Section 2.2 [Block size], page 3). These multipliers are GNU
extensions to POSIX, except that POSIX allows bytes to be followed by ‘k’, ‘b’, and ‘xm’.
An ‘xm’ can be used more than once in a number. Block sizes (i.e., specified by bytes strings)
must be nonzero.
Any block size you specify via ‘bs=’, ‘ibs=’, ‘obs=’, ‘cbs=’ should not be too large –
values larger than a few megabytes are generally wasteful or (as in the gigabyte..exabyte
case) downright counterproductive or error-inducing.
To process data with offset or size that is not a multiple of the I/O block size, you can
use a numeric string n that ends in the letter ‘B’. For example, the following shell commands
copy data in 1 MiB blocks between a flash drive and a tape, but do not save or restore a
512-byte area at the start of the flash drive:
flash=/dev/sda
Chapter 11: Basic operations 117
tape=/dev/st0
# Copy all but the initial 512 bytes from flash to tape.
dd if=$flash iseek=512B bs=1MiB of=$tape
# Copy from tape back to flash, leaving initial 512 bytes alone.
dd if=$tape bs=1MiB of=$flash oseek=512B
For failing storage devices, other tools come with a great variety of extra functionality
to ease the saving of as much data as possible before the device finally dies, e.g. GNU
ddrescue (https://www.gnu.org/software/ddrescue/). However, in some cases such
a tool is not available or the administrator feels more comfortable with the handling of
dd. As a simple rescue method, call dd as shown in the following example: the operand
‘conv=noerror,sync’ is used to continue after read errors and to pad out bad reads with
NULs, while ‘iflag=fullblock’ caters for short reads (which traditionally never occur on
flash or similar devices):
# Rescue data from an (unmounted!) partition of a failing device.
dd conv=noerror,sync iflag=fullblock </dev/sda1 > /mnt/rescue.img
Sending an ‘INFO’ signal (or ‘USR1’ signal where that is unavailable) to a running dd
process makes it print I/O statistics to standard error and then resume copying. In the
example below, dd is run in the background to copy 5GB of data. The kill command
makes it output intermediate I/O statistics, and when dd completes normally or is killed by
the SIGINT signal, it outputs the final statistics.
# Ignore the signal so we never inadvertently terminate the dd child.
# (This is not needed when SIGINFO is available.)
trap '' USR1
‘--debug’ Print extra information to stdout, explaining how files are copied. This option
implies the --verbose option.
‘-g group’
‘--group=group’
Set the group ownership of installed files or directories to group. The default is
the process’s current group. group may be either a group name or a numeric
group ID.
‘-m mode’
‘--mode=mode’
Set the file mode bits for the installed file or directory to mode, which can be
either an octal number, or a symbolic mode as in chmod, with ‘a=’ (no access
allowed to anyone) as the point of departure (see Chapter 27 [File permissions],
page 236). The default mode is ‘u=rwx,go=rx,a-s’ – read, write, and execute
for the owner, read and execute for group and other, and with set-user-ID and
set-group-ID disabled. This default is not quite the same as ‘755’, since it
disables instead of preserving set-user-ID and set-group-ID on directories. See
Section 27.5 [Directory Setuid and Setgid], page 242.
‘-o owner’
‘--owner=owner’
If install has appropriate privileges (is run as root), set the ownership of
installed files or directories to owner. The default is root. owner may be either
a user name or a numeric user ID.
‘--preserve-context’
Preserve the SELinux security context of files and directories. Failure to preserve
the context in all of the files or directories will result in an exit status of 1. If
SELinux is disabled then print a warning and ignore the option.
‘-p’
‘--preserve-timestamps’
Set the time of last access and the time of last modification of each installed
file to match those of each corresponding original file. When a file is installed
without this option, its last access and last modification timestamps are both
set to the time of installation. This option is useful if you want to use the last
modification timestamps of installed files to keep track of when they were last
built as opposed to when they were last installed.
‘-s’
‘--strip’ Strip the symbol tables from installed binary executables.
‘--strip-program=program’
Program used to strip binaries.
‘-S suffix’
‘--suffix=suffix’
Append suffix to each backup file made with -b. See Section 2.1 [Backup
options], page 2.
Chapter 11: Basic operations 120
‘-t directory’
‘--target-directory=directory’
Specify the destination directory. See Section 2.6 [Target directory], page 7.Also
specifying the -D option will ensure the directory is present.
‘-T’
‘--no-target-directory’
Do not treat the last operand specially when it is a directory or a symbolic link
to a directory. See Section 2.6 [Target directory], page 7.
‘-v’
‘--verbose’
Print the name of each file before copying it.
‘-Z’
‘--context[=context]’
Without a specified context, adjust the SELinux security context according
to the system default type for destination files, similarly to the restorecon
command. The long form of this option with a specific context specified, will
set the context for newly created files only. With a specified context, if both
SELinux and SMACK are disabled, a warning is issued.This option is mutually
exclusive with the --preserve-context option.
An exit status of zero indicates success, and a nonzero value indicates failure.
Avoid specifying a source name with a trailing slash, when it might be a symlink to a
directory. Otherwise, mv may do something very surprising, since its behavior depends on
the underlying rename system call. On a system with a modern Linux-based kernel, it fails
with errno=ENOTDIR. However, on other systems (at least FreeBSD 6.1 and Solaris 10) it
silently renames not the symlink but rather the directory referenced by the symlink. See
Section 2.7 [Trailing slashes], page 9.
The mv command replaces destination directories only if they are empty. Conflicting
populated directories are skipped with a diagnostic.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-b’
‘--backup[=method]’
See Section 2.1 [Backup options], page 2. Make a backup of each file that would
otherwise be overwritten or removed.
‘--debug’ Print extra information to stdout, explaining how files are copied. This option
implies the --verbose option.
‘-f’
‘--force’ Do not prompt the user before removing a destination file. If you specify more
than one of the -i, -f, -n options, only the final one takes effect.
‘-i’
‘--interactive’
Prompt whether to overwrite each existing destination file, regardless of its
permissions, and fail if the response is not affirmative. If you specify more than
one of the -i, -f, -n options, only the final one takes effect.
‘-n’
‘--no-clobber’
Do not overwrite an existing file; silently fail instead. If you specify more than
one of the -i, -f, -n options, only the final one takes effect.This option is
mutually exclusive with -b or --backup option. See also the --update=none
option which will skip existing files but not fail.
‘--no-copy’
If a file cannot be renamed because the destination file system differs, fail with
a diagnostic instead of copying and then removing the file.
‘--exchange’
Exchange source and destination instead of renaming source to destination.
Both files must exist; they need not be the same type. This exchanges all data
and metadata.
This option can be used to replace one directory with another. When used
this way, it should be combined with --no-target-directory (-T) to avoid
confusion about the destination location. For example, you might use ‘mv -T
--exchange d1 d2’ to exchange two directories d1 and d2.
Exchanges are atomic if the source and destination are both in a single file system
that supports atomic exchange. Non-atomic exchanges are not yet supported.
Chapter 11: Basic operations 122
If the source and destination might not be on the same file system, using
--no-copy will prevent future versions of mv from implementing the exchange
by copying.
‘-u’
‘--update’
Do not move a non-directory that has an existing destination with the same
or newer modification timestamp; instead, silently skip the file without failing.
If the move is across file system boundaries, the comparison is to the source
timestamp truncated to the resolutions of the destination file system and of the
system calls used to update timestamps; this avoids duplicate work if several ‘mv
-u’ commands are executed with the same source and destination. This option
is ignored if the -n or --no-clobber option is also specified.
which gives more control over which existing files in the destination are replaced,
and its value can be one of the following:
‘all’ This is the default operation when an --update option is not speci-
fied, and results in all existing files in the destination being replaced.
‘none’ This is like the deprecated --no-clobber option, where no files in
the destination are replaced, and also skipping a file does not induce
a failure.
‘none-fail’
This is similar to ‘none’, in that no files in the destination are
replaced, but any skipped files are diagnosed and induce a failure.
‘older’ This is the default operation when --update is specified, and results
in files being replaced if they’re older than the corresponding source
file.
‘-v’
‘--verbose’
Print the name of each file before moving it.
‘--keep-directory-symlink’
Follow existing symlinks to directories when copying. Use this option only
when the destination directory’s contents are trusted, as an attacker can place
symlinks in the destination to cause cp write to arbitrary target directories.
‘--strip-trailing-slashes’
Remove any trailing slashes from each source argument. See Section 2.7 [Trailing
slashes], page 9.
‘-S suffix’
‘--suffix=suffix’
Append suffix to each backup file made with -b. See Section 2.1 [Backup
options], page 2.
‘-t directory’
‘--target-directory=directory’
Specify the destination directory. See Section 2.6 [Target directory], page 7.
Chapter 11: Basic operations 123
‘-T’
‘--no-target-directory’
Do not treat the last operand specially when it is a directory or a symbolic link
to a directory. See Section 2.6 [Target directory], page 7.
‘-Z’
‘--context’
This option functions similarly to the restorecon command, by adjusting the
SELinux security context according to the system default type for destination
files and each created directory.
An exit status of zero indicates success, and a nonzero value indicates failure.
One common question is how to remove files whose names begin with a ‘-’. GNU rm,
like every program that uses the getopt function to parse its arguments, lets you use the
‘--’ option to indicate that all following arguments are non-options. To remove a file called
-f in the current directory, you could type either:
rm -- -f
or:
rm ./-f
The Unix rm program’s use of a single ‘-’ for this purpose predates the development of
the getopt standard syntax.
An exit status of zero indicates success, and a nonzero value indicates failure.
Chapter 11: Basic operations 125
For ext3 and ext4 file systems, shred is less effective when the file system is in
data=journal mode, which journals file data in addition to just metadata. In both the
data=ordered (default) and data=writeback modes, shred works as usual. The ext3/ext4
journaling modes can be changed by adding the data=something option to the mount
options for a particular file system in the /etc/fstab file, as documented in the mount man
page (‘man mount’). Alternatively, if you know how large the journal is, you can shred the
journal by shredding enough file data so that the journal cycles around and fills up with
shredded data.
If you are not sure how your file system operates, then you should assume that it does
not overwrite data in place, which means shred cannot reliably operate on regular files in
your file system.
Chapter 11: Basic operations 126
Generally speaking, it is more reliable to shred a device than a file, since this bypasses file
system design issues mentioned above. However, devices are also problematic for shredding,
for reasons such as the following:
• Solid-state storage devices (SSDs) typically do wear leveling to prolong service life, and
this means writes are distributed to other blocks by the hardware, so “overwritten”
data blocks are still present in the underlying device.
• Most storage devices map out bad blocks invisibly to the application; if the bad blocks
contain sensitive data, shred won’t be able to destroy it.
• With some obsolete storage technologies, it may be possible to take (say) a floppy disk
back to a laboratory and use a lot of sensitive (and expensive) equipment to look for
the faint “echoes” of the original data underneath the overwritten data. With these
older technologies, if the file has been overwritten only once, it’s reputedly not even that
hard. Luckily, this kind of data recovery has become difficult, and there is no public
evidence that today’s higher-density storage devices can be analyzed in this way.
The shred command can use many overwrite passes, with data patterns chosen to
maximize the damage they do to the old data. By default the patterns are designed for
best effect on hard drives using now-obsolete technology; for newer devices, a single pass
should suffice. For more details, see the source code and Peter Gutmann’s paper Secure
Deletion of Data from Magnetic and Solid-State Memory (https://www.cs.auckland.
ac.nz/~pgut001/pubs/secure_del.html), from the proceedings of the Sixth USENIX
Security Symposium (San Jose, California, July 22–25, 1996).
shred makes no attempt to detect or report these problems, just as it makes no attempt
to do anything about backups. However, since it is more reliable to shred devices than files,
shred by default does not deallocate or remove the output file. This default is more suitable
for devices, which typically cannot be deallocated and should not be removed.
Finally, consider the risk of backups and mirrors. File system backups and remote mirrors
may contain copies of the file that cannot be removed, and that will allow a shredded file to
be recovered later. So if you keep any data you may later want to destroy using shred, be
sure that it is not backed up or mirrored.
shred [option]... file[...]
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-f’
‘--force’ Override file permissions if necessary to allow overwriting.
‘-n number’
‘--iterations=number’
By default, shred uses 3 passes of overwrite. You can reduce this to save time,
or increase it if you think it’s appropriate. After 25 passes all of the internal
overwrite patterns will have been used at least once.
‘--random-source=file’
Use file as a source of random data used to overwrite and to choose pass ordering.
See Section 2.5 [Random sources], page 7.
Chapter 11: Basic operations 127
‘-s bytes’
‘--size=bytes’
Shred the first bytes bytes of the file. The default is to shred the whole file.
bytes can be followed by a size specification like ‘K’, ‘M’, or ‘G’ to specify a
multiple. See Section 2.2 [Block size], page 3.
‘-u’
‘--remove[=how]’
After shredding a file, deallocate it (if possible) and then remove it. If a file has
multiple links, only the named links will be removed. Often the file name is less
sensitive than the file data, in which case the optional how parameter, supported
with the long form option, gives control of how to more efficiently remove each
directory entry. The ‘unlink’ parameter will just use a standard unlink call,
‘wipe’ will also first obfuscate bytes in the name, and ‘wipesync’ will also sync
each obfuscated byte in the name to the file system. Although ‘wipesync’ is
the default method, it can be expensive, requiring a sync for every character in
every file. This can become significant with many files, or is redundant if your
file system provides synchronous metadata updates.
‘-v’
‘--verbose’
Display to standard error all status updates as sterilization proceeds.
‘-x’
‘--exact’ By default, shred rounds the size of a regular file up to the next multiple of the
file system block size to fully erase the slack space in the last block of the file.
This space may contain portions of the current system memory on some systems
for example. Use --exact to suppress that behavior. Thus, by default if you
shred a 10-byte regular file on a system with 512-byte blocks, the resulting file
will be 512 bytes long. With this option, shred does not increase the apparent
size of the file.
‘-z’
‘--zero’ Normally, the last pass that shred writes is made up of random data. If this
would be conspicuous on your storage device (for example, because it looks
like encrypted data), or you just think it’s tidier, the --zero option adds an
additional overwrite pass with all zero bits. This is in addition to the number of
passes specified by the --iterations option.
You might use the following command to erase the file system you created on a USB
flash drive. This command typically takes several minutes, depending on the drive’s size
and write speed. On modern storage devices a single pass should be adequate, and will take
one third the time of the default three-pass approach.
shred -v -n 1 /dev/sdd1
Similarly, to erase all data on a selected partition of your device, you could give a
command like the following.
# 1 pass, write pseudo-random data; 3x faster than the default
shred -v -n1 /dev/sda5
Chapter 11: Basic operations 128
To be on the safe side, use at least one pass that overwrites using pseudo-random data.
I.e., don’t be tempted to use ‘-n0 --zero’, in case some device controller optimizes the
process of writing blocks of all zeros, and thereby does not clear all bytes in a block. Some
SSDs may do just that.
A file of ‘-’ denotes standard output. The intended use of this is to shred a removed
temporary file. For example:
i=$(mktemp)
exec 3<>"$i"
rm -- "$i"
echo "Hello, world" >&3
shred - >&3
exec 3>&-
However, the command ‘shred - >file’ does not shred the contents of file, since the
shell truncates file before invoking shred. Use the command ‘shred file’ or (if using a
Bourne-compatible shell) the command ‘shred - 1<>file’ instead.
An exit status of zero indicates success, and a nonzero value indicates failure.
129
Normally ln does not replace existing files. Use the --force (-f) option to replace
them unconditionally, the --interactive (-i) option to replace them conditionally, and
the --backup (-b) option to rename them. Unless the --backup (-b) option is used there
is no brief moment when the destination does not exist; this is an extension to POSIX.
A hard link is another name for an existing file; the link and the original are indistin-
guishable. Technically speaking, they share the same inode, and the inode contains all the
information about a file – indeed, it is not incorrect to say that the inode is the file. Most
systems prohibit making a hard link to a directory; on those where it is allowed, only the
super-user can do so (and with caution, since creating a cycle will cause problems to many
other utilities). Hard links cannot cross file system boundaries. (These restrictions are not
mandated by POSIX, however.)
Symbolic links (symlinks for short), on the other hand, are a special file type (which not
all kernels support: System V release 3 (and older) systems lack symlinks) in which the link
file actually refers to a different file, by name. When most operations (opening, reading,
writing, and so on) are passed the symbolic link file, the kernel automatically dereferences
the link and operates on the target of the link. But some operations (e.g., removing) work
on the link file itself, rather than on its target. The owner and group of a symlink are not
significant to file access performed through the link, but do have implications on deleting a
symbolic link from a directory with the restricted deletion bit set. On the GNU system, the
mode of a symlink has no significance and cannot be changed, but on some BSD systems,
the mode can be changed and will affect whether the symlink will be traversed in file name
resolution. See Section “Symbolic Links” in The GNU C Library Reference Manual.
Symbolic links can contain arbitrary strings; a dangling symlink occurs when the string
in the symlink does not resolve to a file. There are no restrictions against creating dangling
symbolic links. There are trade-offs to using absolute or relative symlinks. An absolute
symlink always points to the same file, even if the directory containing the link is moved.
However, if the symlink is visible from more than one machine (such as on a networked
file system), the file pointed to might not always be the same. A relative symbolic link is
resolved in relation to the directory that contains the link, and is often useful in referring
to files on the same device without regards to what name that device is mounted on when
accessed via networked machines.
When creating a relative symlink in a different location than the current directory, the
resolution of the symlink will be different than the resolution of the same string from the
current directory. Therefore, many users prefer to first change directories to the location
where the relative symlink will be created, so that tab-completion or other file resolution
will find the same target as what will be placed in the symlink.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-b’
‘--backup[=method]’
See Section 2.1 [Backup options], page 2. Make a backup of each file that would
otherwise be overwritten or removed.
Chapter 12: Special file types 131
‘-d’
‘-F’
‘--directory’
Allow users with appropriate privileges to attempt to make hard links to direc-
tories. However, this will probably fail due to system restrictions, even for the
super-user.
‘-f’
‘--force’ Remove existing destination files.
‘-i’
‘--interactive’
Prompt whether to remove existing destination files, and fail if the response is
not affirmative.
‘-L’
‘--logical’
If -s is not in effect, and the source file is a symbolic link, create the hard link
to the file referred to by the symbolic link, rather than the symbolic link itself.
‘-n’
‘--no-dereference’
Do not treat the last operand specially when it is a symbolic link to a directory.
Instead, treat it as if it were a normal file.
When the destination is an actual directory (not a symlink to one), there is
no ambiguity. The link is created in that directory. But when the specified
destination is a symlink to a directory, there are two ways to treat the user’s
request. ln can treat the destination just as it would a normal directory and
create the link in it. On the other hand, the destination can be viewed as a
non-directory – as the symlink itself. In that case, ln must delete or backup
that symlink before creating the new link. The default is to treat a destination
that is a symlink to a directory just like a directory.
This option is weaker than the --no-target-directory (-T) option, so it has
no effect if both options are given.
‘-P’
‘--physical’
If -s is not in effect, and the source file is a symbolic link, create the hard link to
the symbolic link itself. On platforms where this is not supported by the kernel,
this option creates a symbolic link with identical contents; since symbolic link
contents cannot be edited, any file name resolution performed through either
link will be the same as if a hard link had been created.
‘-r’
‘--relative’
Make symbolic links relative to the link location. This option is only valid with
the --symbolic option.
Example:
ln -srv /a/file /tmp
Chapter 12: Special file types 132
Better Example:
Bad Example:
Better Example:
‘-v’
‘--verbose’
Print a message for each created directory. This is most useful with --parents.
‘-Z’
‘--context[=context]’
Without a specified context, adjust the SELinux security context according
to the system default type for destination files, similarly to the restorecon
command. The long form of this option with a specific context specified, will
set the context for newly created files only. With a specified context, if both
SELinux and SMACK are disabled, a warning is issued.
An exit status of zero indicates success, and a nonzero value indicates failure.
‘-m mode’
‘--mode=mode’
Set the mode of created FIFOs to mode, which is symbolic as in chmod and
uses ‘a=rw’ (read and write allowed for everyone) for the point of departure.
mode should specify only file permission bits. See Chapter 27 [File permissions],
page 236.
‘-Z’
‘--context[=context]’
Without a specified context, adjust the SELinux security context according
to the system default type for destination files, similarly to the restorecon
command. The long form of this option with a specific context specified, will
set the context for newly created files only. With a specified context, if both
SELinux and SMACK are disabled, a warning is issued.
An exit status of zero indicates success, and a nonzero value indicates failure.
Chapter 12: Special file types 135
‘Canonicalize mode’
readlink outputs the absolute name of the given files which contain no ., ..
components nor any repeated separators (/) or symbolic links. The realpath
command is the preferred command to use for canonicalization. See Section 18.5
[realpath invocation], page 179.
readlink [option]... file...
By default, readlink operates in readlink mode.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-f’
‘--canonicalize’
Activate canonicalize mode. If any component of the file name except the last
one is missing or unavailable, readlink produces no output and exits with a
nonzero exit code. A trailing slash is ignored.
‘-e’
‘--canonicalize-existing’
Activate canonicalize mode. If any component is missing or unavailable,
readlink produces no output and exits with a nonzero exit code. A trailing
slash requires that the name resolve to a directory.
‘-m’
‘--canonicalize-missing’
Activate canonicalize mode. If any component is missing or unavailable,
readlink treats it as a directory.
‘-n’
‘--no-newline’
Do not print the output delimiter, when a single file is specified. Print a warning
if specified along with multiple files.
‘-s’
‘-q’
‘--silent’
‘--quiet’ Suppress most error messages. On by default.
‘-v’
‘--verbose’
Report error messages.
‘-z’
‘--zero’ Output a zero byte (ASCII NUL) at the end of each line, rather than a newline.
This option enables other programs to parse the output even when that output
would contain data with embedded newlines.
chown command. For example, the chown command might not affect those bits when invoked
by a user with appropriate privileges, or when the bits signify some function other than
executable permission (e.g., mandatory locking). When in doubt, check the underlying
system behavior.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-c’
‘--changes’
Verbosely describe the action for each file whose ownership actually changes.
‘-f’
‘--silent’
‘--quiet’ Do not print error messages about files whose ownership cannot be changed.
‘--from=old-owner’
Change a file’s ownership only if it has current attributes specified by old-owner.
old-owner has the same form as new-owner described above. This option is
useful primarily from a security standpoint in that it narrows considerably the
window of potential abuse. For example, to reflect a user ID numbering change
for one user’s files without an option like this, root might run
find / -owner OLDUSER -print0 | xargs -0 chown -h NEWUSER
But that is dangerous because the interval between when the find tests the
existing file’s owner and when the chown is actually run may be quite large.
One way to narrow the gap would be to invoke chown for each file as it is found:
find / -owner OLDUSER -exec chown -h NEWUSER {} \;
But that is very slow if there are many affected files. With this option, it is
safer (the gap is narrower still) though still not perfect:
chown -h -R --from=OLDUSER NEWUSER /
‘--dereference’
Do not act on symbolic links themselves but rather on what they point to. This
is the default when not operating recursively.
Combining this dereferencing option with the --recursive option may create
a security risk: During the traversal of the directory tree, an attacker may be
able to introduce a symlink to an arbitrary target; when the tool reaches that,
the operation will be performed on the target of that symlink, possibly allowing
the attacker to escalate privileges.
‘-h’
‘--no-dereference’
Act on symbolic links themselves instead of what they point to. This mode
relies on the lchown system call. On systems that do not provide the lchown
system call, no diagnostic is issued, but see --verbose.
‘--preserve-root’
Fail upon any attempt to recursively change the root directory, /. Without
--recursive, this option has no effect. See Section 2.9 [Treating / specially],
page 9.
Chapter 13: Changing file attributes 140
‘--no-preserve-root’
Cancel the effect of any preceding --preserve-root option. See Section 2.9
[Treating / specially], page 9.
‘--reference=ref_file’
Change the user and group of each file to be the same as those of ref file. If
ref file is a symbolic link, do not use the user and group of the symbolic link,
but rather those of the file it refers to.
‘-v’
‘--verbose’
Output a diagnostic for every file processed. If a symbolic link is encoun-
tered during a recursive traversal on a system without the lchown system call,
and --no-dereference is in effect, then issue a diagnostic saying neither the
symbolic link nor its referent is being changed.
‘-R’
‘--recursive’
Recursively change ownership of directories and their contents.
‘-H’ If --recursive (-R) is specified and a command line argument is a symbolic
link to a directory, traverse it.See Section 2.8 [Traversing symlinks], page 9.
‘-L’ In a recursive traversal, traverse every symbolic link to a directory that is
encountered.
Combining this dereferencing option with the --recursive option may create
a security risk: During the traversal of the directory tree, an attacker may be
able to introduce a symlink to an arbitrary target; when the tool reaches that,
the operation will be performed on the target of that symlink, possibly allowing
the attacker to escalate privileges. See Section 2.8 [Traversing symlinks], page 9.
‘-P’ Do not traverse any symbolic links.This is the default if none of -H, -L, or -P is
specified.See Section 2.8 [Traversing symlinks], page 9.
An exit status of zero indicates success, and a nonzero value indicates failure.Examples:
# Change the owner of /u to "root".
chown root /u
If group is intended to represent a numeric group ID, then you may specify it with a
leading ‘+’. See Section 2.4 [Disambiguating names and IDs], page 6.
It is system dependent whether a user can change the group to an arbitrary one, or
the more portable behavior of being restricted to setting a group of which the user is a
member.The program accepts the following options. Also see Chapter 2 [Common options],
page 2.
‘-c’
‘--changes’
Verbosely describe the action for each file whose group actually changes.
‘-f’
‘--silent’
‘--quiet’ Do not print error messages about files whose group cannot be changed.
‘--from=old-owner’
Change a file’s ownership only if it has current attributes specified by old-owner.
old-owner has the same form as new-owner described above. This option is
useful primarily from a security standpoint in that it narrows considerably the
window of potential abuse. For example, to reflect a user ID numbering change
for one user’s files without an option like this, root might run
find / -owner OLDUSER -print0 | xargs -0 chgrp -h NEWUSER
But that is dangerous because the interval between when the find tests the
existing file’s owner and when the chgrp is actually run may be quite large.
One way to narrow the gap would be to invoke chgrp for each file as it is found:
find / -owner OLDUSER -exec chgrp -h NEWUSER {} \;
But that is very slow if there are many affected files. With this option, it is
safer (the gap is narrower still) though still not perfect:
chgrp -h -R --from=OLDUSER NEWUSER /
‘--dereference’
Do not act on symbolic links themselves but rather on what they point to. This
is the default when not operating recursively.
Combining this dereferencing option with the --recursive option may create
a security risk: During the traversal of the directory tree, an attacker may be
able to introduce a symlink to an arbitrary target; when the tool reaches that,
the operation will be performed on the target of that symlink, possibly allowing
the attacker to escalate privileges.
‘-h’
‘--no-dereference’
Act on symbolic links themselves instead of what they point to. This mode
relies on the lchown system call. On systems that do not provide the lchown
system call, no diagnostic is issued, but see --verbose.
‘--preserve-root’
Fail upon any attempt to recursively change the root directory, /. Without
--recursive, this option has no effect. See Section 2.9 [Treating / specially],
page 9.
Chapter 13: Changing file attributes 142
‘--no-preserve-root’
Cancel the effect of any preceding --preserve-root option. See Section 2.9
[Treating / specially], page 9.
‘--reference=ref_file’
Change the group of each file to be the same as that of ref file. If ref file is a
symbolic link, do not use the group of the symbolic link, but rather that of the
file it refers to.
‘-v’
‘--verbose’
Output a diagnostic for every file processed. If a symbolic link is encoun-
tered during a recursive traversal on a system without the lchown system call,
and --no-dereference is in effect, then issue a diagnostic saying neither the
symbolic link nor its referent is being changed.
‘-R’
‘--recursive’
Recursively change the group ownership of directories and their contents.
‘-H’ If --recursive (-R) is specified and a command line argument is a symbolic
link to a directory, traverse it.See Section 2.8 [Traversing symlinks], page 9.
‘-L’ In a recursive traversal, traverse every symbolic link to a directory that is
encountered.
Combining this dereferencing option with the --recursive option may create
a security risk: During the traversal of the directory tree, an attacker may be
able to introduce a symlink to an arbitrary target; when the tool reaches that,
the operation will be performed on the target of that symlink, possibly allowing
the attacker to escalate privileges. See Section 2.8 [Traversing symlinks], page 9.
‘-P’ Do not traverse any symbolic links.This is the default if none of -H, -L, or -P is
specified.See Section 2.8 [Traversing symlinks], page 9.
An exit status of zero indicates success, and a nonzero value indicates failure.Examples:
# Change the group of /u to "staff".
chgrp staff /u
Only a process whose effective user ID matches the user ID of the file, or a process with
appropriate privileges, is permitted to change the file mode bits of a file.
A successful use of chmod clears the set-group-ID bit of a regular file if the file’s group ID
does not match the user’s effective group ID or one of the user’s supplementary group IDs,
unless the user has appropriate privileges. Additional restrictions may cause the set-user-ID
and set-group-ID bits of mode or ref file to be ignored. This behavior depends on the policy
and functionality of the underlying chmod system call. When in doubt, check the underlying
system behavior.
If used, mode specifies the new file mode bits. For details, see the section on Chapter 27
[File permissions], page 236. If you really want mode to have a leading ‘-’, you should use
-- first, e.g., ‘chmod -- -w file’. Typically, though, ‘chmod a-w file’ is preferable, and
chmod -w file (without the --) complains if it behaves differently from what ‘chmod a-w
file’ would do.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-c’
‘--changes’
Verbosely describe the action for each file whose permissions actually change.
‘--dereference’
Do not act on symbolic links themselves but rather on what they point to.
This is the default for command line arguments, but not for symbolic links
encountered when recursing.
Combining this dereferencing option with the --recursive option may create
a security risk: During the traversal of the directory tree, an attacker may be
able to introduce a symlink to an arbitrary target; when the tool reaches that,
the operation will be performed on the target of that symlink, possibly allowing
the attacker to escalate privileges.
‘-h’
‘--no-dereference’
Act on symbolic links themselves instead of what they point to. On systems
that do not support this, no diagnostic is issued, but see --verbose.
‘-f’
‘--silent’
‘--quiet’ Do not print error messages about files whose permissions cannot be changed.
‘--preserve-root’
Fail upon any attempt to recursively change the root directory, /. Without
--recursive, this option has no effect. See Section 2.9 [Treating / specially],
page 9.
‘--no-preserve-root’
Cancel the effect of any preceding --preserve-root option. See Section 2.9
[Treating / specially], page 9.
‘-v’
‘--verbose’
Verbosely describe the action or non-action taken for every file.
Chapter 13: Changing file attributes 144
‘--reference=ref_file’
Change the mode of each file to be the same as that of ref file. See Chapter 27
[File permissions], page 236. If ref file is a symbolic link, do not use the mode
of the symbolic link, but rather that of the file it refers to.
‘-R’
‘--recursive’
Recursively change permissions of directories and their contents.
‘-H’ If --recursive (-R) is specified and a command line argument is a symbolic
link to a directory, traverse it.This is the default if none of -H, -L, or -P is
specified.See Section 2.8 [Traversing symlinks], page 9.
‘-L’ In a recursive traversal, traverse every symbolic link to a directory that is
encountered.
Combining this dereferencing option with the --recursive option may create
a security risk: During the traversal of the directory tree, an attacker may be
able to introduce a symlink to an arbitrary target; when the tool reaches that,
the operation will be performed on the target of that symlink, possibly allowing
the attacker to escalate privileges. See Section 2.8 [Traversing symlinks], page 9.
‘-P’ Do not traverse any symbolic links.See Section 2.8 [Traversing symlinks], page 9.
An exit status of zero indicates success, and a nonzero value indicates failure.Examples:
# Change file permissions of FOO to be world readable
# and user writable, with no other permissions.
chmod 644 foo
chmod a=r,u+w foo
the files. Some older systems have a further restriction: the user must own the files unless
both the access and modification timestamps are being set to the current time.
The touch command cannot set a file’s status change timestamp to a user-specified value,
and cannot change the file’s birth time (if supported) at all. Also, touch has issues similar
to those affecting all programs that update file timestamps. For example, touch may set
a file’s timestamp to a value that differs slightly from the requested time. See Chapter 28
[File timestamps], page 244.
Timestamps assume the time zone rules specified by the TZ environment variable, or by
the system default rules if TZ is not set. See Section “Specifying the Time Zone with TZ” in
The GNU C Library Reference Manual. You can avoid ambiguities during daylight saving
transitions by using UTC timestamps.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-a’
‘--time=atime’
‘--time=access’
‘--time=use’
Change the access timestamp only. See Chapter 28 [File timestamps], page 244.
‘-c’
‘--no-create’
Do not warn about or create files that do not exist.
‘-d time’
‘--date=time’
Use time instead of the current time. It can contain month names, time
zones, ‘am’ and ‘pm’, ‘yesterday’, etc. For example, --date="2020-07-21
14:19:13.489392193 +0530" specifies the instant of time that is 489,392,193
nanoseconds after July 21, 2020 at 2:19:13 PM in a time zone that is 5 hours
and 30 minutes east of UTC. See Chapter 29 [Date input formats], page 245.
File systems that do not support high-resolution timestamps silently ignore any
excess precision here.
‘-h’
‘--no-dereference’
Attempt to change the timestamps of a symbolic link, rather than what the link
refers to. When using this option, empty files are not created, but option -c
must also be used to avoid warning about files that do not exist. Not all systems
support changing the timestamps of symlinks, since underlying system support
for this action was not required until POSIX 2008. Also, on some systems,
the mere act of examining a symbolic link changes the access timestamp, such
that only changes to the modification timestamp will persist long enough to be
observable. When coupled with option -r, a reference timestamp is taken from
a symbolic link rather than the file it refers to.
Chapter 13: Changing file attributes 146
‘-m’
‘--time=mtime’
‘--time=modify’
Change the modification timestamp only.
‘-r file’
‘--reference=file’
Use the times of the reference file instead of the current time. If this option is
combined with the --date=time (-d time) option, the reference file’s time is
the origin for any relative times given, but is otherwise ignored. For example,
‘-r foo -d '-5 seconds'’ specifies a timestamp equal to five seconds before
the corresponding timestamp for foo. If file is a symbolic link, the reference
timestamp is taken from the target of the symlink, unless -h was also in effect.
‘-t [[cc]yy]mmddhhmm[.ss]’
Use the argument (optional four-digit or two-digit years, months, days, hours,
minutes, optional seconds) instead of the current time. If the year is specified
with only two digits, then cc is 20 for years in the range 0 . . . 68, and 19
for years in 69 . . . 99. If no digits of the year are specified, the argument is
interpreted as a date in the current year. On the atypical systems that support
leap seconds, ss may be ‘60’.
On systems predating POSIX 1003.1-2001, touch supports an obsolete syntax, as follows.
If no timestamp is given with any of the -d, -r, or -t options, and if there are two or more
files and the first file is of the form ‘mmddhhmm[yy]’ and this would be a valid argument to
the -t option (if the yy, if any, were moved to the front), and if the represented year is in
the range 1969–1999, that argument is interpreted as the time for the other files instead of as
a file name. Although this obsolete behavior can be controlled with the _POSIX2_VERSION
environment variable (see Section 2.13 [Standards conformance], page 11), portable scripts
should avoid commands whose behavior depends on this variable. For example, use ‘touch
./12312359 main.c’ or ‘touch -t 12312359 main.c’ rather than the ambiguous ‘touch
12312359 main.c’.
An exit status of zero indicates success, and a nonzero value indicates failure.
147
‘-t fstype’
‘--type=fstype’
Limit the listing to file systems of type fstype. Multiple file system types can
be specified by giving multiple -t options. By default, nothing is omitted.
‘-T’
‘--print-type’
Print each file system’s type. The types printed here are the same ones you can
include or exclude with -t and -x. The particular types printed are whatever
is supported by the system. Here are some of the common names (this list is
certainly not exhaustive):
‘nfs’ An NFS file system, i.e., one mounted over a network from an-
other machine. This is the one type name which seems to be used
uniformly by all systems.
‘ext2, ext3, ext4, xfs, btrfs...’
A file system on a locally-mounted device. (The system might even
support more than one type here; GNU/Linux does.)
‘iso9660, cdfs’
A file system on a CD or DVD drive. HP-UX uses ‘cdfs’, most
other systems use ‘iso9660’.
‘ntfs,fat’ File systems used by MS-Windows / MS-DOS.
‘-x fstype’
‘--exclude-type=fstype’
Limit the listing to file systems not of type fstype. Multiple file system types
can be eliminated by giving multiple -x options. By default, no file system
types are omitted.
‘-v’ Ignored; for compatibility with System V versions of df.
df is installed only on systems that have usable mount tables, so portable scripts should
not rely on its existence.
An exit status of zero indicates success, and a nonzero value indicates failure.Failure
includes the case where no output is generated, so you can inspect the exit status of a
command like ‘df -t ext3 -t reiserfs dir’ to test whether dir is on a file system of type
‘ext3’ or ‘reiserfs’.
Since the list of file systems (mtab) is needed to determine the file system type, failure
includes the cases when that list cannot be read and one or more of the options -a, -l, -t
or -x is used together with a file name argument.
overridden (see Section 2.2 [Block size], page 3). Non-integer quantities are rounded up to
the next higher unit.
If two or more hard links point to the same file, only one of the hard links is counted.
The file argument order affects which links are counted, and changing the argument order
may change the numbers and entries that du outputs.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-0’
‘--null’ Output a zero byte (ASCII NUL) at the end of each line, rather than a newline.
This option enables other programs to parse the output even when that output
would contain data with embedded newlines.
‘-a’
‘--all’ Show counts for all files, not just directories.
‘--apparent-size’
Print apparent sizes, rather than file system usage. The apparent size of a file is
the number of bytes reported by wc -c on regular files, or more generally, ls
-l --block-size=1 or stat --format=%s. For example, a file containing the
word ‘zoo’ with no newline would, of course, have an apparent size of 3. Such a
small file may require anywhere from 0 to 16 KiB or more of file system space,
depending on the type and configuration of the file system on which the file
resides. However, a sparse file created with this command:
dd bs=1 seek=2GiB if=/dev/null of=big
has an apparent size of 2 GiB, yet on most modern file systems, it actually uses
almost no space.
Apparent sizes are meaningful only for regular files and symbolic links. Other
file types do not contribute to apparent size.
‘-B size’
‘--block-size=size’
Scale sizes by size before printing them (see Section 2.2 [Block size], page 3).
For example, -BG prints sizes in units of 1,073,741,824 bytes.
‘-b’
‘--bytes’ Equivalent to --apparent-size --block-size=1.
‘-c’
‘--total’ Print a grand total of all arguments after all arguments have been processed.
This can be used to find out the total file system usage of a given set of files or
directories.
‘-D’
‘--dereference-args’
Dereference symbolic links that are command line arguments. Does not affect
other symbolic links. This is helpful for finding out the file system usage of
directories, such as /usr/tmp, which are often symbolic links.
Chapter 14: File space usage 152
‘-d depth’
‘--max-depth=depth’
Show the total for each directory (and file if --all) that is at most MAX DEPTH
levels down from the root of the hierarchy. The root is at level 0, so du
--max-depth=0 is equivalent to du -s.
‘--files0-from=file’
Disallow processing files named on the command line, and instead process those
named in file file; each name being terminated by a zero byte (ASCII NUL). This
is useful when the list of file names is so long that it may exceed a command line
length limitation. In such cases, running du via xargs is undesirable because it
splits the list into pieces and makes du print with the --total (-c) option for
each sublist rather than for the entire list. One way to produce a list of ASCII
NUL terminated file names is with GNU find, using its -print0 predicate. If
file is ‘-’ then the ASCII NUL terminated file names are read from standard
input.
‘-H’ Equivalent to --dereference-args (-D).
‘-h’
‘--human-readable’
Append a size letter to each size, such as ‘M’ for mebibytes. Powers of 1024
are used, not 1000; ‘M’ stands for 1,048,576 bytes. This option is equivalent to
--block-size=human-readable. Use the --si option if you prefer powers of
1000.
‘--inodes’
List inode usage information instead of block usage. This option is useful for
finding directories which contain many files, and therefore eat up most of the
inodes space of a file system (see df, option --inodes). It can well be combined
with the options -a, -c, -h, -l, -s, -S, -t and -x; however, passing other
options regarding the block size, for example -b, -m and --apparent-size, is
ignored.
‘-k’ Print sizes in 1024-byte blocks, overriding the default block size (see Section 2.2
[Block size], page 3). This option is equivalent to --block-size=1K.
‘-L’
‘--dereference’
Dereference symbolic links (show the file system space used by the file or directory
that the link points to instead of the space used by the link).
‘-l’
‘--count-links’
Count the size of all files, even if they have appeared already (as a hard link).
‘-m’ Print sizes in 1,048,576-byte blocks, overriding the default block size (see
Section 2.2 [Block size], page 3). This option is equivalent to --block-size=1M.
‘-P’
‘--no-dereference’
For each symbolic link encountered by du, consider the file system space used
by the symbolic link itself.
Chapter 14: File space usage 153
‘-S’
‘--separate-dirs’
Normally, in the output of du (when not using --summarize), the size listed
next to a directory name, d, represents the sum of sizes of all entries beneath d
as well as the size of d itself. With --separate-dirs, the size reported for a
directory name, d, will exclude the size of any subdirectories.
‘--si’ Append an SI-style abbreviation to each size, such as ‘M’ for megabytes. Powers of
1000 are used, not 1024; ‘M’ stands for 1,000,000 bytes. This option is equivalent
to --block-size=si. Use the -h or --human-readable option if you prefer
powers of 1024.
‘-s’
‘--summarize’
Display only a total for each argument.
‘-t size’
‘--threshold=size’
Exclude entries based on a given size. The size refers to used blocks in normal
mode (see Section 2.2 [Block size], page 3), or inodes count in conjunction with
the --inodes option.
If size is positive, then du will only print entries with a size greater than or
equal to that.
If size is negative, then du will only print entries with a size smaller than or
equal to that.
Although GNU find can be used to find files of a certain size, du’s --threshold
option can be used to also filter directories based on a given size.
When combined with the --apparent-size option, the --threshold option
elides entries based on apparent size. When combined with the --inodes option,
it elides entries based on inode counts.
Here’s how you would use --threshold to find directories with a size greater
than or equal to 200 megabytes:
du --threshold=200MB
Here’s how you would use --threshold to find directories and files – the -a –
with an apparent size smaller than or equal to 500 bytes:
du -a -t -500 --apparent-size
Here’s how you would use --threshold to find directories on the root file system
with more than 20000 inodes used in the directory tree below:
du --inodes -x --threshold=20000 /
‘--time’ Show the most recent modification timestamp (mtime) of any file in the directory,
or any of its subdirectories. See Chapter 28 [File timestamps], page 244.
‘--time=ctime’
‘--time=status’
‘--time=use’
Show the most recent status change timestamp (ctime) of any file in the directory,
or any of its subdirectories. See Chapter 28 [File timestamps], page 244.
Chapter 14: File space usage 154
‘--time=atime’
‘--time=access’
Show the most recent access timestamp (atime) of any file in the directory, or
any of its subdirectories. See Chapter 28 [File timestamps], page 244.
‘--time-style=style’
List timestamps in style style. This option has an effect only if the --time
option is also specified. The style should be one of the following:
‘+format’ List timestamps using format, where format is interpreted like the for-
mat argument of date (see Section 21.1 [date invocation], page 195).
For example, --time-style="+%Y-%m-%d %H:%M:%S" causes du to
list timestamps like ‘2020-07-21 23:45:56’. As with date, for-
mat’s interpretation is affected by the LC_TIME locale category.
‘full-iso’
List timestamps in full using ISO 8601-like date, time, and time
zone components with nanosecond precision, e.g., ‘2020-07-21
23:45:56.477817180 -0400’. This style is equivalent to
‘+%Y-%m-%d %H:%M:%S.%N %z’.
‘long-iso’
List ISO 8601 date and time components with minute precision, e.g.,
‘2020-07-21 23:45’. These timestamps are shorter than ‘full-iso’
timestamps, and are usually good enough for everyday work. This
style is equivalent to ‘+%Y-%m-%d %H:%M’.
‘iso’ List ISO 8601 dates for timestamps, e.g., ‘2020-07-21’. This style
is equivalent to ‘+%Y-%m-%d’.
You can specify the default value of the --time-style option with the en-
vironment variable TIME_STYLE; if TIME_STYLE is not set the default style is
‘long-iso’. For compatibility with ls, if TIME_STYLE begins with ‘+’ and con-
tains a newline, the newline and any later characters are ignored; if TIME_STYLE
begins with ‘posix-’ the ‘posix-’ is ignored; and if TIME_STYLE is ‘locale’ it
is ignored.
‘-X file’
‘--exclude-from=file’
Like --exclude, except take the patterns to exclude from file, one per line. If
file is ‘-’, take the patterns from standard input.
‘--exclude=pattern’
When recursing, skip subdirectories or files matching pattern. For example, du
--exclude='*.o' excludes files whose names end in ‘.o’.
‘-x’
‘--one-file-system’
Skip directories that are on different file systems from the one that the argument
being processed is on.
Since du relies on information reported by the operating system, its output might not
reflect the space consumed in the underlying devices. For example;
Chapter 14: File space usage 155
‘shell-escape-always’
Like ‘shell-escape’, but quote strings even if they would normally not require
quoting.
‘c’ Quote strings as for C character string literals, including the surrounding double-
quote characters; this is the same as the --quote-name (-Q) option.
‘escape’ Quote strings as for C character string literals, except omit the surrounding
double-quote characters; this is the same as the --escape (-b) option.
‘clocale’ Quote strings as for C character string literals, except use surrounding quotation
marks appropriate for the locale.
‘locale’ Quote strings as for C character string literals, except use surrounding quotation
marks appropriate for the locale, and quote ’like this’ instead of "like
this" in the default C locale. This looks nicer on many displays.
The ‘r’, ‘R’, ‘%t’, and ‘%T’ formats operate on the st rdev member of the stat(2) structure,
i.e., the represented device rather than the containing device, and so are only defined for
character and block special files. On some systems or file types, st rdev may be used to
represent other quantities.
The ‘%W’, ‘%X’, ‘%Y’, and ‘%Z’ formats accept a precision preceded by a period to specify
the number of digits to print after the decimal point. For example, ‘%.3X’ outputs the access
timestamp to millisecond precision. If a period is given but no precision, stat uses 9 digits,
so ‘%.X’ is equivalent to ‘%.9X’. When discarding excess precision, timestamps are truncated
toward minus infinity.
zero pad:
$ stat -c '[%015Y]' /usr
[000001288929712]
space align:
$ stat -c '[%15Y]' /usr
[ 1288929712]
$ stat -c '[%-15Y]' /usr
[1288929712 ]
precision:
$ stat -c '[%.3Y]' /usr
[1288929712.114]
$ stat -c '[%.Y]' /usr
[1288929712.114951834]
The mount point printed by ‘%m’ is similar to that output by df, except that:
• stat does not dereference symlinks by default (unless -L is specified)
• stat does not search for specified device nodes in the file system list, instead operating
on them directly
• stat outputs the alias for a bind mounted file, rather than the initial mount point of its
backing device. One can recursively call stat until there is no change in output, to get
the current base mount point
When listing file system information (--file-system (-f)), you must use a different set
of format directives:
• %a – Free blocks available to non-super-user
Chapter 14: File space usage 159
15 Printing text
This section describes commands that display text strings.
If the POSIXLY_CORRECT environment variable is set, then when echo’s first argument is
not -n it outputs option-like arguments instead of treating them as options. For example,
echo -ne hello outputs ‘-ne hello’ instead of plain ‘hello’. Also backslash escapes are
always enabled. To echo the string ‘-n’, one of the characters can be escaped in either octal
or hexadecimal representation. For example, echo -e '\x2dn'.
POSIX does not require support for any options, and says that the behavior of echo
is implementation-defined if any string contains a backslash or if the first argument is
-n. Portable programs should use the printf command instead. See Section 15.2 [printf
invocation], page 162.
An exit status of zero indicates success, and a nonzero value indicates failure.
16 Conditions
This section describes commands that are primarily useful for their exit status, rather than
their output. Thus, they are often used as the condition of shell if statements, or as the
last command in a pipeline.
test has an alternate form that uses opening and closing square brackets instead a
leading ‘test’. For example, instead of ‘test -d /’, you can write ‘[ -d / ]’. The square
brackets must be separate arguments; for example, ‘[-d /]’ does not have the desired effect.
Since ‘test expr’ and ‘[ expr ]’ have the same meaning, only the former form is discussed
below.
Synopses:
test expression
test
[ expression ]
[ ]
[ option
Due to shell aliases and built-in test functions, using an unadorned test interactively
or in a script may get you different functionality than that described here. Invoke it via env
(i.e., env test ...) to avoid interference from the shell.
If expression is omitted, test returns false. If expression is a single argument, test
returns false if the argument is null and true otherwise. The argument can be any string,
including strings like ‘-d’, ‘-1’, ‘--’, ‘--help’, and ‘--version’ that most other programs
would treat as options. To get help and version information, invoke the commands ‘[ --help’
and ‘[ --version’, without the usual closing brackets. See Chapter 2 [Common options],
page 2.
Exit status:
0 if the expression is true,
1 if the expression is false,
2 if an error occurred.
‘-h file’
‘-L file’ True if file exists and is a symbolic link. Unlike all other file-related tests, this
test does not dereference file if it is a symbolic link.
‘string1 == string2’
True if the strings are equal (synonym for ‘=’). This form is not as portable to
other shells and systems.
‘string1 != string2’
True if the strings are not equal.
‘expr1 -o expr2’
True if either expr1 or expr2 is true. ‘-o’ is left associative.
If regex uses ‘\(’ and ‘\)’, the : expression returns the part of string that
matched the subexpression, or the null string if the match failed or the subex-
pression did not contribute to the match.
Only the first ‘\( ... \)’ pair is relevant to the return value; additional pairs
are meaningful only for grouping the regular expression operators.
In the regular expression, \+, \?, and \| are operators which respectively match
one or more, zero or one, or separate alternatives. These operators are GNU
extensions. See Section “Regular Expressions” in The GNU Grep Manual,
for details of regular expression syntax. Some examples are in Section 16.4.4
[Examples of expr], page 171.
‘match string regex’
An alternative way to do pattern matching. This is the same as
‘string : regex’.
‘substr string position length’
Returns the substring of string beginning at position with length at most length.
If either position or length is negative, zero, or non-numeric, returns the null
string.
‘index string charset’
Returns the first position in string where the first character in charset was found.
If no character in charset is found in string, return 0.
‘length string’
Returns the length of string.
‘+ token’ Interpret token as a string, even if it is a keyword like match or an operator
like /. This makes it possible to test expr length + "$x" or expr + "$x" :
'.*/\(.\)' and have it do the right thing even if the value of $x happens to
be (for example) / or index. This operator is a GNU extension. Portable shell
scripts should use " $token" : ' \(.*\)' instead of + "$token".
To make expr interpret keywords as strings, you must use the quote operator.
‘|’ Returns its first argument if that is neither null nor zero, otherwise its second
argument if it is neither null nor zero, otherwise 0. It does not evaluate its
second argument if its first argument is neither null nor zero.
‘&’ Return its first argument if neither argument is null or zero, otherwise 0. It does
not evaluate its second argument if its first argument is null or zero.
‘< <= = == != >= >’
Compare the arguments and return 1 if the relation is true, 0 otherwise. == is a
synonym for =. expr first tries to convert both arguments to integers and do a
numeric comparison; if either conversion fails, it does a lexicographic comparison
using the character collating sequence specified by the LC_COLLATE locale.
17 Redirection
Unix shells commonly provide several forms of redirection – ways to change the input source
or output destination of a command. But one useful redirection is performed by a separate
command, not by the shell; it’s described here.
‘exit-nopipe’
Exit on error opening or writing any output, except pipes. Exit
immediately if all remaining outputs become broken pipes.
The tee command is useful when you happen to be transferring a large amount of data
and also want to summarize that data without reading it a second time. For example, when
you are downloading a DVD image, you often want to verify its signature or checksum right
away. The inefficient way to do it is simply:
wget https://example.com/some.iso && sha1sum some.iso
One problem with the above is that it makes you wait for the download to complete
before starting the time-consuming SHA1 computation. Perhaps even more importantly,
the above requires reading the DVD image a second time (the first was from the network).
The efficient way to do it is to interleave the download and SHA1 computation. Then,
you’ll get the checksum for free, because the entire process parallelizes so well:
# slightly contrived, to demonstrate process substitution
wget -O - https://example.com/dvd.iso \
| tee >(sha1sum > dvd.sha1) > dvd.iso
That makes tee write not just to the expected output file, but also to a pipe running
sha1sum and saving the final checksum in a file named dvd.sha1.
However, this example relies on a feature of modern shells called process substitution
(the ‘>(command)’ syntax, above; See Section “Process Substitution” in The Bash Reference
Manual.), so it works with zsh, bash, and ksh, but not with /bin/sh. So if you write code
like this in a shell script, start the script with ‘#!/bin/bash’.
If any of the process substitutions (or piped standard output) might exit early without
consuming all the data, the -p option is needed to allow tee to continue to process the
input to any remaining outputs.
Since the above example writes to one file and one process, a more conventional and
portable use of tee is even better:
wget -O - https://example.com/dvd.iso \
| tee dvd.iso | sha1sum > dvd.sha1
You can extend this example to make tee write to two processes, computing MD5 and
SHA1 checksums in parallel. In this case, process substitution is required:
wget -O - https://example.com/dvd.iso \
| tee >(sha1sum > dvd.sha1) \
>(md5sum > dvd.md5) \
> dvd.iso
This technique is also useful when you want to make a compressed copy of the contents
of a pipe. Consider a tool to graphically summarize file system usage data from ‘du -ak’.
For a large hierarchy, ‘du -ak’ can run for a long time, and can easily produce terabytes of
data, so you won’t want to rerun the command unnecessarily. Nor will you want to save the
uncompressed output.
Doing it the inefficient way, you can’t even start the GUI until after you’ve compressed
all of the du output:
du -ak | gzip -9 > /tmp/du.gz
Chapter 17: Redirection 174
# Output "stdio".
basename include/stdio.h .h
# Output "stdio".
basename -s .h include/stdio.h
# Output ".".
dirname stdio.h
2. A file name contains a character outside the POSIX portable file name
character set, namely, the ASCII letters and digits, ‘.’, ‘_’, ‘-’, and ‘/’.
3. The length of a file name or one of its components exceeds the POSIX
minimum limits for portability.
‘-P’ Print an error message if a file name is empty, or if it contains a component that
begins with ‘-’.
‘--portability’
Print an error message if a file name is not portable to all POSIX hosts. This
option is equivalent to ‘-p -P’.
Exit status:
0 if all specified file names passed all checks,
1 otherwise.
file-H08W.txt
$ mktemp file-XXXX-XXXX.txt
file-XXXX-eI9L.txt
• Create a secure fifo relative to the user’s choice of TMPDIR, but falling back to the
current directory rather than /tmp. Although mktemp does not create fifos, it can create
a secure directory in which fifos can live. Exit the shell if the directory or fifo could not
be created.
$ dir=$(mktemp -p "${TMPDIR:-.}" -d dir-XXXX) || exit 1
$ fifo=$dir/fifo
$ mkfifo "$fifo" || { rmdir "$dir"; exit 1; }
• Create and use a temporary file if possible, but ignore failure. The file will reside in the
directory named by TMPDIR, if specified, or else in /tmp.
$ file=$(mktemp -q) && {
> # Safe to use $file only within this block. Use quotes,
> # since $TMPDIR, and thus $file, may contain whitespace.
> echo ... > "$file"
> rm "$file"
> }
• Act as a semi-random character generator (it is not fully random, since it is impacted by
the contents of the current directory). To avoid security holes, do not use the resulting
names to create a file.
$ mktemp -u XXX
Gb9
$ mktemp -u XXX
nzC
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘-d’
‘--directory’
Create a directory rather than a file. The directory will have read, write, and
search permissions for the current user, but no permissions for the group or
others; these permissions are reduced if the current umask is more restrictive.
‘-q’
‘--quiet’ Suppress diagnostics about failure to create a file or directory. The exit status
will still reflect whether a file was created.
‘-u’
‘--dry-run’
Generate a temporary name that does not name an existing file, without changing
the file system contents. Using the output of this command to create a new file
is inherently unsafe, as there is a window of time between generating the name
and using it where another process can create an object by the same name.
‘-p dir’
‘--tmpdir[=dir]’
Treat template relative to the directory dir. If dir is not specified (only possible
with the long option --tmpdir) or is the empty string, use the value of TMPDIR
Chapter 18: File name manipulation 179
‘--suffix=suffix’
Append suffix to the template. suffix must not contain slash. If --suffix is
specified, template must end in ‘X’; if it is not specified, then an appropriate
--suffix is inferred by finding the last ‘X’ in template. This option exists for
use with the default template and for the creation of a suffix that starts with
‘X’.
‘-t’ Treat template as a single file relative to the value of TMPDIR if available, or to
the directory specified by -p, otherwise to ‘/tmp’. template must not contain
slashes. This option is deprecated; the use of -p without -t offers better defaults
(by favoring the command line over TMPDIR) and more flexibility (by allowing
intermediate directories).
Exit status:
0 if the file was created,
1 otherwise.
‘-e’
‘--canonicalize-existing’
Ensure that all components of the specified file names exist. If any component is
missing or unavailable, realpath will output a diagnostic unless the -q option
is specified, and exit with a nonzero exit code. A trailing slash requires that the
name resolve to a directory.
‘-m’
‘--canonicalize-missing’
If any component of a specified file name is missing or unavailable, treat it as a
directory.
‘-L’
‘--logical’
Symbolic links are resolved in the specified file names, but they are resolved
after any subsequent ‘..’ components are processed.
Chapter 18: File name manipulation 180
‘-P’
‘--physical’
Symbolic links are resolved in the specified file names, and they are resolved
before any subsequent ‘..’ components are processed. This is the default mode
of operation.
‘-q’
‘--quiet’ Suppress diagnostic messages for specified file names.
‘--relative-to=dir’
Print the resolved file names relative to the specified directory. This option
honors the -m and -e options pertaining to file existence.
‘--relative-base=dir’
Print the resolved file names as relative if the files are descendants of dir. Other-
wise, print the resolved file names as absolute. This option honors the -m and -e
options pertaining to file existence. For details about combining --relative-to
and --relative-base, see Section 18.5.1 [Realpath usage examples], page 180.
‘-s’
‘--strip’
‘--no-symlinks’
Do not resolve symbolic links. Only resolve references to ‘/./’, ‘/../’ and
remove extra ‘/’ characters. When combined with the -m option, realpath
operates only on the file name, and does not touch any actual file.
‘-z’
‘--zero’ Output a zero byte (ASCII NUL) at the end of each line, rather than a newline.
This option enables other programs to parse the output even when that output
would contain data with embedded newlines.
Exit status:
0 if all file names were printed without issue.
1 otherwise.
With --relative-to, file names are printed relative to the given directory:
Chapter 18: File name manipulation 181
realpath --relative-to=/usr/bin \
/usr/bin/sort /tmp/foo /usr/share/dict/words 1.txt
⇒ sort
⇒ ../../tmp/foo
⇒ ../share/dict/american-english
⇒ ../../home/user/1.txt
With --relative-base, relative file names are printed if the resolved file name is below
the given base directory. For files outside the base directory absolute file names are printed:
realpath --relative-base=/usr \
/usr/bin/sort /tmp/foo /usr/share/dict/words 1.txt
⇒ bin/sort
⇒ /tmp/foo
⇒ share/dict/american-english
⇒ /home/user/1.txt
When both --relative-to=DIR1 and --relative-base=DIR2 are used, file names are
printed relative to dir1 if they are located below dir2. If the files are not below dir2, they
are printed as absolute file names:
realpath --relative-to=/usr/bin --relative-base=/usr \
/usr/bin/sort /tmp/foo /usr/share/dict/words 1.txt
⇒ sort
⇒ /tmp/foo
⇒ ../share/dict/american-english
⇒ /home/user/1.txt
When both --relative-to=DIR1 and --relative-base=DIR2 are used, dir1 must be a
subdirectory of dir2. Otherwise, realpath prints absolutes file names.
182
19 Working context
This section describes commands that display or alter the context in which you are working:
the current directory, the terminal settings, and so forth. See also the user-related commands
in the next section.
‘-F device’
‘--file=device’
Set the line opened by the file name specified in device instead of the tty line
connected to standard input. This option is necessary because opening a POSIX
tty requires use of the O_NONDELAY flag to prevent a POSIX tty from blocking
until the carrier detect line is high if the clocal flag is not set. Hence, it is not
always possible to allow the shell to open the device in the traditional manner.
‘-g’
‘--save’ Print all current settings in a form that can be used as an argument to another
stty command to restore the current settings. This option may not be used in
combination with any line settings.
Many settings can be turned off by preceding them with a ‘-’. Such arguments are
marked below with “May be negated” in their description. The descriptions themselves refer
to the positive case, that is, when not negated (unless stated otherwise, of course).
Some settings are not available on all POSIX systems, since they use extensions. Such
arguments are marked below with “Non-POSIX” in their description. On non-POSIX
systems, those or other settings also may not be available, but it’s not feasible to document
all the variations: just try it and see.
stty is installed only on platforms with the POSIX terminal interface, so portable scripts
should not rely on its existence on non-POSIX platforms.
An exit status of zero indicates success, and a nonzero value indicates failure.
‘nl1’
‘nl0’ Newline delay style. Non-POSIX.
‘cr3’
‘cr2’
‘cr1’
‘cr0’ Carriage return delay style. Non-POSIX.
‘tab3’
‘tab2’
‘tab1’
‘tab0’ Horizontal tab delay style. Non-POSIX.
‘bs1’
‘bs0’ Backspace delay style. Non-POSIX.
‘vt1’
‘vt0’ Vertical tab delay style. Non-POSIX.
‘ff1’
‘ff0’ Form feed delay style. Non-POSIX.
‘echoke’
‘crtkill’ Echo the kill special character by erasing each character on the line as indicated
by the echoprt and echoe settings, instead of by the echoctl and echok settings.
Non-POSIX. May be negated.
‘extproc’ Enable ‘LINEMODE’, which is used to avoid echoing each character over high
latency links. See also Internet RFC 1116 (https://datatracker.ietf.org/
doc/rfc1116/). Non-POSIX. May be negated.
‘flusho’ Discard output. This setting is currently ignored on GNU/Linux systems. Non-
POSIX. May be negated.
‘-0’
‘--null’ Output a zero byte (ASCII NUL) at the end of each line, rather than a newline.
This option enables other programs to parse the output even when that output
would contain data with embedded newlines.
Exit status:
0 if all variables specified were found
1 if at least one specified variable was not found
2 if a write error occurred
20 User information
This section describes commands that print user-related information: logins, groups, and so
forth.
Primary and supplementary groups for a process are normally inherited from its parent
and are usually unchanged since login. This means that if you change the group database
after logging in, id will not reflect your changes within your existing login session. Running
id with a user argument causes the user and group database to be consulted afresh, and so
will give a different result.
An exit status of zero indicates success, and a nonzero value indicates failure.
With no file argument, users extracts its information from a system-maintained file
(often /var/run/utmp or /etc/utmp). If a file argument is given, users uses that file instead.
A common choice is /var/log/wtmp.
The only options are --help and --version. See Chapter 2 [Common options], page 2.
The users command is installed only on platforms with the POSIX <utmpx.h> include
file or equivalent, so portable scripts should not rely on its existence on non-POSIX platforms.
An exit status of zero indicates success, and a nonzero value indicates failure.
‘-p’
‘--process’
List active processes spawned by init.
‘-q’
‘--count’ Print only the login names and the number of users logged on. Overrides all
other options.
‘-r’
‘--runlevel’
Print the current (and maybe previous) run-level of the init process.
‘-s’ Ignored; for compatibility with other versions of who.
‘-t’
‘--time’ Print last system clock change.
‘-u’ After the login time, print the number of hours and minutes that the user has
been idle. ‘.’ means the user was active in the last minute. ‘old’ means the
user has been idle for more than 24 hours.
‘-w’
‘-T’
‘--mesg’
‘--message’
‘--writable’
After each login name print a character indicating the user’s message status:
‘+’ allowing write messages
‘-’ disallowing write messages
‘?’ cannot find terminal device
The who command is installed only on platforms with the POSIX <utmpx.h> include file
or equivalent, so portable scripts should not rely on its existence on non-POSIX platforms.
An exit status of zero indicates success, and a nonzero value indicates failure.
‘-s’ Produce short format output. This is the default behavior when no options are
given.
‘-f’ Omit the column headings when printing in short format.
‘-w’ Omit the user’s full name when printing in short format.
‘-i’ Omit the user’s full name and remote host when printing in short format.
‘-q’ Omit the user’s full name, remote host, and idle time when printing in short
format.
‘--lookup’
Attempt to canonicalize hostnames found in utmp through a DNS lookup. This
is not the default because of potential delays.
An exit status of zero indicates success, and a nonzero value indicates failure.
195
21 System context
This section describes commands that print or change system-wide information.
‘%S’ second (‘00’. . . ‘60’). This may be ‘60’ if leap seconds are supported.
‘%T’ 24-hour hour, minute, and second. Same as ‘%H:%M:%S’.
‘%X’ locale’s time representation (e.g., ‘23:13:48’)
‘%z’ Four-digit numeric time zone, e.g., ‘-0600’ or ‘+0530’, or ‘-0000’ if no time zone
is determinable. This value reflects the numeric time zone appropriate for the
current time, using the time zone rules specified by the TZ environment variable.
A time zone is not determinable if its numeric offset is zero and its abbreviation
begins with ‘-’. The time (and optionally, the time zone rules) can be overridden
by the --date option.
‘%:z’ Numeric time zone with ‘:’, e.g., ‘-06:00’ or ‘+05:30’), or ‘-00:00’ if no time
zone is determinable. This is a GNU extension.
‘%::z’ Numeric time zone to the nearest second with ‘:’ (e.g., ‘-06:00:00’ or
‘+05:30:00’), or ‘-00:00:00’ if no time zone is determinable. This is a GNU
extension.
‘%:::z’ Numeric time zone with ‘:’ using the minimum necessary precision (e.g., ‘-06’,
‘+05:30’, or ‘-04:56:02’), or ‘-00’ if no time zone is determinable. This is a
GNU extension.
‘%Z’ alphabetic time zone abbreviation (e.g., ‘EDT’), or nothing if no time zone is
determinable. See ‘%z’ for how it is determined.
‘%g’ year corresponding to the ISO week number, but without the century (range ‘00’
through ‘99’). This has the same format and value as ‘%y’, except that if the
ISO week number (see ‘%V’) belongs to the previous or next year, that year is
used instead.
‘%G’ year corresponding to the ISO week number. This has the same format and value
as ‘%Y’, except that if the ISO week number (see ‘%V’) belongs to the previous or
next year, that year is used instead. It is normally useful only if ‘%V’ is also used;
for example, the format ‘%G-%m-%d’ is probably a mistake, since it combines the
ISO week number year with the conventional month and day.
‘%h’ same as ‘%b’
‘%j’ day of year (‘001’. . . ‘366’)
‘%m’ month (‘01’. . . ‘12’)
‘%q’ quarter of year (‘1’. . . ‘4’)
‘%u’ day of week (‘1’. . . ‘7’) with ‘1’ corresponding to Monday
‘%U’ week number of year, with Sunday as the first day of the week (‘00’. . . ‘53’).
Days in a new year preceding the first Sunday are in week zero.
‘%V’ ISO week number, that is, the week number of year, with Monday as the first
day of the week (‘01’. . . ‘53’). If the week containing January 1 has four or more
days in the new year, then it is considered week 1; otherwise, it is week 53 of
the previous year, and the next week is week 1. (See the ISO 8601 standard.)
‘%w’ day of week (‘0’. . . ‘6’) with 0 corresponding to Sunday
‘%W’ week number of year, with Monday as first day of week (‘00’. . . ‘53’). Days in a
new year preceding the first Monday are in week zero.
‘%x’ locale’s date representation (e.g., ‘12/31/99’)
‘%y’ last two digits of year (‘00’. . . ‘99’)
‘%Y’ year. This is normally at least four characters, but it may be more. Year ‘0000’
precedes year ‘0001’, and year ‘-001’ precedes year ‘0000’.
POSIX specifies the behavior of flags and field widths only for ‘%C’, ‘%F’, ‘%G’, and ‘%Y’
(all without modifiers), and requires a flag to be present if and only if a field width is also
present. Other combinations of flags, field widths and modifiers are GNU extensions.
‘-f datefile’
‘--file=datefile’
Parse each line in datefile as with -d and display the resulting date and time. If
datefile is ‘-’, use standard input. This is useful when you have many dates to
process, because the system overhead of starting up the date executable many
times can be considerable.
‘-I[timespec]’
‘--iso-8601[=timespec]’
Display the date using an ISO 8601 format, ‘%Y-%m-%d’.
The argument timespec specifies the number of additional terms of the time to
include. It can be one of the following:
‘auto’ Print just the date. This is the default if timespec is omitted. This
is like the format %Y-%m-%d.
‘hours’ Also print hours and time zone. This is like the format
%Y-%m-%dT%H%:z.
‘minutes’ Also print minutes. This is like the format %Y-%m-%dT%H:%M%:z.
‘seconds’ Also print seconds. This is like the format %Y-%m-%dT%H:%M:%S%:z.
‘ns’ Also print nanoseconds. This is like the format
%Y-%m-%dT%H:%M:%S,%N%:z.
This format is always suitable as input for the --date (-d) and --file (-f)
options, regardless of the current locale.
‘-r file’
‘--reference=file’
Display the date and time of the last modification of file, instead of the current
date and time.
‘--resolution’
Display the timestamp resolution instead of the time. Current clock timestamps
that are output by date are integer multiples of the timestamp resolution. With
this option, the format defaults to ‘%s.%N’. For example, if the clock resolution
is 1 millisecond, the output is:
0.001000000
‘-R’
‘--rfc-email’
Display the date and time using the format ‘%a, %d %b %Y %H:%M:%S %z’, evalu-
ated in the C locale so abbreviations are always in English. For example:
Mon, 09 Jul 2020 17:00:00 -0400
This format conforms to Internet RFCs 5322 (https://datatracker.ietf.
org/doc/rfc5322/), 2822 (https://datatracker.ietf.org/doc/rfc2822/)
and 822 (https://datatracker.ietf.org/doc/rfc822/), the current and pre-
vious standards for Internet email. For compatibility with older versions of date,
--rfc-2822 and --rfc-822 are aliases for --rfc-email.
Chapter 21: System context 201
‘--rfc-3339=timespec’
Display the date using a format specified by Internet RFC 3339 (https://
datatracker.ietf.org/doc/rfc3339/). This is like --iso-8601, except that
a space rather than a ‘T’ separates dates from times, and a period rather than a
comma separates seconds from subseconds. This format is always suitable as
input for the --date (-d) and --file (-f) options, regardless of the current
locale.The argument timespec specifies how much of the time to include. It can
be one of the following:
‘date’ Print just the full-date, e.g., ‘2020-07-21’. This is like the format
‘%Y-%m-%d’.
‘seconds’ Print the full-date and full-time separated by a space, e.g.,
‘2020-07-21 04:30:37+05:30’. The output ends with a numeric
time-offset; here the ‘+05:30’ means that local time is five hours
and thirty minutes east of UTC. This is like the format ‘%Y-%m-%d
%H:%M:%S%:z’.
‘ns’ Like ‘seconds’, but also print nanoseconds, e.g., ‘2020-07-21
04:30:37.998458565+05:30’. This is like the format ‘%Y-%m-%d
%H:%M:%S.%N%:z’.
‘-s datestr’
‘--set=datestr’
Set the date and time to datestr. See -d above. See also Section 21.1.2 [Setting
the time], page 199.
‘-u’
‘--utc’
‘--universal’
Use Universal Time by operating as if the TZ environment variable were set
to the string ‘UTC0’. UTC stands for Coordinated Universal Time, established
in 1960. Universal Time is often called “Greenwich Mean Time” (GMT) for
historical reasons. Typically, systems ignore leap seconds and thus implement
an approximation to UTC rather than true UTC.
But this may not be what you want because for the first nine days of the month, the
‘%d’ expands to a zero-padded two-digit field, for example ‘date -d 1may '+%B %d'’ will
print ‘May 01’.
• To print a date without the leading zero for one-digit days of the month, you can use
the (GNU extension) ‘-’ flag to suppress the padding altogether:
date -d 1may '+%B %-d'
• To print the current date and time in the format required by many non-GNU versions
of date when setting the system clock:
date +%m%d%H%M%Y.%S
• To set the system clock forward by two minutes:
date --set='+2 minutes'
• To print the date in Internet RFC 5322 format, use ‘date --rfc-email’. Here is some
example output:
Tue, 09 Jul 2020 19:00:37 -0400
• To convert a date string to the number of seconds since the Epoch (which is 1970-01-01
00:00 UTC), use the --date option with the ‘%s’ format. That can be useful in sorting
and/or graphing and/or comparing data by date. The following command outputs the
number of the seconds since the Epoch for the time two minutes after the Epoch:
date --date='1970-01-01 00:02:00 +0000' +%s
120
To convert a date string from one time zone from to another to, specify ‘TZ="from"’
in the environment and ‘TZ="to"’ in the --date option. See Section 29.10 [Specifying
time zone rules], page 250. For example:
TZ="Asia/Tokyo" date --date='TZ="America/New_York" 2023-05-07 12:23'
Mon May 8 01:23:00 JST 2023
If you do not specify time zone information in the date string, date uses your computer’s
idea of the time zone when interpreting the string. For example, if your computer’s
time zone is that of Cambridge, Massachusetts, which was then 5 hours (i.e., 18,000
seconds) behind UTC:
# local time zone used
date --date='1970-01-01 00:02:00' +%s
18120
• If you’re sorting or graphing dated data, your raw date values may be represented
as seconds since the Epoch. But few people can look at the date ‘1577836800’ and
casually note “Oh, that’s the first second of the year 2020 in Greenwich, England.”
date --date='2020-01-01 UTC' +%s
1577836800
An alternative is to use the --utc (-u) option. Then you may omit ‘UTC’ from the
date string. Although this produces the same result for ‘%s’ and many other format
sequences, with a time zone offset different from zero, it would give a different result for
zone-dependent formats like ‘%z’.
date -u --date=2020-07-21 +%s
1595289600
Chapter 21: System context 203
To convert such an unwieldy number of seconds back to a more readable form, use a
command like this:
date -d @1595289600 +"%F %T %z"
2020-07-20 20:00:00 -0400
Often it is better to output UTC-relative date and time:
date -u -d @1595289600 +"%F %T %z"
2020-07-21 00:00:00 +0000
• Typically the seconds count omits leap seconds, but some systems are exceptions.
Because leap seconds are not predictable, the mapping between the seconds count and
a future timestamp is not reliable on the atypical systems that include leap seconds in
their counts.
Here is how the two kinds of systems handle the leap second at the end of the year 2016:
# Typical systems ignore leap seconds:
date --date='2016-12-31 23:59:59 +0000' +%s
1483228799
date --date='2016-12-31 23:59:60 +0000' +%s
date: invalid date '2016-12-31 23:59:60 +0000'
date --date='2017-01-01 00:00:00 +0000' +%s
1483228800
# Atypical systems count leap seconds:
date --date='2016-12-31 23:59:59 +0000' +%s
1483228825
date --date='2016-12-31 23:59:60 +0000' +%s
1483228826
date --date='2017-01-01 00:00:00 +0000' +%s
1483228827
An exit status of zero indicates success, and a nonzero value indicates failure.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
‘--all’ Print the number of installed processors on the system, which may be greater than
the number online or available to the current process. The OMP_NUM_THREADS
or OMP_THREAD_LIMIT environment variables are not honored in this case.
‘--ignore=number’
If possible, exclude this number of processing units.
An exit status of zero indicates success, and a nonzero value indicates failure.
‘-o’
‘--operating-system’
Print the name of the operating system.
‘-r’
‘--kernel-release’
Print the kernel release.
‘-s’
‘--kernel-name’
Print the kernel name. POSIX 1003.1-2001 (see Section 2.13 [Standards con-
formance], page 11) calls this “the implementation of the operating system”,
because the POSIX specification itself has no notion of “kernel”. The kernel
name might be the same as the operating system name printed by the -o or
--operating-system option, but it might differ. Some operating systems (e.g.,
FreeBSD, HP-UX) have the same name as their underlying kernels; others (e.g.,
GNU/Linux, Solaris) do not.
‘-v’
‘--kernel-version’
Print the kernel version.
An exit status of zero indicates success, and a nonzero value indicates failure.
An exit status of zero indicates success, and a nonzero value indicates failure.
22 SELinux context
This section describes commands for operations with SELinux contexts.
‘-r role’
‘--role=role’
Set role role in the target security context.
‘-t type’
‘--type=type’
Set type type in the target security context.
‘-l range’
‘--range=range’
Set range range in the target security context.
An exit status of zero indicates success, and a nonzero value indicates failure.
Exit status:
125 if runcon itself fails
126 if command is found but cannot be invoked
127 if command cannot be found
the exit status of command otherwise
210
Here are a few tips to help avoid common problems in using chroot. To start with a
simple example, make command refer to a statically linked binary. If you were to use a
dynamically linked executable, then you’d have to arrange to have the shared libraries in
the right place under your new root directory.
For example, if you create a statically linked ls executable, and put it in /tmp/empty,
you can run this command as root:
$ chroot /tmp/empty /ls -Rl /
Then you’ll see output like this:
/:
total 1023
-rwxr-xr-x 1 0 0 1041745 Aug 16 11:17 ls
If you want to use a dynamically linked executable, say bash, then first run ‘ldd bash’ to
see what shared objects it needs. Then, in addition to copying the actual binary, also copy
the listed files to the required positions under your intended new root directory. Finally, if
the executable requires any other files (e.g., data, state, device files), copy them into place,
too.
chroot is installed only on systems that have the chroot function, so portable scripts
should not rely on its existence.
Exit status:
125 if chroot itself fails
126 if command is found but cannot be invoked
127 if command cannot be found
the exit status of command otherwise
passed as arguments to that program. The program should not be a special built-in utility
(see Section 2.10 [Special built-in utilities], page 10).
Modifications to PATH take effect prior to searching for command. Use caution when
reducing PATH; behavior is not portable when PATH is undefined or omits key directories
such as /bin.
In the rare case that a utility contains a ‘=’ in the name, the only way to disambiguate
it from a variable assignment is to use an intermediate command for command, and pass
the problematic program name via args. For example, if ./prog= is an executable in the
current PATH:
env prog= true # runs 'true', with prog= in environment
env ./prog= true # runs 'true', with ./prog= in environment
env -- prog= true # runs 'true', with prog= in environment
env sh -c '\prog= true' # runs 'prog=' with argument 'true'
env sh -c 'exec "$@"' sh prog= true # also runs 'prog='
If no command name is specified following the environment specifications, the resulting
environment is printed. This is like specifying the printenv program.
For some examples, suppose the environment passed to env contains ‘LOGNAME=rms’,
‘EDITOR=emacs’, and ‘PATH=.:/gnubin:/hacks’:
• Output the current environment.
$ env | LC_ALL=C sort
EDITOR=emacs
LOGNAME=rms
PATH=.:/gnubin:/hacks
• Run foo with a reduced environment, preserving only the original PATH to avoid
problems in locating foo.
env - PATH="$PATH" foo
• Run foo with the environment containing ‘LOGNAME=rms’, ‘EDITOR=emacs’, and
‘PATH=.:/gnubin:/hacks’, and guarantees that foo was found in the file system rather
than as a shell built-in.
env foo
• Run nemacs with the environment containing ‘LOGNAME=foo’, ‘EDITOR=emacs’,
‘PATH=.:/gnubin:/hacks’, and ‘DISPLAY=gnu:0’.
env DISPLAY=gnu:0 LOGNAME=foo nemacs
• Attempt to run the program /energy/-- (as that is the only possible path search result);
if the command exists, the environment will contain ‘LOGNAME=rms’ and ‘PATH=/energy’,
and the arguments will be ‘e=mc2’, ‘bar’, and ‘baz’.
env -u EDITOR PATH=/energy -- e=mc2 bar baz
‘-0’
‘--null’ Output a zero byte (ASCII NUL) at the end of each line, rather than a newline.
This option enables other programs to parse the output even when that output
would contain data with embedded newlines.
‘-a arg’
‘--argv0=arg’
Override the zeroth argument passed to the command being executed. Without
this option a default value of command is used.
‘-u name’
‘--unset=name’
Remove variable name from the environment, if it was in the environment.
‘-’
‘-i’
‘--ignore-environment’
Start with an empty environment, ignoring the inherited environment.
‘-C dir’
‘--chdir=dir’
Change the working directory to dir before invoking command. This differs
from the shell built-in cd in that it starts command as a subprocess rather than
altering the shell’s own working directory; this allows it to be chained with other
commands that run commands in a different context. For example:
# Run 'true' with /chroot as its root directory and /srv as its working
# directory.
chroot /chroot env --chdir=/srv true
# Run 'true' with /build as its working directory, FOO=bar in its
# environment, and a time limit of five seconds.
env --chdir=/build FOO=bar timeout 5 true
‘--default-signal[=sig]’
Unblock and reset signal sig to its default signal handler. Without sig all
known signals are unblocked and reset to their defaults. Multiple signals can be
comma-separated. An empty sig argument is a no-op. The following command
runs seq with SIGINT and SIGPIPE set to their default (which is to terminate
the program):
env --default-signal=PIPE,INT seq 1000 | head -n1
In the following example, we see how this is not possible to do with traditional
shells. Here the first trap command sets SIGPIPE to ignore. The second trap
command ostensibly sets it back to its default, but POSIX mandates that the
shell must not change inherited state of the signal – so it is a no-op.
trap '' PIPE && sh -c 'trap - PIPE ; seq inf | head -n1'
Using --default-signal=PIPE we can ensure the signal handling is set to its
default behavior:
trap '' PIPE && sh -c 'env --default-signal=PIPE seq inf | head -n1'
Chapter 23: Modified command invocation 214
‘--ignore-signal[=sig]’
Ignore signal sig when running a program. Without sig all known signals are set
to ignore. Multiple signals can be comma-separated. An empty sig argument
is a no-op. The following command runs seq with SIGINT set to be ignored –
pressing Ctrl-C will not terminate it:
env --ignore-signal=INT seq inf > /dev/null
‘SIGCHLD’ is special, in that --ignore-signal=CHLD might have no effect (POSIX
says it’s unspecified).
Most operating systems do not allow ignoring ‘SIGKILL’, ‘SIGSTOP’ (and possibly
other signals). Attempting to ignore these signals will fail.
Multiple (and contradictory) --default-signal=SIG and --ignore-signal=SIG
options are processed left-to-right, with the latter taking precedence. In the
following example, ‘SIGPIPE’ is set to default while ‘SIGINT’ is ignored:
env --default-signal=INT,PIPE --ignore-signal=INT
‘--block-signal[=sig]’
Block signal(s) sig from being delivered. Without sig all known signals are set
to blocked. Multiple signals can be comma-separated. An empty sig argument
is a no-op.
‘--list-signal-handling’
List blocked or ignored signals to standard error, before executing a command.
‘-v’
‘--debug’ Show verbose information for each processing step.
$ env -v -uTERM A=B uname -s
unset: TERM
setenv: A=B
executing: uname
arg[0]= 'uname'
arg[1]= '-s'
Linux
When combined with -S it is recommended to list -v first, e.g. env -vS'string'.
‘-S string’
‘--split-string=string’
process and split string into separate arguments used to pass multiple arguments
on shebang lines. env supports FreeBSD’s syntax of several escape sequences
and environment variable expansions. See below for details and examples.
Exit status:
0 if no command is specified and the environment is output
125 if env itself fails
126 if command is found but cannot be invoked
127 if command cannot be found
the exit status of command otherwise
Chapter 23: Modified command invocation 215
Most operating systems (e.g. GNU/Linux, BSDs) treat all text after the first space as
a single argument. When using env in a script it is thus not possible to specify multiple
arguments.
In the following example:
#!/usr/bin/env perl -T -w
print "hello\n";
The operating system treats ‘perl -T -w’ as one argument (the program’s name), and
executing the script fails with:
/usr/bin/env: 'perl -T -w': No such file or directory
The -S option instructs env to split the single string into multiple arguments. The
following example works as expected:
$ cat hello.pl
#!/usr/bin/env -S perl -T -w
print "hello\n";
In the following contrived example the awk variable ‘OFS’ will be <space>xyz<space>
as these spaces are inside double quotes. The other space characters are used as argument
separators:
$ cat one.awk
#!/usr/bin/env -S awk -v OFS=" xyz " -f
BEGIN {print 1,2,3}
Escape sequences
env supports several escape sequences. These sequences are processed when unquoted or
inside double quotes (unless otherwise noted). Single quotes disable escape sequences except
‘\'’ and ‘\\’.
\c Ignore the remaining characters in the string. Cannot be used inside double
quotes.
\# A hash ‘#’ character. Used when a ‘#’ character is needed as the first character
of an argument (see ’comments’ section below).
\$ A dollar-sign character ‘$’. Unescaped ‘$’ characters are used to expand environ-
ment variables (see ’variables’ section below).
\' A single-quote character. This escape sequence works inside single-quoted strings.
The following awk script will use tab character as input and output field separator (instead
of spaces and tabs):
$ cat tabs.awk
#!/usr/bin/env -S awk -v FS="\t" -v OFS="\t" -f
...
Comments
The escape sequence ‘\c’ (used outside single/double quotes) causes env to ignore the rest
of the string.
The ‘#’ character causes env to ignore the rest of the string when it appears as the first
character of an argument. Use ‘\#’ to reverse this behavior.
$ env -S'printf %s\n A B C'
A
B
C
The following python script prepends /opt/custom/modules to the python module search
path environment variable (‘PYTHONPATH’):
$ cat custom.py
#!/usr/bin/env -S PYTHONPATH=/opt/custom/modules/:${PYTHONPATH} python
print "hello"
...
The expansion of ‘${PYTHONPATH}’ is performed by env, not by a shell. If the curly
braces are omitted, env will fail:
$ cat custom.py
#!/usr/bin/env -S PYTHONPATH=/opt/custom/modules/:$PYTHONPATH python
print "hello"
...
limits. An attempt to set the niceness outside the supported range is treated as an attempt
to use the minimum or maximum supported value.
A niceness should not be confused with a scheduling priority, which lets applications
determine the order in which threads are scheduled to run. Unlike a priority, a niceness is
merely advice to the scheduler, which the scheduler is free to ignore. Also, as a point of
terminology, POSIX defines the behavior of nice in terms of a nice value, which is the non-
negative difference between a niceness and the minimum niceness. Though nice conforms
to POSIX, its documentation and diagnostics use the term “niceness” for compatibility with
historical practice.
command must not be a special built-in utility (see Section 2.10 [Special built-in utilities],
page 10).
Due to shell aliases and built-in nice functions, using an unadorned nice interactively
or in a script may get you different functionality than that described here. Invoke it via env
(i.e., env nice ...) to avoid interference from the shell.
To change the niceness of an existing process, one needs to use the renice command.
The program accepts the following option. Also see Chapter 2 [Common options], page 2.
Options must precede operands.
‘-n adjustment’
‘--adjustment=adjustment’
Add adjustment instead of 10 to the command’s niceness. If adjustment is
negative and you lack appropriate privileges, nice issues a warning but otherwise
acts as if you specified a zero adjustment.
For compatibility nice also supports an obsolete option syntax -adjustment.
New scripts should use -n adjustment instead.
nice is installed only on systems that have the POSIX setpriority function, so portable
scripts should not rely on its existence on non-POSIX platforms.
Exit status:
0 if no command is specified and the niceness is output
125 if nice itself fails
126 if command is found but cannot be invoked
127 if command cannot be found
the exit status of command otherwise
It is sometimes useful to run a non-interactive program with reduced niceness.
$ nice factor 4611686018427387903
Since nice prints the current niceness, you can invoke it through itself to demonstrate
how it works.
The default behavior is to increase the niceness by ‘10’:
$ nice
0
$ nice nice
10
$ nice -n 10 nice
10
Chapter 23: Modified command invocation 221
The adjustment is relative to the current niceness. In the next example, the first nice
invocation runs the second one with niceness 10, and it in turn runs the final one with a
niceness that is 3 more:
$ nice nice -n 3 nice
13
Specifying a niceness larger than the supported range is the same as specifying the
maximum supported value:
$ nice -n 10000000000 nice
19
Only a privileged user may run a process with lower niceness:
$ nice -n -1 nice
nice: cannot set niceness: Permission denied
0
$ sudo nice -n -1 nice
-1
Exit status:
125 if nohup itself fails, and POSIXLY_CORRECT is not set
126 if command is found but cannot be invoked
127 if command cannot be found
the exit status of command otherwise
If POSIXLY_CORRECT is set, internal failures give status 127 instead of 125.
The specified duration starts from the point in time when timeout sends the
initial signal to command, i.e., not from the beginning when the command is
started.
This option has no effect if either the main duration of the timeout command,
or the duration specified to this option, is 0.
This option may be useful if the selected signal did not kill the command, either
because the signal was blocked or ignored, or if the command takes too long
(e.g. for cleanup work) to terminate itself within a certain amount of time.
‘-s signal’
‘--signal=signal’
Send this signal to command on timeout, rather than the default ‘TERM’ sig-
nal. signal may be a name like ‘HUP’ or a number. See Section 2.3 [Signal
specifications], page 5.
‘-v’
‘--verbose’
Diagnose to standard error, any signal sent upon timeout.
duration is a floating point number in either the current or the C locale (see Section 2.12
[Floating point], page 10) followed by an optional unit:
‘s’ for seconds (the default)
‘m’ for minutes
‘h’ for hours
‘d’ for days
A duration of 0 disables the associated timeout. The actual timeout duration is dependent
on system conditions, which should be especially considered when specifying sub-second
timeouts.
Exit status:
124 if command times out, and --preserve-status is not specified
125 if timeout itself fails
126 if command is found but cannot be invoked
127 if command cannot be found
137 if command or timeout is sent the KILL(9) signal (128+9)
the exit status of command otherwise
In the case of the ‘KILL(9)’ signal, timeout returns with exit status 137, regardless of
whether that signal is sent to command or to timeout itself, i.e., these cases cannot be
distinguished. In the latter case, the command process may still be alive after timeout has
forcefully been terminated.
Examples:
# Send the default TERM signal after 20s to a short-living 'sleep 1'.
# As that terminates long before the given duration, 'timeout' returns
# with the same exit status as the command, 0 in this case.
timeout 20 sleep 1
# Send the INT signal after 5s to the 'sleep' command. Returns after
Chapter 23: Modified command invocation 225
# 5 seconds with exit status 124 to indicate the sending of the signal.
timeout -s INT 5 sleep 20
# Likewise, but the command ignoring the INT signal due to being started
# via 'env --ignore-signal'. Thus, 'sleep' terminates regularly after
# the full 20 seconds, still 'timeout' returns with exit status 124.
timeout -s INT 5s env --ignore-signal=INT sleep 20
# Likewise, but sending the KILL signal 3 seconds after the initial
# INT signal. Hence, 'sleep' is forcefully terminated after about
# 8 seconds (5+3), and 'timeout' returns with an exit status of 137.
timeout -s INT -k 3s 5s env --ignore-signal=INT sleep 20
226
24 Process control
25 Delaying
26 Numeric operations
These programs do numerically-related operations.
real 0m0.004s
user 0m0.004s
sys 0m0.000s
For larger numbers, factor uses a slower algorithm. On the same platform, factoring the
eighth Fermat number 2256 + 1 takes about 14 seconds, and the slower algorithm would have
taken about 750 ms to factor 2127 − 3 instead of the 50 ms needed by the faster algorithm.
Factoring large numbers is, in general, hard. The Pollard-Brent rho algorithm used by
factor is particularly effective for numbers with relatively small factors. If you wish to
factor large numbers which do not have small factors (for example, numbers which are the
product of two large primes), other methods are far better.
An exit status of zero indicates success, and a nonzero value indicates failure.
numfmt converts each number on the command-line according to the specified options
(see below). If no numbers are given, it reads numbers from standard input. numfmt can
optionally extract numbers from specific columns, maintaining proper line padding and
alignment.
An exit status of zero indicates success, and a nonzero value indicates failure.See
--invalid for additional information regarding exit status.
‘--header[=n]’
Print the first n (default: 1) lines without any conversion.
‘--invalid=mode’
The default action on input errors is to exit immediately with status code 2.
--invalid=‘abort’ explicitly specifies this default mode. With a mode of
‘fail’, print a warning for each conversion error, and exit with status 2. With
a mode of ‘warn’, exit with status 0, even in the presence of conversion errors,
and with a mode of ‘ignore’ do not even print diagnostics.
‘--padding=n’
Pad the output numbers to n characters, by adding spaces. If n is a positive
number, numbers will be right-aligned. If n is a negative number, numbers will
be left-aligned. By default, numbers are automatically aligned based on the
input line’s width (only with the default delimiter).
‘--round=method’
When converting number representations, round the number according to method,
which can be ‘up’, ‘down’, ‘from-zero’ (the default), ‘towards-zero’, ‘nearest’.
‘--suffix=suffix’
Add ‘SUFFIX’ to the output numbers, and accept optional ‘SUFFIX’ in input
numbers.
‘--to=unit’
Auto-scales output numbers according to unit. See Units below. The default is
no scaling, meaning all the digits of the number are printed.
‘--to-unit=n’
Specify the output unit size (instead of the default 1). Use this option when
the output numbers represent other units (e.g. to represent ‘4,000,000’ bytes
in blocks of 1kB, use ‘--to=si --to-unit=1000’). Suffixes are handled as with
‘--from=auto’.
‘-z’
‘--zero-terminated’
Delimit items with a zero byte rather than a newline (ASCII LF). I.e., treat
input as items separated by ASCII NUL and terminate output items with ASCII
NUL. This option can be useful in conjunction with ‘perl -0’ or ‘find -print0’
and ‘xargs -0’ which do the same in order to reliably handle arbitrary file
names (even those containing blanks or other special characters).With -z the
newline character is treated as a field separator.
numbers, values larger than 1000 will be rounded, and printed with one of the
following suffixes:
‘K’ => 10001 = 103 (Kilo) (uppercase accepted on input)
‘k’ => 10001 = 103 (Kilo) (lowercase used on output)
‘M’ => 10002 = 106 (Mega)
‘G’ => 10003 = 109 (Giga)
‘T’ => 10004 = 1012 (Tera)
‘P’ => 10005 = 1015 (Peta)
‘E’ => 10006 = 1018 (Exa)
‘Z’ => 10007 = 1021 (Zetta)
‘Y’ => 10008 = 1024 (Yotta)
‘R’ => 10009 = 1027 (Ronna)
‘Q’ => 100010 = 1030 (Quetta)
iec Auto-scale numbers according to the International Electrotechnical Commission
(IEC) standard. For input numbers, accept one of the following suffixes. For
output numbers, values larger than 1024 will be rounded, and printed with one
of the following suffixes:
‘K’ => 10241 = 210 (Kibi) (uppercase used on output)
‘k’ => 10241 = 210 (Kibi) (lowercase accepted on input)
‘M’ => 10242 = 220 (Mebi)
‘G’ => 10243 = 230 (Gibi)
‘T’ => 10244 = 240 (Tebi)
‘P’ => 10245 = 250 (Pebi)
‘E’ => 10246 = 260 (Exbi)
‘Z’ => 10247 = 270 (Zebi)
‘Y’ => 10248 = 280 (Yobi)
‘R’ => 10249 = 290 (Robi)
‘Q’ => 102410 = 2100 (Quebi)
The iec option uses a single letter suffix (e.g. ‘G’), which is not fully standard,
as the iec standard recommends a two-letter symbol (e.g ‘Gi’) – but in practice,
this method is common. Compare with the iec-i option.
iec-i Auto-scale numbers according to the International Electrotechnical Commission
(IEC) standard. For input numbers, accept one of the following suffixes. For
output numbers, values larger than 1024 will be rounded, and printed with one
of the following suffixes:
‘Ki’ => 10241 = 210 (Kibi) (uppercase used on output)
‘ki’ => 10241 = 210 (Kibi) (lowercase accepted on input)
‘Mi’ => 10242 = 220 (Mebi)
‘Gi’ => 10243 = 230 (Gibi)
‘Ti’ => 10244 = 240 (Tebi)
‘Pi’ => 10245 = 250 (Pebi)
‘Ei’ => 10246 = 260 (Exbi)
‘Zi’ => 10247 = 270 (Zebi)
‘Yi’ => 10248 = 280 (Yobi)
‘Ri’ => 10249 = 290 (Robi)
Chapter 26: Numeric operations 232
$ numfmt --from=si 1M
1000000
$ numfmt --from=iec 1M
1048576
2,14,74,83,648
seq prints the numbers from first to last by increment. By default, each number is
printed on a separate line. When increment is not specified, it defaults to ‘1’, even when
first is larger than last. first also defaults to ‘1’. So seq 1 prints ‘1’, but seq 0 and seq 10 5
produce no output. The sequence of numbers ends when the sum of the current number and
increment would become greater than last, so seq 1 10 10 only produces ‘1’. increment must
not be ‘0’; use the tool yes to get repeated output of a constant number. first, increment
and last must not be NaN, but inf is supported. Floating-point numbers may be specified
in either the current or the C locale. See Section 2.12 [Floating point], page 10.
The program accepts the following options. Also see Chapter 2 [Common options], page 2.
Options must precede operands.
‘-f format’
‘--format=format’
Print all numbers using format. format must contain exactly one of the ‘printf’-
style floating point conversion specifications ‘%a’, ‘%e’, ‘%f’, ‘%g’, ‘%A’, ‘%E’, ‘%F’,
‘%G’. The ‘%’ may be followed by zero or more flags taken from the set ‘-+#0 '’,
then an optional width containing one or more digits, then an optional precision
consisting of a ‘.’ followed by zero or more digits. format may also contain any
number of ‘%%’ conversion specifications. All conversion specifications have the
same meaning as with ‘printf’.
The default format is derived from first, step, and last. If these all use a
fixed point decimal representation, the default format is ‘%.pf’, where p is the
minimum precision that can represent the output numbers exactly. Otherwise,
the default format is ‘%g’.
Chapter 26: Numeric operations 235
‘-s string’
‘--separator=string’
Separate numbers with string; default is a newline. The output always terminates
with a newline.
‘-w’
‘--equal-width’
Print all numbers with the same width, by padding with leading zeros. first,
step, and last should all use a fixed point decimal representation. (To have other
kinds of padding, use --format).
You can get finer-grained control over output with -f:
$ seq -f '(%9.2E)' -9e5 1.1e6 1.3e6
(-9.00E+05)
( 2.00E+05)
( 1.30E+06)
If you want hexadecimal integer output, you can use printf to perform the conversion:
$ printf '%x\n' $(seq 1048575 1024 1050623)
fffff
1003ff
1007ff
For very long lists of numbers, use xargs to avoid system limitations on the length of an
argument list:
$ seq 1000000 | xargs printf '%x\n' | tail -n 3
f423e
f423f
f4240
To generate octal output, use the printf %o format instead of %x.
On most systems, seq can produce whole-number output for values up to at least 253 .
Larger integers are approximated. The details differ depending on your floating-point
implementation. See Section 2.12 [Floating point], page 10. A common case is that seq
works with integers through 264 , and larger integers may not be numerically correct:
$ seq 50000000000000000000 2 50000000000000000004
50000000000000000000
50000000000000000000
50000000000000000004
However, when limited to non-negative whole numbers, an increment of less than 200,
and no format-specifying option, seq can print arbitrarily large numbers. Therefore seq inf
can be used to generate an infinite sequence of numbers.
Be careful when using seq with outlandish values: otherwise you may see surprising
results, as seq uses floating point internally. For example, on the x86 platform, where the
internal representation uses a 64-bit fraction, the command:
seq 1 0.0000000000000000001 1.0000000000000000009
outputs 1.0000000000000000007 twice and skips 1.0000000000000000008.
An exit status of zero indicates success, and a nonzero value indicates failure.
236
27 File permissions
Each file has a set of file mode bits that control the kinds of access that users have to that
file. They can be represented either in symbolic form or as an octal number.
In addition to the file mode bits listed above, there may be file attributes specific to the
file system, e.g., access control lists (ACLs), whether a file is compressed, whether a file can
be modified (immutability), and whether a file can be dumped. These are usually set using
programs specific to the file system. For example:
ext2 On GNU and GNU/Linux the file attributes specific to the ext2 file system are
set using chattr.
FFS On FreeBSD the file flags specific to the FFS file system are set using chflags.
Even if a file’s mode bits allow an operation on that file, that operation may still fail,
because:
• the file-system-specific attributes or flags do not permit it; or
• the file system is mounted as read-only.
For example, if the immutable attribute is set on a file, it cannot be modified, regardless
of the fact that you may have just run chmod a+w FILE.
The operation part tells how to change the affected users’ access to the file, and is one of
the following symbols:
+ to add the permissions to whatever permissions the users already have for the
file;
- to remove the permissions from whatever permissions the users already have for
the file;
= to make the permissions the only permissions that the users have for the file.
The permissions part tells what kind of access to the file should be changed; it is normally
zero or more of the following letters. As with the users part, the order does not matter
when more than one letter is given. Omitting the permissions part is useful only with the ‘=’
operation, where it gives the specified users no access at all to the file.
For example, to give everyone permission to read and write a regular file, but not to
execute it, use:
a=rw
To remove write permission for all users other than the file’s owner, use:
go-w
The above command does not affect the access that the owner of the file has to it, nor does
it affect whether other users can read or execute the file.
To give everyone except a file’s owner no permission to do anything with that file, use
the mode below. Other users could still remove the file, if they have write permission on the
directory it is in.
go=
Another way to specify the same thing is:
og-rwx
chmod u=rwx,go=rx,a+s G
mkdir -m 6755 H
mkdir -m +6000 I
mkdir -m u=rwx,go=rx,a+s J
If you want to try to clear these bits, you must mention them explicitly in a symbolic
mode, or use an operator numeric mode, or specify a numeric mode with five or more octal
digits, e.g.:
# These commands try to clear the set-user-ID
# and set-group-ID bits of the directory D.
chmod a-s D
chmod -6000 D
chmod =755 D
chmod 00755 D
This behavior is a GNU extension. Portable scripts should not rely on requests to set or
clear these bits on directories, as POSIX allows implementations to ignore these requests.
The GNU behavior with numeric modes of four or fewer digits is intended for scripts portable
to systems that preserve these bits; the behavior with numeric modes of five or more digits
is for scripts portable to systems that do not preserve the bits.
244
28 File timestamps
Standard POSIX files have three timestamps: the access timestamp (atime) of the last read,
the modification timestamp (mtime) of the last write, and the status change timestamp
(ctime) of the last change to the file’s meta-information. Some file systems support a fourth
time: the birth timestamp (birthtime) of when the file was created; by definition, birthtime
never changes.
One common example of a ctime change is when the permissions of a file change. Changing
the permissions doesn’t access the file, so atime doesn’t change, nor does it modify the file, so
the mtime doesn’t change. Yet, something about the file itself has changed, and this must be
noted somewhere. This is the job of the ctime field. This is necessary, so that, for example,
a backup program can make a fresh copy of the file, including the new permissions value.
Another operation that modifies a file’s ctime without affecting the others is renaming.
Naively, a file’s atime, mtime, and ctime are set to the current time whenever you read,
write, or change the attributes of the file respectively, and searching a directory counts as
reading it. A file’s atime and mtime can also be set directly, via the touch command (see
Section 13.4 [touch invocation], page 144). In practice, though, timestamps are not updated
quite that way.
For efficiency reasons, many systems are lazy about updating atimes: when a program
accesses a file, they may delay updating the file’s atime, or may not update the file’s atime
if the file has been accessed recently, or may not update the atime at all. Similar laziness,
though typically not quite so extreme, applies to mtimes and ctimes.
Some systems emulate timestamps instead of supporting them directly, and these emula-
tions may disagree with the naive interpretation. For example, a system may fake an atime
or ctime by using the mtime.
The determination of what time is “current” depends on the platform. Platforms with
network file systems often use different clocks for the operating system and for file systems;
because updates typically uses file systems’ clocks by default, clock skew can cause the
resulting file timestamps to appear to be in a program’s “future” or “past”.
When the system updates a file timestamp to a desired time t (which is either the
current time, or a time specified via the touch command), there are several reasons the file’s
timestamp may be set to a value that differs from t. First, t may have a higher resolution
than supported. Second, a file system may use different resolutions for different types of times.
Third, file timestamps may use a different resolution than operating system timestamps.
Fourth, the operating system primitives used to update timestamps may employ yet a
different resolution. For example, in theory a file system might use 10-microsecond resolution
for access timestamp and 100-nanosecond resolution for modification timestamp, and the
operating system might use nanosecond resolution for the current time and microsecond
resolution for the primitive that touch uses to set a file’s timestamp to an arbitrary value.
245
‘fourth’ for 4, ‘fifth’ for 5, ‘sixth’ for 6, ‘seventh’ for 7, ‘eighth’ for 8, ‘ninth’ for 9,
‘tenth’ for 10, ‘eleventh’ for 11 and ‘twelfth’ for 12.
When a month is written this way, it is still considered to be written numerically, instead
of being “spelled in full”; this changes the allowed strings.
In the current implementation, only English is supported for words and abbreviations
like ‘AM’, ‘DST’, ‘EST’, ‘first’, ‘January’, ‘Sunday’, ‘tomorrow’, and ‘year’.
The output of the date command is not always acceptable as a date string, not only
because of the language problem, but also because there is no standard meaning for time
zone items like ‘IST’. When using date to generate a date string intended to be parsed later,
specify a date format that is independent of language and that does not use time zone items
other than ‘UTC’ and ‘Z’. Here are some ways to do this:
$ LC_ALL=C TZ=UTC0 date
Tue Nov 15 02:02:42 UTC 2022
$ TZ=UTC0 date +'%Y-%m-%d %H:%M:%SZ'
2022-11-15 02:02:42Z
$ date --rfc-3339=ns # --rfc-3339 is a GNU extension.
2022-11-14 21:02:42.000000000-05:00
$ date --rfc-email # a GNU extension
Mon, 14 Nov 2022 21:02:42 -0500
$ date +'%Y-%m-%d %H:%M:%S %z' # %z is a GNU extension.
2022-11-14 21:02:42 -0500
$ date +'@%s.%N' # %s and %N are GNU extensions.
@1668477762.692722128
Alphabetic case is completely ignored in dates. Comments may be introduced between
round parentheses, as long as included parentheses are properly nested. Hyphens not followed
by a digit are currently ignored. Leading zeros on numbers are ignored.
Invalid dates like ‘2022-02-29’ or times like ‘24:00’ are rejected. In the typical case
of a host that does not support leap seconds, a time like ‘23:59:60’ is rejected even if it
corresponds to a valid leap second.
11/14
nov 14
Here are the rules.
For numeric months, the ISO 8601 format ‘year-month-day’ is allowed, where year is
any positive number, month is a number between 01 and 12, and day is a number between
01 and 31. A leading zero must be present if a number is less than ten. If year is 68 or
smaller, then 2000 is added to it; otherwise, if year is less than 100, then 1900 is added
to it. The construct ‘month/day/year’, popular in the United States, is accepted. Also
‘month/day’, omitting the year.
Literal months may be spelled out in full: ‘January’, ‘February’, ‘March’, ‘April’, ‘May’,
‘June’, ‘July’, ‘August’, ‘September’, ‘October’, ‘November’ or ‘December’. Literal months
may be abbreviated to their first three letters, possibly followed by an abbreviating dot. It
is also permitted to write ‘Sept’ instead of ‘September’.
When months are written literally, the calendar date may be given as any of the following:
day month year
day month
month day year
day-month-year
Or, omitting the year:
month day
Coordinated Universal Time (UTC), overriding any previous specification for the time zone
or the local time zone. For example, ‘+0530’ and ‘+05:30’ both stand for the time zone 5.5
hours ahead of UTC (e.g., India). This is the best way to specify a time zone correction by
fractional parts of an hour. The maximum zone correction is 24 hours.
Either ‘am’/‘pm’ or a time zone correction may be specified, but not both.
A number may precede a day of the week item to move forward supplementary weeks.
It is best used in expression like ‘third monday’. In this context, ‘last day’ or ‘next day’
is also acceptable; they move one week before or after the day that day by itself would
represent.
A comma following a day of the week item is ignored.
so it is often wise to adopt universal time by setting the TZ environment variable to ‘UTC0’
before embarking on calendrical calculations.
For example, with the GNU date command you can answer the question “What time is
it in New York when a Paris clock shows 6:30am on October 31, 2022?” by using a date
beginning with ‘TZ="Europe/Paris"’ as shown in the following shell transcript:
$ export TZ="America/New_York"
$ date --date='TZ="Europe/Paris" 2022-10-31 06:30'
Mon Oct 31 01:30:00 EDT 2022
In this example, the --date operand begins with its own TZ setting, so the rest of that
operand is processed according to ‘Europe/Paris’ rules, treating the string ‘2022-11-14
06:30’ as if it were in Paris. However, since the output of the date command is processed
according to the overall time zone rules, it uses New York time. (Paris was normally six
hours ahead of New York in 2022, but this example refers to a brief Halloween period when
the gap was five hours.)
A TZ value is a rule that typically names a location in the ‘tz’ database (https://www.
iana.org/time-zones). A recent catalog of location names appears in the TWiki Date and
Time Gateway (https://twiki.org/cgi-bin/xtra/tzdatepick.html). A few non-GNU
hosts require a colon before a location name in a TZ setting, e.g., ‘TZ=":America/New_York"’.
The ‘tz’ database includes a wide variety of locations ranging from ‘Africa/Abidjan’ to
‘Pacific/Tongatapu’, but if you are at sea and have your own private time zone, or if you
are using a non-GNU host that does not support the ‘tz’ database, you may need to use a
POSIX rule instead. The previously-mentioned POSIX rule ‘UTC0’ says that the time zone
abbreviation is ‘UTC’, the zone is zero hours away from Greenwich, and there is no daylight
saving time. POSIX rules can also specify nonzero Greenwich offsets. For example, the
following shell transcript answers the question “What time is it five and a half hours east of
Greenwich when a clock seven hours west of Greenwich shows 9:50pm on July 12, 2022?”
$ TZ="<+0530>-5:30" date --date='TZ="<-07>+7" 2022-07-12 21:50'
Wed Jul 13 10:20:00 +0530 2022
This example uses the somewhat-confusing POSIX convention for rules. ‘TZ="<-07>+7"’
says that the time zone abbreviation is ‘-07’ and the time zone is 7 hours west of Green-
wich, and ‘TZ="<+0530>-5:30"’ says that the time zone abbreviation is ‘+0530’ and the
time zone is 5 hours 30 minutes east of Greenwich. (One should never use a setting
like ‘TZ="UTC-5"’, since this would incorrectly imply that local time is five hours east
of Greenwich and the time zone is called “UTC”.) Although trickier POSIX TZ settings
like ‘TZ="<-05>+5<-04>,M3.2.0/2,M11.1.0/2"’ can specify some daylight saving regimes,
location-based settings like ‘TZ="America/New_York"’ are typically simpler and more accu-
rate historically. See Section “Specifying the Time Zone with TZ” in The GNU C Library.
parse more locale-specific dates using strptime, but relies on an environment variable and
external file, and lacks the thread-safety of parse_datetime.
This chapter was originally produced by François Pinard ([email protected])
from the parse_datetime.y source code, and then edited by K. Berry ([email protected]).
253
a1 a1
a120 a2
a13 a13
a2 a120
Version sort functionality in GNU Coreutils is available in the ‘ls -v’, ‘ls
--sort=version’, ‘sort -V’, and ‘sort --version-sort’ commands.
$ ls -1 $ ls -1 -v
a1 a1
a100 a1.4
a1.13 a1.13
a1.4 a1.40
a1.40 a2
a2 a100
To sort text files in version sort order, use sort with the -V or --version-sort option:
$ cat input
b3
b11
b1
b20
b3 b20
To sort a specific field in a file, use -k/--key with ‘V’ type sorting, which is often
combined with ‘b’ to ignore leading blanks in the field:
$ cat input2
100 b3 apples
2000 b11 oranges
3000 b1 potatoes
4000 b20 bananas
$ sort -k 2bV,2 input2
3000 b1 potatoes
100 b3 apples
2000 b11 oranges
4000 b20 bananas
foo07.7z
See Section 30.3 [Differences from Debian version sort], page 259, for additional rules
that extend the Debian algorithm in Coreutils.
1.0.5_src.tar.gz
1.0_src.tar.gz
Why is 1.0.5_src.tar.gz listed before 1.0_src.tar.gz?
Based on the version-sort ordering rules, the strings are broken down into the following
parts:
1 vs 1 (rule 3, all digits)
. vs . (rule 2, all non-digits)
0 vs 0 (rule 3)
. vs _src.tar.gz (rule 2)
5 vs empty string (no more bytes in the file name)
_src.tar.gz vs empty string
The fourth parts (‘.’ and ‘_src.tar.gz’) are compared lexically by ASCII order. The ‘.’
(ASCII value 46) is less than ‘_’ (ASCII value 95) – and should be listed before it.
Hence, 1.0.5_src.tar.gz is listed first.
If a different byte appears instead of the underscore (for example, percent sign ‘%’ ASCII
value 37, which is less than dot’s ASCII value of 46), that file will be listed first:
$ touch 1.0.5_src.tar.gz 1.0%zzzzz.gz
1.0%zzzzz.gz
1.0.5_src.tar.gz
The same reasoning applies to the following example, as ‘.’ with ASCII value 46 is less
than ‘/’ with ASCII value 47:
$ cat input5
3.0/
3.0.5
$ sort -V input5
3.0.5
3.0/
Then, percent sign ‘%’ (ASCII value 37) is compared to the first byte of the UTF-8
sequence of ‘α’, which is 0xCE or 206). The value 37 is smaller, hence ‘a%’ is listed before
‘aα’.
• ‘hello-8.2.txt’: the suffix is ‘.txt’ (‘.2’ is not included because the dot is not followed
by a letter)
• ‘hello-8.0.12.tar.gz’: the suffix is ‘.tar.gz’ (‘.0.12’ is not included)
• ‘hello-8.2’: no suffix (suffix is an empty string)
• ‘hello.foobar65’: the suffix is ‘.foobar65’
• ‘gcc-c++-10.8.12-0.7rc2.fc9.tar.bz2’: the suffix is ‘.fc9.tar.bz2’ (‘.7rc2’ is not
included as it begins with a digit)
• ‘.autom4te.cfg’: the suffix is the entire string.
Examples for rule 2:
• Comparing ‘hello-8.txt’ to ‘hello-8.2.12.txt’, the ‘.txt’ suffix is temporarily
removed from both strings.
• Comparing ‘foo-10.3.tar.gz’ to ‘foo-10.tar.xz’, the suffixes ‘.tar.gz’ and
‘.tar.xz’ are temporarily removed from the strings.
Example for rule 3:
• Comparing ‘hello.foobar65’ to ‘hello.foobar4’, the suffixes (‘.foobar65’ and
‘.foobar4’) are temporarily removed. The remaining strings are identical (‘hello’).
The suffixes are then restored, and the entire strings are compared (‘hello.foobar4’
comes first).
Examples for rule 4:
• When comparing the strings ‘hello-8.2.txt’ and ‘hello-8.10.txt’, the suffixes
(‘.txt’) are temporarily removed. The remaining strings (‘hello-8.2’ and
‘hello-8.10’) are compared as previously described (‘hello-8.2’ comes first). (In this
case the suffix removal algorithm does not have a noticeable effect on the resulting
order.)
How does the suffix-removal algorithm effect ordering results?
Consider the comparison of hello-8.txt and hello-8.2.txt.
Without the suffix-removal algorithm, the strings will be broken down to the following
parts:
hello- vs hello- (rule 2, all non-digits)
8 vs 8 (rule 3, all digits)
.txt vs . (rule 2)
empty vs 2
empty vs .txt
The comparison of the third parts (‘.’ vs ‘.txt’) will determine that the shorter string
comes first – resulting in hello-8.2.txt appearing first.
Indeed this is the order in which Debian’s dpkg compares the strings.
A more natural result is that hello-8.txt should come before hello-8.2.txt, and this
is where the suffix-removal comes into play:
The suffixes (‘.txt’) are removed, and the remaining strings are broken down into the
following parts:
hello- vs hello- (rule 2, all non-digits)
Chapter 30: Version sort ordering 262
abb ab-cd
To illustrate the different handling of file extension: (see Section 30.3.3 [Special handling
of file extensions], page 260):
$ compver hello-8.txt hello-8.2.txt 2>/dev/null
hello-8.2.txt
hello-8.txt
$ printf '%s\n' hello-8.txt hello-8.2.txt | sort -V
hello-8.txt
hello-8.2.txt
Toolbox Introduction
This month’s column is only peripherally related to the GNU Project, in that it describes a
number of the GNU tools on your GNU/Linux system and how they might be used. What
it’s really about is the “Software Tools” philosophy of program development and usage.
The software tools philosophy was an important and integral concept in the initial
design and development of Unix (of which GNU/Linux and GNU are essentially clones).
Unfortunately, in the modern day press of Internetworking and flashy GUIs, it seems to
have fallen by the wayside. This is a shame, since it provides a powerful mental model for
solving many kinds of problems.
Many people carry a Swiss Army knife around in their pants pockets (or purse). A Swiss
Army knife is a handy tool to have: it has several knife blades, a screwdriver, tweezers,
toothpick, nail file, corkscrew, and perhaps a number of other things on it. For the everyday,
small miscellaneous jobs where you need a simple, general purpose tool, it’s just the thing.
On the other hand, an experienced carpenter doesn’t build a house using a Swiss Army
knife. Instead, he has a toolbox chock full of specialized tools – a saw, a hammer, a
screwdriver, a plane, and so on. And he knows exactly when and where to use each tool;
you won’t catch him hammering nails with the handle of his screwdriver.
The Unix developers at Bell Labs were all professional programmers and trained computer
scientists. They had found that while a one-size-fits-all program might appeal to a user
because there’s only one program to use, in practice such programs are
a. difficult to write,
b. difficult to maintain and debug, and
c. difficult to extend to meet new situations.
Instead, they felt that programs should be specialized tools. In short, each program
“should do one thing well.” No more and no less. Such programs are simpler to design, write,
and get right – they only do one thing.
Furthermore, they found that with the right machinery for hooking programs together,
that the whole was greater than the sum of the parts. By combining several special purpose
programs, you could accomplish a specific task that none of the programs was designed
for, and accomplish it much more quickly and easily than if you had to write a special
purpose program. We will see some (classic) examples of this further on in the column. (An
important additional point was that, if necessary, take a detour and build any software tools
you may need first, if you don’t already have something appropriate in the toolbox.)
I/O Redirection
Hopefully, you are familiar with the basics of I/O redirection in the shell, in particular the
concepts of “standard input,” “standard output,” and “standard error”. Briefly, “standard
Chapter 31: Opening the Software Toolbox 266
input” is a data source, where data comes from. A program should not need to either know
or care if the data source is a regular file, a keyboard, a magnetic tape, or even a punched
card reader. Similarly, “standard output” is a data sink, where data goes to. The program
should neither know nor care where this might be. Programs that only read their standard
input, do something to the data, and then send it on, are called filters, by analogy to filters
in a water pipeline.
With the Unix shell, it’s very easy to set up data pipelines:
program_to_create_data | filter1 | ... | filterN > final.pretty.data
We start out by creating the raw data; each filter applies some successive transformation
to the data, until by the time it comes out of the pipeline, it is in the desired form.
This is fine and good for standard input and standard output. Where does the standard
error come in to play? Well, think about filter1 in the pipeline above. What happens if it
encounters an error in the data it sees? If it writes an error message to standard output, it
will just disappear down the pipeline into filter2’s input, and the user will probably never
see it. So programs need a place where they can send error messages so that the user will
notice them. This is standard error, and it is usually connected to your console or window,
even if you have redirected standard output of your program away from your screen.
For filter programs to work together, the format of the data has to be agreed upon.
The most straightforward and easiest format to use is simply lines of text. Unix data
files are generally just streams of bytes, with lines delimited by the ASCII LF (Line Feed)
character, conventionally called a “newline” in the Unix literature. (This is '\n' if you’re a
C programmer.) This is the format used by all the traditional filtering programs. (Many
earlier operating systems had elaborate facilities and special purpose programs for managing
binary data. Unix has always shied away from such things, under the philosophy that it’s
easiest to simply be able to view and edit your data with a text editor.)
OK, enough introduction. Let’s take a look at some of the tools, and then we’ll see how
to hook them together in interesting ways. In the following discussion, we will only present
those command line options that interest us. As you should always do, double check your
system documentation for the full story.
list of logged in users. Furthermore, even if a user is logged in multiple times, his or her
name should only show up in the output once.
The administrator could sit down with the system documentation and write a C program
that did this. It would take perhaps a couple of hundred lines of code and about two hours
to write it, test it, and debug it. However, knowing the software toolbox, the administrator
can instead start out by generating just a list of logged on users:
$ who | cut -c1-8
a arnold
a miriam
a bill
a arnold
Next, sort the list:
$ who | cut -c1-8 | sort
a arnold
a arnold
a bill
a miriam
Finally, run the sorted list through uniq, to weed out duplicates:
$ who | cut -c1-8 | sort | uniq
a arnold
a bill
a miriam
The sort command actually has a -u option that does what uniq does. However, uniq
has other uses for which one cannot substitute ‘sort -u’.
The administrator puts this pipeline into a shell script, and makes it available for all the
users on the system (‘#’ is the system administrator, or root, prompt):
# cat > /usr/local/bin/listusers
who | cut -c1-8 | sort | uniq
^D
# chmod +x /usr/local/bin/listusers
There are four major points to note here. First, with just four programs, on one command
line, the administrator was able to save about two hours worth of work. Furthermore, the
shell pipeline is just about as efficient as the C program would be, and it is much more
efficient in terms of programmer time. People time is much more expensive than computer
time, and in our modern “there’s never enough time to do everything” society, saving two
hours of programmer time is no mean feat.
Second, it is also important to emphasize that with the combination of the tools, it
is possible to do a special purpose job never imagined by the authors of the individual
programs.
Third, it is also valuable to build up your pipeline in stages, as we did here. This allows
you to view the data at each stage in the pipeline, which helps you acquire the confidence
that you are indeed using these tools correctly.
Finally, by bundling the pipeline in a shell script, other users can use your command,
without having to remember the fancy plumbing you set up for them. In terms of how you
run them, shell scripts and compiled programs are indistinguishable.
Chapter 31: Opening the Software Toolbox 269
After the previous warm-up exercise, we’ll look at two additional, more complicated
pipelines. For them, we need to introduce two more tools.
The first is the tr command, which stands for “transliterate.” The tr command works
on a character-by-character basis, changing characters. Normally it is used for things like
mapping upper case to lower case:
$ echo ThIs ExAmPlE HaS MIXED case! | tr '[:upper:]' '[:lower:]'
a this example has mixed case!
There are several options of interest:
-c work on the complement of the listed characters, i.e., operations apply to
characters not in the given set
-d delete characters in the first set from the output
-s squeeze repeated characters in the output into just one character.
We will be using all three options in a moment.
The other command we’ll look at is comm. The comm command takes two sorted input
files as input data, and prints out the files’ lines in three columns. The output columns
are the data lines unique to the first file, the data lines unique to the second file, and the
data lines that are common to both. The -1, -2, and -3 command line options omit the
respective columns. (This is non-intuitive and takes a little getting used to.) For example:
$ cat f1
a 11111
a 22222
a 33333
a 44444
$ cat f2
a 00000
a 22222
a 33333
a 55555
$ comm f1 f2
a 00000
a 11111
a 22222
a 33333
a 44444
a 55555
The file name - tells comm to read standard input instead of a regular file.
Now we’re ready to build a fancy pipeline. The first application is a word frequency
counter. This helps an author determine if he or she is over-using certain words.
The first step is to change the case of all the letters in our input file to one case. “The”
and “the” are the same word when doing counting.
$ tr '[:upper:]' '[:lower:]' < whats.gnu | ...
The next step is to get rid of punctuation. Quoted words and unquoted words should be
treated identically; it’s easiest to just get the punctuation out of the way.
$ tr '[:upper:]' '[:lower:]' < whats.gnu | tr -cd '[:alnum:]_ \n' | ...
Chapter 31: Opening the Software Toolbox 270
The second tr command operates on the complement of the listed characters, which are
all the letters, the digits, the underscore, and the blank. The ‘\n’ represents the newline
character; it has to be left alone. (The ASCII tab character should also be included for good
measure in a production script.)
At this point, we have data consisting of words separated by blank space. The words
only contain alphanumeric characters (and the underscore). The next step is break the data
apart so that we have one word per line. This makes the counting operation much easier, as
we will see shortly.
$ tr '[:upper:]' '[:lower:]' < whats.gnu | tr -cd '[:alnum:]_ \n' |
> tr -s ' ' '\n' | ...
This command turns blanks into newlines. The -s option squeezes multiple newline
characters in the output into just one, removing blank lines. (The ‘>’ is the shell’s “secondary
prompt.” This is what the shell prints when it notices you haven’t finished typing in all of a
command.)
We now have data consisting of one word per line, no punctuation, all one case. We’re
ready to count each word:
$ tr '[:upper:]' '[:lower:]' < whats.gnu | tr -cd '[:alnum:]_ \n' |
> tr -s ' ' '\n' | sort | uniq -c | ...
At this point, the data might look something like this:
60 a
2 able
6 about
1 above
2 accomplish
1 acquire
1 actually
2 additional
The output is sorted by word, not by count! What we want is the most frequently used
words first. Fortunately, this is easy to accomplish, with the help of two more sort options:
-n do a numeric sort, not a textual one
-r reverse the order of the sort
The final pipeline looks like this:
$ tr '[:upper:]' '[:lower:]' < whats.gnu | tr -cd '[:alnum:]_ \n' |
> tr -s ' ' '\n' | sort | uniq -c | sort -n -r
a 156 the
a 60 a
a 58 to
a 51 of
a 51 and
...
Whew! That’s a lot to digest. Yet, the same principles apply. With six commands, on
two lines (really one long one split for convenience), we’ve created a program that does
something interesting and useful, in much less time than we could have written a C program
to do the same thing.
Chapter 31: Opening the Software Toolbox 271
A minor modification to the above pipeline can give us a simple spelling checker! To
determine if you’ve spelled a word correctly, all you have to do is look it up in a dictionary.
If it is not there, then chances are that your spelling is incorrect. So, we need a dictionary.
The conventional location for a dictionary is /usr/share/dict/words.
Now, how to compare our file with the dictionary? As before, we generate a sorted list of
words, one per line:
$ tr '[:upper:]' '[:lower:]' < whats.gnu | tr -cd '[:alnum:]_ \n' |
> tr -s ' ' '\n' | sort -u | ...
Now, all we need is a list of words that are not in the dictionary. Here is where
the comm command comes in. Unfortunately comm operates on sorted input and
/usr/share/dict/words is not sorted the way that sort and comm normally use, so we first
create a properly-sorted copy of the dictionary and then run a pipeline that uses the copy.
$ sort /usr/share/dict/words > sorted-words
$ tr '[:upper:]' '[:lower:]' < whats.gnu | tr -cd '[:alnum:]_ \n' |
> tr -s ' ' '\n' | sort -u |
> comm -23 - sorted-words
The -2 and -3 options eliminate lines that are only in the dictionary (the second file),
and lines that are in both files. Lines only in the first file (standard input, our stream of
words), are words that are not in the dictionary. These are likely candidates for spelling
errors. This pipeline was the first cut at a production spelling checker on Unix.
There are some other tools that deserve brief mention.
grep search files for text that matches a regular expression
wc count lines, words, characters
tee a T-fitting for data pipes, copies data to files and to standard output
sed the stream editor, an advanced tool
awk a data manipulation language, another advanced tool
The software tools philosophy also espoused the following bit of advice: “Let someone
else do the hard part.” This means, take something that gives you most of what you need,
and then massage it the rest of the way until it’s in the form that you want.
To summarize:
1. Each program should do one thing well. No more, no less.
2. Combining programs with appropriate plumbing leads to results where the whole is
greater than the sum of the parts. It also leads to novel uses of programs that the
authors might never have imagined.
3. Programs should never print extraneous header or trailer data, since these could get
sent on down a pipeline. (A point we didn’t mention earlier.)
4. Let someone else do the hard part.
5. Know your toolbox! Use each program appropriately. If you don’t have an appropriate
tool, build one.
All the programs discussed are available as described in GNU core utilities (https://
www.gnu.org/software/coreutils/coreutils.html).
Chapter 31: Opening the Software Toolbox 272
None of what I have presented in this column is new. The Software Tools philosophy
was first introduced in the book Software Tools, by Brian Kernighan and P.J. Plauger
(Addison-Wesley, ISBN 0-201-03669-X). This book showed how to write and use software
tools. It was written in 1976, using a preprocessor for FORTRAN named ratfor (RATional
FORtran). At the time, C was not as ubiquitous as it is now; FORTRAN was. The last
chapter presented a ratfor to FORTRAN processor, written in ratfor. ratfor looks an
awful lot like C; if you know C, you won’t have any problem following the code.
In 1981, the book was updated and made available as Software Tools in Pascal (Addison-
Wesley, ISBN 0-201-10342-7). Both books are still in print and are well worth reading if
you’re a programmer. They certainly made a major change in how I view programming.
The programs in both books are available from Brian Kernighan’s home page (https://
www.cs.princeton.edu/~bwk/). For a number of years, there was an active Software Tools
Users Group, whose members had ported the original ratfor programs to essentially every
computer system with a FORTRAN compiler. The popularity of the group waned in the
middle 1980s as Unix began to spread beyond universities.
With the current proliferation of GNU code and other clones of Unix programs, these
programs now receive little attention; modern C versions are much more efficient and do
more than these programs do. Nevertheless, as exposition of good programming style, and
evangelism for a still-valuable philosophy, these books are unparalleled, and I recommend
them highly.
Acknowledgment: I would like to express my gratitude to Brian Kernighan of Bell Labs,
the original Software Toolsmith, for reviewing this column.
273
under this License. If a section does not fit the above definition of Secondary then it is
not allowed to be designated as Invariant. The Document may contain zero Invariant
Sections. If the Document does not identify any Invariant Sections then there are none.
The “Cover Texts” are certain short passages of text that are listed, as Front-Cover
Texts or Back-Cover Texts, in the notice that says that the Document is released under
this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may
be at most 25 words.
A “Transparent” copy of the Document means a machine-readable copy, represented in a
format whose specification is available to the general public, that is suitable for revising
the document straightforwardly with generic text editors or (for images composed of
pixels) generic paint programs or (for drawings) some widely available drawing editor,
and that is suitable for input to text formatters or for automatic translation to a
variety of formats suitable for input to text formatters. A copy made in an otherwise
Transparent file format whose markup, or absence of markup, has been arranged to
thwart or discourage subsequent modification by readers is not Transparent. An image
format is not Transparent if used for any substantial amount of text. A copy that is
not “Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ASCII without
markup, Texinfo input format, LaTEX input format, SGML or XML using a publicly
available DTD, and standard-conforming simple HTML, PostScript or PDF designed
for human modification. Examples of transparent image formats include PNG, XCF
and JPG. Opaque formats include proprietary formats that can be read and edited only
by proprietary word processors, SGML or XML for which the DTD and/or processing
tools are not generally available, and the machine-generated HTML, PostScript or PDF
produced by some word processors for output purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following
pages as are needed to hold, legibly, the material this License requires to appear in the
title page. For works in formats which do not have any title page as such, “Title Page”
means the text near the most prominent appearance of the work’s title, preceding the
beginning of the body of the text.
The “publisher” means any person or entity that distributes copies of the Document to
the public.
A section “Entitled XYZ” means a named subunit of the Document whose title either
is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in
another language. (Here XYZ stands for a specific section name mentioned below, such
as “Acknowledgements”, “Dedications”, “Endorsements”, or “History”.) To “Preserve
the Title” of such a section when you modify the Document means that it remains a
section “Entitled XYZ” according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that
this License applies to the Document. These Warranty Disclaimers are considered to be
included by reference in this License, but only as regards disclaiming warranties: any
other implication that these Warranty Disclaimers may have is void and has no effect
on the meaning of this License.
2. VERBATIM COPYING
Appendix A: GNU Free Documentation License 275
You may copy and distribute the Document in any medium, either commercially or
noncommercially, provided that this License, the copyright notices, and the license
notice saying this License applies to the Document are reproduced in all copies, and
that you add no other conditions whatsoever to those of this License. You may not use
technical measures to obstruct or control the reading or further copying of the copies
you make or distribute. However, you may accept compensation in exchange for copies.
If you distribute a large enough number of copies you must also follow the conditions in
section 3.
You may also lend copies, under the same conditions stated above, and you may publicly
display copies.
3. COPYING IN QUANTITY
If you publish printed copies (or copies in media that commonly have printed covers) of
the Document, numbering more than 100, and the Document’s license notice requires
Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all
these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
the back cover. Both covers must also clearly and legibly identify you as the publisher
of these copies. The front cover must present the full title with all words of the title
equally prominent and visible. You may add other material on the covers in addition.
Copying with changes limited to the covers, as long as they preserve the title of the
Document and satisfy these conditions, can be treated as verbatim copying in other
respects.
If the required texts for either cover are too voluminous to fit legibly, you should put
the first ones listed (as many as fit reasonably) on the actual cover, and continue the
rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100,
you must either include a machine-readable Transparent copy along with each Opaque
copy, or state in or with each Opaque copy a computer-network location from which
the general network-using public has access to download using public-standard network
protocols a complete Transparent copy of the Document, free of added material. If
you use the latter option, you must take reasonably prudent steps, when you begin
distribution of Opaque copies in quantity, to ensure that this Transparent copy will
remain thus accessible at the stated location until at least one year after the last time
you distribute an Opaque copy (directly or through your agents or retailers) of that
edition to the public.
It is requested, but not required, that you contact the authors of the Document well
before redistributing any large number of copies, to give them a chance to provide you
with an updated version of the Document.
4. MODIFICATIONS
You may copy and distribute a Modified Version of the Document under the conditions
of sections 2 and 3 above, provided that you release the Modified Version under precisely
this License, with the Modified Version filling the role of the Document, thus licensing
distribution and modification of the Modified Version to whoever possesses a copy of it.
In addition, you must do these things in the Modified Version:
A. Use in the Title Page (and on the covers, if any) a title distinct from that of the
Document, and from those of previous versions (which should, if there were any,
Appendix A: GNU Free Documentation License 276
be listed in the History section of the Document). You may use the same title as a
previous version if the original publisher of that version gives permission.
B. List on the Title Page, as authors, one or more persons or entities responsible for
authorship of the modifications in the Modified Version, together with at least five
of the principal authors of the Document (all of its principal authors, if it has fewer
than five), unless they release you from this requirement.
C. State on the Title page the name of the publisher of the Modified Version, as the
publisher.
D. Preserve all the copyright notices of the Document.
E. Add an appropriate copyright notice for your modifications adjacent to the other
copyright notices.
F. Include, immediately after the copyright notices, a license notice giving the public
permission to use the Modified Version under the terms of this License, in the form
shown in the Addendum below.
G. Preserve in that license notice the full lists of Invariant Sections and required Cover
Texts given in the Document’s license notice.
H. Include an unaltered copy of this License.
I. Preserve the section Entitled “History”, Preserve its Title, and add to it an item
stating at least the title, year, new authors, and publisher of the Modified Version as
given on the Title Page. If there is no section Entitled “History” in the Document,
create one stating the title, year, authors, and publisher of the Document as given
on its Title Page, then add an item describing the Modified Version as stated in
the previous sentence.
J. Preserve the network location, if any, given in the Document for public access to
a Transparent copy of the Document, and likewise the network locations given in
the Document for previous versions it was based on. These may be placed in the
“History” section. You may omit a network location for a work that was published
at least four years before the Document itself, or if the original publisher of the
version it refers to gives permission.
K. For any section Entitled “Acknowledgements” or “Dedications”, Preserve the Title
of the section, and preserve in the section all the substance and tone of each of the
contributor acknowledgements and/or dedications given therein.
L. Preserve all the Invariant Sections of the Document, unaltered in their text and
in their titles. Section numbers or the equivalent are not considered part of the
section titles.
M. Delete any section Entitled “Endorsements”. Such a section may not be included
in the Modified Version.
N. Do not retitle any existing section to be Entitled “Endorsements” or to conflict in
title with any Invariant Section.
O. Preserve any Warranty Disclaimers.
If the Modified Version includes new front-matter sections or appendices that qualify as
Secondary Sections and contain no material copied from the Document, you may at
your option designate some or all of these sections as invariant. To do this, add their
Appendix A: GNU Free Documentation License 277
titles to the list of Invariant Sections in the Modified Version’s license notice. These
titles must be distinct from any other section titles.
You may add a section Entitled “Endorsements”, provided it contains nothing but
endorsements of your Modified Version by various parties—for example, statements of
peer review or that the text has been approved by an organization as the authoritative
definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up
to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified
Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be
added by (or through arrangements made by) any one entity. If the Document already
includes a cover text for the same cover, previously added by you or by arrangement
made by the same entity you are acting on behalf of, you may not add another; but
you may replace the old one, on explicit permission from the previous publisher that
added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission
to use their names for publicity for or to assert or imply endorsement of any Modified
Version.
5. COMBINING DOCUMENTS
You may combine the Document with other documents released under this License,
under the terms defined in section 4 above for modified versions, provided that you
include in the combination all of the Invariant Sections of all of the original documents,
unmodified, and list them all as Invariant Sections of your combined work in its license
notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical
Invariant Sections may be replaced with a single copy. If there are multiple Invariant
Sections with the same name but different contents, make the title of each such section
unique by adding at the end of it, in parentheses, the name of the original author or
publisher of that section if known, or else a unique number. Make the same adjustment
to the section titles in the list of Invariant Sections in the license notice of the combined
work.
In the combination, you must combine any sections Entitled “History” in the various
original documents, forming one section Entitled “History”; likewise combine any
sections Entitled “Acknowledgements”, and any sections Entitled “Dedications”. You
must delete all sections Entitled “Endorsements.”
6. COLLECTIONS OF DOCUMENTS
You may make a collection consisting of the Document and other documents released
under this License, and replace the individual copies of this License in the various
documents with a single copy that is included in the collection, provided that you follow
the rules of this License for verbatim copying of each of the documents in all other
respects.
You may extract a single document from such a collection, and distribute it individually
under this License, provided you insert a copy of this License into the extracted
document, and follow this License in all other respects regarding verbatim copying of
that document.
Appendix A: GNU Free Documentation License 278
Index
! --binary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 --block-size . . . . . . . . . . . . . . . . . . . . . . . . . 5, 147, 151
!= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 --block-size=size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
--body-numbering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
--boot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
% --bourne-shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
--break-file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
% . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
--buffer-size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
%b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
--bytes . . . . . . . . . . . . . . . . 27, 29, 30, 34, 41, 72, 151
%q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
--c-shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
--cached=mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
--canonicalize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
& --canonicalize-existing . . . . . . . . . . . . . . . 136, 179
& . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 --canonicalize-missing . . . . . . . . . . . . . . . . 136, 179
--changes . . . . . . . . . . . . . . . . . . . . . . . . . . 139, 141, 143
--characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
* --chars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 --chdir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
--check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
--check-chars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
+ --classify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 --color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
+page_range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 --columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
--compare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
--complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73, 82
– --compute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170, 213 --context. . . . . . . . . 95, 110, 120, 123, 134, 135, 190
- and Unix rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 --count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60, 193
‘-’, removing files beginning with. . . . . . . . . . . . . . 124 --count-links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
-- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 --crown-margin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
--across . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 --csh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
--additional-suffix. . . . . . . . . . . . . . . . . . . . . . . . . . 36 --data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
--address-radix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 --date . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145, 199
--adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 --dead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
--algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 --debug . . . . . . . . . . . 44, 105, 119, 121, 199, 214, 229
--all . . . . . . . . . . . . . . 88, 89, 147, 151, 182, 192, 204 --decode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
--all-repeated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 --delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
--almost-all . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 --delimiter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73, 229
--apparent-size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 --delimiters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
--append . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 --dereference . . . 91, 106, 139, 141, 143, 152, 155,
--archive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 207
--argv0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 --dereference-args . . . . . . . . . . . . . . . . . . . . . . . . . . 151
--attributes-only . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 --dereference-command-line. . . . . . . . . . . . . . . . . . 90
--author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 --dereference-command-line-symlink-to-dir . . 90
--auto-reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 --dictionary-order . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
--backup . . . . . . . . . . . . . . . . . . . . 2, 105, 118, 121, 130 --digits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
--base16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 --dir. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
--base2lsbf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 --directory. . . . . . . . . . . . . . . . . . . . . 90, 118, 131, 178
--base2msbf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 --dired . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
--base32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 --double-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
--base32hex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 --dry-run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
--base64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19, 44 --echo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
--base64url . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 --elide-empty-files . . . . . . . . . . . . . . . . . . . . . . 36, 39
--batch-size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 --endian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
--before . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 --equal-width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Index 282
5
= 512-bit checksum . . . . . . . . . . . . . . . . . . . . . . . . . . . 47, 48
= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167, 171
== . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168, 171
Index 286
A B
abbreviations for months . . . . . . . . . . . . . . . . . . . . . . 247 b for block special file . . . . . . . . . . . . . . . . . . . . . . . . . 135
access control lists (ACLs) . . . . . . . . . . . . . . . 106, 107 b2sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
access permission tests . . . . . . . . . . . . . . . . . . . . . . . . 167 background jobs, stopping at terminal write . . . 185
access permissions, changing . . . . . . . . . . . . . . . . . . 142 backslash escapes . . . . . . . . . . . . . . . . . . . . . . . . . 83, 161
access time, changing . . . . . . . . . . . . . . . . . . . . . . . . 145 backslash sequences for file names . . . . . . . . . . . . . 101
access timestamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 backup files, ignoring. . . . . . . . . . . . . . . . . . . . . . . . . . . 90
backup options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
access timestamp, printing or sorting files by . . 96
backup suffix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
access timestamp, show the most recent . . . . . . 154
backups, making . . . . . . . . . . . . . 2, 105, 118, 121, 130
across columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
backups, making only . . . . . . . . . . . . . . . . . . . . . . . . . 104
across, listing files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 base32. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
adding permissions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 base32 encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18, 19
addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 base64. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
ago in date strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 base64 checksum encoding . . . . . . . . . . . . . . . . . . . . . 44
all lines, grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Base64 decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
all repeated lines, outputting . . . . . . . . . . . . . . . . . . . 60 base64 encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
alnum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 basename . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
alpha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 basenc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
alternate ebcdic, converting to . . . . . . . . . . . . . . 113 baud rate, setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
always classify option . . . . . . . . . . . . . . . . . . . . . . . . . . 97 beeping at input buffer full . . . . . . . . . . . . . . . . . . . . 184
always color option . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 beginning of time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
always hyperlink option . . . . . . . . . . . . . . . . . . . . . . . 98 beginning of time, for POSIX . . . . . . . . . . . . . . . . . 250
always interactive option . . . . . . . . . . . . . . . . . . . . . 124 Bellovin, Steven M. . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
always total option . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Berets, Jim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
am i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Berry, K. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1, 252
am in date strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 binary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
binary I/O. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
and operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168, 171
binary input files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
append . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
bind mount . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124, 158
appending to the output file . . . . . . . . . . . . . . . . . . 114
birth time, printing or sorting files by . . . . . . . . . 96
appropriate privileges . . . . . . . . . . . 119, 199, 205, 219 birthtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
arbitrary date strings, debugging . . . . . . . . . . . . . . 199 BLAKE2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
arbitrary date strings, parsing . . . . . . . . . . . . . . . . . 199 BLAKE2 hash length. . . . . . . . . . . . . . . . . . . . . . . 44, 47
arbitrary text, displaying . . . . . . . . . . . . . . . . . . . . . 161 blank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
arch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 blank lines, numbering . . . . . . . . . . . . . . . . . . . . . . . . . 15
arithmetic tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 blanks, ignoring leading . . . . . . . . . . . . . . . . . . . . . . . . 50
arrays of characters in tr . . . . . . . . . . . . . . . . . . . . . . 82 block (space-padding) . . . . . . . . . . . . . . . . . . . . . . . . 113
ascii, converting to . . . . . . . . . . . . . . . . . . . . . . . . . . 113 block size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3, 111
ASCII dump of files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 block size of conversion . . . . . . . . . . . . . . . . . . . . . . . 112
atime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 block size of input . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
atime, changing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 block size of output . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
atime, printing or sorting files by. . . . . . . . . . . . . . . 96 block special check . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
atime, show the most recent . . . . . . . . . . . . . . . . . . 154 block special files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
attribute caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 block special files, creating . . . . . . . . . . . . . . . . . . . . 135
attributes, file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 BLOCK_SIZE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
authors of parse_datetime . . . . . . . . . . . . . . . . . . . 251 BLOCKSIZE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
auto classify option . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 body, numbering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Bourne shell syntax for color setup . . . . . . . . . . . . 103
auto color option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
breaks, cause interrupts . . . . . . . . . . . . . . . . . . . . . . . 184
auto hyperlink option . . . . . . . . . . . . . . . . . . . . . . . . . . 98
breaks, ignoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
auto total option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
brkint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
bs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
BSD output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
BSD sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
BSD tail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
BSD touch compatibility . . . . . . . . . . . . . . . . . . . . . 145
bsn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Index 287
J
I join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
ibs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
icanon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
icrnl. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 K
id . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 kernel name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
idle time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 kernel release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
IEEE floating point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 kernel version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
iexten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 kibibyte, definition of . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
if . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 kibibytes for file sizes . . . . . . . . . . . . . . . . . . . . . . . . . 152
iflag. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 kibibytes for file system sizes . . . . . . . . . . . . . . . . . . 148
ignbrk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 kill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187, 226
igncr. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 kilobyte, definition of. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
ignore file systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Knuth, Donald E. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Ignore garbage in base64 stream. . . . . . . . . . . . . . . . 19
ignoring case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
ignpar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 L
imaxbel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 language, in dates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
immunity to hangups . . . . . . . . . . . . . . . . . . . . . . . . . 221 last day . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199, 248
implementation, hardware. . . . . . . . . . . . . . . . . . . . . 204 last in date strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
indenting lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 last modified dates, displaying in du . . . . . . . . . . . 153
index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 last part of files, outputting . . . . . . . . . . . . . . . . . . . . 30
information, about current users . . . . . . . . . . . . . . 192 LC_ALL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49, 89
initial part of files, outputting . . . . . . . . . . . . . . . . . . 29 LC_COLLATE . . . . . . . . . . . . . . . . . . . . 49, 59, 61, 77, 171
initial tabs, converting . . . . . . . . . . . . . . . . . . . . . . . . . 87 LC_CTYPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50, 51, 163
inlcr. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 LC_MESSAGES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
inode number, printing . . . . . . . . . . . . . . . . . . . . . . . . . 93 LC_NUMERIC . . . . . . . . . . . . . . . . . . . . . 4, 11, 50, 51, 163
inode usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 LC_TIME . . . . . . . . . . . . . 24, 51, 99, 100, 101, 154, 195
inode usage, dereferencing in du . . . . . . . . . . . . . . . 152 lcase. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
inode, and hard links . . . . . . . . . . . . . . . . . . . . . . . . . 130 lcase, converting to . . . . . . . . . . . . . . . . . . . . . . . . . . 113
inodes, written buffered . . . . . . . . . . . . . . . . . . . . . . . 159 LCASE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
inpck. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 lchown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139, 141
input block size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 leading directories, creating missing . . . . . . . . . . . 118
input encoding, UTF-8. . . . . . . . . . . . . . . . . . . . . . . . 184 leading directory components, stripping . . . . . . . 175
input range to shuffle . . . . . . . . . . . . . . . . . . . . . . . . . . 58 leap seconds . . 146, 195, 196, 201, 203, 246, 247, 250
input settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 left margin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
input tabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
install . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 limiting output of du . . . . . . . . . . . . . . . . . . . . . . . . . . 152
intr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
invocation of commands, modified . . . . . . . . . . . . . 210 line buffered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
iseek. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 line count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
isig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 line numbering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
iso9660 file system type . . . . . . . . . . . . . . . . . . . . . . 150 line separator character . . . . . . . . . . . . . . . . . . . . . . . . 36
ISO 8601 date and time of day format . . . . . . . . . 248 line settings of terminal . . . . . . . . . . . . . . . . . . . . . . . 182
ISO 8601 date format . . . . . . . . . . . . . . . . . . . . . . . . . 247 line-breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
ISO/IEC 10646 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 line-by-line comparison . . . . . . . . . . . . . . . . . . . . . . . . . 61
ISO9660 file system type . . . . . . . . . . . . . . . . . . . . . . 150 LINES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
ispeed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
istrip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 links, creating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
items in date strings . . . . . . . . . . . . . . . . . . . . . . . . . . 245 Linux file system types . . . . . . . . . . . . . . . . . . . . . . . . 150
Index 292
release of kernel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 security context . . . 95, 110, 119, 120, 123, 134, 135,
relpath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 190
remainder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 seek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
remote hostname . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 self-backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
removing characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 SELinux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95, 119, 190
removing empty directories . . . . . . . . . . . . . . . . . . . 137 SELinux context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
removing files after shredding . . . . . . . . . . . . . . . . . 127 SELinux, context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
removing files or directories . . . . . . . . . . . . . . . . . . . 123 SELinux, restoring security context . . . . . . . . . . . 123
removing files or directories (via SELinux, setting/restoring
the unlink syscall) . . . . . . . . . . . . . . . . . . . . . . . . . . 137 security context . . . . . . . . . . . . . . 110, 120, 134, 135
removing permissions . . . . . . . . . . . . . . . . . . . . . . . . . 238 send a signal to processes . . . . . . . . . . . . . . . . . . . . . 226
renaming files without copying them . . . . . . . . . . 121 sentences and line-breaking . . . . . . . . . . . . . . . . . . . . 22
repeat output values . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 separator for numbers in seq . . . . . . . . . . . . . . . . . . 235
repeated characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 seq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
repeated lines, outputting . . . . . . . . . . . . . . . . . . . . . . 60 sequence of numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 234
repeated output of a string . . . . . . . . . . . . . . . . . . . . 164 set-group-ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
restricted deletion flag . . . . . . . . . . . . . . . . . . . . . . . . 236 set-group-ID check . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
restricted security context . . . . . . . . . . . . . . . . . . . . . 208 set-user-ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
return, ignoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 set-user-ID check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
return, translating to newline . . . . . . . . . . . . . . . . . 184 setgid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
reverse sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52, 95 setting permissions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
reversing files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 setting the hostname . . . . . . . . . . . . . . . . . . . . . . . . . . 205
rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 setting the time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
rmdir. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 setuid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
rn format for nl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 setup for color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
ronnabyte, definition of . . . . . . . . . . . . . . . . . . . . . . . . . 5 sh syntax for color setup . . . . . . . . . . . . . . . . . . . . . . 103
root as default owner . . . . . . . . . . . . . . . . . . . . . . . . . 119 sha1sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
root directory, allow recursive destruction . . . . . 124 sha224sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
root directory, allow recursive sha256sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
modification . . . . . . . . . . . . . . . . . . . . . . 140, 142, 143 sha384sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
root directory, disallow recursive destruction . . 124 sha512sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
root directory, disallow SHA-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
recursive modification . . . . . . . . . . . . . 139, 141, 143 SHA-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
root directory, running a shebang arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
program in a specified . . . . . . . . . . . . . . . . . . . . . . 210 shell utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 SHELL environment variable, and color . . . . . 97, 103
rprnt. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 shred. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
RTS/CTS flow control . . . . . . . . . . . . . . . . . . . . . . . . 183 shuf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
run commands with bounded time . . . . . . . . . . . . 223 shuffling files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
run with security context . . . . . . . . . . . . . . . . . . . . . 208 SI output . . . . . . . . . . . . . . . . . . . . . . . . . . 4, 95, 149, 153
runcon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 signals, specifying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
running a program in a simple backup method . . . . . . . . . . . . . . . . . . . . . . . . . . 3
modified environment . . . . . . . . . . . . . . . . . . . . . . . 211 SIMPLE_BACKUP_SUFFIX. . . . . . . . . . . . . . . . . . . . . . . . . . 3
running a program in a specified single quotes, and env -S . . . . . . . . . . . . . . . . . . . . . 216
root directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 single-column output of files . . . . . . . . . . . . . . . . . . 96
rz format for nl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
size for main memory sorting . . . . . . . . . . . . . . . . . . . 54
size of file to shred . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
size of files, reporting . . . . . . . . . . . . . . . . . . . . . . . . . . 94
S size of files, sorting files by . . . . . . . . . . . . . . . . . 95
Salz, Rich . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 skip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
same file check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 sleep. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
sane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 socket check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
scheduling, affecting . . . . . . . . . . . . . . . . . . . . . . . . . . 219 software flow control . . . . . . . . . . . . . . . . . . . . . . . . . . 184
screen columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
scripts arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 sort field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
seconds since the Epoch. . . . . . . . . . . . . . . . . . . . . . . 195 sort stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49, 54
section delimiters of pages. . . . . . . . . . . . . . . . . . . . . . 14 sort’s last-resort comparison . . . . . . . . . . . . . . . . 49, 54
Index 296
W X
wc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 xcase. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
week in date strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 xdigit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
werase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 xfs file system type . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
XON/XOFF flow control. . . . . . . . . . . . . . . . . . . . . . 184
who . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
who am i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
whoami . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Y
width, sorting option for ls . . . . . . . . . . . . . . . . . . . . 96
year in date strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
word count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
yes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
working context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 yesterday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
working directory, printing . . . . . . . . . . . . . . . . . . . . 182 yesterday in date strings . . . . . . . . . . . . . . . . . . . . . 249
wrap data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 yottabyte, definition of . . . . . . . . . . . . . . . . . . . . . . . . . . 5
wrapping long input lines . . . . . . . . . . . . . . . . . . . . . . 27 Youmans, B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
writable file check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
write permission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
write permission, symbolic . . . . . . . . . . . . . . . . . . . . 238 Z
write, allowed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 zero-length string check . . . . . . . . . . . . . . . . . . . . . . . 167
wtmp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191, 192 zettabyte, definition of . . . . . . . . . . . . . . . . . . . . . . . . . . 5