BASH Guide - Joseph DeVeau
BASH Guide - Joseph DeVeau
If your computer is using something other than /bin/bash, you can change
your default shell to Bash (provided of course Bash is installed on your system)
by issuing one of the commands below.
The first sets the shell for a particular user (change username to reflect your
username, i.e. the output of whoami) in the configuration file, /etc/passwd. The
second sets the shell for the currently logged-in user upon login through the use
of that user’s ~/.profile configuration file.
What a shell is (and isn’t)
At its most basic, a shell, despite nigh-on four decades of development, is
just a command interpreter. Regardless of how complex the commands are, how
convoluted the logic, and how esoteric the references, all a shell allows you to
do is issue commands interactively or non-interactively (more on this later).
A shell is not your computer, operating system, or display. Nor is it your
editor, terminal, mouse, keyboard, or kernel.
When you launch Bash, what you are actually doing is launching a terminal
emulator window with Bash running inside it. Like other programs, you can
interface with the terminal emulator window (and thus Bash) through use of
your mouse and keyboard. Typing and running commands allows you to speak
to your system, be it parts of the operating system such as configuration files or
window managers, or the kernel itself to say, start, stop, or restart a process.
Using Bash
As mentioned above, Bash allows you to issue commands interactively or
non-interactively.
Bash’s interactive mode is likely the one you are the most familiar with. It
consists of a terminal emulator window with Bash running inside it. When you
type a command, it is written in real time (or near-about) to your terminal and
displayed on your monitor.
Non-interactive mode is simply Bash running in the background. This
does not mean it can’t (or won’t) output anything to the screen, write or read to
or from a file, or anything like that—it can if you tell it to! Non-interactive mode
essentially boils down to a sequential listing of commands to be performed.
This listing of commands is called a Bash script. Those of you with a
background in Microsoft Windows may recognize the terms DOS scripting or
batch file. The two are one in the same, but neither holds a candle to Bash
scripting (which is why Microsoft eventually released the Powershell, and is
currently working to allow native execution of Bash on its OS).
Prompts
There are three major types of prompts. Each begins with a different
character. The ampersand symbol, &, indicates you are running a Bourne,
POSIX, or Korn shell. A percent symbol, %, means you are running a csh, or
zsh shell. And a hash, #, lets you know you are executing commands as root—
be extra careful!
In all likelihood your prompt is much longer than a single symbol. The
default for many of the more popular GNU/Linux distributions is your
username at the machine name you are running on, followed by the current
working directory and then the shell prompt.
Here is an example for the user aries, working on the computer mars-laptop,
which is running a Bash shell, and has a current working directory of ~ (a tilde
expands to /home/username, in this case /home/aries).
For the sake of clarity, usernames and machine names will be omitted from
the remainder of this guide unless required.
Help
Before you go any further, be sure to acquaint yourself with commands that
allow you to lookup help documentation.
Man, short for manual, displays the (surprise, surprise) manual pages for the
argument (more on this later) that immediately follows.
Note that for various commands there exist more than one man page. For
example, the command printf has two man pages (1 and 3).
Help displays the man pages for Bash’s built-ins and keywords. Here we
access the help pages for the conditional blocks, [[. Both shell built-
ins/keywords and conditional blocks will be discussed in depth further on.
If you need to determine what type a particular command is, you use type.
Whatis prints a short, single line description of the argument that follows.
Adding a switch (a command-specific argument) to man allows it to perform the
same function.
Apropos searches a man page’s short description for the keyword that was
passed to it. Again, adding a switch to man allows it to do the same thing.
Chapter 2: Editors
There are many different editors available, from barebones, terminal-only
editors to full-fledged programming interfaces complete with version control
and automated compiling.
Unless you get into remote administration of your (or other peoples’)
computers, rarely will you encounter a situation that forces you to use one editor
or another. Beyond that, choosing a particular editor is largely a matter of
personal preference. My suggestion is to try many different editors—installation
of a new editor is rather straightforward thanks to modern day package
managers—and pick one you think will work for you (in terms of features,
appeal, etc.). Then stick with it! The longer you use an editor, the deeper
understanding you get of its advantages and pitfalls, its shortcuts, and nuances
(or perhaps nuisances. . .).
One critical item before I dive into the editors themselves: do NOT use
Microsoft Word, Microsoft WordPad, OpenOffice, Libreoffice, or any other
WYSIWYG (what you see is what you get) text editor for scripting or
programming. Behind the pretty façade lurks invisible formatting tags that can
cause major headaches (such as Microsoft Office’s carriage returns). When these
files are read by a GNU/Linux command interpreter, things tend to break
invisibly. Use them at your own peril! If you are on a Windows-based machine
and need an alternative, I highly suggest downloading and installing the free
source code editor, Notepad++.
And now, in no particular order, I present four editors and their basic usage.
As mentioned before, there are many others out there—joe, pico, jed, gvim,
NEdit, eclipse, the list goes on. If you don’t find one you like right away, don’t
be discouraged.
Emacs
Boasting a graphical user interface capable of displaying many simultaneous,
real-time buffers, emacs is a long-time user favorite. Though you can issue
commands through its file menus, a never-ending series of keyboard shortcuts
exist (nearly two thousand as of last count). Many a web comic pokes fun at the
sheer number of shortcuts and the obscure tasks they perform.
Installation is typically performed through a command line package manager
(apologies to non-Debian based users out there, but for simplicities sake I had
to pick one package manager; apt/aptitude it is).
Upon opening emacs for the first time, you will get a splash screen. Basic
usage is as simple as typing what you want, then using the file menu to save.
VIM
Originally created as VI, this editor has since been upgraded to VIM (Vi
Improved). VIM is the main rival of emacs in the long-standing Unix editor wars.
No, it is not graphical (unless of course you install gvim), and no, it is not as
pretty, but you would be hard pressed to find a more zealous user base.
Originally written for the days when window managers did not exist and
modems were rated in double- or triple-digit baud rates, vim runs entirely within
a terminal. Like Dwarf Fortress, the ASCII-based Dwarven city management
game, it has a legendary learning curve. Don’t let that discourage you from
trying it, however. The basics are easy to learn, and in the hands of a master, the
results are nothing shy of magical.
To write commands, you will need to enter “insert mode” by hitting [i]. You
can then type using your normal keyboard routines. When ready to save, re-
enter “normal mode” by hitting [ESC]. A few basic commands are show in the
table below. For more advanced usage, consult the manual or an online “cheat
sheet.”
VIM usage:
Command Description
i Enter “insert mode”
[ESC] Enter “normal mode”
:w Write file (save)
ZZ Write file and quit
:q Quit
:q! Quit without writing file
Nano
Nano is a good, lightweight terminal editor. Installed by default (if not,
simply follow the previous apt-get command, but exchange vim for nano) on a
number of distributions, it is quite easy to use. All the command shortcuts are
displayed right there on the bottom of the screen for your convenience.
The caret (or hat) preceding the command letter means you should hit
[CTRL] first, followed by the key of your choice, say [X], to quit.
Gedit
The final editor covered is Gedit, which is also available for installation on
Windows-based computers. It is a good, general purpose notepad, command
and script editor, file viewer, and more. As before, if it is not installed on your
system, you can easily install it with your package manager.
Keyboard shortcuts exist, and are shown in grey next to their command in
the appropriate file menus. For the most part, the shortcuts are the same as the
ones you would find on a standard Windows machine—namely [CTRL][S] to
save, [CTRL][Q] to quit, and so on.
Chapter 3: The Basics
Comments
Anything preceded by a hash (#) is a comment, and is not executed by Bash.
Thus, the following produces no output.
Incidentally, because man and whatis are themselves commands, typing the
following into a terminal is perfectly valid.
Arguments
Arguments are everything (and anything) that follows a command.
Arguments are separated by whitespace (spaces, tabs, and newlines). To
solidify this concept, three commands are introduced, mkdir, cd, and touch.
Mkdir is short for “make directory” and takes one or more arguments. Each
argument separated that is by a whitespace is interpreted by the command as the
name of a new directory to create. This separation is called word splitting.
Word splitting is a concept crucial to Bash. If you want to perform any task on
something containing any whitespace, you must take word splitting into account
and either quote it or use escape characters, as covered in the subsequent
sections.
In order to move into the test directory we created above, the command cd,
short for “change directory” is used. It takes exactly one argument—the name
of the directory you which to move into.
The reason our ls command returns nothing is because there are no files or
directories in it yet!
For those of you with some experience with GNU/Linux systems, you’re
probably saying, “But there are two hidden items in this directory!” And you
would be right. You can verify this by passing an argument (also called a flag) to
ls. That flag is -a (a hyphen followed by a lowercase a). It tells our ls command
not to ignore entries that start with a dot (which are hidden by the operating
system by default).
These “dot directories” are special in that they refer to the current directory
itself (a single dot), and its parent directory (a double dot). Files can also be
“dotted” and hidden. They are then called “dot files.”
As an example, you can move from the current directory to its parent
directory with:
In order to create a file (or three or twelve) you pass them as—you guessed
it!— arguments to our third command, touch, whose job it is to change the last
modified time of a file. However, since out files do not yet exist, touch creates a
new file for each argument.
If the output of this seemingly simple statement confuses you, have no fear!
Several things are happening at once. When touch is executed, Bash performs
word splitting to find all the words in the line. They are then passed to touch and
processed one at a time. This means that the file one is created first, followed by
fish, then two, then fish again… wait a minute! Because the file fish was just
created a millisecond ago, instead of creating another file with the same name,
touch performs its duty faithfully and updates the last modified time of the file
fish. Next up is the file red, fish again, blue, then fish yet again. Tricky, huh?
It is important to note that despite the amount of whitespace between
arguments, Bash only uses them to delimit one argument from another, nothing
more.
First, remove the files you created you previously created in the test directory
with rm.
The asterisk (*), is called a glob and has a special meaning that we will cover
in depth later on. For now, think of it as a special “match everything” character.
Thus, our previous command is telling rm to perform its action (remove files or
directories) on everything it encounters in the current directory. As I’m sure you
suspect, you should be extra careful with this, as there is no undo button.
Let’s try using touch again, this time with spaces.
Notice that the amount of whitespace does not make a lick of difference.
I can hear you thinking, “But what if I want to create a file with spaces in
it?” The answer is to use quotes.
Quotes
To demonstrate how quotes work without creating a mess of files and
directories, the echo command is particularly helpful. Echo works just like an
echo in real life and echoes back whatever arguments you pass to it.
Okay so that worked just as we wanted it too. Echo took each of our
arguments one at a time and printed them one at a time with a single space in
between. But what if, as in our touch command a minute ago, there is more than
once space?
Though it reads right, it is missing the extra spaces, wanted or not. I’m sure
you can see how this can quickly become a headache if you are not expecting it.
Enter quotes. Quotes turn everything within them into a single argument.
Now instead of reading a word, finding a space, and thinking the next word is a
new argument, Bash treats the quoted section as one single, long argument.
You go to open your final semester report only to find it’s no longer there,
but that horrid tps report is!
The problem was that your tps report had a space in the filename. Because
you did not quote it, Bash read in the arguments and performed word splitting:
it separated the arguments by whitespace and operated on them one at a time.
tps was the first argument. Since there was no file named tps in the directory, it
predictably failed. When it went on to the next argument, report, it found the file
present, and deleted it. Thus, you deleted what you wanted to save, and saved
what you wanted to delete.
The proper way is to use quotes.
Much better!
Get into a habit of double quoting any string that does, will, or may contain
spaces, tabs, newlines, or special characters. You’ll thank yourself for it later.
Backslash escapes
An “alternative” to quoting is to backslash escape any character than has
special meaning (essentially all non-alphanumeric characters). Though it works,
it is extremely tedious, causes problems if not handled correctly, and is not easily
readable. The example below is one of the very few times you will see escape
characters in this guide.
Strings
Whether you realized it or not, virtually everything you’ve seen thus far is a
string. A string is simply a series of characters (letters, numbers, symbols, and
punctuation marks). Unless explicitly told otherwise, Bash treats virtually
everything as a string.
The name of the command is a string. The arguments that follow are each
strings in their own right. A quoted sentence is a string. Each of the
aforementioned strings can be thought of as a substring to the entire entered
command, arguments and all. Heck, an entire file can be thought of as a single
string.
This is an important concept to grasp, because as powerful as computers
are, they cannot reason. They do not see things as we do.
When we see a sentence like, “Hi, how are you?” we understand it is a
question, and one asking how we are doing.
On the other hand, Bash only sees a single string, “Hi, how are you?”
After word splitting, it sees sequence of strings, each separated by a
whitespace, and made up of individual characters.
Bash’s view
String 1
Hi, how are you?
Word 1 Word 2 Word 3 Word 4
Hi, how are you?
Note that two of the words, Hi, and you? are complete gibberish if taken
literally—you will not find either in the dictionary. Bash does not care. It sees a
quotation mark, and per the rules of its programming, takes all the following
characters as part of the same word until it encounters a whitespace, at which
point it begins the second word, and so on until the closing quotation mark.
Note that quotation marks are one of the many special characters in Bash, and
have a nonliteral meaning.
This means is that the onus is on you, the programmer, the scripter, the
coder, to ensure that whatever you type not only makes sense to you, but more
importantly, it makes perfect sense to Bash’s rigid rules. If done incorrectly, the
best result will be that Bash throws an error and ceases to execute your code. At
worst, the syntax you entered will mean something completely different than
what you intended and you will find yourself in deep trouble.
IFS
This section utilizes many concepts presented in later chapters of this guide.
It is recommended you skip it and return once you have completed the guide.
For now, simply be aware that IFS stands for Internal Field Separator. Its
purpose is to figure out where and how to perform word splitting based on a list
of delimiters. The default IFS value (delimiters) are space (‘ ’), horizontal tab (‘\t’),
and newline (‘\n’).
There are a couple ways to check the IFS settings. The first uses printf, the
second od, and the third, set.
Setting the IFS value is a matter of passing the special characters you want
as delimiters to IFS.
Here, the IFS value is set to the null string.
This is useful if you want to preserve all whitespace, perhaps while reading
from a file (both loops and the read command are covered in later chapters).
Commas are also common delimiters. Here, the IFS value is set as such.
Notice how the IFS value is permanently modified after you set it? This is
typically not useful. In most cases, you want to change the IFS value for a single
command, such as reading input, but leave the value at its default for the
remainder of the script; else your script will behave strange, as word splitting is
no longer being performed the way you expect.
One option is to set a variable with the current IFS value, and then return it
later.
See how the IFS value only changed for that single command? That’s exactly
what we wanted.
Note that an IFS value corresponds to its value and exactly its value. Thus, if
two delimiters are encountered, it corresponds to an empty field between the
two.
If not for $PATH, every time you wanted to call ls, you would need to
specify exactly where it was stored:
This is important to understand because when you start writing scripts, you
will not be able to call your script by name like you call ls. The reason is because
it will not be stored in one of the places named in your path!
Well, it should not be at any rate. Adding a random folder to your $PATH
makes your system dramatically less secure. It introduces the ability for a
malicious user (or a careless mistake) to wreck your system. Imagine if you
added your script storage folder to the front end of $PATH, then accidently
called a bulk, non-interactive script you named rm (thus clobbering, or colliding
with the existing binary, rm). Bash would search for the name you gave it (rm),
find your script first (instead of the system binary, rm), then run it without a care
for what it deleted. You can see the chance for misuse and abuse!
If, after all that, you are still bound and determined to add a directory to
your path variable, at least use something consistent like /opt/bin or
/home/username/bin and always place the new path at the end of the current path
(because the PATH environment variable is searched in order, from left to
right).
Builtins
Now that you know what binaries are, a builtin is a piece of cake. As the
name suggests, a builtin is quite literally something built-in to Bash. Because they
do not require searching of the PATH environment variable, then forking off a
separate process to run the binary, they are typically execute quite a bit faster.
Examples of shell builtins are: alias, cd, echo, read, and type.
Scripts
Scripts are the focus of this guide. All a script is is a series of commands
stored in a file. Unlike a binary, this file written in a high level language, is
directly editable, and does not require compiling. You can think of Bash as the
liaison or interpreter between your high level script and the low level code
required by the system’s processor.
Scripts all begin with the same line, called the shebang:
When you call your script, the Linux kernel reads the first line, finds the
shebang, and is told that it should look for the liaison/interpreter at the location
/bin/bash.
Despite the many similarities between shells such and dash, sh, and bash, and
zsh, ksh, and csh, there are differences. If you have written and tested your script
using Bash, do not assume it will work in dash or sh (though many will).
Even though they are not binaries, scripts need to be given executable
permissions before they are run. That is unless you want to call it as an
argument to bash every time you run it.
The way you give executable permission to a script is through use of the
binary, chmod.
You can now run the script by name and residing directory (because chances
are your current working directory is not in the $PATH variable, right?).
Aliases
Aliases are shortened commands. That’s it. Nothing fancy or dramatic.
Aliases are simply a way to save time when having to type a long command over
and over again, or save you from remembering all the switches and arguments a
long command needs. Running alias without any arguments lists all aliases
currently defined for your user.
Now when upgrade is entered into a terminal, Bash replaces the string upgrade
with the string “sudo apt-get update && sudo apt-get upgrade”. It is that simple. Below
are a few (hopefully) useful aliases.
Standard aliases:
Install a package:
Remove a package:
Note that if you are using rsync, you should exclude temporary filesystems,
removable storage and the like (/proc, /run, /sys, /dev, /media, /mnt, and
/tmp).
Backup using tar:
SSH:
Shutdown:
Reboot:
Functions
Functions are basically just mini scripts. They can handle variables,
commands, arguments, and much more. As your Bash coding skills progress,
your scripts will typically take on the look of a chain of islands (the functions),
each with nice little landing and takeoff strip for planes from the mainland (the
script itself) to deliver, retrieve, and perform tasks on packages (data). Functions
will be covered once script basics have been covered.
Standard streams
There are three standard streams, Each has a specific and distinct purpose.
Stdin is the first, and stands for “standard input.” By default it refers to
input from a keyboard. Inside a terminal running Bash, all typed text commands
are entered via the keyboard, and thus via standard input. The file
descriptor (more on this topic will be covered later on) for standard input is 0
(zero).
Later chapters of this guide cover other modes of input such as reading
from a file, a process, or the output of a previous command. While not entered
via the keyboard, they are redirected to standard input through use of a special
character such as a pipe (|) or multiple less-than signs (<, <<, <<<).
Stdout is the second stream and refers to “standout output.” By default it
outputs to the terminal running Bash. The file descriptor for standard output is
1 (one).
As with stdin, stdout can be redirected to another process or file through use
of a special character such as a pipe (|) or one or more greater-than signs (>,
>>).
Sterr means “standard error.” Whenever a program generates an error,
whether through misuse, mistakes, or even intentionally, any and all error codes
or messages are written to standard error. It is important to note that standard
error and standard output do not always output to the same location unless
explicitly told to do so. While a potential point of confusion at first, this solves
the issue of having to sift through many lines of output for an obfuscated error
message. It has the file descriptor 2 (two).
As with the other standard streams, standard error stream can be, and a lot
of times is (especially when debugging), redirected. Typically it will be redirected
to the same destination as standard output. This is accomplished by pointing the
stderr file descriptor to stdout’s file descriptor. Again, this will be covered much
more thoroughly in later chapters.
Chapter 5: Variables
Variables
A variable is a storage location called by a name referred to as an identifier.
A variable holds a piece of information, called a value.
As in high school math, a variable can be as simple as “X”. You will
probably want them to be a bit more descriptive in your code however, say
fname, short for file name, or fdogn and mdogt for female dog name and male dog type
respectively. The reason for shortening the variables is solely because retyping
surveyed_female_dog_name or observed_male_canine_type is a hassle and waste of time.
There are a few types of variables. In Bash, nearly all are strings. They are
also by far the most common type. Other variable types include arrays, and
integers.
Strings
String variables are assigned by passing the value to the identifier with an equals
sign (=).
What Bash does is expand (replace) identifier with the value, then echo the
result. You can see this in action by setting verbose debugging mode in a script
with set -xv like so:
Remember a few pages ago when we talked about word splitting? If you
recall, whenever Bash encounters a string that is not quoted, it splits it into
separate words, delimited by whitespace. The first of those words is taken as the
command, and the remaining words are arguments passed to that command.
Thus Bash thinks this,
means search $PATH for the command identifier, then pass it the arguments
= and value for execution one at a time. Of course there is not a command called
identifier, so the above code fails with an error that says as much.
To verify that we stored the string correctly, let’s echo them back.
Looks good! Only problem is, that particular beach picture is terrible, so let’s
delete it.
Hmmm. Not quite what the result we were looking for. It all goes back to
how Bash replaces the variable with its value, then performs word splitting. Again,
if we turn on verbose debugging we can see what Bash is actually trying to do.
Our script:
Let’s try that again, this time properly declaring it and then adding five to the
integer (the “+=5” part).
Much better!
Let also works.
Unless of course you perform the addition inside the let statement.
Once you get into multiple integer variables things get even more
cumbersome.
Are you thoroughly confused yet? I know I am. Do not worry too much
about it; this is why integer variables are rarely used in Bash. We’ll cover
arithmetic expansion in a bit, which is vastly easier to write, read, and use.
Read only
Read only variables are exactly the same as normal variables except, as the
name implies, they are read only. You set a read only variable by passing the -r
flag to declare.
Once set, read only variables are permanent. That is, unset has no effect.
Read only variables stick around until the shell process exits or is killed.
To replace your current instance of Bash with a new one and get rid of read
only variables, you can issue the following:
Shell variables
In addition to setting “normal” variables, Bash provides a few special
variables for your convenience. Not all the shell variables are listed here, but the
most commonly used are.
Common shell variables
Variable Description
$BASH The path to Bash’s binary.
$BASH_VERSION Bash’s version.
$BASH_VERSINFO Bash’s major version.
$EDITOR The environment’s default editor.
$EUID The current user’s effective user ID.
$HOME The current user’s home directory path.
$HOSTNAME The machine’s hostname.
$HOSTTYPE The machine’s hosttype.
$IFS The IFS setting. Blank if set as default.
$MACHTYPE The machine type.
$OLDPWD The previous directory you were in.
$OSTYPE The operating system type.
$PATH The command search path.
$PIPESTATUS The exist status of the last completed pipe.
$PPID The process ID of the current shell’s parent
process.
$PS1, …, $PS4 The shell’s (prompts 1 through 4) format.
$PWD The current working directory.
$RANDOM A pseudo random integer between 0 - 2^15.
$SECONDS How many seconds the script has been
executing.
$UID The current user’s ID.
There are a couple important things to notice about shell variables. First,
you do not have to manually set, declare, unset, or modify them yourself. They
exist whether you decide to use them or not. Second, shell variables are always
fully uppercase. This helps you avoid clobbering one of them with one of your
own. A good rule of thumb is to always name your variables in lowercase. That
way, whenever you see one in uppercase, you know it is special. Finally, shell
variables and environment variables are not the same. Environment variables
are covered in chapter 13.
For now, the most in-depth example involving shell variables will be a
simple listing of them and their stored values. Once we get into tests and
conditionals they will be incredibly useful. Just think of how much information
you can glean about a logged in user by seeing if their UID is 0, or about a
system by parsing the output of MACHTYPE.
Chapter 6: Arrays
Arrays are used more often than integer variables, but they require
significant accounting to ensure you are setting, reading, replacing, and recalling
the correct array member. My suggestion is to skip this part for now and return
once you have read about special characters, special parameters, and loops.
The way an array works is that it holds key to value mappings. Each indexed
key-value pair is called an element.
Indexed arrays
In this type of array (far and away the most common type), each value is
mapped to a key starting from zero (0). This is called an indexed array.
Graphically, it looks like this:
array
key value
0 value1
1 value2
... ...
N valueN
An array called animals is visualized below.
animals
0 Dog
1 Bear
2 Llama
3 Parrot
4 Otter
Array creation is similar to variable creation and assignation. Like everything
in Bash, there are usually multiple ways of performing the same task. Arrays are
no exception. They can be declared in a couple different manners.
Declaration
Usage Description
array=() Create array
Declares an empty array called array.
array[0]=value Create & set array key
Declares array, array, and sets its first value
(index 0) to value.
declare -a array Create array
Declares an empty array, array.
Once the array has been created, with or without values, it can be modified
using the following syntax.
Storing values
Usage Description
array[i]=value Set array value
Sets the ith key of array to value.
array=(value1, value2, . . ., Set entire array
valueN) Sets the entire array, array, to any
number of values, value1, value2, . .
.valueN, which are indexed in order, from
zero. Any previously set array values are
lost unless the array is appended with +=
array+=(value1, value2, . . Append array
., valueN) Appends the existing array, array,
with the values, value1, value2, . . .valueN.
array=([i]=value1, Compound set array values
[j]=value2, . . ., [k]=valueN) Set the indexed keys, i, j, and k, to
value1, value2, . . .valueN respectively. Any
previously set array values are lost.
Here we create a simple indexed array.
Due to the way array creation and assignation works, our attempt to modify
the array actually wiped (unset) our previous value, and reset the entire array as:
animals
0 Dog
1 Bear
2 Llama
3 Parrot
4 Otter
If we want to add the value Cat back into the array, we can append it to the
end.
There are two important things to note here. The first is that we have
accidently created a sparse array. That is, the animals array has eight keys
(remember array elements are indexed from zero), yet only seven values.
animals
0 Dog
1 Tiger
2 Llama
3 Parrot
4 Otter
5 Cat
6
7 Chicken
If a loop is used to enumerate the array, key six will be empty! This is
typically not what you want to happen.
Secondly, if you attempt to set the array values using this syntax,
you will unset the existing array and create a new one that looks like this:
animals
0
1 Tiger
2
3
4
5
6
7 Chicken
That is not what we wanted at all! You could try to append the array instead
and specify keys and their respective values, but appending appends everything.
Let’s “reset” the array to the spare array we accidently created before, and
learn how to read values from it.
Parrot is displayed because it corresponds to key value 3, not the 3rd element
of array animals!
Displaying all the elements of animals array can be done in a few ways, each
subtly different. This may be confusing at first because the echoed output of the
methods appear nearly identical.
The only apparent difference above is the extra space between the values
Cat and Chicken (the missing 6th element in our sparse array) when the array is
quoted.
If we use a loop instead, see how the output changes:
Quoting the arrays shows us that there is indeed a difference. The at (@)
method results in each individual value, word split and quoted itself. The glob (*)
results in a single long string with any extra whitespace (i.e. the missing elements
in our sparse array) truncated.
Expanding any number of elements produces a likewise result.
Like parameter expansion, arrays also have metadata that corresponds to the
number of values, indexes, and length. These are especially useful in loops.
Metadata
Usage Description
${#array[i]} Value string length
Expands to the string length of the ith array value
${#array[@]} Array values
${#array[*]} Expands to the number of values in the array.
${!array[@]} Array indexes
${!array[*]} Expands to the indexes in the array.
Notice that Bash makes no special distinction for sparse arrays. That is, even
though key 6 (corresponding to element 7) in the animals array is empty, it is still
treated as an element, just like all the others keys that contain actual values.
Loops (for, while, and until) can be constructed using standard syntax:
Arrays are erased by unsetting them.
Deletion
Usage Description
unset -v array Erase an array
unset -v Completely erases the array, array.
array[@]
unset -v array[*]
unset -v array[i] Erase an array value
Erases the ith array value from array.
Associative arrays
Unlike indexed arrays, which use keys that begin at zero and increment
upward by one, associative arrays use string labels to associate with each value.
array
string label value
string1 value1
otherstring2 value2
... ...
anotherstring3 valueN
Because associative arrays are not numerically indexed, they are always
unordered. This is very important! It means that any attempt to retrieve multiple
values from the array will return them in random order (unless each string label
is explicitly specified and ordered in and of itself).
The syntax to create, store, retrieve, and perform other actions on them is
similar to indexed arrays.
Declaration
Usage Description
declare -A array Create array
Declares an empty array, array.
array[str]=value Create & set array key
Declares array, array, and sets its first
string label to value.
Storing values
Usage Description
array[str]=value Set array value
Sets the element indexed by str of array to
value.
array=([str1]=value1, Compound set array values
[str2]=value2, . . ., Set the elements indexed by strings, str1,
[str3]=valueN) str2, and str3, to value1, value2, . . .valueN
respectively. Any previously set array values
are lost.
Retrieving values
Usage Description
${array[str]} Expand value
Expands to the element indexed by string
str of array, array.
${array[@]} Mass expand values
“${array[@]}” Expands to all values in the array. If
double quoted, it expands to all values in the
array individually quoted. Output is in
random order.
${array[*]} Mass expand values
“${array[*]}” Expands to all values in the array. If
double quoted, it expands to all values in the
array. quoted as a whole. Output is in random
order.
Metadata
Usage Description
${#array[str]} Value string length
Expands to the string length of the
element indexed by str
${#array[@]} Array values
${#array[*]} Expands to the number of values in the
array.
${!array[@]} Array indexes
${!array[*]} Expands to the string labels in the array.
Deletion
Usage Description
unset -v array Erase an array
unset -v array[@] Completely erases the array, array.
unset -v array[*]
unset -v array[str] Erase an array value
Erases the element indexed by str from
array.
unset -v array Erase an array
unset -v array[@] Completely erases the array, array.
unset -v array[*]
We start by declaring an array, stingarr, and setting a few values. Echoing the
array back shows us that the values are indeed stored. Notice how they returned
in random order? They are not even sorted alphabetically!
Next, we add a string label and value and verify it has indeed been added.
Unsetting and adding values demonstrates how associative array metadata
retrieval works.
Just like indexed arrays, the string labels for associative arrays is easily
retrieved and displayed, first as a standalone list, and second as a paired list with
its stored values.
Chapter 7: Special Characters
A few of Bash’s special characters—characters that have other than a literal
meaning—have already been introduced, such as the double quotes and dollar
sign, but many more exist. While not exhaustive, the majority are listed below.
Basic
Character Description
# Comment
Lines beginning with a hash will not be executed.
““ Whitespace
Bash uses whitespace (e.g. spaces, tabs, and newlines)
to perform word splitting.
& Run in background
Cause the preceding command to run in the
background.
; Command separator
Allows for the placement of another command on the
same line.
Logic
Character Description
&& And
Logical and operator. Returns a success if both of the
conditions before and after the operator are true.
|| Or
Logical or operator. Returns a success if either of the
conditions before or after the operator are true.
The logical AND operator is just like that learned in other programming
languages or electronics/computer/logic courses. It performs a second task if
and only if the first successfully completes. Bash knows if a task successfully
completed by examining the command’s exit code. An entire section is devoted
to exit codes later in the guide. For now, all you need to know is that an exit
code is 0 for true, and anything else (typically 1) for false.
The logical OR operator works precisely the opposite of the AND operator.
It performs a second task if and only if the first does not successfully complete (i.e. its
error code does not equal 0).
This is not good practice however, as the exit code read is always the last
one. This can cause unexpected behavior.
But wait! Bash never even tried to create the file, myscript. What it did do was
try to change to otherdir, which failed because the directory did not exist. The exit
code was set to 1 (a failure), which caused it to skip the touch command. When it
saw the OR operator, it executed the command there because the exit code was
still set to the last one (the cd failure). It then told us it could not make the new
script file even though it never even tried.
If you are not very careful with your logic operators, this situation will play
out when you least expect it, and you will end up with a royal headache of
unexpected results.
The only time you should sting more than one logical operator together is if
they are all ANDs or all ORs. Still, I highly suggest that if you need more than
one logical operator, you use control groups, as demonstrated in a few sections.
Directory traversal
Character Description
~ Home directory
Represents the home directory, and the current user’s
home directory when followed by a forward slash.
. Current directory
A dot in front of a filename makes it “hidden.” Use ls -
a to view.
A dot directory represents the current working
directory.
A dot separated from a filename by a space sources
(loads) the file.
.. Parent directory
A double dot directory represents the parent directory.
/ Filename separator
A forward slash separates the components of a
filename.
Performs division when used arithmetically.
These are basic, universal commands in any GNU/Linux system.
Quoting
Character Description
\ Escape
Escapes the following special character and causes it to
be treated literally.
Allows for line splitting of a long command.
‘‘ Full quoting
Special characters within single quotes lose their
meaning and become literal.
Word splitting is not performed on the contents.
““ Partial quoting
Special characters within double quotes lose their
meaning, with the notable exception of parameter
expansion, arithmetic expansion, and command
substitution.
Word splitting is not performed on the contents.
Though quoting has been covered, it is worth restating yet again: if in doubt,
always double quote strings!
A double greater than sign, >>, appends to a file without overwriting its
contents.
The less than sign, <, allows the contents of a file to be read as standard
input.
Here documents allow you to embed blocks of data into your script.
Here strings, on the other hand, take only a single string as input.
A pipe allows you to pipe (send) the standard output of one command to the
standard input of another.
Groups
Character Description
{} Inline group
Commands within the curly braces are treated as a
single command. Essentially a nameless function without
the ability to assign or use local variables. The final
command in an inline group must be terminated with a
semicolon.
() Command group
Commands within are executed as a subshell. Variables
inside the subshell are not visible to the rest of the script.
(()) Arithmetic expression
Within the expression, mathematical operators (+, -, *,
/, >, <, etc.) take on their mathematical meanings (addition,
subtraction, multiplication, division). Used for assigning
variable values and in tests.
Braces, or inline groups, allow all commands within to be treated as a
single command. Here we attempt to change into a directory. If the directory
exists, everything works and the inline group following the OR logic is skipped
entirely. If the directory does not exist, then it is created, cd’d into, and a message
is printed as such.
In this way, inline groups are especially useful for simple error handling in
scripts.
Along the same lines, this is why Bash built-ins execute quicker than their
external counterparts. Built-ins do not require the forking off of a sub-process,
whereas their external counterparts do.
Finally, variables in a subshell are not visible outside of that subshell—even
to their parent process. This means that any variables in a subshell are essentially
local variables.
It is important to note that Bash only performs integer math. Bash does not
know what longs, doubles, tuples, or anything like that are. Additionally, Bash’s
built-in math functions are limited. If you need square roots, decimal division,
etc., you must use an external utility like bc.
Command groups and arithmetic expressions can both be expanded. They
then become command substitution and arithmetic expansion respectively.
Both are covered in detail in chapter 10.
Chapter 8: Globs
Globs, or wildcards in Bash, are a lifesaver when matching characters and
strings. Instead of manually entering strings that contain spaces, mixed case
letters, special characters, and others, globs allow you to enter one character and
be done.
Globs
Character Description
? Wildcard (character)
Serves as a match for a single character.
* Wildcard (string)
Serves as a match-for any number of characters.
[…] Wildcard (list)
Serves as a match for a list of characters or
ranges.
?
The question mark is the simplest to use. It stands in for a single character,
and means, “match any single character.”
The simplest usage of ? is joining it with some “standard” commands like ls,
cd, rm, cp, and mv.
by reading in echo as the command, performs word splitting (as per IFS), and
passes the glob, *, to the echo command. Next, Bash expands the glob and
matches it against all files in the current directory. After all the matches are
found, Bash sorts them alphanumerically, then replaces the glob (the asterisk)
with the results one at a time. The order here is very important—did you notice
that Bash performs word splitting before it expands the glob? This ensures that
matching glob patterns will not be split up by your IFS setting and will always be
handled correctly!
Another important point is that globs are implicitly anchored at both ends. A
simple example should demonstrate how this works.
See how n* did not match filename? The reason is because the character, n, is
anchored at the front. This is exactly the same reason why *e failed to match
name_f despite the file containing the letter e—because the letter is anchored at
the end.
It is permissible to use globbing in the middle of a filename, as in the last
two examples, but again, anchoring applies. Since no files began or terminated
with the characters, f and n, respectively, echo simply echoed back your statement.
Changing this behavior is covered in the section called null globs.
Note that neither wildcard can match the forward slash (/) character. The
reason behind this is because the forward slash is the filename separator. If
globs could match this character, you would have no control over directory
recursion.
Here are a few more examples of globbing in action. The switches passed to
the echo command prevents the default added trailing newline, and then allows
us to manually specify where a newline should go by entering, \n.
The question mark and asterisk globs can be used together, in any order, and
in any quantity. First, we use the find command to find any files on the system
that end in “lib” followed by any single character plus “.so”.
Finally, we find only files that begin with “lib” and end with a period plus
any single character.
[…]
Square braces are somewhat of a middle ground between ? and *. They
allow for the matching of any number of individual characters and/or ranges of
characters. Specifying the characters you wish to expand for is done by simple
enumerating them between the braces.
They can stand on their own, however unless you know the exact length,
and/or combination of characters, it is typically helpful to add with the question
mark or asterisk glob. The example below lets us know how many binaries
beginning with the letters a or b are in the directory /usr/bin.
You could also list all the binaries that are only two letters:
Adding more than one square brace expansion is acceptable. This command
lists all the two letter binaries in /usr/bin that start with the letters a through m,
and end with the letters m through z.
What’s going on is we forgot that square braces can match numbers and special
characters in addition to letters!
Note that the same command could be accomplished by redirecting a cat
output of each square brace expansion into the utility, diff.
if there did not exist a file whose name matched our wildcard expansion,
*.txt.
Having the glob expanding to itself can be particularly irksome in scripts
where filenames are concerned because you do not want to perform tasks on
the wildcard expansion—chances are your script will error out.
Additionally, counters break when the glob expands to itself. The example
below adds all matching files one at a time to the array, filelist, then outputs the
number of elements in the array.
Notice how the script says there is a file that matches the expansion, but
there really isn’t?
Loops (covered in chapter 12) do not perform as expected either.
Now, when we run our scripts to echo any matching filenames and count
the number of matching files, we receive the correct outputs.
Echo results:
Array element counting:
Loops:
Extended globs
Beyond “normal” pattern matching is something called regular
expressions. A simpler version using slightly different syntax is called
extended globs. Similar to braced lists, […], extended globs allow for the
matching of pattern, not just individual characters.
To enable extended globs, do so via the shell option.
Notice how the parameter, $0, is the called name of the script? If you set the
executable bit on the script and call it like this,
then the parameter becomes ./pnames. You can easily remove the leading
dot-slash using sed.
Substring removal
Usage Description
${parameter:offset:length} Results in a string of length
characters, starting at offset characters. If
length is not specified, takes all
characters after offset. If length is negative,
count backwards.
${parameter#pattern} Searches from front to back
parameter until the first occurrence of
pattern is found and deletes the result.
${parameter##pattern} Searches from front to back of
parameter until the last occurrence of
pattern is found and deletes the result.
${parameter%pattern} Searches from back to front of
parameter until the first occurrence of
pattern is found and deletes the result.
${parameter%%pattern} Searches from back to front of
parameter until the last occurrence of
pattern is found and deletes the result.
Another example, this time using user input and writing to a variable.
Remember once more that Bash performs integer math. It does not know
what floats, longs, doubles, or anything like that are.
Thus if your script requires a precision result, you will need to call an
external binary such as bc.
The scale option sets the number of significant figures after the decimal and
the -l switch defines the standard math library to be used. Hopefully this will
save you from a few headaches.
Next up is brace expansion, which is more or less a way of making lists.
Brace expansion
Character Description
{} Brace expansion
Expands the character sequence
within.
The simplest use of brace expansion is to expand a list of letters or numbers.
Brace expansion accepts empty inputs (notice the placement of the comma
in the first brace, which indicate the item preceding it was an empty item).
Though backticks perform the same function as $(..), they should not be
used. They are essentially included in Bash only to support legacy code.
Backticks do not nest without escaping themselves and other special characters,
are difficult to read, extremely ugly, and are not POSIX compatible.
By comparison, $(..) looks nice and neat and does not require any escaping.
Here are a couple examples of setting variables and echoing output that are
much more useful than those shown above.
Chapter 11: Conditional Blocks
A conditional statement is a way to alter the flow of a script based on a
boolean condition—a condition that evaluates to either true or false. In
languages such as C++, the evaluation of a conditional statement becomes
either 1 for true, or 0 for false In Bash however, a 0 is true, and any other number is
false (though it is typically a 1). The reason behind this is because of something
called exit codes.
Exit codes
Exit codes are exactly what they sound like: codes (an integer between 0 and
or 255 [i.e. 2^8 values]) that indicate the previous command’s termination status.
A 0 indicates successful completion (i.e. a true). A 1 is the default value for an
unsuccessful completion (i.e. a false).
You can check the exit code generated by the last foreground command by
using the special parameter, $?.
Two special commands we can use to verify this functionality are true and
false. From their man pages:
Thus, true always completes with a 0 exit code, and false always completes
with a 1 as its exit code.
Values above 1 all indicate unsuccessful completion, but are typically set
within the script itself. For example, you could set it so that failure to create a
file exits with code 60, while failure to delete a file exits with code 80. This way,
based solely on the script’s exit code, you are able to tell what happened (or
didn’t). This is accomplished with the exit built-in.
In the above example, the double pipes (||, the logical OR operator) cause
the subsequent echo/exit command grouping to execute only if the original
command, either touch or rm, fail. The redirection, >&2, causes the error
message to be printed to sterr, rather than stout. Both of these concepts will be
covered later in the guide.
Exit then sets the exit code as a value other than the catchall, 1, which
makes debugging easier and troubleshooting easier.
Many programs use custom error codes as a way to give more information
about what went wrong. Here is an example from grep’s man page:
Beginning with the first keyword, if, the conditional expression is evaluated. If it
is true (its Boolean value is 0), the statements (commands) between the keywords
then and fi are executed (in Bash, multi-line statements like if and case terminate
with their character reverses, fi, and esac). If the expression is false, all the
subsequent commands are skipped and the script continues as normal after the
fi keyword.
A trivial example:
What if you want to do something based on the expression being false? If-
else statements are the way to go. All you do is add another keyword, else, with
the statements to be executed in the event of a false evaluation.
Though the Boolean can only carry two values, true or false, multiple
discrete tests can be run one after the other by using a branching if-elif-
else statement.
Borrowing test (double square brackets) from a few pages further on, we
can see this in action. Here, test is checking to see if the variable x is equal (-eq),
greater than (-gt), or less than (-lt) the variable y.
What happens when the script encounters the case keyword is that it searches
from top to bottom for the first matching pattern. Once found, it executes the
statements (commands) until it reaches the double semi colon (;;) keyword. If
the variable does not match any specific pattern, the final case, *) serves as a
catch all. esac terminates the case.
It is worth restating that case searches in order and only executes the first
matching pattern. Once executed, case breaks (ends, and continues on to the first
line following esac).
Perhaps the most prevalent usage of the case statement is in a standard
startup script.
Notice that two or more patterns can return the same result by separating
them with a pipe, |. It is also possible to use globs for pattern matching within a
case statement.
Though it appears there are only two arguments in the above test, each
variable is expanded to its set value before tested. [ does not know how to
handle this without quoting. Additionally, while [[ is limited to four arguments, [
is overburdened with more than two.
[[ does not require escaping special characters (such as the greater than, >,
and less than characters, <). With [, you must escape all special characters.
[[ can utilize logical AND (&&) and OR (||) operators. [ requires the use of
-a for &&, and -o for ||).
[[ can perform pattern matching while [ cannot:
returns the exit code of the last completed foreground command (true is 0,
false is 1).
Arithmetic tests
Operator Description
X -eq Y True if X is equal to Y
X -ne Y True if X is not equal to Y
X -gt Y True if X is greater than Y
X -lt Y True if X is less than Y
X -le Y True if X is less than or equal to Y
X -ge Y True if X is greater than or equal to Y
All of these tests are self-explanatory.
Note that bc returns 0 if the relation is false and 1 if the relation is true!
String tests
Operator Description
-z STRING True if STRING is empty
-n STRING True if STRING is not empty
STRING1 = STRING2 True if STRING1 is equal to STRING2
STRING1 == True if STRING1 is equal to STRING2
STRING2
STRING1 != STRING2 True if STRING1 is not equal to
STRING2
STRING1 < STRING2 True if STRING1 is sorts
lexicographically before STRING2 (must
be escaped if using [)
STRING1 > STRING2 True if STRING1 is sorts
lexicographically after STRING2 (must be
escaped if using [)
These string tests can be used in a number of useful ways. Perhaps the most
straightforward is checking to see if a variable is set or not.
A mighty convenient follow-up is to use read and prompt the user for
input, such as the characters Y or y for yes, and N or n for no. The use of
read will be covered later in the guide. For a quick example, jump down to
regular expression pattern matching.
Note that the = operator is not the same as -eq operator from the above
arithmetic test. The equals sign (=) is actually a string test. Thus the test,
This is because the symbols < and > have special meaning in Bash. [ thinks
< is trying to redirect the contents of file rainier to the file denali, while > is
attempting the opposite. The error message and file creation are a direct result
of how those operators work when the filename arguments do not exist.
File tests
All filenames must be quoted if using [.
Operator Description
-e FILE True if FILE exists
-f FILE True if FILE exists and is a regular file
-d FILE True if FILE exists and is a directory
-c FILE True if FILE exists and is a special character
file
-b FILE True if FILE exists and is a special block file
-p FILE True if FILE exists and is a named pipe
(FIFO—“first in, first out”)
-S FILE True if FILE exists and is a socket file
-h FILE True if FILE exists and is a symbolic link
-O FILE True if FILE exists and is effectively owned
by you
-G FILE True if FILE exists and is effectively owned
by your group
-g FILE True if FILE exists and has SGID set
-u FILE True if FILE exists and has SUID set
-r FILE True if FILE exists and is readable by you
-w FILE True if FILE exists and is writable by you
-x FILE True if FILE exists and is executable by you
-s FILE True if FILE exists and is non-empty (size is >
0)
-t FD True if FD (file descriptor) is opened on a
terminal
FILE1 -nt FILE2 True if FILE1 is newer than FILE2
FILE1 -ot FILE2 True if FILE1 is older than FILE2
FILE1 -ef FILE2 True if FILE1 and FILE2 refer to the same
device and inode numbers
The usage of a few of the operators are shown below. The rest, their
functions evident and usages clear, are left as an exercise for the reader.
Logical tests
Operator Description
EXP1 && EXP2 True if both EXP1 and EXP2 are true
(equivalent of -a if using [)
EXP1 || EXP2 True if either EXP1 or EXP2 is true
(equivalent of -o if using [)
Both of these operators were covered previously, albeit in a different
context. There are no changes in the way they operate to mention.
Pattern tests
All pattern tests are [[ only.
Operator Description
STRING = PATTERN True if STRING matches PATTERN
STRING == PATTERN True if STRING matches PATTERN
STRING =~ PATTERN True if STRING matches the regular
expression pattern, PATTERN
Pattern matching is extremely convenient for tasks involving strings which
adhere to a standard or specific use, such as filename extensions.
Regular expressions are incredibly helpful, but their usage is beyond the
scope of this guide.
Later in the guide we will cover traps, which are a way to catch signals
(though not all are catchable) and perform custom actions based on which signal
was received.
While
The while loop is nice and simple. It evaluates a condition and executes the
statements inside the loop until the conditions returns a non-true exit code (1 or
higher).
Note in the above example that the loop counter started at 0, then
incremented to 1 before echoing the loop number.
If those lines were reversed, the first loop iteration would be iteration 0.
In either case the result is still five loops, with both indexed at 0 (which is
why the script uses -lt 5 and not eq 5), but the displayed numbers between the
two are different.
Here is a handy script that alerts you when your CPU temperature is getting
too hot. What is does is check the CPU temperature of “Core 0” via the sensors
command (from the lm_sensors package) and sees if it is under 80 degrees
Celsius. If it is, it waits ten seconds before trying again. If the CPU temperature
rises above 80, the loop breaks and the lines following are executed.
The ugly line of Bash code that grabs the temperature would be a perfect
use for a function (covered later on in the guide).
Another example is to use [[ and shift to cycle through all the set positional
parameters.
Until
The until loop is exactly the opposite of the while loop. It runs a test and
executes the statements inside the loop until the command returns a true exit
code (0).
This example runs ping once per second and checks to see if it receives any
response from host 10.0.0.209. Once it receives a response, ping returns a true
exit code (0), and the until loop exits. The -n switch on echo suppresses the
trailing newline from being added, and the 1> /dev/null sends all standard
output to /dev/null for a cleaner-looking screen.
The output will look something like this:
The until loop version of the CPU temperature example requires changing
the criteria from less than to greater than.
Cycling through the positional parameters using an until loop looks like this:
For
Arguably one of the most common loops (as it has been around forever—it
was originally implemented in C/C++ in the early 1970s) is the for loop, which
utilizes a variable, a condition, and an increment.
Here is a simple example that loops five times, printing the current loop
number each time:
Note that this style of for loop uses double parenthesis. This means that all
the operators inside are arithmetic operators. Unlike tests, where greater than or
equal to is symbolized by -ge, a for loop understands >= in its arithmetic context.
The same goes for > and <, which are seen not as redirection (in [), or string
sorting (in [[), but as greater-than, and less-than.
For loops also work with listed items,
brace expansions,
subshell expansion,
globbing,
arrays,
positional parameters,
and more. As you can see, the possibilities are all but endless.
Miscellaneous
Nesting
Like conditional statements, all loops can be nested, in any order, and using
any combination of while, until, and for loops.
This code uses all three loop types to print, in order, the numbers 000
through 999.
If you are wondering why j correctly displays all digits 0 to 9 despite its test
checking for a less than condition, the reason is due to the placement of the j
incrementer. Since it is above the k loop, it increments to 9 before the k loop
begins. If the j incrementer were moved below the k loop, then the conditional
statement would have to change to either -gt or le.
Break
All loops can be exited via the break statement, in which case the script
immediately following the loop is executed. This is not to be confused with the
exit statement, which exits the script (not just the loop) entirely.
Continue
Usage of the continue statement allows the current iteration of the loop to be
skipped, and the next to be immediately started.
The following example checks if any files in the current directory contain an
uppercase character. If any do, the filename is converted to all lowercase. If no
uppercase characters are present, the script has no work to do and the loop
continues to the next iteration.
Another example is to create backup copies of all files, skipping those which
have already had backups created.
Chapter 13: Input & Output
As you’ve no doubt noticed, when scripting in Bash there is typically no
single “correct” way of doing things. Rather, there are many correct ways, with
the selection almost solely based upon personal preference. Input and output are
no different.
Input is anything that is received or read by a script. Input includes
positional and special parameters, shell and environment variables, files, streams,
pipes, and more.
Output on the other hand, is anything that is produced or written by a
script. Shell and environment variables, files, streams, and pipes are a few
examples.
Parameters
Positional and special parameters were covered earlier in the guide. The
listing of these parameters is repeated here for convenience.
Parameter Description
$0 The called name of the script.
$1, $2, … Print the arguments passed to the script.
$# The number of positional parameters passed to the
script.
$* All the positional parameters. If double quoted, results
in a single string of them all.
$@ All the positional parameters. If double quoted, results
in a single list of them all.
$? The exit code of the last completed foreground
command.
$! The exit code of the last completed background
command.
$$ The process ID of the current shell.
$_ The last argument of the last completed command.
$- The shell options that are set.
Shell variables
Like parameters, shell variables were covered earlier in the guide and are
repeated below for convenience.
Variable Description
$BASH The path to Bash’s binary.
$BASH_VERSION Bash’s version.
$BASH_VERSINFO Bash’s major version.
$EDITOR The environment’s default editor.
$EUID The current user’s effective user ID.
$HOME The current user’s home directory path.
$HOSTNAME The machine’s hostname.
$HOSTTYPE The machine’s hosttype.
$IFS The IFS setting. Blank if set as default.
$MACHTYPE The machine type.
$OLDPWD The previous directory you were in.
$OSTYPE The operating system type.
$PATH The command search path.
$PIPESTATUS The exist status of the last completed
pipe.
$PPID The process ID of the current shell’s
parent process.
$PS1, …, $PS4 The shell’s (prompts 1 through 4) format.
$PWD The current working directory.
$RANDOM A pseudo random integer between 0 -
2^15.
$SECONDS How many seconds the script has been
executing.
$UID The current user’s ID.
Environment variables
Like shell variables, environment variables consist of a name and value
pairing. However, unlike shell variables, which are local and valid in the current
terminal instance only, environment variables are system wide.
The following table contains a condensed listing of commonly used
environment variables.
Variable Description
BASH_VERSION Bash’s version number.
BROWSER The system’s default web browser.
DISPLAY The hostname, display and screen number
for graphical application display.
EDITOR The default file editor. VISUAL falls back to
this if it fails.
HOSTNAME The system’s name.
HISTFILE The path to the file were command history is
saved.
HOME The current user’s home directory.
IFS The Internal Field Separator’s global setting.
LANG The language settings used by system
applications.
LOGNAME The currently logged in user.
MAIL The storage location for system mail.
MANPATH The storage location for system manpages.
PAGER The utility used to display long text output.
PATH The colon separated search path for
commands.
PS1 The prompt settings.
PWD The current working directory.
SHELL The path to the current user’s shell program.
TERM The current terminal emulation type.
TZ The timezone used by the system clock.
USER The currently logged in user.
VISUAL The default GUI file editor.
To get a full listing of the environment variables set on your system, which
will assuredly have many more than listed above, use the printenv command.
There are a couple important items to note when setting and unsetting
environment variables.
First, you cannot change the environment of a running program. Changing
environment variables only affects its to-be-spawned child processes. A
wrapper script is a commonly used snippet of code that changes the
environment in some way then executes another program.
Second, all changes to the environment are temporary unless they are
written, typically to one of the following locations:
Because you cannot change a running environment, you must relogin for
your changes to take effect. Alternatively, you can force your system to reread
the modified file (and thus change newly spawned processes) with source.
Using exec is another option that will replace the current terminal instance with
a newly spawned instance.
Standard streams
The three standard streams, stdin, stdout, and stderr, were covered in the
beginning of this guide with a promise to expound upon them later. That time
has come in the form of file descriptors.
File descriptors
File descriptors (FDs), are handles (abstract indicators) that are used to
access an input or output file, stream, pipe, socket, device, network interface,
and so on.
At a high level, file descriptors work by providing a layer of abstraction
between an actual hardware device, such as a keyboard, screen, or cdrom drive,
and a special file created by the kernel for the device, populated by udev, and
stored in the /dev directory. When a call is made by a process to read or write, file
descriptors provide a way for those calls to be routed to the correct place.
The important points to remember are summarized in the following table:
Name Stream Default File
Descriptor
Standard stdin Keyboard 0
input
Standard stdout Screen 1
output
Standard stderr Screen 2
error
File redirection
A common and basic form of redirection is file redirection. Normally, a
program sends its output to stdout (FD 1). With file redirection, the FD for stdout is
pointed to a file, rather than a screen (the default). Output redirection is
accomplished by using a greater-than sign, >.
Normally, echo displays a line of text on stdout. However, we can redirect its
output to a file as shown here:
When the above command is run, echo no longer displays its text onscreen
because its FD is now pointing to a file called myfile.
It is critical to note that > opens the file and writes to it without care
whether it previously existed or contained anything. If the file did not exist
before the command was run, it is created and written to. If the file existed,
anything it contained is wholly overwritten.
In order to avoid overwriting an existing file’s contents, Bash contains a
doubled version of the output redirection operator, >>. Its purpose is to
append data to the end of an existing file.
Version 1:
Version 2:
Version 3:
Though the first version appears different from the last two, which look
identical, they all operate differently behind the scenes.
In the first version, cat is started, and begins to read from stdin. It then takes
your input and displays it on stdout. The reason your text is displayed twice is
because the same reason commands don’t execute until you hit [ENTER]; your
terminal is giving you the chance to edit (via backspace, delete, arrow keys, etc.)
what you have typed. Once you hit [ENTER] however, your text becomes stdin
for cat, which takes and (re)displays it on stdout. [CTRL][D] is the keyboard
shortcut for the End of File (EOF) character, which makes cat think that stdin has
stopped.
In the second method, myfile is passed as an argument to cat. Just like you
have to decide what to do with, and how to handle, positional parameters in
your script, cat must do likewise—it has complete control over what to do and
how to go about doing it. Since the purpose of cat is to print to stdout, it opens
the file, then reads and prints its contents to stdout.
In the third method, cat receives no input from your keyboard, nor from any
arguments. Instead, a file descriptor is opened to the file, myfile, which makes the
contents of the file available for cat as stdin. Thus, when cat reads from stdin, it is
now reading from a file, and not your keyboard. This means that though the
output of version 3 is identical to that of version 2, its method of operation is
actually similar to version 1!
The doubled version of the input redirection operator, <<, is called a here
document, and is covered a little further on.
File descriptor redirection
In the previous section, we used “naked” redirection operators. That is, we
used > and < by themselves, without specifying any particular file descriptor.
This worked for file redirection because the default operations were exactly
what we were looking for. You can however, manually specify which file
descriptor you wish to redirect. This is accomplished by adding the numerical
prefix of the standard stream you wish to redirect.
Redirecting the stdout of command to file would look like this (both variations
function exactly the same):
Similarly, redirecting file to the stdin of command would look like this (again,
the two lines work exactly the same):
Why is adding the stream number useful when the “naked” redirector works
perfectly fine, you ask? Because there is another file descriptor, stderr! Since the
defaults do not specify stderr, we need to specify them manually.
To demonstrate this, let’s create an empty directory, change to it, and try to
remove a nonexistent file.
That got rid of the error message. But simply throwing away things you
don’t want to see is not always the best solution! Perhaps a better way would be
to redirect the message to a file you can consult as needed.
While better, it is far from ideal when running a script and a few dozen
(hundred? thousand?) errors or warnings flash by—they would each overwrite
the last! The double redirection operator solves that problem by appending the
error file with each new message.
For our demonstration however, we are looking for something that is easily
reproducible and does not actually require modifying our system (i.e. installing a
program).
That will work nicely. Linux systems store their configurations in config files,
not all of which are accessible to a non-root user, thus generating many lines of
both “normal” output as well as “error” output. Below is a small snippet of the
output from this find command.
Once the testbed is setup, we try to redirect both output file descriptors.
Not quite what was expected, was it? The problem is that the output from
our stderr stream clobbered and overwrote that of our stdout stream. This was
because both file descriptors were pointing to the same destination. Note that
the file descriptors are read from left to right, and that the same issue will happen
in reverse. That is, two file descriptors pointing to the same input source will
always clobber one another.
The correct way to perform this task is to redirect one file descriptor to
another by pointing our second one (stderr) to our first one (stdout). This is
accomplished by using the dollar sign operator, & with our file descriptor
redirection.
Now, instead of opening two separate file descriptors, only one is opened
for stdout (FD 1), and is subsequently duplicated and placed in stderr (FD 2).
Trying the command in our test bed yields,
It is important to restate that the file descriptors are read from left to right.
Thus trying to duplicate FD 1 and place it in FD 2 before FD 1 is redirected will
not work correctly.
Here document
One rule in programming is that your code and data should not occupy the
same space. That is, input and output should come from files that are being read
or written to. Think of it this way: opening a script and editing blocks of data
every time you want to use a different set of input is cumbersome, messy, and
allows for the accidental introduction of errors. The caveat is that using files
when all you have are a few words or a line or two is overkill, not to mention a
hassle. Enter here documents.
Here documents are perfect for embedding small amounts of data into a
script.
What happens is the shell reads the << operator and understands it is to
take the following text/characters as input up to the specified delimiter, which
ends the input steam and sends it to stdin of the issued command.
Note that the delimiter cannot contain whitespace. EOF is commonly used,
and stands for End Of File.
Two important things occurred above. First, did you notice that you can
include other redirects on the same line at here documents? That is because here
documents are redirects like any other. Second, did you notice that here
documents allow parameter substitution? You can prevent that by quoting your
delimiter.
One final nuance with here documents is that they preserve tabs, and if used
in a script, the delimiter must be at the beginning of the line.
You can tell here documents to ignore tabs by adding a dash, -, to its
operator.
Notice how it worked on both the delimiter and the data? It is a small
change but it makes a big difference.
The vast majority of the time here documents are used to dump simple
usage documentation to script/program’s user.
Here string
Here strings are very similar to here documents in that they read input
information and send it to a command. Unlike here documents however, they
only work on a single string that immediately proceeds the here string operator,
<<<.
Remember to quote your string! If you do not, everything past the first
whitespace becomes an argument to your command.
Because they are so much simpler, and can only contain a single string, here
strings are used much more frequently than here documents.
FIFOs
Expanding on the knowledge of file descriptor manipulation to write and
read from files, a special file called a FIFO is introduced. FIFO stands for First
In, First Out, and unlike a normal file, it stores no actual data. Rather, it serves
to redirect input and output from one program to another. FIFOs are created
using the mkfifo command.
Since FIFO stands for First In, First Out, let’s write some data to it, and
then perform an operation on it at the other end.
Here I took a file containing the nine eight, planets, and redirected it to the
FIFO, only to be locked out of my terminal! What happened was the FIFO
blocked. Remember how FIFOs don’t store any data? They really don’t! All
they do is provide a tunnel—a named pipe—to shuttle it from one place to
another. Until a command tries to read from the FIFO, it will remain blocked
forever.
Opening up another terminal and reading from the FIFO allows us to see
this action.
Did you notice how both commands completed at the same time? That’s
because the FIFO realized it was being read from so it unblocked the write
operation and shuttled the data to the read operation. After it finished, it had no
more work to do and closed.
If you don’t want to use multiple terminals, you can still use FIFOs by
simply sending one of the processes—it doesn’t matter which—to the
background by using the dollar sign, &. Once in the background, it blocks,
waiting until the FIFO is accessed by a corresponding read or write operation.
To demonstrate, here are the same commands as above, performed in two
different methods: writing first and reading first. Notice that they both work
equally well.
Though they perform their task admirably, FIFOs are a pain to manage. If a
FIFO is blocked by a read operation, it will remain blocked until it receives a
write operation, or vice versa. You run the risk of having no tasks complete, and
even then, potential task mismatch (perhaps because one failed due to a bug,
bad syntax, etc., and thus skewed the ones following it). Additionally, FIFOs
have file permissions and ownerships just like any other file, require creation and
deletion, and a lot of typing to get them to work.
The much more user friendly alternative is a pipe.
Pipes
Similar to FIFOs, pipes connect the stdout of one command to the stdin of
another, and all without the hassle of dealing with FIFOs. Pipes are created with
a vertical bar, |. Our same planet searching example from the previous section
is now accomplished quite a bit more easily.
One final thing to note is that all FIFOs and pipes do is shuttle data from
one place to another. They do not act on it. Thus, parameter substitution does
not work.
The only way in which parameter expansion works is if you use a pipe and
an inline group (remember those?). Note however, that pipes spawn a child shell
each time called, so any parameter modifications made inside the subshell have
no effect on their parent.
Process substitution
While stringing multiple piped commands to one another in a row certainly
works, each new pipe makes it that much more difficult to follow and
troubleshoot. Enter process substitution. It essentially allows you to pipe the
stdout of multiple commands to another command. Input substitution is
accomplished via the <() syntax, while output substitution is accomplished with
>().
Way back in the guide, command groups were covered. Commands within a
command group are executed in a subshell, where any variable assignments are
temporary.
A quick example would be if you wanted to change directories before
running a single command and not have the rest of your script affected.
Working a bit like FIFOs, and a bit like pipes, process substitution uses
Bash created and handled named pipes to redirect input and output, saving you
from managing the special files that come with FIFOs.
The first example prints the number of lines (which corresponds to words,
in this case) in the “American English” dictionary. The second prints the
number of lines (words) that contain the partial word, “color.” In each case, the
output is followed by the path to the Bash-created file descriptor(s) used by the
process substitution (in this case, /dev/fd/63).
Process substitution is not limited to one substitution. diff requires two files
as input, which is easily accomplished in the following example that compares
the “American English” and “British English” dictionaries.
Pipes and other redirections still work like normal:
Output redirection works in the same way. Arguably the most common use
is with the utility tee, which duplicates stdout. Here the contents of /usr/bin are
listed, their output is sent to stdout, duplicated by tee, then grep’ed for apt and dpkg,
which are both written to corresponding file lists.
Another example is to tar a directory (perhaps containing a large number of
small files), send that tar archive to another computer (venus.mydomain.com)
via ssh where it is untarred in place.
Read
Although not a file descriptor, redirector, pipe, or anything like that, the
built-in read is an important part of reading data from stdin, files, or file
descriptors into a variable.
There are a couple of standard usages. Catching user input is the first.
Two useful switches to the read builtin are -p, which prompts the user for an
entry, and -s, which suppresses stdout for that entry. It is critical to note that all
variables are unencrypted, and therefor unsafe for any task that contains even
remotely sensitive material. Obviously you should never display a password as
done in the below example.
Since read takes in a string, it is possible to split the string based on your IFS
settings and store them in individual variables (or arrays, with the -a array_name
option).
Read is not limited to catching user input. It can also be used to read in data
from a file.
As an example:
The most basic function is one whose sole purpose is to inform you it is
such.
However if you place the above function in a script, nothing will happen.
That is because functions need to be called by name to execute.
Though the function definition must precede its call, functions can appear
anywhere a command group could go. Additionally, once a function is defined,
its call is equivalent to a command.
You could nest functions, but I cannot think of a scenario where that would
be useful.
As mentioned, functions have access to a script’s global variables.
Local variables, on the other hand, are only valid in the function, and will
not overwrite variables of the same name in the caller’s namespace. It is good
practice not to have variables of the same name in a script, though loop-
initialized counters such as i, j, and k are a common exception.
Check if user is running the script as root (i.e. checking their euid or uid to
see if it is 0):
Check if all the characters in a string are alphanumeric (this method works
by removing all alphanumeric characters and seeing if anything is left):
Signals
Name Value Effect Shortcut Trappable?
EXIT 0 Exit
SIGHUP 1 Hangup Yes
SIGINT 2 Interrupt [CTRL] Yes
[C]
SIGQUIT 3 Quit [CTRL] Yes
[\]
SIGKILL 9 Kill No
SIGTERM 15 Terminate Yes
SIGCONT 18 Continue Yes
SIGSTOP 19 Stop No
SIGTSTP 20 Terminal [CTRL] Yes
Stop [Z]
Notes:
· A non-trappable signal cannot be caught, blocked, or ignored. In
other words, the kernel shoots down the process, without giving the
process a chance to terminate gracefully.
· “EXIT” is not really a signal. Rather, whenever a script exits, for
any reason, signaled or not, an EXIT trap is run.
Signals can be sent to a process either by hitting the bound keyboard
shortcut, or by using the kill command.
To kill a process, you should start with the least “dangerous” signal. That is,
you should start with the signal that, if supported, allows the process to end
itself gracefully and clean up any temporary files, process, or streams it has open.
If that does not work, try a stronger signal, such as one asking the program
to kill itself.
If that still has no effect, move on to a signal that kills the process outright.
Be aware that killing a process in this manner orphans all sub processes (which
are “adopted” by int, or PID 1), and leaves any and all open files, streams, or
other items in an incomplete state.
Process management
Process management is a systems administration topic and will not be
covered in depth here. The reason it is mentioned at all is because you will need
to know the basics of how to find a process ID, name, and kill said process
when you start into the section on traps.
To find processes, you can issue the jobs, pidof, pgrep, or ps commands.
First, we create a simple sleep process and send it to the background to practice
finding it.
Let’s start with a simple example. Here we create an infinite loop that
outputs a period once per second. Have a second terminal handy before you
start it and try to kill it using the various methods you learned in the previous
section.
There are a number of things to notice in the example above. The first is
that you needed to kill it by issuing,
specifies that the script is to trap the signals, HUP, INT, QUIT, KILL, and
TERM, received while executing the code below the trap command, and complete
ignore them (the single quotation marks, ‘ ‘, i.e. run no command). This line
could have also been written as:
Note that the attempt to trap and ignore signal 9, SIGKILL, had no effect.
That is because the signal is untrappable.
Finally, the last line in the script,
tells the script to go back to its default handling of the listed signals.
Returning to the default handling of signals is important to remember! If not
included, all the code below the command where you specify the handling of
trapped signals will respond in the same way whether you want them to or not.
Instead of ignoring signals, let’s catch and process them. In this case, we’ll
create a temporary file, populate it with some data, then issue the command to
read input from the user. In a real life context, read would be take the form of a
command or series of commands that processes the data in the temporary file.
(Since this is a learning script, rather than create a runaway process or an infinite
loop that would bog down one or more of your system resources, read is used
instead.)
It is important to understand why we did not simply put the script to sleep.
That is because signals are only handled once the current foreground
process terminates. This is a critical concept!
An example should make this clearer. Try killing the script with the kill -2
command.
Did you see how the sleep 60 command had to complete before the trap
executed?
Three important things to mention here:
· First, the reason the example two pages ago did not seem to have
this problem, despite having a sleep command, is because it only slept for
one second! Chances are you did not notice that small a delay between
your kill command and when the trap kicked in.
· Second, trap - INT was not included at the bottom of the script
because there was no more script to execute, thus no reason to return the
handling of the SIGINT trap for the script back to normal. Trap handlers
are local to the script they are included in only; they are not system-wide.
· And third, [CTRL][C] and kill, pkill, killall, etc. work differently!
While the command line kills send the specified signal to the PID, the
keyboard SIGINT sends it to the process’ group ID, which contains the sleep
command.
If you want your script to not have to wait for the foreground process to
end before handling traps, the process needs to be placed in the background. A
command to wait on the most recent process sent to the background is how
this is accomplished (wait looks for a process status change, such as an incoming
signal). A trap can then clean up this background task before exiting—exactly
what traps were designed to do!
See how now when you issue the command,
the script catches the signal, runs the trap function (recall that function calls
are commands, and thus valid traps!), and kills the background sleep process
before exiting?
One final note will help your programs play well with others’, especially
when dealing with loops. It is good practice to kill your script with the same
signal issued to it. Thus instead of issuing exit 1, what we should have been
doing this whole time is resetting the trap handlers and killing the script with the
same signal we trapped! This allows the caller to see that the script was killed,
and did not, say, exit on an error. Correcting this mistake in the last example
would yield the friendly code as shown below.
Now when we run (and kill) the script, its exit code properly reflects how it
terminated!
I’ll end this chapter with one super helpful script for any Linux user that has
to bring down a service in order to perform a task. Normally, you would have to
notice that the script failed, and then manually restart it, right? What happens if
you had to bring down the network to run a script while remoted in and your
script did not successfully complete and bring it back up? Traps are your answer.
Example Scripts
Common Programming Interview Scripts
A common “weed out” question in a programming interview is to write a
script that performs a certain, simple task which demonstrates knowledge of the
basic principles of programming. This section contains several common such
programming interview questions. Unless noted, all properties are determined
via a brute force search and are not optimized for speed, efficiency, CPU usage,
etc.
Factor
Determine all the factors and other properties of a given number.
First up, read the number from the user. Then, using an iterative for loop,
determine the factors of the number and add them to an array. This is a brute
force search, so by only searching up to number/2, the program's run time is cut
in half (since there are no factors above this limit).
Next, print the properties of the number:
· Print the factors to the screen by running through the factors
array one at a time.
· Calculate and display the sum of the factors.
· Check the number of factors to determine if the number is prime.
· Check the sum of the factors to determine if the number is
perfect.
Sample output:
Factorial
Calculate and display all factorials from 1 until a given number.
Sample output:
Fizz Buzz
Arguable the most common interview question.
Given all the numbers 1 through 100:
· If the number is a multiple of 3, print "Fizz" to the screen
· If the number is a multiple of 5, print "Buzz" to the screen
· If neither of the above is true, print the number to the screen
It is important to realize that there are a few special cases:
· What happens when the number is a multiple of 3 and 5?
· What would happen if you include 0 on the list?
Version 1: Conditional statement
Iterate from 1 to 100, then for each number, check for multiples using if test
statements. Note that the statements cannot be nested, else for multiples of
both 3 and 5, the statement will exit after the first statement is true.
Maximum value
Given a user-inputted series of number, find the maximum value.
This question is simple if you understand the concept behind arrays (which
you can see the basics of in the "Reverse a Sentence" post).
The basic idea is to read the user input into an array with each number in
the input separated into its own element in the array (the read -a numbers_array
line).
Once the numbers are in the array, we want to iterate through each of the
elements and check if the current element is larger than the current maximum
element. If it is, then we want to override the previous maximum with the new
maximum. Note that it is important that the initial maximum value is either not
specified or set as one of the elements in the array at the start. This prevents
errors if the user number list range does not encompass the initial maximum
value.
If you comment out the lines (#echo "Current number/max) near the
bottom, you can watch as the script compared the numbers and sets the
maximum value.
Sample output:
Minimum value
Given a user-inputted series of number, find the minimum value.
This question is simple if you understand the concept behind arrays (which
you can see the basics of in the "Reverse a Sentence" post).
The basic idea is to read the user input into an array with each number in
the input separated into its own element in the array (the read -a numbers_array
line).
Once the numbers are in the array, we want to iterate through each of the
elements and check if the current element is smaller than the current minimum
element. If it is, then we want to override the previous minimum with the new
minimum. Note that it is important that the initial minimum value is either not
specified or set as one of the elements in the array at the start. This prevents
errors if the user number list range does not encompass the initial minimum
value.
If you comment out the lines (#echo "Current number/min) near the
bottom, you can watch as the script compared the numbers and sets the
minimum value.
Sample output:
Power towers
Calculate and display all numbers from a given base to its raised power.
Sample output:
Then create the list of integers using seq (I use sort to ensure later
commands accept the data correctly).
Step 2/3
Initially, let p equal 2, the first prime number. Then, starting at the number
2, count up in increments of 2 and mark those numbers off the list.
To do this, I started at 2, then incremented by 2 until the upper_search_limit
was reached, creating a list of all odd numbers while still including the number
2. I actually didn't mark them off the list yet, but will only check these numbers
to create new lists later (thus avoiding creating a list for each and every even
number, since no even number can be a prime).
Remember that the only numbers crossed off the list are the ones that start
after the initial "seed" number, hence 2 and 3 both stay in the passes below (and
the reason for the square ( $i * $i ) in the code above — study the Wikipedia
page, especially the animation on the right to see that the first number crossed
off for each new number is always the square of the "seed" number).
Also remember that I used sort to ensure the lists were handled properly by
comm.
What happens:
Note that while the code has found all prime numbers from 2 to our upper
search limit in 2 passes, it still will run through the loop 8 more times for the
"seed" numbers, 5, 7, 9, 11, 13, 15, 17, 19! While this is horribly inefficient for
such a small scope, ensuring, say, multiples of 5, 7 and 9 are crossed off is
essential for checking primes up to 100, and so on for higher search limits.
Step 5
Display the results
This is pretty self-explanatory.
Script
Now for the final code itself.
Output
And, wallah!, all the primes up to the search limit (100 in this case).
Processing time
If you wrap the script in a time command ( time {..} ), you can see how long
it takes to sieve for each prime list to the upper search limit.
Remainder
Given a numerator and denominator, calculate the remainder.
Modulus method
This question is trivial if you use the modulus (%) operator. Simply read the
user input, calculate and display the modulus and you are finished.
I added one simply check to ensure that the denominator is not equal to
zero so that division can occur without error.
The final line simply shows the arithmetic expression (i.e. a = q*d + r ).
One note is that when calculating the remainder for negative numbers, there
will be two possible answers which I did not bother to code.
Formula method
If for some reason you don't (or can't) use the modulus operator, you can
always fall back on the arithmetic expression to calculate the remainder.
The method below exploits the way BASH does math: it only does integer
math unless otherwise specified. Thus, 9/5=1, not 1.8. Using this, it is easy to
write an expression that calculated the remainder by simply subtracting the
largest whole multiple.
Again, this code suffers from the same issue as the one above, namely that it
only calculates one of the remainders for negative numbers with the only error
check on the denominator not being 0.
Sample output:
Reverse a sentence
Given a sentence as input, reverse the words but not their individual letters.
Thus, the sentence "One two three" would result in "three two One".
Using an array is the only sensible way to accomplish this. If you don't, you'll
have a headache of a time scanning through for whitespace, storing the
characters until the next whitespace and repeating until the end of the sentence
where you display it onscreen.
In my solution, I read in a string passed to the script into an array (the -a
sentence_array part) as arguments separated by whitespace (i.e. a standard
sentence). Note that I don't need to specify IFS=' ' since by default, IFS
separates by whitespace (spaces, tabs and newlines).
Using a for loop, I simply iterate from the end to the beginning, one element
at a time in the array. Since all whitespace is used as deliminators, the array will
be full (i.e. not sparse), and display with only one space between the words when
echoed back.
Sample output:
Reverse a string
Given a random string as input, write a script which reverses it character-
for-character.
Cheating
For loop
While loop
Sample output:
Serpinski
Calculate and display all iterations of the Serpinski formula from 1 to the
given iteration.
Sample output:
Quick Reference
The purpose of this section is to provide a rapid way lookup basic Bash
syntax covered in this guide.
Basics
Prompts
Comments
Arguments
The command and everything following are subject to word splitting.
Aliases
Functions
Help
Man pages
Help pages
Whatis
Apropos
Variables
Strings
Expansion
Integer
Read only
Shell
Variable Description
$BASH The path to Bash’s binary.
$BASH_VERSION Bash’s version.
$BASH_VERSINFO Bash’s major version.
$EDITOR The environment’s default editor.
$EUID The current user’s effective user ID.
$HOME The current user’s home directory path.
$HOSTNAME The machine’s hostname.
$HOSTTYPE The machine’s hosttype.
$IFS The IFS setting. Blank if set as default.
$MACHTYPE The machine type.
$OLDPWD The previous directory you were in.
$OSTYPE The operating system type.
$PATH The command search path.
$PIPESTATUS The exist status of the last completed pipe.
$PPID The process ID of the current shell’s parent
process.
$PS1, …, $PS4 The shell’s (prompts 1 through 4) format.
$PWD The current working directory.
$RANDOM A pseudo random integer between 0 - 2^15.
$SECONDS How many seconds the script has been
executing.
$UID The current user’s ID.
Environment
Variable Description
BASH_VERSION Bash’s version number.
BROWSER The system’s default web browser.
DISPLAY The hostname, display and screen number
for graphical application display.
EDITOR The default file editor. VISUAL falls back to
this if it fails.
HOSTNAME The system’s name.
HISTFILE The path to the file were command history is
saved.
HOME The current user’s home directory.
IFS The Internal Field Separator’s global setting.
LANG The language settings used by system
applications.
LOGNAME The currently logged in user.
MAIL The storage location for system mail.
MANPATH The storage location for system manpages.
PAGER The utility used to display long text output.
PATH The colon separated search path for
commands.
PS1 The prompt settings.
PWD The current working directory.
SHELL The path to the current user’s shell program.
TERM The current terminal emulation type.
TZ The timezone used by the system clock.
USER The currently logged in user.
VISUAL The default GUI file editor.
Arrays, indexed
Structure
array
key value
0 value1
1 value2
... ...
N valueN
Declaration
Usage Description
array=() Create array
Declares an empty array called array.
array[0]=value Create & set array key
Declares array, array, and sets its first value
(index 0) to value.
declare -a array Create array
Declares an empty array, array.
Storing values
Usage Description
array[i]=value Set array value
Sets the ith key of array to value.
array=(value1, value2, . . ., Set entire array
valueN) Sets the entire array, array, to any
number of values, value1, value2, . .
.valueN, which are indexed in order, from
zero. Any previously set array values are
lost unless the array is appended with +=
array+=(value1, value2, . . Append array
., valueN) Appends the existing array, array,
with the values, value1, value2, . . .valueN.
array=([i]=value1, Compound set array values
[j]=value2, . . ., [k]=valueN) Set the indexed keys, i, j, and k, to
value1, value2, . . .valueN respectively. Any
previously set array values are lost.
Retrieving values
Usage Description
${array[i]} Expand value
Expands to the ith value of indexed
array, array. Negative numbers count
backwards from the highest indexed key.
${array[@]} Mass expand values
“${array[@]}” Expands to all values in the array. If
double quoted, it expands to all values in
the array individually quoted.
${array[*]} Mass expand values
“${array[*]}” Expands to all values in the array. If
double quoted, it expands to all values in
the array quoted as a whole.
${array[@]:offset:num} Subarray expansion
“${array[@]:offset: num Expands to num of array values,
}” beginning at offset. The same quoting rules
as above apply.
${array[*]:offset:num } Subarray expansion
“${array[*]:offset:num }” Expands to num of array values,
beginning at offset. The same quoting rules
as above apply.
Metadata
Usage Description
${#array[i]} Value string length
Expands to the string length of the ith array value
${#array[@]} Array values
${#array[*]} Expands to the number of values in the array.
${!array[@]} Array indexes
${!array[*]} Expands to the indexes in the array.
Deletion
Usage Description
unset -v array Erase an array
unset -v Completely erases the array, array.
array[@]
unset -v array[*]
unset -v array[i] Erase an array value
Erases the ith array value from array.
Arrays, associative
Structure
array
string label value
string1 value1
otherstring2 value2
... ...
anotherstring3 valueN
Declaration
Usage Description
declare -A array Create array
Declares an empty array, array.
array[str]=value Create & set array key
Declares array, array, and sets its first
string label to value.
Storing values
Usage Description
array[str]=value Set array value
Sets the element indexed by str of array to
value.
array=([str1]=value1, Compound set array values
[str2]=value2, . . ., Set the elements indexed by strings, str1,
[str3]=valueN) str2, and str3, to value1, value2, . . .valueN
respectively. Any previously set array values
are lost.
Retrieving values
Usage Description
${array[str]} Expand value
Expands to the element indexed by string
str of array, array.
${array[@]} Mass expand values
“${array[@]}” Expands to all values in the array. If
double quoted, it expands to all values in the
array individually quoted. Output is in
random order.
${array[*]} Mass expand values
“${array[*]}” Expands to all values in the array. If
double quoted, it expands to all values in the
array. quoted as a whole. Output is in random
order.
Metadata
Usage Description
${#array[str]} Value string length
Expands to the string length of the
element indexed by str
${#array[@]} Array values
${#array[*]} Expands to the number of values in the
array.
${!array[@]} Array indexes
${!array[*]} Expands to the string labels in the array.
Deletion
Usage Description
unset -v array Erase an array
unset -v array[@] Completely erases the array, array.
unset -v array[*]
unset -v array[str] Erase an array value
Erases the element indexed by str from
array.
unset -v array Erase an array
unset -v array[@] Completely erases the array, array.
unset -v array[*]
Special characters
Basic
Character Description
# Comment
Lines beginning with a hash will not be executed.
““ Whitespace
Bash uses whitespace (e.g. spaces, tabs, and newlines)
to perform word splitting.
& Run in background
Cause the preceding command to run in the
background.
; Command separator
Allows for the placement of another command on the
same line.
Logic
Character Description
&& And
Logical and operator. Returns a success if both of the
conditions before and after the operator are true.
|| Or
Logical or operator. Returns a success if either of the
conditions before or after the operator are true.
Directory traversal
Character Description
~ Home directory
Represents the home directory, and the current user’s
home directory when followed by a forward slash.
. Current directory
A dot in front of a filename makes it “hidden.” Use ls -
a to view.
A dot directory represents the current working
directory.
A dot separated from a filename by a space sources
(loads) the file.
.. Parent directory
A double dot directory represents the parent directory.
/ Filename separator
A forward slash separates the components of a
filename.
Quoting
Character Description
\ Escape
Escapes the following special character and causes it to
be treated literally.
Allows for line splitting of a long command.
‘‘ Full quoting
Special characters within single quotes lose their
meaning and become literal.
Word splitting is not performed on the contents.
““ Partial quoting
Special characters within double quotes lose their
meaning, with the notable exception of parameter
expansion, arithmetic expansion, and command
substitution.
Word splitting is not performed on the contents.
Redirection
Character Description
> Output redirection
Redirects standard output, typically to a file.
>> Output redirection
Redirects and appends standard output, typically to a
file.
< Input redirection
Redirects standard input, typically for reading in a file’s
contents.
<< Here document
Reads in strings of data until a specified delimiter is
encountered.
<<< Here string
Reads in a single string of data immediately following.
| Pipe
Redirects the standard output of a command to the
standard input of another command.
Groups
Character Description
{} Inline group
Commands within the curly braces are treated as a
single command. Essentially a nameless function without
the ability to assign or use local variables. The final
command in an inline group must be terminated with a
semicolon.
() Command group
Commands within are executed as a subshell. Variables
inside the subshell are not visible to the rest of the script.
(()) Arithmetic expression
Within the expression, mathematical operators (+, -, *,
/, >, <, etc.) take on their mathematical meanings (addition,
subtraction, multiplication, division). Used for assigning
variable values and in tests.
Globs
Character Description
? Wildcard (character)
Serves as a match for a single character.
* Wildcard (string)
Serves as a match-for any number of characters.
[…] Wildcard (list)
Serves as a match for a list of characters or
ranges.
Null globs
Extended globs
Glob Description
?(list) Matches exactly zero or one of the listed pattern.
*(list) Matches any number of the listed patterns.
@(list) Matches exactly one of the listed patterns.
+(list) Matches at least one of the listed patterns.
!(list) Matches anything but the listed patterns.
Parameters
Positional
Parameter Description
$0 The called name of the script.
$1, $2, … $9 Print the arguments 0 through 9 passed to the
script.
${10}. ${11} . . . Print the arguments 10 and higher
Special
Parameter Description
$# The number of positional parameters passed to the
script.
$* All the positional parameters. If double quoted, results
in a single string of them all.
$@ All the positional parameters. If double quoted, results
in a single list of them all.
$? The exit code of the last completed foreground
command.
$! The exit code of the last completed background
command.
$$ The process ID of the current shell.
$_ The last argument of the last completed command.
$- The shell options that are set.
Expansion
Usage Description
${parameter} Append
Allows for additional characters beyond the end
curly brace to be appended to parameter after
substitution.
${#parameter} String length
The number of characters in parameter.
Assigning values
Usage Description
${parameter:=value} Assign default value
${parameter=value} Set parameter to value if parameter is not set,
then substitute parameter.
${parameter:+value} Use alternate value
$parameter+word} Substitute nothing for parameter if parameter
is not set, else substitute value.
${parameter:-value} Use default value
$(parameter-value} Substitute value for parameter if parameter is
not set, else substitute parameter.
Substring removal
Usage Description
${parameter:offset:length} Results in a string of length
characters, starting at offset characters. If
length is not specified, takes all
characters after offset. If length is negative,
count backwards.
${parameter#pattern} Searches from front to back
parameter until the first occurrence of
pattern is found and deletes the result.
${parameter##pattern} Searches from front to back of
parameter until the last occurrence of
pattern is found and deletes the result.
${parameter%pattern} Searches from back to front of
parameter until the first occurrence of
pattern is found and deletes the result.
${parameter%%pattern} Searches from back to front of
parameter until the last occurrence of
pattern is found and deletes the result.
Brace expansion
Character Description
{} Brace expansion
Expands the character sequence
within.
Command substitution
Character Description
`` Command substitution
The output of the command within the backticks are
run and substituted in place of the entire backtick
grouping.
$( ) Command substitution
Exactly the same as backticks above, but provide
nestability and better readability.
Conditional blocks
Exit codes
Exit Code Description Example
0 Successful execution. true
1 Unsuccessful execution catchall. false
2 Incorrect use of a shell builtin ls —option
126 Command cannot execute /dev/null
127 Command not found (typically a toucj
typo)
128 Incorrect exit code argument exit 2.718
128+num Fatal error signal “num” kill -9 pid
# num = 137
kill -15 pid
# num = 143
130 Script killed with Control-C [CTRL]-[c]
255* Exit code out of range exit 666
If
If-else
If-elif-else
If-else nesting
Case
Select
Tests
Types
Conditional Description
! Negate
Negates a test or exit status.
If issued from the command line, invokes Bash’s
history mechanism.
[] Test
The expression between the brackets is tested and
results in a true/false output. This is a shell builtin. If
not concerned with compatibility with the sh shell, the
double square bracket test is much more flexible.
[[ ]] Test
Tests the expression between the brackets but
provides many more options, features, and flexibility
than single square brackets. It is a shell keyword. Tests
will be covered later.
Arithmetic
Operator Description
X -eq Y True if X is equal to Y
X -ne Y True if X is not equal to Y
X -gt Y True if X is greater than Y
X -lt Y True if X is less than Y
X -le Y True if X is less than or equal to Y
X -ge Y True if X is greater than or equal to Y
bc:
[[ operator bc equivalent
-eq ==
-ne !=
-gt >
-lt <Y
-le <=
-ge >=
Strings
Operator Description
-z STRING True if STRING is empty
-n STRING True if STRING is not empty
STRING1 = STRING2 True if STRING1 is equal to STRING2
STRING1 == True if STRING1 is equal to STRING2
STRING2
STRING1 != STRING2 True if STRING1 is not equal to
STRING2
STRING1 < STRING2 True if STRING1 is sorts
lexicographically before STRING2 (must
be escaped if using [)
STRING1 > STRING2 True if STRING1 is sorts
lexicographically after STRING2 (must be
escaped if using [)
Files
Operator Description
-e FILE True if FILE exists
-f FILE True if FILE exists and is a regular file
-d FILE True if FILE exists and is a directory
-c FILE True if FILE exists and is a special character
file
-b FILE True if FILE exists and is a special block file
-p FILE True if FILE exists and is a named pipe
(FIFO—“first in, first out”)
-S FILE True if FILE exists and is a socket file
-h FILE True if FILE exists and is a symbolic link
-O FILE True if FILE exists and is effectively owned
by you
-G FILE True if FILE exists and is effectively owned
by your group
-g FILE True if FILE exists and has SGID set
-u FILE True if FILE exists and has SUID set
-r FILE True if FILE exists and is readable by you
-w FILE True if FILE exists and is writable by you
-x FILE True if FILE exists and is executable by you
-s FILE True if FILE exists and is non-empty (size is >
0)
-t FD True if FD (file descriptor) is opened on a
terminal
FILE1 -nt FILE2 True if FILE1 is newer than FILE2
FILE1 -ot FILE2 True if FILE1 is older than FILE2
FILE1 -ef FILE2 True if FILE1 and FILE2 refer to the same
device and inode numbers
Logical
Operator Description
EXP1 && EXP2 True if both EXP1 and EXP2 are true
(equivalent of -a if using [)
EXP1 || EXP2 True if either EXP1 or EXP2 is true
(equivalent of -o if using [)
Patterns
Operator Description
STRING = PATTERN True if STRING matches PATTERN
STRING == PATTERN True if STRING matches PATTERN
STRING =~ PATTERN True if STRING matches the regular
expression pattern, PATTERN
Character classes
Class Description
alnum Alphanumeric characters. Equivalent to specifying both
alpha and digit, or [0-9A-Za-z].
alpha Alphabetic characters. Equivalent to [A-Za-z].
ascii ASCII characters.
blank Space and tab.
cntrl Control characters. These are the characters with octal
codes 000 through 037, and 177.
digit Digits. Equivalent to [0-9].
graph Visible characters. Equivalent to alnum and punct.
lower Lowercase characters. Equivalent to [a-z].
print Visible characters and spaces. Equivalent to alnum, punct, and
space.
punct Punctuation characters.
!@#$%^&*(){}[]_+-,.:;<>=\|/'“`~?
space Whitespace characters. Tab, newline, vertical tab, newline,
carriage return, space, and form feed.
upper Uppercase characters
word Alphanumeric characters plus “_”
xdigit Hexadecimal digits
Miscellaneous
Operator Description
! TEST Inverts the result of TEST
( TEST ) Groups a TEST (changes precedence of tests, must be
escaped if using [)
-o OPT True if shell option, OPT, is set
-v VAR True if variable, VAR, is set
Loops
While
Until
For
File descriptors
Standard streams
Name Stream Default File
Descriptor
Standard stdin Keyboard 0
input
Standard stdout Screen 1
output
Standard stderr Screen 2
error
Redirection
stdout of command to file:
Duplication
stdout duplicated and placed in stderr:
Redirection
Write
Append
Read
Here document
Here string
FIFO
PIPE
Process substitution
Traps
Signals
Name Value Effect Shortcut Trappable?
EXIT 0 Exit
SIGHUP 1 Hangup Yes
SIGINT 2 Interrupt [CTRL] Yes
[C]
SIGQUIT 3 Quit [CTRL] Yes
[\]
SIGKILL 9 Kill No
SIGTERM 15 Terminate Yes
SIGCONT 18 Continue Yes
SIGSTOP 19 Stop No
SIGTSTP 20 Terminal [CTRL] Yes
Stop [Z]
Set
Reset
Ignore
A Seattle transplant who was born and raised in Minnesota, Joe has worked
in various I.T. roles for the past decade. An avid outdoorsman, you can usually
find him and his black lab, Ranger, exploring nature and wilderness areas.