Bash Cheat Sheet by Tomi Mester
Bash Cheat Sheet by Tomi Mester
CHEAT SHEET
created by
Tomi Mester
I originally created this cheat sheet for my 6-week data science online course
participants.* But I have decided to open-source it and make it available to
everyone who wants to learn bash and the command line.
It's designed to give you a meaningful structure but also to let you add your own
notes (that's why the empty boxes are there). It starts from the absolute basics
(cd ..) and includes everything that you will need as a junior data analyst/
scientist (commands, scripting, automations, etc.).
The ideal use case of this cheat sheet is that you print it in color and keep it next to
you while you are learning and practicing Bash on your computer.
Enjoy!
Cheers,
Tomi Mester
THE PROMPT
When working in the command line, at the beginning of every line you will see the
prompt.
Example:
tomi@data36-learn-server:~$
[your notes]
ls
Lists all the content of your current working directory.
cd
You move from your current working directory to your user's home directory.
Note: cd stands for change directory.
cd test_dir
You move to the test_dir directory - if test_dir exists in your current working
directory.
cd ..
Moves up one folder.
cd -
Moves to the folder where you previously were.
mkdir one_more_dir
Creates a new directory called one_more_dir.
cp test_file.txt test_file_copy.txt
Copies your original test_file.txt and names the new copy to test_file_copy.txt.
cp test_file.txt one_more_dir/test_file.txt
Copies your original test_file.txt into the one_more_dir directory - and keeps its
original name. (Only works if one_more_dir is in your current working directory.)
mv test_file.txt one_more_dir/test_file.txt
Moves your original test_file.txt into the one_more_dir directory.
(Only works if one_more_dir is in your current working directory.)
mv test_file.txt test_file_some_new_name.txt
Renames your original test_file.txt to test_file_some_new_name.txt.
rm test_file.txt
Removes the test_file.txt file - if it exists.
Note: Be careful! In bash, if you delete a file, it will be deleted for good. There's no
"Trash" folder or "Restore" button in case you remove something by accident.
#2
You can use the up (↑) and down (↓) arrows on your keyboard to bring back your
previous commands.
#3
history
Brings back all your previously typed-in commands and prints them to your screen.
#4
Hit the TAB key (→) on your keyboard to auto-extend your typed-in text (e.g.
commands or file names.)
PRINTING
echo "Hello, World!"
Prints Hello, World! to your screen.
cat some_data.csv
Prints the entire contents of the some_data.csv file to your screen.
head some_data.csv
Prints the first ten rows of the some_data.csv file to your screen.
tail some_data.csv
Prints the last ten rows of the some_data.csv file to your screen.
COUNTING
wc some_data.csv
Counts the number of lines, words and characters in some_data.csv - and prints
this information to the screen.
wc -l some_data.csv
Counts only the number of lines in some_data.csv. (Note: -l stands for line.)
wc -w some_data.csv
Counts only the number of words in some_data.csv. (Note: -w stands for word.)
wc -c some_data.csv
Counts only the number of characters in some_data.csv.
OPTIONS
By using an option, you can modify what your original command does. Most
commands have plenty of options. You can apply them by:
1. typing your original command (e.g. wc)
2. then a space and a dash ( -)
3. then the option name (e.g. w)
General syntax:
command -option
A few examples:
wc -w some_data.csv
On its own, wc counts the number of lines, words and characters in some_data.csv.
But with the -w option, it only counts words.
ls -l
ls lists all the content of your current working directory.
But with the -l option, it also displays additional information for the given files and
directories (e.g. size, date of creation, owner, permissions).
Example:
man wc
Opens the manual for the wc command. There you'll find a general description of
the command and the descriptions for all the options it has.
When you are finished, you can exit the manual by hitting the Q key on your
keyboard.
PRINTING TO FILE
In complex data projects, you don't want to print results to your terminal screen.
You want to save them into files, so you can reuse them later.
Examples:
head some_data.csv > first_ten_rows.csv
Takes the first ten rows of the some_data.csv file and prints them into the file
called first_ten_rows.csv. (If the first_ten_rows.csv file didn't exist before, it will be
created. If it did exist, then it will be overwritten.)
FILTERING
cut -d';' -f1 some_data.csv
Instead of the whole file, it prints only the first column of the some_data.csv file.
The field separator (aka delimiter) between the columns is a semicolon (;).
sort -n some_data.csv
Sorts the some_data.csv file by numerical order.
sort -r some_data.csv
Sorts the some_data.csv file by alphabetical order (by default) and in reverse.
sort -r -n some_data.csv
Sorts the some_data.csv file by numerical order and in reverse.
sort -u some_data.csv
Sorts the some_data.csv file and removes duplicates. (Note: -u stands for unique.)
uniq some_data.csv
Unifies repeated lines that follow each other.
Note: sort -u removes all duplicated rows. uniq removes only those duplicated
rows that follow each other.
uniq -c some_data.csv
Unifies repeated lines that follow each other and counts the number of
occurrences.
To get things done more simply and cleanly, you can use pipes instead.
A pipe takes the output of your command and redirects it directly into your next
command.
Examples:
head -50 some_data.csv | tail -10
Takes the first 50 rows of the some_data.csv file. The pipe redirects this output into
tail, which prints the last 10 rows of those first 50 rows. As a result, you will have
the rows from 41 to 50.
BASH SCRIPTING
mcedit
Opens a text editor in bash.
(If you haven't set up mcedit yet, go here: https://data36.com/server-setup)
In the editor, you can create a bash script by simply adding commands listed one
after another (the same commands that you would type on the command line).
When you run this script, the commands in it will be executed one by one.
Done!
After running the script, your output will be something like this:
You can create various bash scripts (even hundreds of lines long) and even Python
or SQL scripts the same way.
AUTOMATIONS
You can schedule your scripts (and your bash commands) to run automatically on
your remote server every day, every hour, even every minute.
It will open a text file which starts with a detailed manual of how it works.
After the manual, you can add your actual tasks to schedule.
One line is one task and you have to define 6 parameters in each line:
• The minute of the hour
• The hour of the day
• The day of the month
• The month
• The day of the week
• The command you want to run.
The first five parameters define the when - the 6th defines the what.
Example:
01 00 * * * /home/demo_user/demo_script.sh
If you have this line in your crontab, it will run your demo_script.sh script (that's
located in the /home/demo_user/ folder) every day, one minute after midnight.
VARIABLES
In bash, you can assign a value to a variable as simply as:
variable_name=variable_value
Note: If you assign a new value to a variable that you have used before, it will
overwrite your previous value.
Examples:
a=100
b='hello, world'
c=true
d=0.75
You can refer to any of these variables by typing a $ sign and the variable name
itself.
Examples:
echo $a
Prints 100.
echo $b
Prints hello, world.
echo $c
Prints true.
echo $d
Prints 0.75.
IF STATEMENTS
If statements are great for evaluating a condition and taking certain action(s)
based on the result.
Example:
a=10
b=20
if [[ $a == $b ]]
then
echo 'yes'
else
echo 'no'
fi
WHILE LOOPS
While loops are great for repeating and executing one (or more) commands until a
condition is fulfilled.
Example:
i=0
while [ $i -lt 10 ]
do
echo $i
i=$((i + 1))
done
date
Prints the current time and date.
(Example output: Sat Aug 10 23:21:55 UTC 2019)
date +%D
Prints the current date in mm/dd/yy format. (Example output: 08/10/19)
date +%Y-%m-%d
Prints the current date in yyyy-mm-dd format. (Example output: 2019-08-10)
awk
awk is another great command line tool for text processing and data cleaning.
CREATED BY
Tomi Mester from Data36.com
Tomi Mester is a data analyst and researcher. He's worked for Prezi, iZettle and
several smaller companies as an analyst/consultant. He's the author of the Data36
blog where he writes posts and tutorials on a weekly basis about data science, AB-
testing, online research and coding. He's an O'Reilly author and presenter at
TEDxYouth, Barcelona E-commerce Summit, Stockholm Analytics Day and more.
WHERE TO GO NEXT
Find company workshops, online tutorials and online video courses on my website:
https://data36.com