Backups
Backups
I. Objective
The objective of this lab assignment is to introduce students to features of common
backup utilities, and to perform a few basic experiments to determine the effects of
verification, compression and encryption on backups. These procedures assume the
student has access to an accompanying spreadsheet to aid in the analysis. This
spreadsheet highlights the cells that require input.
Note that these experiments are unscientific because there are not enough test cases (such
as a large variety of backup sizes) and a lack of variety in backup destinations (such as
external hard disk vs. internal hard disk vs. tape). These tests also ignore the differences
between a VM and a physical machine. Nevertheless, despite these limitations, it is hoped
that good observations and hypotheses can be made.
A. Boot your Linux system or VM. If necessary, log in and then open a
terminal window and cd to the labtainer/labtainer-student directory. The pre-
packaged Labtainer VM will start with such a terminal open for you. Then
start the lab:
labtainer backups2
Note the terminal displays the paths to three files on your Linux host:
On most Linux systems, these are links that you can right click on and select “Open
Link”. If you chose to edit the lab report on a different system, you are
responsible for copying the completed report back to the displayed path on your
Linux system before using “stoplab” to stop the lab for the last time.
The first basic use of the tar command is the creation of an archive file, which is often
referred to as a tarball. Tarballs are similar to zip files, which were popularized on
Microsoft Windows. Zip files and tarballs are different in at least two ways: 1) they use
different file formats; and 2) zip compresses the archive while tar does not compress
by default.
In this exercise you will not be backing up the root file system. Instead you will work
with a separate files system mounted on /lab_mnt that contains a copy of the /usr/bin
directory.
As shown below, use the du command (which stands for disk usage) to determine
how much data exists under the /lab_mnt hierarchy. The “-s” option stands for
summarize, and the ‘b’ option stands for return the size in bytes.
du -sb /lab_mnt
Notation #1: How many bytes did the du command report? [Enter the number
into the highlighted/shaded part of spreadsheet]
Execute the following command to create a tarball from all the files and directories
in the file system mounted on /lab_mnt. The command is explained in the next
figure.
cd /lab_mnt
tar -cvf /tmp/lab_mnt.tar *
3. List some of the metadata for the tarball you just created:
ll /tmp/lab_mnt.tar
Notation #2: How big is the tarball (in bytes)? The number of bytes is given just
before the date. [Enter the number in the spreadsheet.]
To see that the tarball contains more than just the content of files, do the following to
list the contents of the tarball:
In the above command, the ‘t’ option stands for list, ‘v’ still stands for verbose, and
‘f’ still stands for file. Taken together, the command means, “list the contents of the /
tmp/lab_mnt.tar tarball in a verbose manner”.
The output shows that in addition to the file data, the tarball also has the metadata
necessary to restore the file to a prior state.
A very, very important task after creating a backup is to verify that the backup is
good, especially when using some kinds of “unreliable” media, such as tape or optical
discs. The tar command comes with two options for verifying a tarball, which cause
tar to read each file from the tarball and then compare it to the original file on the
disk. To test one of these verification features, first do the following to cause a
change to the metadata for an existing file:
touch usr/bin/base64
You can see that the command highlighted the change in m-time. Note that this
difference option can be used at any time in the future to compare the contents in a
tarball to the existing state of the files on disk.
Execute the following command to tell the kernel to drop any cached files in RAM so
we can get repeatable, consistent times. [If we want to measure performance, then we
want to make sure that every invocation of tar must do the same amount of work.]
This time you will omit the verbose output and use the time command1 to measure
how long it takes to finish:
Notation #3: Referring to the real time2 that was displayed as part of the output of
the above command, how many seconds did it take to create the
tarball? [Enter the number in the spreadsheet.]
8. Clean up.
rm /tmp/lab_mnt.tar
1 A previous lab showed one way of measuring the passage of time. This approach
is a more accurate way of measuring time.
2 The user and sys times refer to the amount of CPU time used by the application
and by the kernel, respectively. The real time refers to actual passage of time, which we
will use in this exercise.
This time, use dump to back up the file system mounted on /lab_mnt, as shown
below. The resulting file is called a dump file. [The “-0” below (the number zero)
means full backup.]
Notation #4: How many seconds did it take to create the dump file? [Enter the
number in the spreadsheet.]
Notation #6: How big is the dump file? [Enter the number in the spreadsheet.]
3. Back up the data to a remote archive server. We will use ssh to send the results of the
dump command to a remote server named “archive”.
Assuming you were able to reach the archive server, exit your ssh session:
exit
Pipe the dump command through ssh to the remote archive server:
time (dump -0 -f - /lab_mnt | ssh student@archive "cat >
lab_mnt.dump")
Notation #5: How many seconds did it take to create the remote dump file? [Enter
the number in the spreadsheet.]
rm /lab_mnt/usr/bin/cheese
Notation #7: How many seconds did it take to verify the local dump file? [Enter
the number in the spreadsheet.]
Note that the restore command showed you that it detected a missing file near the end
of the output.
Notation #8: How many seconds did it take to verify the remote dump file? [Enter
the number in the spreadsheet.]
Note that the restore command showed you that it detected a missing file near the end
of the output.
4. Restore the file from the dump file after first changing to directory into the file
system:
cd /lab_mnt
restore -i -f /tmp/lab_mnt.dump
This will put you into an interactive session with the restore command, as you can
see from the “restore>” prompt. At the prompt, enter the following command to
add the file we need restored to a list of files to be restored. [If we wanted to restore
several files we would continue to add them in the same way.]
add usr/bin/cheese
Now enter the command to copy from the dump file to its original location:
extract
ll usr/bin/cheese
1. Do the following to measure how long it takes to create and compress the dump file
(where the ‘z’ option indicates compression).
cd /lab_mnt
time dump -0 -z -f /tmp/lab_mnt.dump.z usr/bin
Notation #9: How many seconds did it take to create and compress the dump file?
[Enter the number in the spreadsheet.] Note that it should have taken
longer to dump & compress than to just dump. If it took you less time,
then you may have skipped over the step to clear the cache.
Notation #10: How big is the compressed dump file? [Enter the number in the
spreadsheet.]
rm /tmp/lab_mnt.dump.z
1. Encrypt the dump file using the long command given below3. [The encryption key is
supplied on the command line as “hi”.]
time (openssl enc -aes-256-cbc -k hi -in /tmp/lab_mnt.dump >
/tmp/lab_mnt.dump.aes256)
Notation #11: How many seconds did it take to encrypt the dump file? [Enter
the number in the spreadsheet.]
Notation #12: How big is the encrypted dump file? [Enter the number in the
spreadsheet.]
rm /tmp/lab_mnt.dump
rm /tmp/lab_mnt.dump.aes256
3 There is no universally accepted way of adding a file suffix to show that a file
has been encrypted with openssl, nor what cipher was used, but it seems wise to use
something that gives an indication of how to decrypt it later, which explains the use of
“aes256” at the end of the file.
cd /lab_mnt
time dump -0 -z -f /tmp/lab_mnt.dump usr/bin
Notation #13: How many seconds did it take to dump and compress the data?
[Enter the number in the spreadsheet.]
Notation #14: How many seconds did it take to verify the compressed data? [Enter
the number in the spreadsheet.]
Notation #15: How many seconds did it take to encrypt the compressed dump file?
[Enter the number in the spreadsheet.]
IX. Clean up
After completing the lab report using the supplied template, save the spreadsheet and go
to the terminal on your Linux system that was used to start the lab and type:
stoplab
If you modified the lab report and/or spreadsheet on a different system, you must copy
that completed file into the directory path displayed when you started the lab, and you
must do that before typing “stoplab”. When you stop the lab, the system will display a
path to the zipped lab results on your Linux system.
X. Submisssion
Provide the zip file to your instructor, e.g., via the Sakai site.
cd Change directory
cd location
With no “location”, you will be taken to your home directory.
find Find an object with a given kind of attribute. The basic use of find
is to list all the files and directories in a given hierarchy:
find directorypath –print
man Manual
man command
Displays the manual page for the given “command”. To see
another page press the space bar. To see one more line press the
Enter key. To quit before reaching the end of the file enter ‘q’.
touch Change the modification date on the given file. If the file does not
exist, it will be created.
touch filename
wc Count the number of words in a given text file. Given the “-l”
option (for “lines”), it will return the number of lines in a text file:
wc –l filename