Linux System Administration For Researchers: Chapter 9: Filesystems
Linux System Administration For Researchers: Chapter 9: Filesystems
Chapter 9: Filesystems
Part 1: Partitions
Disk Geometry:
Disks are made of stacks of spinning
platters, each surface of which is read by
an independent read head.
Originally, the position of a piece of data
on a disk was given by the coordinates
C,H and S, for Cylinder, Head and
Sector.
Block, or
Track Sector
The CHS coordinate system began with floppy disks, where the
(c,h,s) values really told you where to find the data. Some
reasons CHS doesn't really tell you where the data is on a
modern hard disk:
Partitions:
Sometimes, it's useful to split up a disk into smaller pieces, called
partitions. Some motivations for this are:
The operating system may not be able to use storage devices as
large as the whole disk.
You may want to prevent one part of your storage from filling up the
whole disk.
IDE/PATA Disks:
These disks are represented by files named /dev/hd[a-z]. The disk
names will be:
Start
1
14
End
13
19452
Blocks
104391
156143767+
Partition Type
Id
83
8e
System
Linux
Linux LVM
Near the top, you can see the number of heads, sectors and cylinders. These
may not represent physical reality, but they're the way the disk presents itself
to the operating system.
Fdisk reports the size of each partition in 1024-byte blocks. The two
partitions above are about 100 MB and about 156 GB. The + sign on the
size of the second partition means that its size isn't an integer number of
1024-byte blocks.
The start and end values are in units of cylinders, by default. You can use
the -u switch to cause fdisk to display start and end in terms of 512-byte 10
track sectors.
Partition Types:
Here's the list of partition types that fdisk knows about. The most
common ones are highlighted.
0
1
2
3
4
5
6
7
8
9
a
b
c
e
f
10
11
12
14
16
17
18
1b
1c
Empty
FAT12
XENIX root
XENIX usr
FAT16 <32M
Extended
FAT16
HPFS/NTFS
AIX
AIX bootable
OS/2 Boot Manag
W95 FAT32
W95 FAT32 (LBA)
W95 FAT16 (LBA)
W95 Ext'd (LBA)
OPUS
Hidden FAT12
Compaq diagnost
Hidden FAT16 <3
Hidden FAT16
Hidden HPFS/NTF
AST SmartSleep
Hidden W95 FAT3
Hidden W95 FAT3
1e
24
39
3c
40
41
42
4d
4e
4f
50
51
52
53
54
55
56
5c
61
63
64
65
70
75
80
81
82
83
84
85
86
87
88
8e
93
94
9f
a0
a5
a6
a7
a8
a9
ab
b7
b8
bb
Old Minix
Minix / old Lin
Linux swap / So
Linux
OS/2 hidden C:
Linux extended
NTFS volume set
NTFS volume set
Linux plaintext
Linux LVM
Amoeba
Amoeba BBT
BSD/OS
IBM Thinkpad hi
FreeBSD
OpenBSD
NeXTSTEP
Darwin UFS
NetBSD
Darwin boot
BSDI fs
BSDI swap
Boot Wizard hid
be
bf
c1
c4
c6
c7
da
db
de
df
e1
e3
e4
eb
ee
ef
f0
f1
f4
f2
fd
fe
ff
Solaris boot
Solaris
DRDOS/sec (FATDRDOS/sec (FATDRDOS/sec (FATSyrinx
Non-FS data
CP/M / CTOS / .
Dell Utility
BootIt
DOS access
DOS R/O
SpeedStor
BeOS fs
EFI GPT
EFI (FAT-12/16/
Linux/PA-RISC b
SpeedStor
SpeedStor
DOS secondary
Linux raid auto
LANstep
11
BBT
Note that the partition type is just a label. You can put
anything you want into a partition of any type. The
partition type designation just provides the operating
system with clues about what to expect.
Command action
e
extended
p
primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-9726, default 1): +
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-9726,
default 9726): +40G
d Delete a partition
t Change a partition's
type
q Quit without saving
changes
w Write the new partition
table and exit
Device Boot
Note: In fdisk, the term primary
System
partition means one that's not an /dev/sdb1
extended partition.
Start
End
1 4864
Blocks
39070048+
Id
83
Linux
12
Start
1
4865
End
4864
9726
Blocks
39070048+
39054015
Id
83
83
System
Linux
Linux
Start
1
4865
End
4864
9726
Blocks
39070048+
39054015
Id
83
82
System
Linux
13
Linux swap / Solaris
[root@demo ~]#
Note that this command should be used very carefully, since it will
(without asking for confirmation) wipe out any existing partition table
on the disk. The content of hda.out looks like this:
# partition table of /dev/hda
unit: sectors
/dev/hda1
/dev/hda2
/dev/hda3
/dev/hda4
:
:
:
:
start=
start=
start=
start=
63,
208845,
0,
0,
size=
208782,
size=312287535,
size=
0,
size=
0,
Id=83, bootable
Id=8e
Id= 0
Id= 0
15
Part 3: Filesystem
Structure
16
What is a Filesystem?
A filesystem is a way of organizing data on a block device. The filesystem
organizes data into files, each of which has a name and other metadata
attributes. These files are grouped into hierarchical directories, making it
possible to locate a particular file by specifying its name and directory path.
Some of the metadata typically associated with each file are:
Timestamps, recording file creation or modification times.
Ownership, specifying a user or group to whom the file belongs.
Permissions, specifying who has access to the file.
Linux originally used the minix filesystem, from the operating system of the
same name, but quickly switched to what was called the Extended
Filesystem (in 1992) followed by an improved Second Extended
Filesystem (in 1993). The two latter filesystems were developed by French
software developer Remy Card.
The Second Extended Filesystem (ext2) remained the standard Linux
filesystem until the early years of the next century, when it was supplanted
by the Third Extended Filesystem (ext3), written by Scottish software
developer Stephen Tweedie. Recently, this has been superseded by ext4,17
developed by Ted Ts'o.
Block Group 0
Superblock
All Group
Descriptors
Block Group 1
Data
Inode Inode
Bitmap Bitmap Table
Block Group N
Data Blocks
18
Superblocks:
The ext2/ext3/ext4 filesystem as a whole is described in a chunk of data
called the Superblock. The superblock contains:
a name for the filesystem (a label),
the size of the filesystem's block groups,
timestamps showing when the filesystem was last mounted,
a flag saying whether it was unmounted cleanly,
a number showing the amount of unused space in the filesystem,
Superblock
All Group
Descriptors
Data
Inode Inode
Bitmap Bitmap Table
Data Blocks
19
and other information. The group descriptors are so important that copies of the
block descriptors for every block group are stored in each block group. Normally,
the operating system only uses the descriptors stored in block group 0 for all block
groups, but if a filesystem is damaged or has been uncleanly unmounted it's
possible to verify the filesystem's integrity and repair damage by using other
copies.
Superblock
All Group
Descriptors
Data
Inode Inode
Bitmap Bitmap Table
Data Blocks
20
The Journal:
Although ext2, ext3 and ext4 are very similar, ext3/4 have one
important feature that ext2 lacks: journaling. We say that ext3/4 are
journaled filesystems because, instead of writing data directly into
data blocks, the filesystem drivers first write a list of tasks into a
journal. These tasks describe any changes that need to be made to
the data blocks.
The operating system then periodically looks at the journal to see if
there are any tasks that need doing. These tasks are then done, in
order, and each completed task is marked as done in the journal.
If the computer crashes, the journal is examined at the next reboot to
see if there were any outstanding tasks that needed to be done. If so,
they're done. Any garbled information left at the end of the journal is
ignored and cleared.
Journaling makes it much quicker to check the integrity of a filesytem
after a crash, since only a few items in the journal need to be looked
21
at. In contrast, when an ext2 filesystem crashes, the operating
system needs to scan the entire filesystem looking for problems.
ext3
2 TB
16 TB
ext4
16 TB
22
1 EB
23
Give it this
label.
Create it on this
partition.
Note that the command above will format (or re-format) the
designated partition without asking for any confirmation. Please
make sure you point it at the partition you really want to format.
The filesystem label can be any text you choose, but usually the label
is chosen to be the same as the name of the location at which you
expect to mount the filesystem. For example, a filesystem intended
to be mounted at /boot, would probably probably be created with
-L/boot. For the / and /boot filesystems, this should always be
done, but it's good practice for other filesystems, too.
24
26
2007
2008
2008
2007
27
You can see this plus block group information by using the dumpe2fs command.
Checking a Filesystem:
[root@demo ~]# fsck /dev/sdb1
If a computer loses power unexpectedly, the filesystems on its disks may be left in
an untidy state. The filesystem check (fsck) command looks at ext2/ext3/ext4
filesystems and tries to find and repair damage. Fsck can only be run on
unmounted filesystems.
Each filesystem's superblock contains a flag saying whether the filesystem was
cleanly unmounted. If it was, fsck just exits without doing anything further.
If the filesystem wasn't cleanly unmounted, fsck checks it. Under ext3/ext4, fsck
first just looks at the journal and completes any outstanding operations, if possible.
If this works, then fsck exits.
If the ext3/ext4 journal is damaged, or if this is an ext2 filesystem, fsck scans the
filesystem for damage. It does this primarily by looking for inconsistencies between
the various copies of the superblock and block group descriptors. If
inconsistencies are found, fsck tries to resolve them, using various strategies.
The filesystem's superblock also contains a mount count, maximum mount
count, last check date and check interval. If the mount count exceeds the
maximum, a scan of the filesystem is forced even if it was cleanly unmounted. If
28
the time since the last check date exceeds the check interval, a scan is also forced.
Both of these forced checks can be disabled, by using tune2fs.
29
Fsck is actually a wrapper that calls a different typespecific filesystem checker for each different type of
filesystem that it knows about.
/dev/sda1
LABEL=/boot
devpts
tmpfs
proc
sysfs
/dev/sda2
Filesystem
/
/boot
/dev/pts
/dev/shm
/proc
/sys
swap
Mount Point
ext3
ext3
devpts
tmpfs
proc
sysfs
swap
defaults
defaults
mode=620
defaults
defaults
defaults
defaults
Type
Options
1
1
0
0
0
0
0
1
2
0
0
0
0
0
dump Flag
fsck Order
31
32
33
VolGroup00
sda
PE
PE
PE
PE
PE
PE
PE
PE
PE
PE
PE
PE
PE
PE
sdb
LogVol00
Logical Volume
34
Finally, we can mount the logical volume just as we'd mount a partition:
[root@demo ~]# mount /dev/VolGroup01/LogVol00 /data35
If you move a disk to a different computer that already has a volume group
with the same name, you may need to use the UUID of the volume groups to
36
rename one of them. Use vgrename for this.
The End
38
Thank You!