04-ch4-ELEC462.pptx
04-ch4-ELEC462.pptx
(ELEC462)
Focus on File Systems:
Writing pwd
Dukyun Nam
HPC Lab@KNU
Contents
● Introduction
● Understanding Directories
● Writing pwd
3
Introduction (cont.)
● A hard disk is a stack of metal platters
○ Each platter has a magnetically responsive coating
○ A disk stores files, file info, and dirs in a tree-structure
● How does this stack of spinning metal appear to be a tree of files,
properties, and directories?
○ This chapter will let you learn how files are physically organized in a disk
with some hand-on experiences
● Question) What is the internal structure of the file system?
○ Write pwd
4
Introduction (cont.)
● pwd reports your current location with in directory tree.
● System calls to be studied in this chapter:
○ mkdir, rmdir, chdir, link, unlink, rename, symlink
● Besides, we will learn:
○ How directories are connected,
○ How cat/pwd works, and
○ Mounting file systems
5
A User’s View of the File System
● Build the tree with this sequence of commands
○ Can you draw a tree of directories?
$ mkdir demodir
$ cd demodir
$ pwd
/home/yourname/demodir
$ mkdir b oops
$ mv b c
$ rmdir oops
$ cd c
$ mkdir d1 d2
$ cd ../..
$ mkdir demodir/a
6
A User’s View of the File System (cont.)
● Directory Commands:
○ mkdir: creates a directory with a specified name
○ rmdir: removes a directory
○ mv: renames or moves directories (or files)
○ cd: moves from one to another directory
● File Handling Commands:
○ ‘..’ : parent directory.
○ Example:
$ cd ../..
7
A User’s View of the File System (cont.)
● Create some files in this directory tree
○ Where are files?
8
A User’s View of the File System (cont.)
● A snapshot of a filesystem
○ x and xlink are called “links”
9
A User’s View of the File System (cont.)
● Tree commands
○ ls –R: can list the items of an entire tree
■ Directories + Subdirectories
○ chmod –R: changes permission bits of files
■ By utilizing features of “–R,” we “recursively” apply changes to all files in subdirectories
○ du: disk usage command
■ It reports the number of the disk blocks used by a specified directory
○ find: searches a directory and all its subdirectories for files and
descriptions specified on command line
■ e.g., find . -name textbook
10
A User’s View of the File System (cont.)
● The internal structure of the system imposes no limit on the depth
of a directory tree
○ An infinite number of directories can be virtually created
● What’s going to happen if you run the following script on Linux?
11
A User’s View of the File System (cont.)
● Additional commands
○ tree: a recursive directory listing program
■ List contents of directories in a tree-like format
○ find with “2>/dev/null”
■ > file: redirect stdout to file
■ 1> file: redirect stdout to file
■ 2> file: redirect stderr to file
■ /dev/null: the null device it takes any input you want and throws it away
■ examples
● find . -name keyword -print
● find . -name keyword -print 2> /dev/null
12
Internal Structure of the Unix File System
● Abstraction zero: from platters to partitions
○ A hard disk can contain one or more logical regions called partitions
○ Disk can store a huge amount of data; can be divided into partitions
■ To create separate regions within a larger entity for different purposes.
● e.g. On Windows: “C:\”, “D:\”, …
● c.f. Country vs. provinces (or states) / cities / villages
■ We can treat each partition as a separate (but virtual) disk
15
Internal Structure of the Unix File System (cont.)
● Abstraction two: from an array of block to three regions
○ A file system stores file contents, file properties (owner, date, etc.), and
directories that hold those files
○ Divide the array of blocks into three sections
16
Internal Structure of the Unix File System (cont.)
● Three regions
○ 1) The Superblock
■ Contains the information about the organization of the file system itself.
● e.g. the size of each area, the location of unused data blocks, …
○ 2) The Inode Table
■ Each file has a set of properties (or, metadata): size, user ID of the owner, and last
modification time
■ Those properties are recorded in a struct called an inode
■ All inodes are the same size, and the inode table is simply an array of those structs
○ 3) The Data Area
■ The actual contents of files are kept in this section
■ All blocks on the disk are the same size
17
The File System in Practice: Creating a File
● Internal structure of a file
○ Example) $ who > userlist
18
Four Main Operations in Creating a New File
● 1) Store Properties
○ The file has properties.
○ The kernel locates a free inode.
○ The kernel gets inode number 47.
○ The kernel records information about the file in this inode.
● 2) Store Data
○ The file has contents; this file requires 3 blocks of storage.
○ The kernel locates 3 free blocks: 627, 200, 992.
○ Each chunk of bytes is copied from the kernel buffers to these 3 blocks.
19
Four Main Operations in Creating a New File (cont.)
● 3) Record Allocation
○ The contents of this file are in blocks 627, 200, and 992, in that order
○ The kernel records that sequence of block numbers in the disk allocation
section of the inode
○ The disk allocation section is an array of block numbers
● 4) Add Filename to Directory
○ The kernel adds the entry (47, userlist) to the directory
20
The File System in Practice: How Directories Work
● A directory is a “special kind of file,” containing a list of names of
files
22
Inodes and Big Files
● Recording data block allocation for a file with 14 blocks
23
Understanding Directories
● Internally, a directory is a file that contains a list of pairs:
○ Filename and inode number
25
Understanding Directories (cont.)
● Diagram with most of the inode numbers filled in
27
Understanding Directories (cont.)
● The real meaning of “A file is in a directory.”
○ “File x is in directory a” means there is a ‘link’ to inode 402 in the directory called a.
○ The filename attached to that link is x.
○ It’s important to remember that the directory marked ‘d1’ contains a link to inode 402,
which is called xlink.
○ These two links to inode 402 (demodir/a/x and demodir/c/d1/xlink) refer to
the same file.
● In short, directories contain “references” to files.
○ Each of these references is called “link.”
○ The contents of the file are in “data blocks.”
○ The properties of the file are recorded in a “struct in the inode table.”
○ The inode number and a (link) name are stored in a directory.
28
Understanding Directories (cont.)
● The real meaning of “A directory contains a subdirectory.”
○ e.g., a (inode 277) contained in demodir
■ The kernel installs in every directory an entry for its own inode, called “.”.
30
Commands and System Calls for Directory Trees
(cont.)
● rmdir: the command to delete a
directory
○ Uses rmdir()
○ rmdir(): removes a directory node
from a directory tree
■ The directory must be EMPTY.
■ If the directory itself is not used by any
other process, then the inode and data
are freed.
31
Commands and System Calls for Directory Trees
(cont.)
● rm: the command to remove entries
from a directory
○ Uses unlink()
○ unlink(): deletes a directory entry
■ Decrements the link count for the corresponding
inode
■ If the link count for the inode becomes zero, the
data blocks and inode are freed
■ unlink may not be used to unlink directories
32
Commands and System Calls for Directory Trees
(cont.)
● ln: the command to create a link
to a file
○ Uses link()
○ link(): makes a new link to an
inode
■ The new link contains the inode
number of the original link and has the
name specified
■ If there is already a link with the new
name, link will fail
33
Commands and System Calls for Directory Trees
(cont.)
● mv: the command to change the
name or location of a file or
directory
○ Uses rename()
○ The basic logic of rename:
■ Copy the original link to new name and/or
location
■ Then delete original link
34
Commands and System Calls for Directory Trees
(cont.)
● How rename works, why rename exists
35
Commands and System Calls for Directory Trees
(cont.)
● Advantages of having rename()
○ Makes it possible to rename or relocate directories “safely.”
■ In old days, no regular users are allowed to link and unlink directories; no method
of renaming directories
36
Commands and System Calls for Directory Trees
(cont.)
● cd: the command to change the
current directory of a process
● Uses chdir()
○ Internally, the process keeps a variable
that stores the inode number of the
current directory
○ When you “change into a new
directory,” you just change the value of
that variable
37
Understanding pwd
● Type pwd
○ What can you see?
○ Let’s first get into the ‘d2’ directory; then do pwd
38
How pwd Works
● Follow the links and read the directories
39
How pwd Works (cont.)
● The algorithm is a repetition of these three steps:
○ Step 1: Find the inode number for “.”, call it n
■ use stat()
○ Step 2: Do “chdir ..”
■ use chdir()
○ Step 3: Find the name for the link with inode n
■ use opendir(), readdir(), closedir()
○ Repeat until you reach the top of the tree
● Q1) How do we know when we reach the top of the tree?
● Q2) How do we print the directory names in the correct order?
40
Writing pwd
● Simplified version (spwd.c)
41
Writing pwd (cont.)
42
Writing pwd (cont.)
43
Multiple File Systems: A Tree of Trees
● Each partition has its own file system tree
○ We can have the other file system attached to some subdirectory of
the root file system
○ The kernel associates a pointer to disk 2’s
file system with a directory of disk 1’s
file system having the root
○ Of course, we can detach (unmount)
Partition B from Partition A
■ e.g., external HDD / USB
44
Mount Points
45
Mount (cont.)
● Single directory hierarchy and mount points
○ On Linux, as on other UNIX systems, all files from all file systems reside
under a single directory tree
■ At the base of this tree is the root
directory, / (slash)
■ Other file systems are mounted
under the root directory and appear
as subtrees within the overall hierarchy
47
Duplicate Inode Numbers and Cross-Device Links
(cont.)
● Hard links (that we’ve discussed so far)
○ Pointers that connect directories into a tree
○ Pointers that link filenames to the files themselves
■ Cannot point to inodes in other file systems
■ Note that even root cannot make hard links to directories in other file systems.
■ There’s no way of achieving pointing to the same file with hard links from one file
system to another filesystem, and vice versa.
● Solution
○ Let’s use another type of link supported by Unix/Linux, soft link
○ That way, in the current file system we can reach the file with the same
inode number from another filesystem
48
Symbolic Links
● “Symbolic” (soft) links: “ln -s”
○ Refers to a file by “name” not by “inode” number
○ Similar to a shortcut in that it is a path name contained in a file
● This symbolic linked file (users) behaves like the original file (whoson)
○ It’s not the original file, though!
49
Symbolic Links (cont.)
● Symbolic links: relevant to symlink and readlink system calls
○ May “span” file systems, as they don’t store the inode of the original file; it just
keeps the reference to the original file by name
○ May “point to” directories (across different file systems)
○ Still suffers from the problems we discussed for the following conditions, the
symbolic link will be broken:
■ If the file system containing the original file (pointed by a symbolic link) is removed (or
unmounted)
■ If the original file name is changed
■ If a different file with that name is installed
○ But it’s OK, though, as
■ We can check with the soft links for lost references or infinite loops in which the links point to
parent directories.
50
Summary