0% found this document useful (0 votes)
16 views

CSI3131 Mod 9 File Sys

Operating System

Uploaded by

John
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

CSI3131 Mod 9 File Sys

Operating System

Uploaded by

John
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 89

Module 9: File-Systems

Reading: Chapter 10 and 11

Objectives
 File System Interface (Chap 10)
 Explain the function of file systems.
 Describe the interfaces to file systems.
 Discuss file-system design tradeoffs, including access methods, file
sharing, file locking, and directory structures.
 Explore file system protection.
 File System Implementation (Chap 11)
 Describe the details of implementing local file systems and directory
structures.
 Describe the implementation of remote file systems.
 Discuss block allocation and free-block algorithms and trade-offs.
1
File
Systems
Files Implementation

Structure
Directories

Operations

Types

Access
Methods

Sequential

Direct
Indexed
2
File-System Interface
What is File-System?
 Organized data in files
 Files organized in directories, and the whole associated
machinery

What are the main File-System concepts?


 Files
 Directories

What do we expect from a File-System?


 Efficient and convenient access methods
 Convenient way to organize data - Directory structure
 Transparent handling of several devices - File-System
Mounting
 File Sharing
 File Protection
3
File Concept
What is a file?
 Named collection of related information
 Can be seen as collection of records
 Basic unit for storing data

What is the structure of a file?


 Simple structure - sequence of words, bytes
 Simple record structure
 Lines
• Fixed length
• Variable length
 Complex Structures
 Formatted document
 Relocatable load file (executable)
4
File Structure
At the lowest level, a file is simply collection of bytes/words.
Interpretation of the file’s contents is up to the
 Operating system
 Program accessing the file
 Combination – some OS support different file types

Unix:
 Treats files as a sequence of bytes, it is up to the
user/program to interpret file’s content
Windows:
 Recognizes several file types (using the file extension) and
automatically launches/suggests the program that knows
how to interpret the files of this type
Databases:
 Have specific needs, usually creates and interprets their
own database files (e.g. indexed files)

5
File Attributes
What needs to be remembered about each file?
 Name – only information kept in human-readable form
 Identifier – unique tag (number) identifies file within file
system
 Type – needed for systems that support different types
 Location – pointer to file location on device
 Size – current file size
 Protection – controls who can do reading, writing, executing
 Time, date, and user identification – data for protection,
security, and usage monitoring
Where is all this information kept?
 in the directory structure, we will talk about that later

6
File Operations
File can be seen as an abstract data type supporting operations:
 Basic Operations
 Create
 Write: Pointer for write position
 Read: Read pointer
 Reposition within file (seek)
 Truncate (clear): sets file size to zero, but maintains all
attributes
 Delete: Releases space

 Other Operations
 Rename,
 Copy
 Change attributes: owner, permissions, etc.
 …

7
File Operations
Do we need to specify with each file operation the file
name and a location within the file (i.e. for read or
write)?
 No
 File Descriptor: we ask the file system to open the
file and get a file descriptor (file handle)
 The file operations (read/write/seek) then use this
handle to identify the file
 File Pointer: there is also a file pointer maintained
that points to the current location within the file, to
be used for the next read or write
 Why?
 Would be inefficient to do otherwise
8
Open Files
 open(fileName, mode)
 Searches the directory structure to find the record
containing the file attributes for file fileName, then loads
this record into memory and returns a descriptor that
references it
 Mode specifies additional information, i.e. open for read
only, …
 close(file_descriptor)
 Clears the structures kept about this open file
 What needs to be in the structure for each open file?
 File pointer: pointer to the location of the next read/write
• Per file or per process? Per process
 Location on the disc (to know where to physically
read/write the data)
 File-open count
• Per file or per process? Per file
 Access rights
• Per file or per process? Both
9
Open Files
 What needs to be in the structure for each open file?
 File pointer
 Location on the disc
 Access rights
 Can a file be opened by several processes
simultaneously?
 Yes. Do we maintain separate file attribute records in
memory for each process the file is opened?
• Yes - for file pointer
• No – for location on the disc
 If several processes open the same file, and then some
of them close it, how do we know when can we release
the record from the memory?
• Keep the counter of processes that opened this file
 Locks for controlling concurrent access to the file
10
Unix Kernel I/O Structures

11
Open File Locking
 Provided by some operating systems and
file systems
 Mediates access to a file
 Mandatory or advisory:
 Mandatory – access is denied depending on
locks held and requested
 Advisory – processes can find status of
locks and decide what to do
 Reader-Writer Locks
 Recall synchronization concepts

12
File Types

 Certain operating systems use the extension of


the file to identify its type.
 Microsoft: an executable file must
extension .EXE, .COM, or .BAT (otherwise OS
refuses execution).
 The type is not defined for certain OS’s
 UNIX: the extension is used (and recognized) by
applications only.
 For other OS’s, the type is an attribute
 MAC-OS: the file has an attribute that contains the
name of the program that generated it (ex:
document Word Perfect)
13
File Types – Name, Extension

14
File Access Methods
 The access method is determined by the logical structure
 OS on « mainframes » usually support many access methods
(one per file type)
 Because they support many file types (e.g. indexed)
 But many modern OS (Unix, Linux, MS-DOS…) basically
support very few access methods (ex: sequential and direct)
since all files are of the same type (ex: byte streams)
 Ex: Data Base Management Systems (DBMS) almost
always support indexed access of files even when the
OS does not support this.
 In that case the DBMS uses the basic random-access
support of the OS to provide indexed file access to the
DBMS user
 But the same DBMS on a main-frame OS would directly
15 use the index file access method provided by the OS
Sequential Access
 The most common method and the simplest method
 Records can be accessed one after the other in their sequential
order of storage
 Based on the tape model
 The operation read_next() reads the next record and advance the
pointer to the next record
 Operation write_next() appends a record at the end
 We may also come back to the beginning (rewind)
 It is sometimes possible to jump n records (backward or forward)

16
Direct Access
 Based on disk model containing blocks of data
 A direct access file consists of a set of logical blocks
 Their size is the same as those of physical blocks
 They are numbered from 0 to k (for a file of k+1 blocks)
• But logical blocks are positioned on arbitrarily-chosen physical
blocks on disks according to the file allocation method used (see
later).
 Each logical block can be directly (and independently)
accessed
 Each logical block consists of R records of the same size
 Hence, we may have internal fragmentation
 The first logical block contains the first R records. The next
group of R records is in the second logical block, an so on…
 Access to record number N is done by first extracting logical bloc
number N/R from secondary memory and bringing it into main
memory
17  This is the logical block containing the desired record
Direct and Sequential Access

 Not all OS’s offer both sequential and


direct access
 Easy to simulate sequential access with
direct access
• Maintain a pointer cp that indicates the
current position in a file.
 The reverse is very difficult and inefficient.

18
Indexed Files
 It is a direct access method performed by using an index
 Heavily used by DBMS
 We need to use two files (per data file):
 A relative file : a direct access file containing the data (in
logical blocks)
 An index file : containing the indexes
 An index consists of a key and a pointer
 The pointer has for value the logical block number
containing the record identified by the key
 The key consists of one the fields in record of the relative
file. Its value provide a unique identifier for each record.
• It is not permitted to have two different records of the same
file having the same key value
• Example of key: social insurance number
19
Social Security Log Rec Num Social Security Last Name First Name Age

Indexed
Files 222 333 444 Smith Bob 45

(cont.)
222 333 444

Index File Relative File

 The index file is an ordered list (by key value)


 We simply perform a binary search whenever the index file can
fit entirely into main memory
 If not, then we can create an index for the index file!
• The primary index file is first consulted to find the logical block
of the secondary index file containing the desired key value
• That logical block is then searched to find the logical block
containing the desired record
20
Indexed Files (continued)

An index can also lead to a point close to the
desired record.
 The keys in the index file contains references
(pointers) to certain important points in the relative
file (for e.g., beginning of names that start with the letter S,
beginning of the Smiths, beginning of serial numbers that
start with 8)
 The index file provides the means to access rapidly
a point in the relative file; the search then continues
from that point.
 Such a file could also index another index file, i.e.
serves as the primary file described in the previous
slide.

21
Example of Indexing

Points to the start of the Smiths (there are many)


The relative file must support both direct access and sequential access.
22
File
Systems
Files Implementation

Structure
Directories

Operations Organization

Types
Operations
Access Directory
Methods Tree
Sequential
Mounting
Protection
Direct
Indexed
23
Directory Structure
 A collection of nodes containing information about all files

Directory

Files
F1 F2 F4
F3
Fn

Both the directory structure and the files reside on disk

24
Typical Organization of a File-System

25
Information in a Directory

 File names
 Type
 Location on disk (or other device)
 Current length
 Maximum length
 Date last accessed
 Date last modified
 Owner
 Protection
26
Operations Performed on Directory

What are the operations performed on a


directory?
 Search for a file
 Create a file
 Delete a file
 Rename a file
 List a directory
 Create/Delete/Rename directory
 Traverse the file system
27
Why do we use directories?
 Efficiency – locating a file quickly
 Naming – convenient to users
 Two users can have same name for
different files
 The same file can have several different
names
 Grouping – logical grouping of files by
properties, (e.g., all Java programs, all
games, …)

Each directory holds directory entries for the


files it contains
 We will discuss this in detail when we will

28 talk about file system implementations


Single-Level Directory
 A single directory for all users

Any problems?
With many files becomes total mess
Naming conflicts between users

Primitive and not practical
29
Two-Level Directory
 Separate directory for each user

 Path name
 Can have the same file name for different user
 Efficient searching
 No grouping capability

30
Tree-Structured Directories

31
Tree-Structured Directories (Cont)

 Efficient searching

 Grouping Capability

 Useful concept: current/working directory


 cd /spell/mail/prog
 type list

32
Tree-Structured Directories (Cont)
 Absolute or relative path name
 Creating a new file is done in current directory
 Delete a file
rm <file-name>
 Creating a new subdirectory is done in current
directory
mkdir <dir-name>
Example: if in current directory /mail
mkdir count

mail

prog copy prt exp count

Deleting “mail”  deleting the entire subtree rooted by “mail”


33
Acyclic-Graph Directories
 Idea: Have shared subdirectories and files

34
Acyclic-Graph Directories (Cont.)
 How do we achieve sharing?
 Have a special directory entry, not containing the file
attributes, but containing the name of another file
• Called symbolic (soft) link
 Having two different directory entries that reference the
same file attribute information (i.e. the file control block –
FCB)
• Called hard link (FCB contains a count of links to the file)

 What happens when a file being referenced by a symbolic


link is deleted?
 Dangling pointer
 What happen when a file referenced by a hard link is
deleted?
 The link is deleted, other directory entries can still be
used via the remaining hard links

35
General Graph Directory

36
General Graph Directory (Cont.)
 Adding links to subdirectory leads to many complications,
we really want to avoid it
 The tree-structured (easily searched) in the directory is
destroyed
• When searching the directory, can search sub-directories
multiple times.
 Cycles in the graph can occur.
• When searching the directory, can end up with looping
(infinite searching).
 When a link to a directory is removed, how to release the
files contained in the subdirectory (link count is non
zero)
 But how to allow or omit cycles?
 Allow only links to files, not subdirectories
 Every time a new link to a subdirectory is added use a
cycle detection algorithm to determine whether it is OK
• Complex and costly.
 Allow only symbolic links to subdirectories and do not
follow symbolic links
 Use garbage collection to clean up the directory when a
subdirectory is deleted (this is expensive).
37
Combining Several File Systems
File system
 Directory tree residing on the particular
device/partition
Why combining?
 Several HD partitions, floppy/ZIP disks, CDROM,
network disks
 Uniform view/access to them
How?
 Mount a file system into particular node of the
directory tree
 Windows: 2 level system – automatically mounts
into drive letters
 Unix: explicit mount operation, can mount
anywhere
38
File System Mounting

 A file system must be mounted


before it can be accessed
 A unmounted file system (i.e. Fig.
11-11(b)) is mounted at a mount
point

39
(a) Existing. (b) Unmounted Partition

40
Mount Point

41
Protection
 File owner/creator should be able to control:
 what can be done to the file
 by whom

 Types of access
 Read
 Write
 Execute
 Append
 Delete
 List (list names and attributes in a subdirectory)

42
Access Lists and Groups - UNIX
 Mode of access: read, write, execute
 Three classes of users
RWX
a) owner access 7  111
RWX
b) group access 6  110
RWX
c) public access 1  001
 Ask manager to create a group (unique name), say G, and add
some users to the group.
 For a particular file (say game) or subdirectory, define an
appropriate access.
owner group public

chmod 761 game

Attach a group to a file


chgrp G game
43
File
Systems
Files Implementation
Directory
Structure Structure Implementation
Directories
On Disk VFS
Operations In Memory
Organization Allocation

Linked Contiguous
Types
Operations FAT Extent
Access Directory
Methods Indexed
Tree
Free Space Other
Sequential Management
Mounting
Protection Caching Efficiency and
Direct Performance
Indexed Consistency
44 and recovery
Layered File System Implementation

High level data management, mounted


volumes, directories…

Translates logical file address (offset) to


physical address (block on disc), file
allocation, free space management

Generic commands to the device driver


to read/write blocks

Device driver

45
On-Disc File System Structures
What FS data are stored on the disc?
 Secondary memory is subdivided into blocs an
each I/O operation is performed in terms of blocs
(to be studied later).

 Directory structure
 Information about each file
 File control blocks/inodes (FCB)
 Per volume (partition)
 Boot control block
• Contains the code for starting the OS (can be empty)
 Volume control block (also called the
superblock)
• # of blocks, size of blocks, # of free blocks, pointer to
free blocks
46
In-Memory File System Structures

What FS data are stored in main memory?


 Information about each mounted volume
 Cache of the FS data, esp. directory
structure
 System-wide open file table
 FCB of each open file, file locks
 Per-process open file table
 File pointer, access rights

47
In-Memory File System Structures

48
Implementation
Directory
Structure Implementation

On Disk VFS
In Memory
Allocation

Linked Contiguous

FAT Extent

Indexed
Free Space Other
Management
Caching Efficiency and
Performance
Consistency
and recovery
49
Virtual File System

50
Virtual File Systems
 Virtual File Systems (VFS) provide an object-oriented way
of implementing file systems.
 The API is to the VFS interface, rather than any specific
type of file system.

Example:
 In Linux there are four main object types:
 inode (info about individual file)
 file (open file)
 superblock (entire file system of one volume)
 dentry (individual directory entry)
 Each of them is required to implement the needed
methods, but different devices/FSs can implement them
differently

51
Unix Kernel I/O Structures

52
Directory Implementation
What does a directory contain?
 The directory consists of a collection of entries that associate
“names” to the files (and also subdirectories) represented by the
FCB’s
 Must be organized in a tree-structure
 Hard links – entries that reference FCB’s
 Soft links – entries that reference other entries.

How to store the directory entries for the files within one directory?
 Linear list of file names with pointer to the data blocks.
 simple to program
 Problems: time-consuming to perform search

 Hash Table – linear list with hash data structure.


 decreases directory search time
 Problems: fixed size

Note: As each opening of a file involves many directory


53
searches, they have to be really efficient.
Directory Implementation
What should a directory entry contain?
 file name (must be)
 + all other info in the FCB (DOS)
 or pointer to the FCB/inode containing the info
 or symbolic link – pointer to another directory entry
 might have a flag indicating that this is a mounting
point, and a pointer to the entry in the mount table
with the info about the volume mounted here

Note: DOS used fixed length field (8+3 bytes) within


the directory entry to store the filename+extension
54
Implementation
Directory
Structure Implementation

On Disk VFS
In Memory
Allocation

Linked Contiguous

FAT Extent

Indexed
Free Space Other
Management
Caching Efficiency and
Performance
Consistency
and recovery
55
Allocation Methods
Before we start:
 The disc is block-oriented device, basic unit of access is a
sector of (almost always) 512 bytes
 The FS might for efficiency reasons group several sectors
into blocks (e.g. of size 8kb) and use those as basic units
 Notion of clusters equivalent to the block.
How to store a file on a disc?
 Good news:
 The ideas from memory management approaches
are being reused
 Bad news:
 There are differences
So, how to allocate a space on disc for a file?
 Contiguous allocation
 Linked allocation
 Indexed allocation
56
Contiguous Allocation
Idea: Each file occupies a set of contiguous blocks on
the disk
How much to allocate when a file is created?
 difficult to guess in general
 too little – difficult to grow, to much – wasting
space
How to grow a file?
 into preallocated free space, free adjacent blocks,
relocate into bigger hole of free blocks
Allocation
What information is needed to allow translation of the
logical to physical address? Linked Contiguous
 initial block and size, in the FCB FAT Extent

Indexed
57 Free Space
Management
Contiguous Allocation
Benefits:
 Simple to implement
 ph. block = start block + logical
address/block size
 offset = logical address modulo block size
 Efficient random access
 Good locality of reference

Drawbacks:
 Wasteful of space (fragmentation)
 Files cannot grow
 Cannot easily add to data in the middle of
the file
 Needs compaction
58
Contiguous Allocation Example

59
Extent-Based Systems

 Many newer file systems (I.e. Veritas File


System) use a modified contiguous
allocation scheme
 Extent-based file systems allocate disk
blocks in extents
 An extent is a contiguous set of blocks on
disk
 Extents are allocated to files
 A file consists of one or more extents.
 Analogous to segments in memory
management
60
Linked Allocation
Idea: Each block of the file contains a pointer pointing to the
next block of that file. The directory entry of the file points
to the first block of the file.
Benefits:
 simple
 no external fragmentation
 easy to grow files (a free block anywhere on the disc is
linked to the end of the file)
Drawbacks:
 What happens if you want to reach the end of the file?
 Has to read it all, to follow the pointers Allocation
 Very inefficient random access
Linked Contiguous
 Wasted space for pointers
 Using larger bloc size helps FAT Extent
 Poor locality of reference (the file might be spread over all
Indexed
disc, so even reading it sequentially might be inefficient)
61 Free Space
Management
Linked Allocation

62
Linked Allocation + FAT
Idea:
 Store the pointers to the next block separately in a special
table (called File Allocation Table - FAT)
 One table for all files is sufficient
Benefits:
 The table or at least significant part of it might fit into
memory
 Seeking within the file now means traversing the pointers in
the memory, not on the disc
Note: Used in DOS, OS/2, and other Windows OS’s
FAT size vs cluster size:
 # of blocks on disc = FAT entries * cluster size
 With large discs, FAT16 (216 entries) must use large clusters
 Lots of internal fragmentation with small files
 FAT32 solves this problem (but the FAT itself is quite beefy)
63
File-Allocation Table

64
Exercise How does the FAT table look, assuming
FAT DISC each block can hold only 4 entries
and only the following files are in the
0
system?
1 FAT
2 File A = abcdefghijk
3
4  2 File B = ABCDEFGHIJKLMNOP
5 efgh 3 File C = 
6 ijk 4
7 MNOP 5 Directory entries:
File A: start block: length:
8 6
File B: start block: length:
9 abcd 7
File C: start block: length:
10 EFGH 8
11 IJKL 9
12 ABCD 10 Assume the FAT is stored in blocks 0..3
13 11 Block numbers start at 2 in the Data
14  12 Region of the file system
15 13

65
Indexed allocation – similar to paging

 All pointers to blocs (bloc numbers) are


placed in a table (index bloc).

Allocation
index table
Linked Contiguous

FAT Extent

Indexed
66 Free Space
Management
Indexed Allocation
Idea: Use an index block to store all the block pointers of the file

67
Indexed Allocation (Cont.)
Benefits:
 easy random access
 no external fragmentation
Drawbacks:
 need index table blocks
 poor locality of reference
What if the file needs more blocks than the index block can
reference?
 i.e. 512 byte block with 256 entries, 256x512  maximum
files size of 128 Kbytes
 Implement Index Table as
 linked list of blocks
• poor for random access
 hierarchical index tables
• too much overhead for small files
 combined scheme
68
Hierarchical Indexed Allocation

outer-index

index table file

69
Indexed Allocation Used in Unix
 Each directory entry contains the address of an index node
called inode
 The inode contains info about the type of file, its
restrictions,…, and 13 addresses (block numbers)
 The first 10 addresses point directly on the first 10
blocks of the file
 If the file needs more blocks, then…
• The11th address (single indirect) points to a block
containing pointers to up to 256 additional blocks
 If the file needs more blocks, then…
• The 12th address (double indirect) points to a block
containing pointers to 256 blocks of additional pointers
 If the file needs more blocks, then…
• Triple indirection (see next figure)
70
Combined Scheme – UNIX inode

71
Exercise Same exercise, but with simple indexed
DISC allocation.

0 File A = abcd efgh ijk


1 File B = ABCD EFGH IJKL MNOP
2
File C = 
3
4  How do the index tables of files A,B,C look?
5 efgh (We have to also choose where on disc
6 ijk are those tables located.)
7 MNOP
File A index table: at block:
8
File B index table: at block:
9 abcd
File C index table: at block:
10 EFGH
11 IJKL
12 ABCD
13
14 
15

72
Exercise 2 – Allocating Larger Files
Assume each index block can hold at most 4 index entries.
The blocks containing file A: 4..12, 62..65, 35 (14 blocks)
How is file A allocated?
Hierarchical allocation:

directory, Name = A,master


in block 20: index block = 17

Block 17: 2,1,13,14


How many blocks need to be accessed
to read position 46 in the file?
Block 2:
46=11x4+2
Which blocks will be accessed? Block 1:
Assume 4 bytes/block.
Block 13:

Block 14:
73
Exercise 2 – Representing Larger Files
Assume each index block can hold at most 4 index entries.
The blocks containing file A: 4..12, 62..65, 35 (14 blocks)
How is file A allocated?
Unix like allocation:

directory, Name = A, inode


in block 20: block = 17

inode, in
block 17: ?,?,1,2

Block 1: Block 13:

Block 14:
Block 2: 13,14,15,16
Block 15:

Block 16:
74
Indexed Allocation with variable-size
portions
Directory

75
Free-Space Management
 When creating/growing file, we need free
blocs.
 How to find free blocs in fast and efficient
manner?
 Have a data structure storing them:
• Bit Vector (Map)
• Ex: Windows 2000, MacOS, Linux… Allocation

Contiguous
• Linked List Linked

• Ex: Unix SVR4, Windows 9x, MS-DOSFAT Extent

Indexed
76 Free Space
Management
Bit Vector (or Bit Map)
 Each existing block is represented by a bit
 The bit is 1 if the block is free
 The bit is 0 if the block is allocated to a file
 The number of bits of the vector = number of existing
blocks
 Example of Bit Vector where blocks 3, 4, 5, 9, 10, 15, 16 are
free:
 00011100011000011…
 If the whole bit vector fits into memory, we can reasonably
fast locate the first 1, representing a free block
 First find the first non-zero byte
 Then the first 1 on that byte
• Most processors provide HW support for that
77
Bit Vector (or Bit Map)
Can we expect to have the whole bit vector in memory?
Let’s see:
 160GB harddrive, 512 byte bloc = 320 Mbits = 40Mbytes
 Ouch!
 With 16kb bloc, its still over 1Mbyte
 Sequential search in the bitmap for the first non-0 word will not do
any more, need smarter algorithms/data structures.
Can the bitmap of the free blocks be computed from the directory/file
mapping information?
 Yes, but we need to traverse the whole structure, whatever was
not used is free.
 So, is the bit vector stored only in memory or is it stored also
on HD?
• Well, do you want to scan the whole FS on each boot?
78
Linked List
Idea:
 Each free block points to the next free
block
 A designated place on the disc stores the
first free block (also cached in memory)
Bad news:
 Scanning the free blocks is very costly
(lots of disc I/O)
Good news:
 Usually we just want a free block – we get
it immediately from the head of the list,
just have to move the head
Problem:
 Want several blocks
Solution:
 Have a list of contiguous blocks of free
blocks: head -> (2,4) -> (8,6) ->(17,2)->(25,3)
79
FAT
What structure do we need for
representing free blocks if we use FAT?
 The FAT itself works a bit as a bit map:
 An entry containing 0 means that this
block is empty.

80
Implementation
Directory
Structure Implementation

On Disk VFS
In Memory
Allocation

Linked Contiguous

FAT Extent

Indexed
Free Space Other
Management
Caching Efficiency and
Performance
Consistency
and recovery
81
Efficiency and Performance
How much of the disc’s memory is wasted/unusable?
 Why would be some disc space wasted?
 Internal fragmentation -> want smaller clusters
 FAT/inodes/bitmaps -> want larger clusters
 Fixed width directory entries -> small = too limiting, large = waste
 Overhead for supporting variable-width fields (e.g. for file names)

 What happens if all meta-data information is at the beginning of the


disc?
 Accessing/creating file involves lots of disc head movement
 You want to have the meta-data (inode, directory entry) about the
file close to it
 What kind of data to keep?
 Writing into file means updating not only the file itself, but also
meta-data: last modification time
 Should we maintain last access time?
• Also reading would involve write into meta-data

82
Efficiency and Performance
What is a universal solution for slow memory? (Note that the
disc is secondary memory)
 Caching
Ideas:
 Dedicate part of the main memory for caching frequently
used blocks (exploits temporal locality)
 Can also use read-ahead, betting that the file is accessed
sequentially (exploits spatial locality)
 Might use free-behind (releasing the just read bloc as soon
as the next bloc is requested) – again betting on sequential
access)
 Unified Virtual Memory, unified buffer cache

Other speedup techniques:


 Synchronous vs asynchronous writes

83
Implementing Caching
Caching for what?
 open/read/write accesses
(buffer cache)
 memory-mapped files (page
cache)
Can have separate caches for
each of them
 but leads to double caching
and inefficiencies
Better to have a unified cache for
both disk blocks and pages
 Use the page cache for both -
unified virtual memory
 Unified buffer cache

84
Consistency and Recovery
When should we write the meta-data to the disc?
 Whenever they change (i.e. if file is modified,
update the inode with the new modification time)
 Slow (additional disc access, possibly involving
head movement)
 Wasteful (lots of writes, do it only once when the
write burst finishes)
 On system shutdown
 Minimizes disc traffic
 But what if abnormal (power off, crash)
termination?
• Inconsistent data on the disc

85
Consistency and Recovery
What to do?
 Write the meta-data less frequently
 Always writing on each read/write is unacceptable from
performance point of view
 Still might need to always write the file creation/rename/growth
 But that means we must be able to deal with the inconsistencies

Consistency checking
 compares the data in the directory structure/FAT/other meta-data
structures with the data blocks on the disc and tries to detect and
fix inconsistencies
 fsck in UNIX, chkdsk in DOS

Making back-ups is always good idea


 Not always 100% possible to recover the data
 Especially if HW failure

86
Example: Free-Space Management
Special care must be taken with the order of operations
 First write the meta-data change, then perform the
change, then update the in-memory tables
Consider free-space management using bitmap
 We want to have the bitmap in the memory for fast
searches and updates
 Cannot allow for block[i] to have a situation where
bit[i] = 1 in memory and bit[i] = 0 on disk
• If crash comes, the allocated block will be lost
 Solution:
• Set bit[i] = 1 in disk
• Allocate block[i]
• Set bit[i] = 1 in memory

87
Log Structured File Systems
What do we want to achieve with log structured FS?
 To implement crash-tolerant file system in an efficient way.
Idea:
 Log structured (or journaling) file systems record each update to
the file system as a transaction
 All transactions are written to a log on the disc
 Efficient write, as the log file is accessed sequentially
 A transaction is considered committed once it is written to the
log
 However, the file system data on the disc may not yet be
updated
 The transactions in the log are asynchronously written to the file
system
 When the file system is modified, the transaction is removed
from the log
 What to do after a system crash?
 Read the log and perform all transactions remaining there
88
File
Systems
Files Implementation
Directory
Structure Structure Implementation
Directories
On Disk VFS
Operations In Memory
Organization Allocation

Linked Contiguous
Types
Operations FAT Extent
Access Directory
Methods Indexed
Tree
Free Space Other
Sequential Management
Mounting
Protection Caching Efficiency and
Direct Performance
Indexed Consistency
89 and recovery

You might also like