0% found this document useful (0 votes)
16 views

Chapter 5-OS-FileSystems

Uploaded by

Shubham Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Chapter 5-OS-FileSystems

Uploaded by

Shubham Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

Chapter 5

File Systems
Files
⮚ Processes (threads), address spaces, files are the most
important concepts in OS

⮚ Files are logical units of information created by processes


– Similar to kind of address space

⮚ A file is a collection of correlated information which is


recorded on secondary or non-volatile storage like
magnetic disks, optical disks, and tapes.
Files contd..
⮚ It is a method of data collection that is used as a medium for
giving input and receiving output from that program.

⮚ A file is a sequence of bits, bytes, or records whose meaning is


defined by the file creator and user.

⮚ Every File has a logical location where they are located for
storage and retrieval.
File system

⮚ File system is the part of the operating system which is


responsible for file management.
⮚ It provides a mechanism to store the data and access to
the file contents including data and programs.
⮚ Some Operating systems treats everything as a file for
example Ubuntu.
⮚ Manage files: how they are structured, named, accessed,
used, protected, implemented, etc…
File naming
• Files are abstraction mechanism
⮚ To store information on the disk and read it back
⮚ When a process creates a file, it gives the file a name;
and the file can be accessed by the name
• Two-part file name
⮚ File extension: indicating characteristics of file
⮚ In Unix, file extension is just convention; C compiler is
exception
⮚ In windows, file extensions specify which program
“owns” that extension; when double clicking, program
assigned to it is launched
File Naming

Figure 4-1. Some typical file extensions.


File Structure

Figure 4-2. Three kinds of files. (a) Byte sequence.


(b) Record sequence. (c) Tree.
File Type

It refers to the ability of the operating system to differentiate various


types of files like text files, binary, and source files. However,
Operating systems like MS_DOS and UNIX has the following type of
files:
Regular Files:
– ASCII files or binary files
– ASCII consists of lines of text; can be displayed and printed

Character Special File


It is a hardware file that reads or writes data character by character, like
mouse, printer, and more.
File Type
Ordinary files
These types of files stores user information.
It may be text, executable programs, and databases.
It allows the user to perform operations like add, delete, and modify.
Directory Files
Directory contains files and other related information about those
files. Its basically a folder to hold and organize multiple files.
Special Files
These files are also called device files. It represents physical devices
like printers, disks, networks, flash drive, etc.
File Access

• File descriptor

– A file descriptor is a small integer representing a kernel-


managed object that a process may read from or write to

– Every process has a private space of file descriptors


starting at 0

– By convention, 0 is standard input, 1 is standard output,


and 2 is standard error.
File Access Methods
The way that files are accessed and read into memory is
determined by Access methods.
Usually a single access method is supported by systems while
there are OS's that support multiple access methods.

1. Sequential Access
•Data is accessed one record right after another is an order.
•Read command cause a pointer to be moved ahead by one.
•Write command allocate space for the record and move the
pointer to the new End Of File.
•Such a method is reasonable for tape.
File Access Methods contd..

2. Direct Access
⮚This method is useful for disks.
⮚The file is viewed as a numbered sequence of blocks or
records.
⮚There are no restrictions on which blocks are read/written, it
can be done in any order.
⮚User now says "read n" rather than "read next".
⮚"n" is a number relative to the beginning of file, not relative to
an absolute physical disk location.
File Access Methods contd..

3. Indexed Sequential Access


It is built on top of Sequential access.
It uses an Index to control the pointer while accessing files.
File Attributes

A file has a name and data. Moreover, it also stores meta


information like file creation date and time, current size, last
modified date, etc.
All this information is called the attributes of a file system.
File Attributes contd..
Here, are some important File attributes used in OS:
Name: It is the only information stored in a human-readable form.
Identifier: Every file is identified by a unique tag number within a file
system known as an identifier.
Location: Points to file location on device.
Type: This attribute is required for systems that support various
types of files.
Size. Attribute used to display the current file size.
Protection. This attribute assigns and controls the access rights of
reading, writing, and executing the file.
Time, date and security: It is used for protection, security, and also
used for monitoring
File Attributes

Figure 4-4a. Some possible file attributes.


File Operations
The most common system calls relating to files:

• Create • Append
• Delete • Seek
• Open • Get Attributes
• Close • Set Attributes
• Read • Rename
• Write
What is a directory?

⮚ Directory can be defined as the listing of the related files on the


disk. The directory may store some or the entire file attributes.
⮚ To get the benefit of different file systems on the different
operating systems, A hard disk can be divided into the number of
partitions of different sizes.
⮚ The partitions are also called volumes or mini disks.
⮚ Each partition must have at least one directory in which, all the
files of the partition can be listed.
⮚ A directory entry is maintained for each file in the directory which
stores all the information related to that file.
What is a directory?

⮚ A directory can be viewed as a file which contains the Meta


data of the bunch of files.
Structures of Directory
A directory is a container that is used to contain folders and file. It
organizes files and folders into a hierarchical manner.
Single-level directory
• Simplest directory structure.
• All files are contained in same directory which make it easy to
support and understand.
• Limitations arises when the number of files increases or when
the system has more than one user.
Two-level directory
⮚ A single level directory often leads to confusion of files names
among different users.
⮚ The solution to this problem is to create a separate directory
for each user.
⮚ In the two-level directory structure, each user has there
own user files directory (UFD).
Tree Structured directory
⮚ Once we have seen a two-level directory as a tree of height 2,
the natural generalization is to extend the directory structure to
a tree of arbitrary height.
This generalization allows the user to create there own
subdirectories and to organize on their files accordingly.
Directory Operations

System calls for managing directories:

• Create • Readdir
• Delete • Rename
• Opendir • Link
• Closedir • Unlink
Permissions on the file and directory
The permissions are R W X which are regarding reading, writing and the execution of the
files or directory. The permissions are assigned to three types of users: owner, group and
others.
File Group Everyone
Owner Owner Else

Write Read Execute


Permission Permissio Permissio
n n
File System Implementation

• Users:
– How files are named, what operations are allowed on them,
what the directory tree looks like

• Implementors
– How files and directories are stored, how disk space is
managed and how to make every thing work efficiently and
reliably
Master Boot Record (MBR)
⮚ Master boot record is the information present in the first sector of any
hard disk. It contains the information regarding how and where the
Operating system is located in the hard disk so that it can be booted in
the RAM.
⮚ MBR is sometimes called master partition table because it includes a
partition table which locates every partition in the hard disk.
⮚ Master boot record (MBR) also includes a program which reads the
boot sector record of the partition that contains operating system.
File System Layout
⮚ Due to the fact that the main memory is volatile, when we turn on our
computer, CPU
⮚ cannot access the main memory directly. However, there is a special
program called as BIOS stored in ROM is accessed for the first time by
the CPU.
⮚ BIOS contains the code, by executing which, the CPU access the very
first partition of hard disk that is MBR. It contains a partition table for all
the partitions of the hard disk.
⮚ Since, MBR contains the information about where the operating system
is being stored and it also contains a program which can read the boot
sector record of the partition, hence the CPU fetches all this information
and load the operating system into the main memory.
File System Layout
Superblock: contains all the key parameters about a file
system; read into memory the booted or the FS is used

Figure 4-9. A possible file system layout.


Directory Implementation

⮚ There is the number of algorithms by using which, the directories


can be implemented. However, the selection of an appropriate
directory implementation algorithm may significantly affect the
performance of the system.

⮚ The directory implementation algorithms are classified according


to the data structure they are using. There are mainly two
algorithms which are used in these days.
1. Linear List

⮚ In this algorithm, all the files in a directory are maintained as


singly linked list. Each file contains the pointers to the data
blocks which are assigned to it and the next file in the directory.

⮚ When a new file is created, then the entire list is checked whether
the new file name is matching to a existing file name or not.

⮚ In case, it doesn't exist, the file can be created at the beginning or


at the end. Therefore, searching for a unique name is a big
concern because traversing the whole list takes time.
1. Linear List contd..

⮚ The list needs to be traversed in case of every operation


(creation, deletion, updating, etc) on the files therefore the
systems become inefficient.
2. Hash Table

⮚ To overcome the drawbacks of singly linked list implementation


of directories, there is an alternative approach that is hash table.
This approach suggests to use hash table along with the linked
lists.

⮚ A key-value pair for each file in the directory gets generated and
stored in the hash table. The key can be determined by applying
the hash function on the file name while the key points to the
corresponding file stored in the directory.
2. Hash Table contd..
⮚ Now, searching becomes efficient due to the fact that now, entire list will not be
searched on every operating. Only hash table entries are checked using the key
and if an entry found then the corresponding file will be fetched using the value.
Contiguous Allocation
⮚ If the blocks are allocated to the file in such a way that all the logical blocks of the
file get the contiguous physical block in the hard disk then such allocation
scheme is known as contiguous allocation.
⮚ In the image shown below, there are three files in the directory. The starting block
and the length of each file are mentioned in the table. We can check in the table
that the contiguous blocks are assigned to each file as per its need.
Linked List Allocation
⮚ Each file is considered as the linked list of disk blocks.
⮚ However, the disks blocks allocated to a particular file need not to be contiguous
on the disk.
⮚ Each disk block allocated to a file contains a pointer which points to the next disk
block allocated to the same file.
File Allocation Table
⮚ The main disadvantage of linked list allocation is that the
Random access to a particular block is not provided. In order to
access a block, we need to access all its previous blocks.

⮚ File Allocation Table overcomes this drawback of linked list


allocation. In this scheme, a file allocation table is maintained,
which gathers all the disk block links. The table has one entry for
each disk block and is indexed by block number.

⮚ File allocation table needs to be cached in order to reduce the


number of head seeks. Now the head doesn't need to traverse all
the disk blocks in order to access one successive block.
File Allocation Table contd..

⮚ It simply accesses the file


allocation table, read the
desired block entry from there
and access that block.
⮚ This is the way by which the
random access is
accomplished by using FAT.
⮚ It is used by MS-DOS and pre-
NT Windows versions.
File Allocation Table contd..

Advantages
Advantages
⮚Uses the whole disk block for data.
⮚A bad disk block doesn't cause all successive blocks lost.
⮚Random access is provided although its not too fast.
⮚Only FAT needs to be traversed in each file operation.

Disadvantages

⮚Each Disk block needs a FAT entry.


⮚FAT size may be very big depending upon the number of FAT entries.
⮚Number of FAT entries can be reduced by increasing the block size but it will also
increase Internal Fragmentation.
Inode
⮚ In UNIX based operating systems, each file is indexed by an Inode.
⮚ Inode are the special disk block which is created with the creation of the
file system. The number of files or directories in a file system depends
on the number of Inodes in the file system.

An Inode includes the following information


⮚Attributes (permissions, time stamp, ownership details, etc) of the file
⮚A number of direct blocks which contains the pointers to first 12 blocks of
the file.
⮚A single indirect pointer which points to an index block. If the file cannot be
indexed entirely by the direct blocks then the single indirect pointer is used.
Inode contd..
⮚ A double indirect pointer which points to a disk block that is a collection of the
pointers to the disk blocks which are index blocks.
⮚ Double index pointer is used if the file is too big to be indexed entirely by the direct
blocks as well as the single indirect pointer.
⮚ A triple index pointer that points to a disk block that is a collection of pointers.
Each of the pointers is separately pointing to a disk block which also contains a
collection of pointers which are separately pointing to an index block that contains
the pointers to the file blocks
I-nodes

An example i-node.
Free Space Management
A file system is responsible to allocate the free blocks to the file therefore it
has to keep track of all the free blocks present in the disk.
There are mainly two approaches by using which, the free blocks in the disk
are managed.

1. Bit Vector
⮚In this approach, the free space list is implemented as a bit map vector. It
contains the number of bits where each bit represents each block.
⮚If the block is empty then the bit is 1 otherwise it is 0.
⮚ Initially all the blocks are empty therefore each bit in the bit map vector
contains 1.
⮚LAs the space allocation proceeds, the file system starts allocating blocks to
the files and setting the respective bit to 0.
Free Space Management

2. Linked List
⮚It is another approach for free space management. This approach
suggests linking together all the free blocks and keeping a pointer in
the cache which points to the first free block.
⮚Therefore, all the free blocks on the disks will be linked together with
a pointer.
⮚Whenever a block gets allocated, its previous free block will be linked
to its next free block.
Contiguous Allocation

Figure 4-10. (a) Contiguous allocation of disk space for 7 files.


(b) The state of the disk after files D and F have been removed.
Linked List Allocation

Figure 4-11. Storing a file as a linked list of disk blocks.


Linked List Allocation Using a Table in Memory

Figure 4-12. Linked list allocation using a file allocation table


in main memory.

You might also like