Unit Iv Os
Unit Iv Os
FILE
MANAGEMEN
T SYSTEM.
File Systems
File system is the part of the operating system which is responsible for file
management. It provides a mechanism to store the data and access to
the file contents including data and programs. Some Operating systems
treats everything as a file for example Ubuntu.
o File Structure
Whenever a file gets deleted from the hard disk, there is a free
space created in the disk. There can be many such spaces which
need to be recovered in order to reallocate them to other files.
The major concern about the file is deciding where to store the files
on the hard disk. There are various disks scheduling algorithm
which will be covered later in this tutorial.
A File may or may not be stored within only one block. It can be
stored in the non contiguous blocks on the disk. We need to keep
track of all the blocks on which the part of the files reside
• Secondary storage is less expensive when compared to primary memory like RAMs.
• The speed of the secondary storage is also lesser than that of primary storage.
• Hence, the data which is less frequently accessed is kept in the secondary storage.
• A few examples are magnetic disks, magnetic tapes, removable thumb drives etc.
A magnetic disk contains several platters. Each platter is divided into circular
shaped tracks. The length of the tracks near the centre is less than the length of the
tracks farther from the centre. Each track is further divided into sectors, as shown in
the figure.
Tracks of the same distance from centre form a cylinder. A read-write head is used
to read data from a sector of the magnetic disk.
The speed of the disk is measured as two parts:
• Transfer rate: This is the rate at which the data moves from disk to the computer.
• Random access time: It is the sum of the seek time and rotational latency.
Seek time is the time taken by the arm to move to the required track. Rotational
latency is defined as the time taken by the arm to reach the required sector in the
track.
Even though the disk is arranged as sectors and tracks physically, the data is
logically arranged and addressed as an array of blocks of fixed size. The size of a
block can be 512 or 1024 bytes. Each logical block is mapped with a sector on the
disk, sequentially. In this way, each sector in the disk will have a logical address.
Sequential Access
Most of the operating systems access the file sequentially. In other words, we can say
that most of the files need to be accessed sequentially by the operating system.
In sequential access, the OS read the file word by word. A pointer is maintained which
initially points to the base address of the file. If the user wants to read first word of the
file then the pointer provides that word to the user and increases its value by 1 word.
This process continues till the end of the file.
Modern word systems do provide the concept of direct access and indexed access but
the most used method is sequential access due to the fact that most of the files such as
text files, audio files, video files, etc need to be sequentially accessed.
Direct Access
The Direct Access is mostly required in the case of database systems. In most of the
cases, we need filtered information from the database. The sequential access can be
very slow and inefficient in such cases.
Suppose every block of the storage stores 4 records and we know that the record we
needed is stored in 10th block. In that case, the sequential access will not be
implemented because it will traverse all the blocks in order to access the needed record.
Direct access will give the required result despite of the fact that the operating system
has to perform some complex tasks such as determining the desired block number.
However, that is generally implemented in database applications.
Indexed Access
If a file can be sorted on any of the filed then an index can be assigned to a group of
certain records. However, A particular record can be accessed by its index. The index is
nothing but the address of a record in the file.
In index accessing, searching in a large database became very quick and easy but we
need to have some extra space in the memory to store the index value.
Directory Structure
What is a directory?
Directory can be defined as the listing of the related files on the disk. The directory may
store some or the entire file attributes.
To get the benefit of different file systems on the different operating systems, A hard
disk can be divided into the number of partitions of different sizes. The partitions are
also called volumes or mini disks.
Each partition must have at least one directory in which, all the files of the partition can
be listed. A directory entry is maintained for each file in the directory which stores all the
information related to that file.
A directory can be viewed as a file which contains the Meta data of the bunch of files.
Advantages
1. Implementation is very simple.
2. If the sizes of the files are very small then the searching becomes faster.
3. File creation, searching, deletion is very simple since we have only one directory.
Every Operating System maintains a variable as PWD which contains the present
directory name (present user name) so that the searching can be done appropriately.
Each user has its own directory and it cannot enter in the other user's directory.
However, the user has the permission to read the root's data but he cannot write or
modify this. Only administrator of the system has the complete access of root directory.
Searching is more efficient in this directory structure. The concept of current working
directory is used. A file can be accessed by two types of path, either relative or absolute.
Absolute path is the path of the file with respect to the root directory of the system while
relative path is the path with respect to the current working directory of the system. In
tree structured directory systems, the user is given the privilege to create the files as
well as directories.
File System Structure
File System provide efficient access to the disk by allowing data to be stored, located
and retrieved in a convenient way. A file System must be able to store the file, locate
the file and retrieve the file.
Most of the Operating Systems use layering approach for every task including file
systems. Every layer of the file system is responsible for some activities.
The image shown below, elaborates how the file system is divided in different layers, and
also the functionality of each layer.
o When an application program asks for a file, the first request is directed to the
logical file system. The logical file system contains the Meta data of the file and
directory structure. If the application program doesn't have the required
permissions of the file then this layer will throw an error. Logical file systems also
verify the path to the file.
o Generally, files are divided into various logical blocks. Files are to be stored in the
hard disk and to be retrieved from the hard disk. Hard disk is divided into various
tracks and sectors. Therefore, in order to store and retrieve the files, the logical
blocks need to be mapped to physical blocks. This mapping is done by File
organization module. It is also responsible for free space management.
o Once File organization module decided which physical block the application
program needs, it passes this information to basic file system. The basic file
system is responsible for issuing the commands to I/O control in order to fetch
those blocks.
o I/O controls contain the codes by using which it can access hard disk. These
codes are known as device drivers. I/O controls are also responsible for handling
interrupts.
Allocation Methods
There are various methods which can be used to allocate disk space to the files.
Selection of an appropriate allocation method will significantly affect the performance
and efficiency of the system. Allocation method provides a way in which the disk will be
utilized and the files will be accessed.
Contiguous Allocation
If the blocks are allocated to the file in such a way that all the logical blocks of the file
get the contiguous physical block in the hard disk then such allocation scheme is known
as contiguous allocation.
In the image shown below, there are three files in the directory. The starting block and
the length of each file are mentioned in the table. We can check in the table that the
contiguous blocks are assigned to each file as per its need.
Advantages
1. It is simple to implement.
2. We will get Excellent read performance.
3. Supports Random Access into files.
Disadvantages
1. The disk will become fragmented.
2. It may be difficult to have a file grow.
Disadvantages
1. Random Access is not provided.
2. Pointers require some space in the disk blocks.
3. Any of the pointers in the linked list must not be broken otherwise the file will get
corrupted.
4. Need to traverse each block.
File Allocation Table overcomes this drawback of linked list allocation. In this scheme, a
file allocation table is maintained, which gathers all the disk block links. The table has
one entry for each disk block and is indexed by block number.
File allocation table needs to be cached in order to reduce the number of head seeks.
Now the head doesn't need to traverse all the disk blocks in order to access one
successive block.
It simply accesses the file allocation table, read the desired block entry from there and
access that block. This is the way by which the random access is accomplished by using
FAT. It is used by MS-DOS and pre-NT Windows versions.
Advantages
1. Uses the whole disk block for data.
2. A bad disk block doesn't cause all successive blocks lost.
3. Random access is provided although its not too fast.
4. Only FAT needs to be traversed in each file operation.
Disadvantages
1. Each Disk block needs a FAT entry.
2. FAT size may be very big depending upon the number of FAT entries.
3. Number of FAT entries can be reduced by increasing the block size but it will also
increase Internal Fragmentation.
If the block is empty then the bit is 1 otherwise it is 0. Initially all the blocks are empty
therefore each bit in the bit map vector contains 1.
LAs the space allocation proceeds, the file system starts allocating blocks to the files and
setting the respective bit to 0.
2. Linked List
It is another approach for free space management. This approach suggests linking
together all the free blocks and keeping a pointer in the cache which points to the first
free block.
Therefore, all the free blocks on the disks will be linked together with a pointer.
Whenever a block gets allocated, its previous free block will be linked to its next free
block.
Directory Implementation
There is the number of algorithms by using which, the directories can be implemented.
However, the selection of an appropriate directory implementation algorithm may
significantly affect the performance of the system.
The directory implementation algorithms are classified according to the data structure
they are using. There are mainly two algorithms which are used in these days.
1. Linear List
In this algorithm, all the files in a directory are maintained as singly lined list. Each file
contains the pointers to the data blocks which are assigned to it and the next file in the
directory.
Characteristics
1. When a new file is created, then the entire list is checked whether the new file
name is matching to a existing file name or not. In case, it doesn't exist, the file
can be created at the beginning or at the end. Therefore, searching for a unique
name is a big concern because traversing the whole list takes time.
2. The list needs to be traversed in case of every operation (creation, deletion,
updating, etc) on the files therefore the systems become inefficient.
2. Hash Table
To overcome the drawbacks of singly linked list implementation of directories, there is an
alternative approach that is hash table. This approach suggests to use hash table along
with the linked lists.
A key-value pair for each file in the directory gets generated and stored in the hash
table. The key can be determined by applying the hash function on the file name while
the key points to the corresponding file stored in the directory.
Now, searching becomes efficient due to the fact that now, entire list will not be
searched on every operating. Only hash table entries are checked using the key and if an
entry found then the corresponding file will be fetched using the value.
Disk Scheduling
As we know, a process needs two type of time, CPU time and IO time. For I/O, it
requests the Operating system to access the disk.
However, the operating system must be fare enough to satisfy each request and at the
same time, operating system must maintain the efficiency and speed of process
execution.
The technique that operating system uses to determine the request which is to be
satisfied next is called disk scheduling.
Seek Time
Seek time is the time taken in locating the disk arm to a specified track where the
read/write request will be satisfied.
Rotational Latency
It is the time taken by the desired sector to rotate itself to the position from where it can
access the R/W heads.
Transfer Time
It is the time taken to transfer the data.
The purpose of the swapping in operating system is to access the data present in the
hard disk and bring it to RAM so that the application programs can use it. The thing to
remember is that swapping is used only when data is not present in RAM.
Although the process of swapping affects the performance of the system, it helps to run
larger and more than one process. This is the reason why swapping is also referred to as
memory compaction.
The concept of swapping has divided into two more concepts: Swap-in and Swap-out.
o Swap-out is a method of removing a process from RAM and adding it to the hard
disk.
o Swap-in is a method of removing a program from a hard disk and putting it back
into the main memory or RAM.
Example: Suppose the user process's size is 2048KB and is a standard hard disk where
swapping has a data transfer rate of 1Mbps. Now we will calculate how long it will take
to transfer from main memory to secondary memory.
Advantages of Swapping
1. It helps the CPU to manage multiple processes within a single main memory.
2. It helps to create and use virtual memory.
3. Swapping allows the CPU to perform multiple tasks simultaneously. Therefore,
processes do not have to wait very long before they are executed.
4. It improves the main memory utilization.
Disadvantages of Swapping
1. If the computer system loses power, the user may lose all information related to
the program in case of substantial swapping activity.
2. If the swapping algorithm is not good, the composite method can increase the
number of Page Fault and decrease the overall processing performance.
Note:
o In a single tasking operating system, only one process occupies the user program
area of memory and stays in memory until the process is complete.
o In a multitasking operating system, a situation arises when all the active
processes cannot coordinate in the main memory, then a process is swap out
from the main memory so that other processes can enter it.