0% found this document useful (0 votes)
55 views22 pages

Unit Iv Os

The file system is responsible for file management and provides mechanisms to store and access file contents. It takes care of file structure, recovering free space, assigning disk space to files, and tracking data locations. A file may be stored across non-contiguous disk blocks, so the file system must keep track of the block locations.

Uploaded by

Shivam Tony
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views22 pages

Unit Iv Os

The file system is responsible for file management and provides mechanisms to store and access file contents. It takes care of file structure, recovering free space, assigning disk space to files, and tracking data locations. A file may be stored across non-contiguous disk blocks, so the file system must keep track of the block locations.

Uploaded by

Shivam Tony
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

UNIT IV

FILE
MANAGEMEN
T SYSTEM.
File Systems
File system is the part of the operating system which is responsible for file
management. It provides a mechanism to store the data and access to
the file contents including data and programs. Some Operating systems
treats everything as a file for example Ubuntu.

The File system takes care of the following issues

o File Structure

We have seen various data structures in which the file can be


stored. The task of the file system is to maintain an optimal file
structure.

o Recovering Free space

Whenever a file gets deleted from the hard disk, there is a free
space created in the disk. There can be many such spaces which
need to be recovered in order to reallocate them to other files.

o disk space assignment to the files

The major concern about the file is deciding where to store the files
on the hard disk. There are various disks scheduling algorithm
which will be covered later in this tutorial.

o tracking data location

A File may or may not be stored within only one block. It can be
stored in the non contiguous blocks on the disk. We need to keep
track of all the blocks on which the part of the files reside

Secondary Storage and Disk


Scheduling Algorithms
Secondary storage devices are those devices whose memory is non volatile,
meaning, the stored data will be intact even if the system is turned off. Here are a
few things worth noting about secondary storage.

• Secondary storage is also called auxiliary storage.

• Secondary storage is less expensive when compared to primary memory like RAMs.
• The speed of the secondary storage is also lesser than that of primary storage.

• Hence, the data which is less frequently accessed is kept in the secondary storage.

• A few examples are magnetic disks, magnetic tapes, removable thumb drives etc.

Magnetic Disk Structure


In modern computers, most of the secondary storage is in the form of magnetic
disks. Hence, knowing the structure of a magnetic disk is necessary to understand
how the data in the disk is accessed by the computer.

Structure of a magnetic disk

A magnetic disk contains several platters. Each platter is divided into circular
shaped tracks. The length of the tracks near the centre is less than the length of the
tracks farther from the centre. Each track is further divided into sectors, as shown in
the figure.
Tracks of the same distance from centre form a cylinder. A read-write head is used
to read data from a sector of the magnetic disk.
The speed of the disk is measured as two parts:

• Transfer rate: This is the rate at which the data moves from disk to the computer.

• Random access time: It is the sum of the seek time and rotational latency.

Seek time is the time taken by the arm to move to the required track. Rotational
latency is defined as the time taken by the arm to reach the required sector in the
track.
Even though the disk is arranged as sectors and tracks physically, the data is
logically arranged and addressed as an array of blocks of fixed size. The size of a
block can be 512 or 1024 bytes. Each logical block is mapped with a sector on the
disk, sequentially. In this way, each sector in the disk will have a logical address.

File Access Methods


Let's look at various ways to access files stored in secondary memory.

Sequential Access

Most of the operating systems access the file sequentially. In other words, we can say
that most of the files need to be accessed sequentially by the operating system.
In sequential access, the OS read the file word by word. A pointer is maintained which
initially points to the base address of the file. If the user wants to read first word of the
file then the pointer provides that word to the user and increases its value by 1 word.
This process continues till the end of the file.

Modern word systems do provide the concept of direct access and indexed access but
the most used method is sequential access due to the fact that most of the files such as
text files, audio files, video files, etc need to be sequentially accessed.

Direct Access
The Direct Access is mostly required in the case of database systems. In most of the
cases, we need filtered information from the database. The sequential access can be
very slow and inefficient in such cases.

Suppose every block of the storage stores 4 records and we know that the record we
needed is stored in 10th block. In that case, the sequential access will not be
implemented because it will traverse all the blocks in order to access the needed record.

Direct access will give the required result despite of the fact that the operating system
has to perform some complex tasks such as determining the desired block number.
However, that is generally implemented in database applications.
Indexed Access
If a file can be sorted on any of the filed then an index can be assigned to a group of
certain records. However, A particular record can be accessed by its index. The index is
nothing but the address of a record in the file.

In index accessing, searching in a large database became very quick and easy but we
need to have some extra space in the memory to store the index value.

Directory Structure
What is a directory?
Directory can be defined as the listing of the related files on the disk. The directory may
store some or the entire file attributes.

To get the benefit of different file systems on the different operating systems, A hard
disk can be divided into the number of partitions of different sizes. The partitions are
also called volumes or mini disks.

Each partition must have at least one directory in which, all the files of the partition can
be listed. A directory entry is maintained for each file in the directory which stores all the
information related to that file.

A directory can be viewed as a file which contains the Meta data of the bunch of files.

Every Directory supports a number of common operations on the file:


1. File Creation
2. Search for the file
3. File deletion
4. Renaming the file
5. Traversing Files
6. Listing of files

Single Level Directory


The simplest method is to have one big list of all the files on the disk. The entire system
will contain only one directory which is supposed to mention all the files present in the
file system. The directory contains one entry per each file present on the file system.

This type of directories can be used for a simple system.

Advantages
1. Implementation is very simple.
2. If the sizes of the files are very small then the searching becomes faster.
3. File creation, searching, deletion is very simple since we have only one directory.

Two Level Directory


In two level directory systems, we can create a separate directory for each user. There is
one master directory which contains separate directories dedicated to each user. For
each user, there is a different directory present at the second level, containing group of
user's file. The system doesn't let a user to enter in the other user's directory without
permission.
Characteristics of two level directory system
1. Each files has a path name as /User-name/directory-name/
2. Different users can have the same file name.
3. Searching becomes more efficient as only one user's list needs to be traversed.
4. The same kind of files cannot be grouped into a single directory for a particular
user.

Every Operating System maintains a variable as PWD which contains the present
directory name (present user name) so that the searching can be done appropriately.

Tree Structured Directory


In Tree structured directory system, any directory entry can either be a file or sub
directory. Tree structured directory system overcomes the drawbacks of two level
directory system. The similar kind of files can now be grouped in one directory.

Each user has its own directory and it cannot enter in the other user's directory.
However, the user has the permission to read the root's data but he cannot write or
modify this. Only administrator of the system has the complete access of root directory.

Searching is more efficient in this directory structure. The concept of current working
directory is used. A file can be accessed by two types of path, either relative or absolute.

Absolute path is the path of the file with respect to the root directory of the system while
relative path is the path with respect to the current working directory of the system. In
tree structured directory systems, the user is given the privilege to create the files as
well as directories.
File System Structure
File System provide efficient access to the disk by allowing data to be stored, located
and retrieved in a convenient way. A file System must be able to store the file, locate
the file and retrieve the file.

Most of the Operating Systems use layering approach for every task including file
systems. Every layer of the file system is responsible for some activities.

The image shown below, elaborates how the file system is divided in different layers, and
also the functionality of each layer.
o When an application program asks for a file, the first request is directed to the
logical file system. The logical file system contains the Meta data of the file and
directory structure. If the application program doesn't have the required
permissions of the file then this layer will throw an error. Logical file systems also
verify the path to the file.
o Generally, files are divided into various logical blocks. Files are to be stored in the
hard disk and to be retrieved from the hard disk. Hard disk is divided into various
tracks and sectors. Therefore, in order to store and retrieve the files, the logical
blocks need to be mapped to physical blocks. This mapping is done by File
organization module. It is also responsible for free space management.
o Once File organization module decided which physical block the application
program needs, it passes this information to basic file system. The basic file
system is responsible for issuing the commands to I/O control in order to fetch
those blocks.
o I/O controls contain the codes by using which it can access hard disk. These
codes are known as device drivers. I/O controls are also responsible for handling
interrupts.

Allocation Methods
There are various methods which can be used to allocate disk space to the files.
Selection of an appropriate allocation method will significantly affect the performance
and efficiency of the system. Allocation method provides a way in which the disk will be
utilized and the files will be accessed.

There are following methods which can be used for allocation.


1. Contiguous Allocation.
2. Extents
3. Linked Allocation
4. Clustering
5. FAT
6. Indexed Allocation
7. Linked Indexed Allocation
8. Multilevel Indexed Allocation
9. Inode

We will discuss three of the most used methods in detail.

Contiguous Allocation
If the blocks are allocated to the file in such a way that all the logical blocks of the file
get the contiguous physical block in the hard disk then such allocation scheme is known
as contiguous allocation.

In the image shown below, there are three files in the directory. The starting block and
the length of each file are mentioned in the table. We can check in the table that the
contiguous blocks are assigned to each file as per its need.
Advantages
1. It is simple to implement.
2. We will get Excellent read performance.
3. Supports Random Access into files.

Disadvantages
1. The disk will become fragmented.
2. It may be difficult to have a file grow.

Linked List Allocation


Linked List allocation solves all problems of contiguous allocation. In linked list
allocation, each file is considered as the linked list of disk blocks. However, the disks
blocks allocated to a particular file need not to be contiguous on the disk. Each disk block
allocated to a file contains a pointer which points to the next disk block allocated to the
same file.
Advantages
1. There is no external fragmentation with linked allocation.
2. Any free block can be utilized in order to satisfy the file block requests.
3. File can continue to grow as long as the free blocks are available.
4. Directory entry will only contain the starting block address.

Disadvantages
1. Random Access is not provided.
2. Pointers require some space in the disk blocks.
3. Any of the pointers in the linked list must not be broken otherwise the file will get
corrupted.
4. Need to traverse each block.

File Allocation Table


The main disadvantage of linked list allocation is that the Random access to a particular
block is not provided. In order to access a block, we need to access all its previous
blocks.

File Allocation Table overcomes this drawback of linked list allocation. In this scheme, a
file allocation table is maintained, which gathers all the disk block links. The table has
one entry for each disk block and is indexed by block number.

File allocation table needs to be cached in order to reduce the number of head seeks.
Now the head doesn't need to traverse all the disk blocks in order to access one
successive block.

It simply accesses the file allocation table, read the desired block entry from there and
access that block. This is the way by which the random access is accomplished by using
FAT. It is used by MS-DOS and pre-NT Windows versions.
Advantages
1. Uses the whole disk block for data.
2. A bad disk block doesn't cause all successive blocks lost.
3. Random access is provided although its not too fast.
4. Only FAT needs to be traversed in each file operation.

Disadvantages
1. Each Disk block needs a FAT entry.
2. FAT size may be very big depending upon the number of FAT entries.
3. Number of FAT entries can be reduced by increasing the block size but it will also
increase Internal Fragmentation.

Free Space Management


A file system is responsible to allocate the free blocks to the file therefore it has to keep
track of all the free blocks present in the disk. There are mainly two approaches by using
which, the free blocks in the disk are managed.
1. Bit Vector
In this approach, the free space list is implemented as a bit map vector. It contains the
number of bits where each bit represents each block.

If the block is empty then the bit is 1 otherwise it is 0. Initially all the blocks are empty
therefore each bit in the bit map vector contains 1.

LAs the space allocation proceeds, the file system starts allocating blocks to the files and
setting the respective bit to 0.

2. Linked List
It is another approach for free space management. This approach suggests linking
together all the free blocks and keeping a pointer in the cache which points to the first
free block.

Therefore, all the free blocks on the disks will be linked together with a pointer.
Whenever a block gets allocated, its previous free block will be linked to its next free
block.

Directory Implementation
There is the number of algorithms by using which, the directories can be implemented.
However, the selection of an appropriate directory implementation algorithm may
significantly affect the performance of the system.

The directory implementation algorithms are classified according to the data structure
they are using. There are mainly two algorithms which are used in these days.

1. Linear List
In this algorithm, all the files in a directory are maintained as singly lined list. Each file
contains the pointers to the data blocks which are assigned to it and the next file in the
directory.

Characteristics

1. When a new file is created, then the entire list is checked whether the new file
name is matching to a existing file name or not. In case, it doesn't exist, the file
can be created at the beginning or at the end. Therefore, searching for a unique
name is a big concern because traversing the whole list takes time.
2. The list needs to be traversed in case of every operation (creation, deletion,
updating, etc) on the files therefore the systems become inefficient.
2. Hash Table
To overcome the drawbacks of singly linked list implementation of directories, there is an
alternative approach that is hash table. This approach suggests to use hash table along
with the linked lists.

A key-value pair for each file in the directory gets generated and stored in the hash
table. The key can be determined by applying the hash function on the file name while
the key points to the corresponding file stored in the directory.

Now, searching becomes efficient due to the fact that now, entire list will not be
searched on every operating. Only hash table entries are checked using the key and if an
entry found then the corresponding file will be fetched using the value.
Disk Scheduling
As we know, a process needs two type of time, CPU time and IO time. For I/O, it
requests the Operating system to access the disk.

However, the operating system must be fare enough to satisfy each request and at the
same time, operating system must maintain the efficiency and speed of process
execution.

The technique that operating system uses to determine the request which is to be
satisfied next is called disk scheduling.

Let's discuss some important terms related to disk scheduling.

Seek Time
Seek time is the time taken in locating the disk arm to a specified track where the
read/write request will be satisfied.

Rotational Latency
It is the time taken by the desired sector to rotate itself to the position from where it can
access the R/W heads.

Transfer Time
It is the time taken to transfer the data.

Disk Access Time


Disk access time is given as,

Disk Access Time = Rotational Latency + Seek Time + Transfer Time

Disk Response Time


It is the average of time spent by each request waiting for the IO operation.

Purpose of Disk Scheduling


The main purpose of disk scheduling algorithm is to select a disk request from the queue
of IO requests and decide the schedule when this request will be processed.

Goal of Disk Scheduling Algorithm


o Fairness
o High throughout
o Minimal traveling head time
Disk Scheduling Algorithms
The list of various disks scheduling algorithm is given below. Each algorithm is carrying
some advantages and disadvantages. The limitation of each algorithm leads to the
evolution of a new algorithm.

o FCFS scheduling algorithm


o SSTF (shortest seek time first) algorithm
o SCAN scheduling
o C-SCAN scheduling
o LOOK Scheduling
o C-LOOK scheduling

Disk Scheduling Algorithms


On a typical multiprogramming system, there will usually be multiple disk access
requests at any point of time. So those requests must be scheduled to achieve good
efficiency. Disk scheduling is similar to process scheduling. Some of the disk
scheduling algorithms are described below.

First Come First Serve


This algorithm performs requests in the same order asked by the system. Let's take
an example where the queue has the following requests with cylinder numbers as
follows:
98, 183, 37, 122, 14, 124, 65, 67
Assume the head is initially at cylinder 56. The head moves in the given order in the
queue i.e., 56→98→183→...→67.
Shortest Seek Time First (SSTF)
Here the position which is closest to the current head position is chosen first.
Consider the previous example where disk queue looks like,
98, 183, 37, 122, 14, 124, 65, 67
Assume the head is initially at cylinder 56. The next closest cylinder to 56 is 65, and
then the next nearest one is 67, then 37, 14, so on.
SCAN algorithm
This algorithm is also called the elevator algorithm because of it's behavior. Here,
first the head moves in a direction (say backward) and covers all the requests in the
path. Then it moves in the opposite direction and covers the remaining requests in
the path. This behavior is similar to that of an elevator. Let's take the previous
example,
98, 183, 37, 122, 14, 124, 65, 67
Assume the head is initially at cylinder 56. The head moves in backward direction
and accesses 37 and 14. Then it goes in the opposite direction and accesses the
cylinders as they come in the path.

Swapping in Operating System


Swapping is a memory management scheme in which any process can be temporarily
swapped from main memory to secondary memory so that the main memory can be
made available for other processes. It is used to improve main memory utilization. In
secondary memory, the place where the swapped-out process is stored is called swap
space.

The purpose of the swapping in operating system is to access the data present in the
hard disk and bring it to RAM so that the application programs can use it. The thing to
remember is that swapping is used only when data is not present in RAM.
Although the process of swapping affects the performance of the system, it helps to run
larger and more than one process. This is the reason why swapping is also referred to as
memory compaction.

The concept of swapping has divided into two more concepts: Swap-in and Swap-out.

o Swap-out is a method of removing a process from RAM and adding it to the hard
disk.
o Swap-in is a method of removing a program from a hard disk and putting it back
into the main memory or RAM.

Example: Suppose the user process's size is 2048KB and is a standard hard disk where
swapping has a data transfer rate of 1Mbps. Now we will calculate how long it will take
to transfer from main memory to secondary memory.

1. User process size is 2048Kb


2. Data transfer rate is 1Mbps = 1024 kbps
3. Time = process size / transfer rate
4. = 2048 / 1024
5. = 2 seconds
6. = 2000 milliseconds
7. Now taking swap-in and swap-out time, the process will take 4000 milliseconds.

Advantages of Swapping
1. It helps the CPU to manage multiple processes within a single main memory.
2. It helps to create and use virtual memory.
3. Swapping allows the CPU to perform multiple tasks simultaneously. Therefore,
processes do not have to wait very long before they are executed.
4. It improves the main memory utilization.

Disadvantages of Swapping
1. If the computer system loses power, the user may lose all information related to
the program in case of substantial swapping activity.
2. If the swapping algorithm is not good, the composite method can increase the
number of Page Fault and decrease the overall processing performance.

Note:

o In a single tasking operating system, only one process occupies the user program
area of memory and stays in memory until the process is complete.
o In a multitasking operating system, a situation arises when all the active
processes cannot coordinate in the main memory, then a process is swap out
from the main memory so that other processes can enter it.

You might also like