0% found this document useful (0 votes)
57 views21 pages

OS R22 2-2 UNIT-5

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views21 pages

OS R22 2-2 UNIT-5

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

UNIT - V

File System Interface and Operations

What is a File System?


A file system is a method an operating system uses to store, organize, and
manage files and directories on a storage device. Some common types of file
systems include:
1. FAT (File Allocation Table): An older file system used by older versions of
Windows and other operating systems.
2. NTFS (New Technology File System): A modern file system used by
Windows. It supports features such as file and folder permissions,
compression, and encryption.
3. ext (Extended File System): A file system commonly used on Linux and
Unix-based operating systems.
4. HFS (Hierarchical File System): A file system used by macOS.
5. APFS (Apple File System): A new file system introduced by Apple for their
Macs and iOS devices.
File Attributes
 Different OSes keep track of different file attributes, including:
o Name - Some systems give special significance to names, and particularly
extensions ( .exe, .txt, etc. ), and some do not. Some extensions may be of
significance to the OS ( .exe ), and others only to certain applications ( .jpg )
o Identifier ( e.g. inode number )
o Type - Text, executable, other binary, etc.
o Location - on the hard drive.
o Size
o Protection
o Time & Date
o User ID

File Operations

 The file ADT supports many common operations:


o Creating a file
o Writing a file
o Reading a file
o Repositioning within a file
o Deleting a file
o Truncating a file.
 Most OSes require that files be opened before access and closed after all
access is complete. Normally the programmer must open and close files
explicitly, but some rare systems open the file automatically at first access.
Information about currently open files is stored in an open file table, containing
for example:
o File pointer - records the current position in the file, for the next read or
write access.
o File-open count - How many times has the current file been opened
( simultaneously by different processes ) and not yet closed? When this counter
reaches zero the file can be removed from the table.
o Disk location of the file.
o Access rights
 Some systems provide support for file locking.
o A shared lock is for reading only.
File Types

File type Usual extension Function

Read to run machine


Executable exe, com, bin
language program

Compiled, machine
Object obj, o
language not linked

Source code in various


Source Code C, java, pas, asm, a
languages

Commands to the
Batch bat, sh
command interpreter

Text txt, doc Textual data, documents

Word Processor wp, tex, rrf, doc Various word processor


File type Usual extension Function

formats

Related files grouped into


Archive arc, zip, tar
one compressed file

For containing audio/video


Multimedia mpeg, mov, rm
information

It is the textual data and


Markup xml, html, tex
documents

It contains libraries of
Library lib, a ,so, dll
routines for programmers

It is a format for printing


Print or View gif, pdf, jpg or viewing an ASCII or
binary file.

File Access Methods

Let's look at various ways to access files stored in secondary memory.

Sequential Access

Most of the operating systems access the file sequentially. In other words, we can
say that most of the files need to be accessed sequentially by the operating system.
In sequential access, the OS read the file word by word. A pointer is maintained
which initially points to the base address of the file. If the user wants to read first
word of the file then the pointer provides that word to the user and increases its
value by 1 word. This process continues till the end of the file.

Modern word systems do provide the concept of direct access and indexed access
but the most used method is sequential access due to the fact that most of the files
such as text files, audio files, video files, etc need to be sequentially accessed.

Direct Access

The Direct Access is mostly required in the case of database systems. In most of
the cases, we need filtered information from the database. The sequential access
can be very slow and inefficient in such cases.

Suppose every block of the storage stores 4 records and we know that the record
we needed is stored in 10th block. In that case, the sequential access will not be
implemented because it will traverse all the blocks in order to access the needed
record.

Direct access will give the required result despite of the fact that the operating
system has to perform some complex tasks such as determining the desired block
number. However, that is generally implemented in database applications.

Indexed Access

If a file can be sorted on any of the filed then an index can be assigned to a group
of certain records. However, A particular record can be accessed by its index. The
index is nothing but the address of a record in the file.
In index accessing, searching in a large database became very quick and easy but
we need to have some extra space in the memory to store the index value.

Directory Structure
Storage Structure

 A disk can be used in its entirety for a file system.


 Alternatively a physical disk can be broken up into multiple
partitions, slices, or mini-disks, each of which becomes a virtual disk
and can have its own filesystem. ( or be used for raw storage, swap
space, etc. )
 Or, multiple physical disks can be combined into one volume, i.e.
a larger virtual disk, with its own filesystem spanning the physical
disks.

1) Single-level directory:

The single-level directory is the simplest directory structure. In it, all files are
contained in the same directory which makes it easy to support and understand.
A single level directory has a significant limitation, however, when the number of
files increases or when the system has more than one user. Since all the files are
in the same directory, they must have a unique name. If two users call their
dataset test, then the unique name rule violated.
2) Two-level directory:
As we have seen, a single level directory often leads to confusion of files names
among different users. The solution to this problem is to create a separate
directory for each user.
In the two-level directory structure, each user has their own user files directory
(UFD). The UFDs have similar structures, but each lists only the files of a single
user. System’s master file directory (MFD) is searched whenever a new user id
is created.

Two-Levels Directory Structure

Advantages:
 The main advantage is there can be more than two files with same name, and
would be very helpful if there are multiple users.
 A security would be there which would prevent user to access other user’s
files.
 Searching of the files becomes very easy in this directory structure.
Disadvantages:
 As there is advantage of security, there is also disadvantage that the user
cannot share the file with the other users.
 Unlike the advantage users can create their own files, users don’t have the
ability to create subdirectories.
 Scalability is not possible because one use can’t group the same types of files
together.
3) Tree Structure/ Hierarchical Structure:
Tree directory structure of operating system is most commonly used in
our personal computers. User can create files and subdirectories too, which was
a disadvantage in the previous directory structures.
This directory structure resembles a real tree upside down, where the root
directory is at the peak. This root contains all the directories for each user. The
users can create subdirectories and even store files in their directory.
A user do not have access to the root directory data and cannot modify it. And,
even in this directory the user do not have access to other user’s directories. The
structure of tree directory is given below which shows how there are files and
subdirectories in each user’s directory.

Tree/Hierarchical Directory Structure

Advantages:
 This directory structure allows subdirectories inside a directory.
 The searching is easier.
 File sorting of important and unimportant becomes easier.
 This directory is more scalable than the other two directory structures
explained.
Disadvantages:
 As the user isn’t allowed to access other user’s directory, this prevents the file
sharing among users.
 As the user has the capability to make subdirectories, if the number of
subdirectories increase the searching may become complicated.
 Users cannot modify the root directory data.
 If files do not fit in one, they might have to be fit into other directories.
4) Acyclic Graph Structure:
As we have seen the above three directory structures, where none of them have
the capability to access one file from multiple directories. The file or the
subdirectory could be accessed through the directory it was present in, but not
from the other directory.
This problem is solved in acyclic graph directory structure, where a file in one
directory can be accessed from multiple directories. In this way, the files could be
shared in between the users. It is designed in a way that multiple directories point
to a particular directory or file with the help of links.
In the below figure, this explanation can be nicely observed, where a file is
shared between multiple users. If any user makes a change, it would be reflected
to both the users.

Acyclic Graph Structure

Advantages:
 Sharing of files and directories is allowed between multiple users.
 Searching becomes too easy.
 Flexibility is increased as file sharing and editing access is there for multiple
users.
Disadvantages:
 Because of the complex structure it has, it is difficult to implement this
directory structure.
 The user must be very cautious to edit or even deletion of file as the file is
accessed by multiple users.
 If we need to delete the file, then we need to delete all the references of the file
inorder to delete it permanently.

Protection in File System

In computer systems, alot of user’s information is stored, the objective of the


operating system is to keep safe the data of the user from the improper access to
the system. Protection can be provided in number of ways. For a single laptop
system, we might provide protection by locking the computer in a desk drawer or
file cabinet. For multi-user systems, different mechanisms are used for the
protection.

Types of Access :

The files which have direct access of the any user have the need of protection.
The files which are not accessible to other users doesn’t require any kind of
protection. The mechanism of the protection provide the facility of the controlled
access by just limiting the types of access to the file. Access can be given or not
given to any user depends on several factors, one of which is the type of access
required. Several different types of operations can be controlled:

 Read – Reading from a file.


 Write – Writing or rewriting the file.
 Execute – Loading the file and after loading the execution process starts.
 Append – Writing the new information to the already existing file, editing
must be end at the end of the existing file.
 Delete – Deleting the file which is of no use and using its space for the another
data.
 List – List the name and attributes of the file.
Operations like renaming, editing the existing file, copying; these can also be
controlled. There are many protection mechanism. each of them mechanism have
different advantages and disadvantages and must be appropriate for the intended
application.
Access Control :

There are different methods used by different users to access any file. The general
way of protection is to associate identity-dependent access with all the files and
directories an list called access-control list (ACL) which specify the names of the
users and the types of access associate with each of the user. The main problem
with the access list is their length. If we want to allow everyone to read a file, we
must list all the users with the read access. This technique has two undesirable
consequences:

Constructing such a list may be tedious and unrewarding task, especially if we do


not know in advance the list of the users in the system.

Previously, the entry of the any directory is of the fixed size but now it changes to
the variable size which results in the complicates space management. These
problems can be resolved by use of a condensed version of the access list. To
condense the length of the access-control list, many systems recognize three
classification of users in connection with each file:

 Owner – Owner is the user who has created the file.

 Group – A group is a set of members who has similar needs and they are
sharing the same file.

 Universe – In the system, all other users are under the category called
universe.

The most common recent approach is to combine access-control lists with the
normal general owner, group, and universe access control scheme. For example:
Solaris uses the three categories of access by default but allows access-control
lists to be added to specific files and directories when more fine-grained access
control is desired.

Other Protection Approaches:

The access to any system is also controlled by the password. If the use of
password is random and it is changed often, this may be result in limit the
effective access to a file.

The use of passwords has a few disadvantages:

 The number of passwords are very large so it is difficult to remember the large
passwords.
 If one password is used for all the files, then once it is discovered, all files are
accessible; protection is on all-or-none basis.

File System Structure

A file is a collection of related information. The file system resides on secondary


storage and provides efficient and convenient access to the disk by allowing data
to be stored, located, and retrieved.

File system implementation in an operating system refers to how the file system
manages the storage and retrieval of data on a physical storage device such as a
hard drive, solid-state drive, or flash drive. The file system implementation
includes several components, including:

1. File System Structure: The file system structure refers to how the files and
directories are organized and stored on the physical storage device. This
includes the layout of file systems data structures such as the directory
structure, file allocation table, and inodes.

2. File Allocation: The file allocation mechanism determines how files are
allocated on the storage device. This can include allocation techniques such as
contiguous allocation, linked allocation, indexed allocation, or a combination
of these techniques.

3. Data Retrieval: The file system implementation determines how the data is
read from and written to the physical storage device. This includes strategies
such as buffering and caching to optimize file I/O performance.

4. Security and Permissions: The file system implementation includes features


for managing file security and permissions. This includes access control lists
(ACLs), file permissions, and ownership management.

5. Recovery and Fault Tolerance: The file system implementation includes


features for recovering from system failures and maintaining data integrity.
This includes techniques such as journaling and file system snapshots.

File system implementation is a critical aspect of an operating system as it


directly impacts the performance, reliability, and security of the system. Different
operating systems use different file system implementations based on the specific
needs of the system and the intended use cases. Some common file systems used
in operating systems include NTFS and FAT in Windows, and ext4 and XFS in
Linux.
The file system is organized into many layers:

1. I/O Control level – Device drivers act as an interface between devices and
OS, they help to transfer data between disk and main memory. It takes block
number as input and as output, it gives low-level hardware-specific instruction.

2. Basic file system – It Issues general commands to the device driver to read and
write physical blocks on disk. It manages the memory buffers and caches. A
block in the buffer can hold the contents of the disk block and the cache stores
frequently used file system metadata.

3. File organization Module – It has information about files, the location of files
and their logical and physical blocks. Physical blocks do not match with
logical numbers of logical blocks numbered from 0 to N. It also has a free
space that tracks unallocated blocks.

4. Logical file system – It manages metadata information about a file i.e includes
all details about a file except the actual contents of the file. It also maintains
via file control blocks. File control block (FCB) has information about a file –
owner, size, permissions, and location of file contents.
File Allocation Methods

The allocation methods define how the files are stored in the disk blocks. There
are three main disk space or file allocation methods.
 Contiguous Allocation
 Linked Allocation
 Indexed Allocation

The main idea behind these methods is to provide:


 Efficient disk space utilization.
 Fast access to the file blocks.
All the three methods have their own advantages and disadvantages as discussed
below:

1. Contiguous Allocation

In this scheme, each file occupies a contiguous set of blocks on the disk. For
example, if a file requires n blocks and is given a block b as the starting location,
then the blocks assigned to the file will be: b, b+1, b+2,……b+n-1. This means
that given the starting block address and the length of the file (in terms of blocks
required), we can determine the blocks occupied by the file.
The directory entry for a file with contiguous allocation contains
 Address of starting block
 Length of the allocated portion.
The file ‘mail’ in the following figure starts from the block 19 with length = 6
blocks. Therefore, it occupies 19, 20, 21, 22, 23, 24 blocks.
Advantages:

 Both the Sequential and Direct Accesses are supported by this. For direct
access, the address of the kth block of the file which starts at block b can easily
be obtained as (b+k).

 This is extremely fast since the number of seeks are minimal because of
contiguous allocation of file blocks.

Disadvantages:

 This method suffers from both internal and external fragmentation. This makes
it inefficient in terms of memory utilization.

 Increasing file size is difficult because it depends on the availability of


contiguous memory at a particular instance.

2. Linked List Allocation

In this scheme, each file is a linked list of disk blocks which need not
be contiguous. The disk blocks can be scattered anywhere on the disk.
The directory entry contains a pointer to the starting and the ending file block.
Each block contains a pointer to the next block occupied by the file.

The file ‘jeep’ in following image shows how the blocks are randomly distributed.
The last block (25) contains -1 indicating a null pointer and does not point to any
other block.
Advantages:
 This is very flexible in terms of file size. File size can be increased easily since
the system does not have to look for a contiguous chunk of memory.
 This method does not suffer from external fragmentation. This makes it
relatively better in terms of memory utilization.
Disadvantages:

 Because the file blocks are distributed randomly on the disk, a large number of
seeks are needed to access every block individually. This makes linked
allocation slower.

 It does not support random or direct access. We can not directly access the
blocks of a file. A block k of a file can be accessed by traversing k blocks
sequentially (sequential access ) from the starting block of the file via block
pointers.

 Pointers required in the linked allocation incur some extra overhead.

3. Indexed Allocation

In this scheme, a special block known as the Index block contains the pointers to
all the blocks occupied by a file. Each file has its own index block. The ith entry
in the index block contains the disk address of the ith file block. The directory
entry contains the address of the index block as shown in the image:
Advantages:

 This supports direct access to the blocks occupied by the file and therefore
provides fast access to the file blocks.

 It overcomes the problem of external fragmentation.

Disadvantages:

 The pointer overhead for indexed allocation is greater than linked allocation.

 For very small files, say files that expand only 2-3 blocks, the indexed
allocation would keep one entire block (index block) for the pointers which is
inefficient in terms of memory utilization. However, in linked allocation we
lose the space of only 1 pointer per block.

Free space management in Operating System

A file system is responsible to allocate the free blocks to the file therefore it has to
keep track of all the free blocks present in the disk. There are mainly two
approaches by using which, the free blocks in the disk are managed.

1. Bitmap or Bit vector – A Bitmap or Bit Vector is series or collection of bits


where each bit corresponds to a disk block. The bit can take two values: 0 and
1: 0 indicates that the block is allocated and 1 indicates a free block. The given
instance of disk blocks on the disk in Figure 1 (where green blocks are
allocated) can be represented by a bitmap of 16 bits as: 0000111000000110.
Advantages –

 Simple to understand.

 Finding the first free block is efficient. It requires scanning the words (a
group of 8 bits) in a bitmap for a non-zero word. (A 0-valued word has all
bits 0). The first free block is then found by scanning for the first 1 bit in the
non-zero word.

2. Linked List – In this approach, the free disk blocks are linked together i.e. a
free block contains a pointer to the next free block. The block number of the
very first disk block is stored at a separate location on disk and is also cached

in memory.

In Figure-2,
the free space list head points to Block 5 which points to Block 6, the next free
block and so on. The last free block would contain a null pointer indicating the
end of free list. A drawback of this method is the I/O required for free space
list traversal.

3. Grouping – This approach stores the address of the free blocks in the first free
block. The first free block stores the address of some, say n free blocks. Out of
these n blocks, the first n-1 blocks are actually free and the last block contains
the address of next free n blocks. An advantage of this approach is that the
addresses of a group of free disk blocks can be found easily.

4. Counting – This approach stores the address of the first free disk block and a
number n of free contiguous disk blocks that follow the first block. Every entry
in the list would contain:

1. Address of first free disk block

2. A number n
File System CALLS

1. creat()
2. open()
3. close()
4. read()
5. write()
6. lseek()
7. stat()
8. ioctl()

UNIX
SYSTEM DESCRIPTION WINDOWS API CALLS DESCRIPTION
CALLS

Process
Control

Create a new Create a new


fork() CreateProcess()
process. process.

exit() Terminate the ExitProcess() Terminate the


current process.
UNIX
SYSTEM DESCRIPTION WINDOWS API CALLS DESCRIPTION
CALLS

current process.

Make a process
Wait for a
wait until its child
wait() WaitForSingleObject() process or thread
processes
to terminate.
terminate.

Execute a new Execute a new


CreateProcess() or
exec() program in a program in a new
ShellExecute()
process. process.

Get the unique Get the unique


getpid() GetCurrentProcessId()
process ID. process ID.

File
Manageme
nt

Open a file (or Open or create a


open() CreateFile()
device). file or device.

Close an open file Close an open


close() CloseHandle()
(or device). object handle.

Read data from a


Read from a file (or
read() ReadFile() file or input
device).
device.

Write data to a
Write to a file (or
write() WriteFile() file or output
device).
device.

Change the Set the position


lseek() read/write location SetFilePointer() of the file
in a file. pointer.

unlink() Delete a file. DeleteFile() Delete an


UNIX
SYSTEM DESCRIPTION WINDOWS API CALLS DESCRIPTION
CALLS

existing file.

Move or rename
rename() Rename a file. MoveFile()
a file.

Directory
Manageme
nt

Create a Create a new


mkdir() CreateDirectory()
new directory. directory.

Remove an
Remove a
rmdir() RemoveDirectory() existing
directory.
directory.

Change the current Change the


chdir() SetCurrentDirectory()
directory. current directory.

Get extended file


stat() Get file status. GetFileAttributesEx()
attributes.

Get file
Get status of an
fstat() GetFileInformationByHandle() information using
open file.
a file handle.

Create a hard
Create a link to a
link() CreateHardLink() link to an
file.
existing file.

Get the status of an Create a


symlink() CreateSymbolicLink()
open file. symbolic link.
Assignment Question

1. What is a File System?

2. Discuss in detail about different file access methods.

3. What are the operations of directory?

4. Explain File Allocation Methods

5. Discuss Free space management in Operating System.

You might also like