OS R22 2-2 UNIT-5
OS R22 2-2 UNIT-5
File Operations
Compiled, machine
Object obj, o
language not linked
Commands to the
Batch bat, sh
command interpreter
formats
It contains libraries of
Library lib, a ,so, dll
routines for programmers
Sequential Access
Most of the operating systems access the file sequentially. In other words, we can
say that most of the files need to be accessed sequentially by the operating system.
In sequential access, the OS read the file word by word. A pointer is maintained
which initially points to the base address of the file. If the user wants to read first
word of the file then the pointer provides that word to the user and increases its
value by 1 word. This process continues till the end of the file.
Modern word systems do provide the concept of direct access and indexed access
but the most used method is sequential access due to the fact that most of the files
such as text files, audio files, video files, etc need to be sequentially accessed.
Direct Access
The Direct Access is mostly required in the case of database systems. In most of
the cases, we need filtered information from the database. The sequential access
can be very slow and inefficient in such cases.
Suppose every block of the storage stores 4 records and we know that the record
we needed is stored in 10th block. In that case, the sequential access will not be
implemented because it will traverse all the blocks in order to access the needed
record.
Direct access will give the required result despite of the fact that the operating
system has to perform some complex tasks such as determining the desired block
number. However, that is generally implemented in database applications.
Indexed Access
If a file can be sorted on any of the filed then an index can be assigned to a group
of certain records. However, A particular record can be accessed by its index. The
index is nothing but the address of a record in the file.
In index accessing, searching in a large database became very quick and easy but
we need to have some extra space in the memory to store the index value.
Directory Structure
Storage Structure
1) Single-level directory:
The single-level directory is the simplest directory structure. In it, all files are
contained in the same directory which makes it easy to support and understand.
A single level directory has a significant limitation, however, when the number of
files increases or when the system has more than one user. Since all the files are
in the same directory, they must have a unique name. If two users call their
dataset test, then the unique name rule violated.
2) Two-level directory:
As we have seen, a single level directory often leads to confusion of files names
among different users. The solution to this problem is to create a separate
directory for each user.
In the two-level directory structure, each user has their own user files directory
(UFD). The UFDs have similar structures, but each lists only the files of a single
user. System’s master file directory (MFD) is searched whenever a new user id
is created.
Advantages:
The main advantage is there can be more than two files with same name, and
would be very helpful if there are multiple users.
A security would be there which would prevent user to access other user’s
files.
Searching of the files becomes very easy in this directory structure.
Disadvantages:
As there is advantage of security, there is also disadvantage that the user
cannot share the file with the other users.
Unlike the advantage users can create their own files, users don’t have the
ability to create subdirectories.
Scalability is not possible because one use can’t group the same types of files
together.
3) Tree Structure/ Hierarchical Structure:
Tree directory structure of operating system is most commonly used in
our personal computers. User can create files and subdirectories too, which was
a disadvantage in the previous directory structures.
This directory structure resembles a real tree upside down, where the root
directory is at the peak. This root contains all the directories for each user. The
users can create subdirectories and even store files in their directory.
A user do not have access to the root directory data and cannot modify it. And,
even in this directory the user do not have access to other user’s directories. The
structure of tree directory is given below which shows how there are files and
subdirectories in each user’s directory.
Advantages:
This directory structure allows subdirectories inside a directory.
The searching is easier.
File sorting of important and unimportant becomes easier.
This directory is more scalable than the other two directory structures
explained.
Disadvantages:
As the user isn’t allowed to access other user’s directory, this prevents the file
sharing among users.
As the user has the capability to make subdirectories, if the number of
subdirectories increase the searching may become complicated.
Users cannot modify the root directory data.
If files do not fit in one, they might have to be fit into other directories.
4) Acyclic Graph Structure:
As we have seen the above three directory structures, where none of them have
the capability to access one file from multiple directories. The file or the
subdirectory could be accessed through the directory it was present in, but not
from the other directory.
This problem is solved in acyclic graph directory structure, where a file in one
directory can be accessed from multiple directories. In this way, the files could be
shared in between the users. It is designed in a way that multiple directories point
to a particular directory or file with the help of links.
In the below figure, this explanation can be nicely observed, where a file is
shared between multiple users. If any user makes a change, it would be reflected
to both the users.
Advantages:
Sharing of files and directories is allowed between multiple users.
Searching becomes too easy.
Flexibility is increased as file sharing and editing access is there for multiple
users.
Disadvantages:
Because of the complex structure it has, it is difficult to implement this
directory structure.
The user must be very cautious to edit or even deletion of file as the file is
accessed by multiple users.
If we need to delete the file, then we need to delete all the references of the file
inorder to delete it permanently.
Types of Access :
The files which have direct access of the any user have the need of protection.
The files which are not accessible to other users doesn’t require any kind of
protection. The mechanism of the protection provide the facility of the controlled
access by just limiting the types of access to the file. Access can be given or not
given to any user depends on several factors, one of which is the type of access
required. Several different types of operations can be controlled:
There are different methods used by different users to access any file. The general
way of protection is to associate identity-dependent access with all the files and
directories an list called access-control list (ACL) which specify the names of the
users and the types of access associate with each of the user. The main problem
with the access list is their length. If we want to allow everyone to read a file, we
must list all the users with the read access. This technique has two undesirable
consequences:
Previously, the entry of the any directory is of the fixed size but now it changes to
the variable size which results in the complicates space management. These
problems can be resolved by use of a condensed version of the access list. To
condense the length of the access-control list, many systems recognize three
classification of users in connection with each file:
Group – A group is a set of members who has similar needs and they are
sharing the same file.
Universe – In the system, all other users are under the category called
universe.
The most common recent approach is to combine access-control lists with the
normal general owner, group, and universe access control scheme. For example:
Solaris uses the three categories of access by default but allows access-control
lists to be added to specific files and directories when more fine-grained access
control is desired.
The access to any system is also controlled by the password. If the use of
password is random and it is changed often, this may be result in limit the
effective access to a file.
The number of passwords are very large so it is difficult to remember the large
passwords.
If one password is used for all the files, then once it is discovered, all files are
accessible; protection is on all-or-none basis.
File system implementation in an operating system refers to how the file system
manages the storage and retrieval of data on a physical storage device such as a
hard drive, solid-state drive, or flash drive. The file system implementation
includes several components, including:
1. File System Structure: The file system structure refers to how the files and
directories are organized and stored on the physical storage device. This
includes the layout of file systems data structures such as the directory
structure, file allocation table, and inodes.
2. File Allocation: The file allocation mechanism determines how files are
allocated on the storage device. This can include allocation techniques such as
contiguous allocation, linked allocation, indexed allocation, or a combination
of these techniques.
3. Data Retrieval: The file system implementation determines how the data is
read from and written to the physical storage device. This includes strategies
such as buffering and caching to optimize file I/O performance.
1. I/O Control level – Device drivers act as an interface between devices and
OS, they help to transfer data between disk and main memory. It takes block
number as input and as output, it gives low-level hardware-specific instruction.
2. Basic file system – It Issues general commands to the device driver to read and
write physical blocks on disk. It manages the memory buffers and caches. A
block in the buffer can hold the contents of the disk block and the cache stores
frequently used file system metadata.
3. File organization Module – It has information about files, the location of files
and their logical and physical blocks. Physical blocks do not match with
logical numbers of logical blocks numbered from 0 to N. It also has a free
space that tracks unallocated blocks.
4. Logical file system – It manages metadata information about a file i.e includes
all details about a file except the actual contents of the file. It also maintains
via file control blocks. File control block (FCB) has information about a file –
owner, size, permissions, and location of file contents.
File Allocation Methods
The allocation methods define how the files are stored in the disk blocks. There
are three main disk space or file allocation methods.
Contiguous Allocation
Linked Allocation
Indexed Allocation
1. Contiguous Allocation
In this scheme, each file occupies a contiguous set of blocks on the disk. For
example, if a file requires n blocks and is given a block b as the starting location,
then the blocks assigned to the file will be: b, b+1, b+2,……b+n-1. This means
that given the starting block address and the length of the file (in terms of blocks
required), we can determine the blocks occupied by the file.
The directory entry for a file with contiguous allocation contains
Address of starting block
Length of the allocated portion.
The file ‘mail’ in the following figure starts from the block 19 with length = 6
blocks. Therefore, it occupies 19, 20, 21, 22, 23, 24 blocks.
Advantages:
Both the Sequential and Direct Accesses are supported by this. For direct
access, the address of the kth block of the file which starts at block b can easily
be obtained as (b+k).
This is extremely fast since the number of seeks are minimal because of
contiguous allocation of file blocks.
Disadvantages:
This method suffers from both internal and external fragmentation. This makes
it inefficient in terms of memory utilization.
In this scheme, each file is a linked list of disk blocks which need not
be contiguous. The disk blocks can be scattered anywhere on the disk.
The directory entry contains a pointer to the starting and the ending file block.
Each block contains a pointer to the next block occupied by the file.
The file ‘jeep’ in following image shows how the blocks are randomly distributed.
The last block (25) contains -1 indicating a null pointer and does not point to any
other block.
Advantages:
This is very flexible in terms of file size. File size can be increased easily since
the system does not have to look for a contiguous chunk of memory.
This method does not suffer from external fragmentation. This makes it
relatively better in terms of memory utilization.
Disadvantages:
Because the file blocks are distributed randomly on the disk, a large number of
seeks are needed to access every block individually. This makes linked
allocation slower.
It does not support random or direct access. We can not directly access the
blocks of a file. A block k of a file can be accessed by traversing k blocks
sequentially (sequential access ) from the starting block of the file via block
pointers.
3. Indexed Allocation
In this scheme, a special block known as the Index block contains the pointers to
all the blocks occupied by a file. Each file has its own index block. The ith entry
in the index block contains the disk address of the ith file block. The directory
entry contains the address of the index block as shown in the image:
Advantages:
This supports direct access to the blocks occupied by the file and therefore
provides fast access to the file blocks.
Disadvantages:
The pointer overhead for indexed allocation is greater than linked allocation.
For very small files, say files that expand only 2-3 blocks, the indexed
allocation would keep one entire block (index block) for the pointers which is
inefficient in terms of memory utilization. However, in linked allocation we
lose the space of only 1 pointer per block.
A file system is responsible to allocate the free blocks to the file therefore it has to
keep track of all the free blocks present in the disk. There are mainly two
approaches by using which, the free blocks in the disk are managed.
Simple to understand.
Finding the first free block is efficient. It requires scanning the words (a
group of 8 bits) in a bitmap for a non-zero word. (A 0-valued word has all
bits 0). The first free block is then found by scanning for the first 1 bit in the
non-zero word.
2. Linked List – In this approach, the free disk blocks are linked together i.e. a
free block contains a pointer to the next free block. The block number of the
very first disk block is stored at a separate location on disk and is also cached
in memory.
In Figure-2,
the free space list head points to Block 5 which points to Block 6, the next free
block and so on. The last free block would contain a null pointer indicating the
end of free list. A drawback of this method is the I/O required for free space
list traversal.
3. Grouping – This approach stores the address of the free blocks in the first free
block. The first free block stores the address of some, say n free blocks. Out of
these n blocks, the first n-1 blocks are actually free and the last block contains
the address of next free n blocks. An advantage of this approach is that the
addresses of a group of free disk blocks can be found easily.
4. Counting – This approach stores the address of the first free disk block and a
number n of free contiguous disk blocks that follow the first block. Every entry
in the list would contain:
2. A number n
File System CALLS
1. creat()
2. open()
3. close()
4. read()
5. write()
6. lseek()
7. stat()
8. ioctl()
UNIX
SYSTEM DESCRIPTION WINDOWS API CALLS DESCRIPTION
CALLS
Process
Control
current process.
Make a process
Wait for a
wait until its child
wait() WaitForSingleObject() process or thread
processes
to terminate.
terminate.
File
Manageme
nt
Write data to a
Write to a file (or
write() WriteFile() file or output
device).
device.
existing file.
Move or rename
rename() Rename a file. MoveFile()
a file.
Directory
Manageme
nt
Remove an
Remove a
rmdir() RemoveDirectory() existing
directory.
directory.
Get file
Get status of an
fstat() GetFileInformationByHandle() information using
open file.
a file handle.
Create a hard
Create a link to a
link() CreateHardLink() link to an
file.
existing file.