File Management
File Management
1.Create operation:
This operation is used to create a file in the file system. It is the most
widely used operation performed on the file system. To create a new file
of a particular type the associated application program calls the file
system. This file system allocates space to the file. As the file system
knows the format of directory structure, so entry of this new file is made
into the appropriate directory.
2. Open operation:
This operation is the common operation performed on the file. Once the
file is created, it must be opened before performing the file processing
operations. When the user wants to open a file, it provides a file name to
open the particular file in the file system. It tells the operating system to
invoke the open system call and passes the file name to the file system.
3. Write operation:
This operation is used to write the information into a file. A system call
write is issued that specifies the name of the file and the length of the
data has to be written to the file. Whenever the file length is increased by
specified value and the file pointer is repositioned after the last byte
written.
4. Read operation:
This operation reads the contents from a file. A Read pointer is
maintained by the OS, pointing to the position up to which the data has
been read.
5. Re-position or Seek operation:
The seek system call re-positions the file pointers from the current
position to a specific place in the file i.e. forward or backward depending
upon the user's requirement. This operation is generally performed with
those file management systems that support direct access files.
6. Delete operation:
Deleting the file will not only delete all the data stored inside the file it is
also used so that disk space occupied by it is freed. In order to delete the
specified file the directory is searched. When the directory entry is
located, all the associated file space and the directory entry is released.
7. Truncate operation:
Truncating is simply deleting the file except deleting attributes. The file is
not completely deleted although the information stored inside the file gets
replaced.
8. Close operation:
When the processing of the file is complete, it should be closed so that all
the changes made permanent and all the resources occupied should be
released. On closing it deallocates all the internal descriptors that were
created when the file was opened.
9. Append operation:
This operation adds data to the end of the file.
10. Rename operation:
This operation is used to rename the existing file.
File Type
File type refers to the ability of the operating system to distinguish
different types of file such as text files source files and binary files etc.
Many operating systems support many types of files. Operating system
like MS-DOS and UNIX have the following types of files −
Ordinary files
These are the files that contain user information.
These may have text, databases or executable program.
The user can apply various operations on such files like add,
modify, delete or even remove the entire file.
Most of the operating systems access the file sequentially. In other words,
we can say that most of the files need to be accessed sequentially by the
operating system.
In sequential access, the OS read the file word by word. A pointer is
maintained which initially points to the base address of the file. If the
user wants to read first word of the file then the pointer provides that
word to the user and increases its value by 1 word. This process
continues till the end of the file.
Modern word systems do provide the concept of direct access and indexed
access but the most used method is sequential access due to the fact that
most of the files such as text files, audio files, video files, etc need to be
sequentially accessed.
Direct Access
The Direct Access is mostly required in the case of database systems.
In most of the cases, we need filtered information from the database. The
sequential access can be very slow and inefficient in such cases.
Suppose every block of the storage stores 4 records and we know
that the record we needed is stored in 10th block. In that case, the
sequential access will not be implemented because it will traverse all the
blocks in order to access the needed record.
Direct access will give the required result despite of the fact that the
operating system has to perform some complex tasks such as
determining the desired block number. However, that is generally
implemented in database applications.
Indexed Access
An index can be assigned to a group of certain records. A particular
record can be accessed by its index. The index is nothing but the address
of a record in the file.
In index accessing, searching in a large database became very quick and
easy but we need to have some extra space in the memory to store the
index value.
Advantages:
This is very flexible in terms of file size. File size can be
increased easily since the system does not have to look for a
contiguous chunk of memory.
This method does not suffer from external fragmentation. This
makes it relatively better in terms of memory utilization.
Disadvantages:
Because the file blocks are distributed randomly on the disk, a
large number of seeks are needed to access every block
individually. This makes linked allocation slower.
It does not support random or direct access. We can not directly
access the blocks of a file. A block k of a file can be accessed by
traversing k blocks sequentially (sequential access ) from the
starting block of the file via block pointers.
Pointers required in the linked allocation incur some extra
overhead.
3. Indexed Allocation
In this scheme, a special block known as the Index block contains the
pointers to all the blocks occupied by a file. Each file has its own index
block. The ith entry in the index block contains the disk address of the ith
file block. The directory entry contains the address of the index block as
shown in the image:
Advantages:
This supports direct access to the blocks occupied by the file and
therefore provides fast access to the file blocks.
It overcomes the problem of external fragmentation.
Disadvantages:
The pointer overhead for indexed allocation is greater than linked
allocation.
For very small files, say files that expand only 2-3 blocks, the
indexed allocation would keep one entire block (index block) for
the pointers which is inefficient in terms of memory utilization.
However, in linked allocation we lose the space of only 1 pointer
per block.
For files that are very large, single index block may not be able to hold all
the pointers.
Following mechanisms can be used to resolve this:
1. Linked scheme: This scheme links two or more index blocks
together for holding the pointers. Every index block would then
contain a pointer or the address to the next index block.
2. Multilevel index: In this policy, a first level index block is used
to point to the second level index blocks which inturn points to
the disk blocks occupied by the file. This can be extended to 3 or
more levels depending on the maximum file size.
3. Combined Scheme: In this scheme, a special block called
the Inode (information Node) contains all the information
about the file such as the name, size, authority, etc and the
remaining space of Inode is used to store the Disk Block
addresses which contain the actual file as shown in the image
below. The first few of these pointers in Inode point to the direct
blocks i.e the pointers contain the addresses of the disk blocks
that contain data of the file. The next few pointers point to
indirect blocks. Indirect blocks may be single indirect, double
indirect or triple indirect. Single Indirect block is the disk block
that does not contain the file data but the disk address of the
blocks that contain the file data. Similarly, double indirect
blocks do not contain the file data but the disk address of the
blocks that contain the address of the blocks containing the file
data.
as: 0000111000000110.
Advantages –
Simple to understand.
Finding the first free block is efficient. It requires
scanning the words (a group of 8 bits) in a bitmap for a
non-zero word. (A 0-valued word has all bits 0). The
first free block is then found by scanning for the first 1
bit in the non-zero word.
2. Linked List – In this approach, the free disk blocks are linked
together i.e. a free block contains a pointer to the next free
block. The block number of the very first disk block is stored at a
separate location on disk and is also cached in
memory.
Directory Structure
What is a directory?
Directory can be defined as the listing of the related files on the disk. The
directory may store some or the entire file attributes.
To get the benefit of different file systems on the different operating
systems, A hard disk can be divided into the number of partitions of
different sizes. The partitions are also called volumes or mini disks.
Each partition must have at least one directory in which, all the files of
the partition can be listed. A directory entry is maintained for each file in
the directory which stores all the information related to that file.
A directory can be viewed as a file which contains the Meta data of the
bunch of files.
Every Directory supports a number of common operations on the file:
1. File Creation
2. Search for the file
3. File deletion
4. Renaming the file
5. Traversing Files
6. Listing of files
In the above image, we can see that a cycle is formed in the user 2
directory. Although it provides greater flexibility, it is complex to
implement this structure.
Advantages of General-graph directory
Compared to the others, the General-Graph directory structure
is more flexible.
Cycles are allowed in the directory for general-graphs.
Disadvantages of General-graph directory
It costs more than alternative solutions.
Garbage collection is an essential step here.