0% found this document useful (0 votes)
15 views

Learning Module: Surigao State College of Technology

The document discusses device management and file systems. It covers characteristics of input/output devices, an overview of file systems, directories, file system hierarchy and implementation. The objectives are to understand characteristics of I/O devices, file systems concepts, and organization of directories and file allocation tables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Learning Module: Surigao State College of Technology

The document discusses device management and file systems. It covers characteristics of input/output devices, an overview of file systems, directories, file system hierarchy and implementation. The objectives are to understand characteristics of I/O devices, file systems concepts, and organization of directories and file allocation tables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Module No. 4
CHAPTER 4: DEVICE MANAGEMENT AND FILE SYSTEMS
In this chapter, we will discuss the following:
4.1. Characteristics of I/O) Devices
4.2 Overview of File Systems
4.3 Directories
4.4 File System Hierarchy
4.5 File System Implementation
4.6 Directory Implementation
4.7 File System Reliability
Time Frame: 15 hrs
Introduction:
I/O device management is a very important activity of the operating system. I/O is
important for the communication of users to computer. The following are services of
the operating system as I/O manager:
• Monitoring all the I/O devices
• Order the I/O devices, capture interrupts and manage bugs related to I/O
• Avail communication channel between I/O devices and all other hardware
components
Accessing and storing information is the task of every computer application. A clear
and obvious requirement of an operating system is thus, the provision of a convenient,
efficient, and robust information handling system. A process can use its address space
to store some amount of information. However, three main problems are associated
with this method.
• One is the adequacy of the space to accommodate all information to be stored
by the application as the size of the address space is determined by the size of
the virtual address space.
• The second problem is the data loss as the process terminates the information
kept on its address space is also lost though the information is required to be
retained for long period of time.
• The third problem is concurrent accessibility of the information by other
processes as information saved in one process’s address space is accessible
only to that process and sometimes there is a need to make this information as
whole or part of it available to other processes as well. Solving these problems
by separately managing information resulted by a process from its process is a
concern of any operating system which is usually done by storing the information
on external media in units called files.
The operating system manages naming, structure, access, use, protection and
implementation of these files. Thus, component of an operating system and monitors

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 94


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

activities related with files is known as a file system which is to be addressed in this
section. File management system consists of system utility programs that run as
privileged applications concerned with secondary storages.

Objectives
Upon completion of this unit you should be able to:

• State the characteristics of input/output devices


• Address the principles of input/output hardware and software
• Define and express the concept of file systems and how a file can be logically
organized
• Explain the organization of the directory, the FAT and the data area on a single
partition
• Discuss details of file system and directory implementations

Learning Activities
Read Me!
4.1 Characteristics of I/O Devices
➢ The I/O devices are varied that operating systems devotes a subsystem
to handle the variety. The range of devices on a modern computer system
include from mice, keyboards, disk drives, display adapters, USB devices,
network connections, audio I/O, printers, special devices for the handicapped,
and many special-purpose peripherals. These devices can be roughly
categorized as storage, communications, and user-interface. The peripheral
devices can communicate with the computer via signals sent over wires or
through the air and connect with the computer via ports, e.g. a serial or parallel
port. A common set of wires connecting multiple devices is termed as a bus.
➢ Device drivers are modules that can be plugged into an OS to handle a
particular device or category of similar devices.
➢ Principles of I/O hardware
➢ I/O devices have three sections:
• I/O devices: - concerned with the way data are handled by the I/O device.
There are two types of I/O devices known as blocked and character devices.
• Blocked devices (such as disks) - are devices with fixed-size slots having
unique addresses and stores data accordingly.. Byte ranges of 512 to 32,768
are the general block size ranges. The ability to read and write each blocks
regardless of the other blocks is an important feature of these devices. A disk
is the commonly known block device which allows moving the read/write arm

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 95


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

any time to any required cylinder position and awaits for the needed block to
spin under the head
• Character devices (such as printers, mouse and NIC) - The other type of
I/O device is the character device. A character device delivers or accepts
stream of character, without regard to any block structure. It is not addressable
and does not have any seek operation.
I/O Unit:

➢ Indicates the hardware components. There are two major components –


Electronic Component (Device controller/Adapter) and the Mechanical
Component
➢ Memory-mapped I/O is a technique for communicating with I/O devices.
➢ In this case a certain portion of the processor’s address space is mapped to the
device, and communications occur by reading and writing directly to/from those
memory areas.
➢ Memory-mapped I/O is suitable for devices which must move large quantities
of data quickly, such as graphics cards.
➢ Memory-mapped I/O can be used either instead of or more often in combination
with traditional registers. For example, graphics cards still use registers for
control information such as setting the video mode.
➢ A potential problem exists with memory-mapped I/O, if a process is allowed to
write directly to the address space used by a memory-mapped I/O device.
➢ Direct Memory Access (DMA): is a technique for moving a data directly
between main memory and I/O devices without the CPU’s intervention. In the
absence of DMA reading from a disk is done with the following steps:
➢ the controller serially fetches every bit from the block which may have one or
more sectors and maintains the whole data in its internal buffer
➢ The controller then checks for a read error by computing the checksum value
of the data
➢ The controller then sends an interrupt signal to the system
➢ The OS reads the information found in the controller’s buffer, which is the disk
block byte by byte or word by word and put it on memory
➢ Problem: Since the OS controls the loop of reading, it wastes the CPU time. On
the other hand, when DMA is used the Device Controller will do the counting
and address tracking activities.
➢ The following diagram shows the steps to use DMA:

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 96


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Principles of I/O Software


➢ Layered technique is used
➢ Goals and issues of I/O software:
➢ Device Independence: It should be possible to write programs that can read
files on a floppy disk, on hard disk, or on a CD-ROM, without having to modify
the program for each different device types. It is up to the operating system to
take care of the problems caused by the fact that these devices are really
different.
➢ Uniform Naming: file or device names can be any string or number identifier
which has no dependency to the file or device, what so ever.
➢ Error handling: Broadly speaking, errors need to be dealt with at the device
controller level. Most of the errors to be observed are temporary like for instance
read errors due to dust spots on the read head and can be solved by performing
the needed operation repetitively.
➢ Transfer: There are two types of transfer modes – Synchronous (Blocking) and
Asynchronous (interrupt –driven). In the case of synchronous transfer the
program requesting I/O transfer will be suspended until the transfer is
completed. In the case of Asynchronous transfer the CPU starts the transfer
and goes off to do something until the interrupt that shows the completion of
the transfer arrives.
➢ Device types: There are two device types –

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 97


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

1. Sharable (such as disk) devices- accessed by multiple users


simultaneously. Nothing is wrong if several users try to open a file from one
disk concurrently.
2. Dedicated (tape drives) have to be dedicated to a single user until that
user is finished. Having two or more users writing blocks intermixed at
random to the same tape will definitely not work
Layers of I/O software
The following are the I/O software layers
• Interrupt handler (bottom)
• Device driver
• Device independent OS software
•User-level software (top)
Interrupt Handler
➢ Interrupts are undesirable and inevitable situations in life but can be hidden
away. One of the methods used to hide interrupts is through blocking all
processes with an I/O operation till the I/O is finished and an interrupt happens.
The interrupt method will then need to perform various actions and unblock the
process which started it.
Device Driver
➢ All device – dependent code goes in the device driver
➢ Only one device type, or at most, one group of closely linked devices is handled
through each device driver.
➢ Each device driver handles one device type, or at most, one class of closely
related devices.
➢ Each controller has one or more device registers used to give it command
➢ The device driver is responsible to give these commands and ensure their
proper execution
➢ Thus, the disk driver is the only part of the OS that knows how many registers
that disk controller has and what they are used for.
➢ Generally, we can say that a device driver is responsible to get requests from
the software, which is device independent, and issue orders for the execution of
the requests.
Steps in carrying out I/O requests:
• Translate the requests from abstract to concrete terms
• Write the interpreted requests into registers of the device controller’s
• The device driver blocks itself until an interrupt comes which awakens or
unblocks the driver

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 98


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

• The Device driver starts its operations by first testing for any error on the I/O
device
• If no error is identified and everything is properly functioning, data is passed
from the driver to the requester software
• If the driver couldn’t get any request in the queue, it will move back to block
state and waits until a request comes in.
Device Independent I/O Software
➢ It is large fraction of I/O software
➢ Services it provides are:
Uniform interfacing for device drivers – perform I/O function common to all
drives
Device Naming – responsible for mapping symbolic devices names onto the
proper driver
Device protection – secures devices by blocking illegitimate or non-allowed
access requests of users
Providing device-independent block size – provide uniform block size to
Buffering: if a user process write half a block, the OS will normally keep the
data in buffer until the rest of the data are written. Keyboard inputs that arrive
before it is needed also require buffering.
Storage allocation on block devices: when a file is created and filled with data,
new disk blocks have to be allocated to the file. To perform this allocation, the
OS needs a list of free blocks and used some algorithms for allocation
Allocating and releasing dedicated devices: The OS is solely responsible to
validate any device request and act accordingly by either granting or denying
the service.
Error reporting: Errors handling, by and large, is done by drivers. Most errors
are device dependent. When a device request comes to the driver, it attempts
to communicate the requested block certain number of times.
If it is unable to read the block during these times, it stops trying and
communicate the requester software the status and finally reports to the caller
User Space I/O Software
➢ A situation where a small portion of the I/O software is outside the OS
In this case, library procedures are used to perform system calls including I/O
system calls. For instance, if a C program has a function call count=write(fd,
buffer, n bytes); the write procedure from the library will be linked with it and kept
in the binary program which will be loaded to memory during execution
➢ The library procedures are also responsible to format the I/O requests
The following is an example how the I/O system works during reading by an
application
Step 1: system call to do file reading is passed from the user program.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 99


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Step 2: the device-independent software checks the block cache; if the


requested block is found there, the device driver will be called.
Step 3: The device driver issue the request to the hardware and the
requesting user process will be moved to a block state until the disk
operation is finished
Step 4: the disk then generates an interrupt once it finishes the operation
Step 5: The interrupt handler immediately takes over and investigates the
interrupt i.e., it first checks for the device currently requiring attention. It then
reads the output of the device and unblocks the sleeping process indicating
the I/O process is being completed and let the user process continue
➢ The following table 4.1. Shows the I/O system layers along with the major
responsibilities of each layer.

Disk
➢ All real disks are organized into cylinders, each one containing many
tracks. Each of the tracks then will be divided into sectors (equal number
of sectors or different number of sectors)
➢ In the case of equal number of sectors
➢ The data density as closer to the center (hub) is high
➢ The speed increases as the read/write moves to the outer tracks

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 100


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

➢ Modern large hard drives have more sectors per track on outer tracks e.g.
IDE drives
➢ Many controllers, except floppy disk controllers, are capable of doing a
read or write operation on one drive and also seek operation on one or
more other drives simultaneously.
Disk Access Time
➢ Three factors determine the time required to read or write a disk block :
➢ The seek time (the time to move the arm to the proper cylinder)
➢ The rotational delay (the time for the proper sector to rotate under the
head)
The actual data transfer time
➢ For most disks, the seek time dominates the other two times, so reducing the
mean seek time can improve system performance substantially.
➢ Disk requests can come from processes while the arm is doing a seek operation
of another process. A table of waiting disk requests is kept by disk drivers. The
table is indexed by cylinder number and pending requests of each cylinder is
linked through a linked list that is headed by the table entries.
Disk Arm Scheduling Algorithm
➢ The OS maintains queue of requests for each I/O operation and it uses various
disk scheduling algorithms. To mention some:
➢ First Come First Served (FCFS) : Accept single request at a time and perform
the requests in same order
➢ E.g. Track initial position: 11
Track request: 1,36,16,34,9,12
Service order: 1,36,16,34,9,12
Arm motion required: 10, 35, 20, 18, 25, 3,
Total= 111 tracks
➢ The simplest and the fairest of all, but it doesn’t improve performance
➢ Shortest Seek First (SSF): It handles the closest (the least disk arm movement)
request next, to minimize seek time.
➢ E.g. Track initial position: 11
Track request: 1,36,16,34,9,12
Service order: 12,9,16, 1, 34, 36
Arm motion required: 1, 3, 7, 15, 33, 2
Total= 61 tracks
➢ Advantage: Performance (efficiency), provides better performance
➢ Disadvantage: Possibility of starvation (it lacks fairness)

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 101


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

SCAN (Elevator) Algorithm:


➢ The disk arm moves in one direction till it couldn’t find any more request in that
direction, then switches to another direction
➢ Direction bit 1= up, 0=down
➢ E.g. Track initial position: 11
Track request: 1,36,16,34,9,12
Direction bit: 1
Service order: 12, 16, 34, 36, 9, 1
Disk arm motion: 1, 4, 18, 2, 27, 8
Total= 60 tracks
➢ It provides better service distribution
C-SCAN (Modified Elevator) Algorithm
➢ It restricts scanning to one direction only. This is a bit modified version of an
elevator algorithm and exhibits smaller variance in response times as it always
scan in the same direction. When the highest numbered cylinder with a pending
request has been serviced, the arm goes to the lowest-numbered cylinder with
a pending request and then continues moving in an upward direction. In effect,
the lowest-numbered cylinder is thought of as being just above the highest-
numbered cylinder. It reduces the maximum delay experienced by new request.
RAM Disk
➢ A RAM disk has the advantage of having instant access
➢ Unix support mounted file system but DOS and Windows do not support
➢ The RAM disk is split up into n blocks each with a size equal to the real disk
➢ Finally the transfer will be done
➢ A RAM disk driver may support several areas of memory used as RAM disk
Disk Cache
➢ Memory cache – To narrow the distance between the processor and memory.
➢ Disk cache – To narrow the distance between the processor/ memory and I/O
➢ It uses a buffer kept in main memory that functions as a cache of disk memory
and the rest of the main memory.
➢ It contains a copy of some of the sectors on the disk.
➢ It improves performance (by minimizing block transfer. Disk memory
➢ Design issue:
➢ Data transfer - Memory-to-memory
- using shared memory (pointer)

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 102


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

➢ Replacement algorithm - Least Recently used


- Least Frequently Used
Conclusion
➢ I/O devices are interfaces that communicate users with the computer system.
To manage the communication effectively, the operating system uses the I/O
subsystem, which has a complete layer of hardware and software.
➢ The I/O function is generally broken up into a number of layers, with lower layers
dealing with details that are closer to the physical functions to be performed
and higher layers dealing with I/O in a logical and generic fashion. The layering
is done in such a way that changes that a specific layer won’t affect the other
layers.
➢ The aspect of I/O that has the greatest impact on overall system performance
is disk I/O. Two of the most widely used approaches to improve disk I/O
performance are disk scheduling and the disk cache.
➢ At any time, there may be a queue of requests for I/O on the same disk. It is the
object of disk scheduling to satisfy these requests in a way that minimizes the
mechanical seek time of the disk and hence improves performance. The
physical layout of pending requests plus considerations of locality come into
play.
➢ A disk cache is a buffer, usually kept in main memory that functions as a cache
of disk blocks between disk memory and the rest of main memory. Because of
the principle of locality, the use of a disk cache should substantially reduce the
number of block I/O transfers between main memory and disk.
Assessment:
Instructions: Answer the following. Write your answers in the space
provided
1. Explain the I/O software goals

2. Discuss the two most widely used approaches to improve disk I/O performance

3. What is DMA? What advantages does it provide to the operating system?

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 103


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

4.2 Overview of file systems


➢ A file is a named collection of related information that are treated as a single
entity defined by its creator and kept on secondary storage devices that can be
accessed by users or applications through file management systems. In this
activity, we will discuss how files are handled by the operating system, the
naming scheme, access methods, and operations carried out on a file.
➢ A file is a contiguous logical storage unit defined by the operating system by
abstracting the physical properties of storage devices so that the computer
system will be convenient to use by providing a uniform logical view of
information storage. File is the most visible aspect of an OS which is used to
store and retrieve information from the disk
➢ Files are mapped by the operating system onto physical devices. These storage
devices are usually non-volatile making the contents persistent through power
failures and system reboots. Files represent both programs and data where
data files may be numeric, alphabetic, alphanumeric, or binary. Files can also
be either in free form or rigidly formatted. Every file has a name associated with
it which is used to interact with. Operating systems provide a layer of system-
level software, using system-calls to provide services relating to the provision
of files which avoids the need for each application program to manage its own
disk allocation and access. File manager component of the operating system is
responsible for the maintenance of files on secondary storages as memory
manager is responsible for the maintenance of primary memory. Let’s discuss
some properties of files from user’s point of view.
File naming
➢ Files are abstraction mechanism use by a computer system and naming is an
important aspect of a good abstraction mechanism. Name is assigned to a file
by the creating process at the time of creation which is then used by other
processes to communicate with the file after the creating process terminates. A
file is named, for the convenience of its human users, and is referred to by its
name. A string of eight or more characters is used to label a file in most
operating system though digits and some special symbols are allowed in some
situations. Some operating systems, like UNIX, make distinctions between
uppercase and lowercase file names where as others like MS-DOS do not. A
file name has two parts separated by a period.
➢ The first part is the label for the file while the last part is the extension indicating
the type of the file and with which the operating system identifies the owner
program for that file. Thus, opening the file will start the program assigned to its
file extension using the file as a parameter.
File structure
➢ A file is a sequence of bits, bytes, lines, or records, the meaning of which is
defined by the file’s creator and user. A file can be structured in several ways
among which these three types are common.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 104


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Byte sequence
➢ The file is organized in unstructured sequence of bytes where the operating
system is unaware of the file content. Meaning of the bytes is imposed by user
programs providing maximum flexibility but minimal support. This structure is
advantageous for users who want to define their own semantics on files. UNIX
and MS-DOS operating systems use this structure
Record sequence
➢ The file is treated as a sequence of internally structured fixed length records. In
such a structure, reading from or writing into a file interacts with one record. No
systems currently use this method though they were very popular in early times
in mainframe computers.
Tree
➢ The file consists of a tree of records that may differ in length and each having
a key parameter in a fixed position in the record which is used to interact with
the records. The operating system can add new records to the file deciding the
place for the record. The records on the tree are sorted based on the key field
which makes searching faster. Large mainframe computers use this structure.
File types
➢ Several types of files exist that are supported by an operating system.
➢ Regular files: are the most common types of files consisting user information.
Regular files may contain ASCII characters which are texts, binary data, non-
texts and not readily readable, executable program binaries, program input or
output. The contents of such files is structured with no kernel level support
➢ Directory: is a binary file consisting of list of files contained in it. These are
system files used to manage file system structure which may contain any kind
of files, in any combination. . and .. refer to directory itself and its parent
directory. The two commands used to manage directories are mkdir, to create
a directory, and rmdir, to remove a directory
➢ Character-special files: are I/O related files that allows the device drivers to
perform their own I/O buffering. These files are used for unbuffered data
transfer to and from a device. The files generally have names beginning with
r(for raw), such as/dev/rsd0a
➢ Block-special files: are files used for modeling disks and other devices that
handle I/O in large chunks, known as blocks. These files expect the kernel to
perform buffering for them. They generally have names without the r, such as
/dev/sd0a.
File access
➢ Files can be accessed either sequentially or randomly. In a sequential file
access, a system reads the bytes in order starting from the beginning without
skipping any in between. Rewinding or backing up is possible but with limited
accuracy and slow performance. Such files were available during the magnetic

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 105


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

tape era. In a random file access, a system can traverse the file directly in any
order. Bytes or records can be read out of order and access is based on key
rather than position. Disks made possible this random accessibility of files.
These are essential for database systems.
File attributes
➢ Attributes of a file are extra information associated with files in addition to name
and data. Some of these attributes are listed in table7.1 below

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 106


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

File operations
➢ Storage and retrieval of files is handled with different types of operations
provided by a system. The operating system can provide system calls to create,
write, read, reposition, delete, and truncate files. Some of the operations
defined on files are:
➢ Create file: Create a new file of size zero or with no data. The attributes are set
by the environment in which the file is created. The task of this system call is to
inform about the file and define the attributes it has
➢ Open file: used to establish a logical connection between process and file. It
fetches the attributes and list of disk addresses into main memory for rapid
access during subsequent calls
➢ Write: Transfer the memory into a logical record starting at current position in
file. This would increase the size of the file if current position is the end of file,
or may overwrite and cause content loss if current position is found at the center
of the file
➢ Read: Transfer the logical record starting at current position in file to memory
starting at buffer known as input buffer. It is the task of the caller to indicate the
amount of data to be read and where to read
➢ Close: Disconnects file from the current process. The file will not be accessible
to the process after close
➢ Delete: removing a file when no more needed so that some disk space is
utilized.
➢ Append: used to perform restricted write operation by adding data at the end
of an existing file
➢ Seek: used to specify from where a data should be accessed in a randomly
accessible file by repositioning the current-file-position pointer to a given value.
In addition to these basic operations, there are also common operations defined
on files such as rename, copy, set and get attributes, etc. most of these
operations require searching for a named file in the directory of files causing
continuous search.
➢ To overcome this continuous searching, most systems perform a call to open
operation before the file is first brought active.
Conclusion
➢ When a process has information to be maintained, it keeps it in its address
space which causes three main problems. Storage capacity of a system is
restricted to the size of available virtual memory which may not be enough for
applications involving large data; the virtual memory is volatile which may not
be good for long term storage and information need not be dependent upon
process as there might be a need to modify that data by different processes.
➢ To overcome these limitations, long term information storage, file, is essential.
A file is a named collection of related information defined by its creator. It is an

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 107


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

abstraction used by the kernel to represent and organize the system’s non-
volatile storage resources, including hard disks, CD-ROMs, and optical disks.
File system is part of the operating system that manages files.
➢ Every file has a name attribute associated with it which is used to communicate
with it.
➢ File names have a different extension they end with according to the file type.
A file can represent a program or data and can support free form or rigid form.
➢ A file can be structured in byte, record or tree form and can be accessed either
sequentially or randomly. The operating system uses several types of
operations, through system calls, to interact with files.

Assessment:
Instructions: Answer the following. Write your answers in the space provided
a. What is file?

b. Discuss in detail three file attributes.

c. What is the difference between random access and sequential access?

d. Discuss the problems encountered by a process when keeping information in


its address space

e. What are the extensions found in file name used for?

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 108


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

4.3 Directories
Introduction
➢ Directory is collection of nodes that contains information about files. Directories
shall be logically organized to achieve efficiency, facilitate convenient naming,
and allow file grouping. This activity covers overview of file directory,
organization of directories and operations that can be carried out on directories
by users or applications. A general purpose computer maintains thousands and
millions of files on secondary storages.
➢ An entire space of a secondary storage device may be used to store a file or
only its portion. Directory is a means provided by the file system to keep track
of files. In most systems, directories, also known as folders, are files themselves
owned by the operating system and have their own organization, properties and
operations. They provide mapping between file names and the files themselves.
Directory contains a number of entries about files such as attributes, location,
ownership etc.. The directory may keep the attributes of a file within itself, like
a table, or may keep them elsewhere and access them through a pointer. In
opening a file, the OS puts all the attributes in main memory for subsequent
usage.
➢ Files can be organized in a single-level directory structure or multi-level
directory structure. In a single level directory structure, all the files in a system
are kept in one directory, which may also be referred as root directory. This
organization scheme is known to be simple and results in fast file search.
However, it encounters a problem when the number of files to be maintained
increases or when used in a multi-user systems as names of each file is
required to be unique. If two or more users create files with same name, this
uniqueness requirement is overridden and in such cases the last file created
overwrites the previous one causing the first file being replaced with another
file of same name.
➢ In a two-level directory structure, a private user level directory is assigned for
each user which elevates the naming conflicts encountered in a single-level
directory structure. The basic implementation of this organization requires all
users to access only their own directory. But with little modification, it can
extend to allow users to also access other user’s directories through some
notification mechanism. In this organization, it is possible to provide same
names for files of different user. The system implicitly knows where to search
for a file when asked to open a file since each user is associated with a private
directory. This organization also has its own drawbacks. It creates total isolation
between users which is not required in systems where processes share and
exchange data. It may also not be satisfactory if users have many files. Figure
4.3 and figure4.4 below shows the single-level and two-level directory
organizations respectively.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 109


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

➢ Organizing files in a two level directory system does not suffice for systems with
several files and where a logical organization is required. A general hierarchical
structure of several directories needs to be allowed for each user where files
are organized in their natural categorical manner by extending the two level
directory organizations. This hierarchical organization of directories is called
tree structure where the top directory is the root sitting at the top of the tree and
all directories spring out of the root allowing logical grouping of files. Every
process can have its own working directory also referred as current directory to
avoid affecting other processes. A directory or sub-directory has a set of files
or other directories inside.
➢ When a process references a file, it will be searched in the current directory
which contains files which are currently active. Otherwise, if user needs to
access a file not residing in the current directory, file name should be specified
through path names or the current directory should be set to the directory
holding the desired file through a system call which takes the name of the
directory as parameter. Path names are conventions used to specify a file in

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 110


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

the tree hierarchy of directories. A hierarchy starts at the directory/, known as


the root directory.
➢ A path name is then made up of a list of directories crossed to reach the desired
file followed by the file name itself. Two kinds of path name specifications exist.
An absolute path name is a unique name that always starts from the root
directory and extends to a file. The part separator in directories (/ for UNIX and
\ for Windows) is the first character of any absolute path name. Example
/usr/books/os in UNIX and \usr\books\os in windows. A relative path name is a
name which does not begin with the root directory name. This path name is
specified relatively to the current directory of a process. for instance if the
working directory for a process is /usr/books, then the previous file with absolute
path name of /usr/books/os can be communicated as os. A relative path name
is more convenient than the absolute form and achieves the same effect.
➢ The advantage of a tree directory structure is its efficient searching and
grouping capability where users are allowed to define their own subdirectories
by enforcing structures on their files.

Directory Operations
➢ There are various system calls associated with directory management. Let’s
discuss some of these operations defined on directories.
➢ Create: is an operation used to create a new and empty directory except for the
two special components that are automatically included by a hierarchical
directory structure supporting systems. These are the dot (.) and dot dot (..) that

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 111


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

refers to the current directory and its parent respectively. The mkdir command
is used in both UNIX and MS-DOS to create a directory.
➢ Delete: a call used to remove an existing directory from a system. A directory
with no component (except the dot and dot dot) can be deleted from a system.
The rmdir command does the magic in UNIX and MS-DOS.
➢ Open directory: used to open a directory for reading through an opendir
system call.
➢ Close directory: calls the closedir operation and is used to exit a directory that
was opened for reading to free space on internal table
➢ Read directory: done by the readdir system call to access an opened directory
by returning the next entry of the directory
➢ Rename: change the previous directory name
➢ Link: is creation of pointers to other files or directories so that same file can
appear in multiple directories enabling file sharing. The system call generates
a link from an existing file to a given path taking an existing file name and a
path name. A file may have any number of links but all will not affect the
attributes of the file. A link can be either a hard link done by the link() system
call, where links are made only to existing files in one file system not across file
systems and requires all links to the file must be removed before the file itself
is removed , or a symbolic link done by the symlink() system call, which points
to another named file spanning across file systems. The original file can be
removed without affecting the links.
➢ Unlink: used to remove a directory entry. If the file being removed is present in
one directory, it is removed from the file system. Whereas, if the file being
removed is present in multiple directories, only the path name specified is
removed; others remain. The rm/rmdir user commands and the unlink() system
call can be used to unlink a file.
Conclusion
➢ A directory is a means through which files are organized and managed by a file
system. A single directory may contain all available files in a system or a two
level directory organization can be used where by each user is assigned its own
directory. A logical structuring of files in a system can be achieved through a
tree like hierarchical organization of directories where users can create their
own subdirectories and impose different structures to their files.
➢ A tree structure has an efficient search and grouping advantages to users.
Referencing of files in such organizations is done through path names which
can be specified either absolutely or relatively in reference to user’s current
directory. A number of operations can be carried out on directories either
through system calls or user commands.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 112


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Assessment:
Instructions: Answer the following. Write your answers in the space
provided.
a. What is a directory?

b. What are the limitations of a two-level directory organization?

c. What does a link operation define on directories?

d. What is a path?

e. What is the difference between absolute path and relative path of a file?

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 113


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

4.4 File System Hierarchy


Introduction
➢ So far, our discussion has focused mainly on file systems as viewed by end
users. The kernel of an operating system has also its own perception of files
that will be addressed in this activity. Here the issue is about file system
implementation dealing with storage of files and directories, disk space
management and making everything work effectively and reliably.

➢ Using disks to efficiently store, search and read data on disks is made possible
through file systems. Two quite different design issues are faced by a file
system: defining the file system interface to the user through defining a file with
its attributes, the operations allowed on a file, and the directory structure for
organizing files and defining algorithms and data structures to plot the logical
file system onto the physical secondary-storage devices.

➢ The file system is generally structured with several levels making a layered
design used to abstract the lower level details associated with it from the higher
level components but each high level layer making use of the lower level layers
functions to produce new features. Figure 7.4 below shows this hierarchy of a
file system design.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 114


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

➢ The top level, the application programs, are codes making file requests and
handle the content of the files as they have knowledge of the file’s internal
structure.
➢ The logical file system layer is responsible for managing the metadata of the
file system, not the data in the files. It also manages directory structures and
maintains file structures through a file control block that contains information
about the file.
➢ The file-organization module layer translates logical block addresses to
physical ones as logical addresses of files do not match to physical addresses.
This layer also has free space managers that tracks and inform it about free
blocks.
➢ The basic file system layer makes generic requests to appropriate device
drivers to read or write the physical blocks on the disk. Memory buffers and
caches where file systems and data blocks reside are also managed by this
layer.
➢ The I/O control layer presents device drivers and interrupts handlers used to
transfer information between the device and main memory. It generally
interfaces with the hardware devices and handles different interrupts
associated with the I/O devices.
➢ The last layer is the device layer which presents the actual devices like disks,
tapes, etc. a new file is created through a call made by application program to
logical file system which, as been mentioned, knows about directory structures
and defines a new file control block to the file. The system then uploads the
indicated directory into memory and updates its content with the new file name
and file control block and finally return it back to the disk.
➢ The file organization module is then called by the logical file system so that the
directory I/O is mapped to a disk block numbers which is then used by basic file
system and I/O control components concluding a file creation call. I/O can then
be performed on this file once it is opened for such operations. All modern
operating systems have adopted the hierarchical directory model to represent
collections of files as they support more than one file system both removable-
media based and disk based file systems.
➢ In a layered file system structure, code reusability is enhanced and duplication
of codes highly reduced as the I/O control and also the basic file system
modules can be used by more than one file system each then defining their
own logical file system and file organization module. However, layering also
has its own drawback which is performance overhead on the operating system.
Thus, decisions about hierarchical file system structure are a serious design
issue that needs to be addressed when new systems are brought.
Conclusion
➢ In this activity, files are discussed from the operating system’s kernel
perspective. The operating system has a file management system by which
secondary storage devices are utilized to maintain data permanently through
files. The file system has a hierarchical arrangement so that lower level
technical details of how files are handled by the system are abstracted from the

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 115


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

higher level users or applications. This hierarchical arrangement of file defines


six different components with a distinct operation to handle. Though hierarchical
arrangement of these components reduces code duplication by making some
of the components used in several file systems, performance overhead resulted
because of it should not also be neglected which requires a trade-off to be made
between these two choices while designing a system.

Assessment:
Instructions: Answer the following. Write your answers in the space provided.

1. Discuss what a file system is.

2. What are the different layers found in a file system hierarchy?

3. What advantage has the hierarchical arrangement of a file system?

4. What are the two design issues related with file systems?

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 116


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

4.5 File System Implementation


Introduction
➢ In the previous activities, we tried to point out file system from the user’s view
as well as kernel’s view. Let’s now move our discussion to the structures and
operations used to implement file systems. In this activity, files and directory
storages, disk space management, as well as efficient and reliable file
management system implementations will be examined.
File system layout
➢ File systems are stored permanently on secondary storage devices such as
disks. These disks can be used in their entirety or partitioned into partitions
whose layout differs between file systems and each maintains different file
systems. On disk, the file system contains information about how an operating
system is to be booted, the total number of blocks, the number and location of
free blocks, the directory structure, and individual files.
➢ The Master Boot Record (MBR) which is found at sector 0 of a disk contains
information used to boot an operating system from that disk. The end of the
MBR is the partition table which gives the start and ending address of each
partition. When the MBR executes, the first thing it does is locating an active
partition among the available partitions by reading it to its boot block in UNIX
and partition boot sector in windows , the first block, which contains a bootable
operating system that is loaded when programs on this block execute. A super
block, also termed as volume control block is another component of a file
system which consists key parameters of the file system such as available
blocks in a partition, size of each block, free block count, etc… and is loaded to
memory either on system boot or on first use of the file system. A free space
management attribute, nodes describing the files, root directory of the system
and all the other directories found in the file system are also some other
components of a file system. The general file system layout is shown in figure
4.7 below

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 117


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

➢ The most important thing to consider during file implementation is making


associations between files and blocks associated with each file system. To
achieve this, various allocation methods are used that differs among
different operating systems. Three general methods of allocating storage on
hard disks are available.
➢ Contiguous allocation: In contiguous space allocation, each file occupies
a set of contiguous blocks on the disk by laying down the entire file on
contiguous sectors of the disk. Disk addresses (or sector addresses) are
defined linearly on the disk. Therefore disk address 0 would map to cylinder
0, head 0, sector 0, disk address 1 would map to cylinder 0, head 0, sector
1, disk address 2 would map to cylinder 0, head 0, sector 2 and so on. Then
logical block 0 is stored in disk address 0, logical block 1 is stored in disk
address 1, and so on. This method has a simple implementation as keeping
track of files only requires memorizing of two numbers: first block’s disk
address and number of blocks in a file with which any other block can be
found through addition. Moreover, only a single seek operation is used to
read the whole file from the disk which enhances system’s read
performance. Thus when accessing files that have been stored
contiguously, the seek time and search time is greatly minimized. The main
disadvantage of such method is the difficulty to obtain such contiguous
locations especially when the file to be stored is large in size. Such files
cannot be expanded unless there is empty space available immediately
following it. If there is not enough room, the entire file must be recopied to a
larger section of the disk every time records are added. There is also the
problem of external fragmentation as disks are not compacted the moment
a file is removed leaving holes in between files and causing this
fragmentation problem.
➢ Linked list allocation: solves all the problems of contiguous allocation by
storing each file as a linked list of disk blocks which may be stored at any
particular disk address and using the first word of each block as a pointer to
the next one,. The entry in the directory for a file will consist of a pointer to
the disk address for the first block (or record) in the file and the last block
(or record) in the file. The first block will contain a pointer to the second
block, which in turn will contain a pointer to the third block and so on. There’s
no external fragmentation since each request is for one block. This method
can only be effectively used for sequential files. Linked allocation however
does not permit direct access since it is necessary to follow each block one
at a time in order to locate a required block causing extremely slow access.
Pointers also use up space in each block and reliability is not high because
any loss of a pointer results in loses the rest of the file. These problems can
be solved through placing the pointer words of each block in a table known
as the File Allocation Table (FAT) in memory. Using a separate disk area to
hold the links frees the entire block for data and also solves the slow access
in random access though this table needs be available in memory all the
time. A FAT file system is used by MS-DOS

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 118


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

➢ I-nodes: a data structure called an index node (i-node) listing the attributes
and disk addresses of the file’s blocks is associated with each file in this
method. The I-node is created at the time of file system creation (disk
partition) and remains always in the same position on the file system. I-node
table size determines the maximum number of files, including directories,
special files, and links that can be stored into the files system.
➢ Each file uses an index block on disk to contain addresses of other disk
blocks used by the file. When the ith block is written, the address of a free
block is placed at the ith position in the index block. Such method requires
the i-node to be in memory only when the file is open which solves the
limitations of the FAT system by reducing the space requirement. An I-node
creates an array of size proportional to the maximum number of concurrently
accessed files which differs it from the FATof linked list allocation system
which has a proportional size to the disk size growing linearly as the disk
grows. The i-node also has its limitation if each node is of fixed number of
disk addresses and can’t address when files exceed this limitation.
Conclusion
➢ From a top level perspective, a file system is perceived as set of files,
directories and operations defined to manipulate them which is quite
different from its internal arrangements. File system implementers’ or
designers need to deal with storage allocation of files and also maintain the
block file associations in order to place inline an efficient and reliable file
manipulation schemes.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 119


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Assessment:
Instructions: Answer the following. Write your answers in the space provided.
1. Discuss in pair the cons and pros of the three file storage allocation methods
on hard disks. Present your answer thru a concept map. Provide a brief
explanation of the concept map created.

2. Discuss about the MBR along with its contents.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 120


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

4.6 Directory implementation


Introduction
➢ In this activity, the focus is on how directories are implemented within a system
and what the concerns are in relation with directories. In order to read a file, it
needs be opened. During file opening, a user provides a path name which is
an ASCII value to be mapped and used by the operating system to locate the
directory entry which gives information used to track disk blocks associated
with that file. The information may be the whole file’s disk address, for adjacent
allocation, the number of the first block, for linked list allocation or the number
of the i-node. Every file system also needs to maintain file attributes. File
attributes are properties of files such as its creation time, owner, etc that needs
to be stored. It is possible to maintain these attributes in a directory entry which
is how most systems handle the issue.
➢ A directory is defined with fixed size entries per each file and holds the file
name, structure of the file attributes and disk addresses specifying locations of
disk blocks. It is also possible to store this information in the i-nodes for
systems with i-node space allocation.
➢ A file name can be of fixed length with a pre-defined number of characters
constituting the name or variable length which doesn’t restrict the character
numbers in a file name. These variant of file names can be implemented by
setting a limit on file names length, commonly 255 characters. This approach
has simple implementations but wastes directory space as few files have such
lengthy names.
➢ Another possibility can be considering a variant entry size in directories
containing the length of the entry itself, some file attributes and the file name.
Every file name is expected to fill out an integral number of words so that the
next directory entry is started on a word boundary. Figure7.6 depicts this
method of handling variable length file names. During file removal process,, a
variable size break is introduced in the directory which may not be enough for
a newly coming file. A page fault can also occur while reading a file name as
directories may cover multiple pages of memory.
➢ Making constant size directory entries and keeping all the file names together
in a heap found at the end of the directory can also be another solution towards
handling the variant long file names as. This method solves the fitting into a
freed space problem encountered in the previous scheme. But heap
management should be performed and there is no guarantee of page fault
avoidance.
➢ In all these directory implementations, a linear list of directory entries with
pointers to the data blocks is used which makes searching for files linear. This
method is simple to program but time-consuming to execute. Creating a new
file, for instance, requires searching the directory to be sure that no existing file
has the same name and the new entry is added at the end of the directory. File
deletion also requires searching the directory for the named file and then
release the space allocated to it. This linear arrangement of entries results in a
slow search operation especially when the directory is too long. Caching results

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 121


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

of a search or using a hash table of each directory can be used to speed up the
search. The cache is first checked for the file name before starting the search
avoiding the long lookup if it is found in the cache.
➢ Through hash table method, file name is entered by representing it with value
between 0 and n-1 by performing division by n and taking the remainder of the
division or by adding the words in the file name and dividing it by n where n is
the size of the hash table. This hash code is inspected to determine if it is used
or not. If the slot is occupied, a linked list headed at that table entry is created
those threads through all entries with same hash table. But if the slot is
unexploited, a pointer to the file entry that follows the hash table will be kept
there. Searching for a file is done by hashing the file name and select a hash
table entry and checking if the file is present in all the chain entries. If the file is
not found, the file is not available in the directory. While searching is improved
in hash table implementation, administration complexity is also inevitable.
File sharing
➢ In a multiuser system, there is a frequent need to share files among users which
demands the need to present common files within different directories of
multiple users at a time. Two important issues needs to be addressed in
association with file sharing. These are access rights and simultaneous access
of files. One thing to note is the difference between shared file (or directory)
and two copies of the file. With two copies of a file, the copy is accessible by
each user not the original, and if one user updates the file, the changes will not
appear in the other’s copy. With a shared file, however, only one actual file
exists, making immediately visible any changes made by one person to the
other. Sharing is particularly important for subdirectories as a newly created file
by one user will automatically appear in all the shared subdirectories. Several
implementation possibilities are there for shared files.

➢ A common way, exemplified by many of the UNIX systems, is to create a new


directory entry called a link which is effectively a pointer to another file or
subdirectory. When a shared file or directory exists, the system creates a new
file of type link specifying the path name of the file it is linked to. Links are easily
identified by their format in the directory entry (or by having a special type on
systems that support types) and are effectively indirect pointers which are
neglected by the operating system when traversing directory trees and thus
referred as symbolic links.

➢ The advantage of symbolic link implementation is their use to link to files over
networks through the network address specification of the machine that holds
the shared file as link. But the problem with this implementation is the resulting
of two or more paths of same file. Another problem associated with the symbolic
link implementation is the extra overhead incurred as a result of parsing the
path component wise through extra disc accesses.
➢ Another shared file implementation approach is through simple duplication of
all information about the shared files in both directories resulting in identical and

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 122


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

non-distinguishable entries which is different from the link implementation


discussed earlier. Consistency maintenance is a major problem in this
implementation during file modifications as the changed content is visible only
to the user making the changes which brings down the sharing concept.
Disk space management
➢ In order to maintain files on disks, a space should be allocated and the system
needs to keep track of free spaces to be allocated. An n byte file can be stored
on disks using either of the two possibilities. These are allocating n successive
bytes of a disk space or breaking down the file into several fixed size non-
successive blocks.
➢ As contiguous allocation has its own limitation which we had seen earlier, most
systems prefer the non-adjacent fixed sixe block partitioning of files. The
question, however, is the size of the block as making it too large wastes the
total disk space while too small results files to have several number of blocks.
In general, a block can be allocated sector, track or cylinder sizes. Having
blocks of cylinder size for every file lack resource utilization as small files don’t
consume all the space.
➢ However, access time which is directly dependent on the seek time and
rotational delay of the r/w head, is enhanced with larger block sizes as the data
rate is directly proportional to block size. If a small size block is defined, which
makes the file to be sliced up into several blocks; space utilization is increased
in a block but also causes a longer access time.
➢ Overall, blocks of larger size tend to have less space utilization but better
performance while blocks of small size have better space utilization but poor
performance. Keeping track of free blocks is another concern in disk space
management after the disk size is being defined. Since disk space is limited,
we need to reuse the space from deleted files for new files, if possible. A free
space list is maintained by the system to keep track of free disk space which
records all free disk blocks which are not allocated to some file or directory.
➢ During file creation, the free space list is looked up for the needed amount of
space and allocate this block for the new file which will then remove it from the
list and when a file is removed, its disk space is added to the free space list.
This free block space management is carried out by bitmap and linked list of
blocks.
➢ Bitmap: an n bit bitmap is defined for an n block disk where allocated blocks
indicated with a bit value 0 and free blocks with bit value 1 in the bitmap. A
bitmap requires less space as the blocks are represented by bits. This method
is relatively simple and is efficient in finding free blocks though different bit-
manipulation instructions that can be used effectively for that purpose. One
technique for finding the first free block on a system that uses a bit-vector to
allocate disk space is to sequentially check each word in the bit map to see
whether that value is not 0. The first non-0 word is scanned for the first 1 bit,
which is the location of the first free block.
➢ Linked list: all free disk space blocks are linked together and a pointer to the
first free block is kept in a special location on the disk and caching it in memory.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 123


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

During file creation, the blocks needed are captured from the blocks of pointers
that are read from the disk when file creation ends.. Deleting a file adds the
freed space to the block of pointers in memory which is written to disk. There
are cases where this method is not efficient due to unnecessary disk I/O.
traversing the list requires reading each block which requires a substantial I/O
time though it is done less frequently as the first free block is always allocated
to a file that needs a space. The FAT method’s allocation data structure has
this free-block accounting which avoids the need for a separate method of free
block management.
Disk quotas
➢ In multiuser operating systems, there is a need to limit the maximum allotment
of files and blocks for each user to prevent users from monopolizing the disk
space and the system assures no user exceeds the limit set. Among the
attributes of a file is the owner entry specifying by whom the file is owned and
this will be reviewed when a file is opened to charge the owner for any increase
in the file’s size. Another table defining the quota of each user with a currently
open file is also maintained by the system whose records are written back to
the quota file once the opened files are closed. This table has a pointer to the
quota file of the user which is used to check the limits of the user every time a
block is added to the opened file resulting in error if the limit is exceeded.
Conclusion
➢ In this activity, we tried to discuss issues related with directory implementation.
File naming conventions, file searching efficiency, file sharing methods among
users and applications, disk space management schemes, and disk quotas
were explored all of which are concern of system designers for the betterment
of structured file arrangement.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 124


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Assessment:
Instructions: Answer the following. Write your answers in the space provided.
1. What are the file sharing schemes used by a file management system?

2. What advantages and disadvantages do you observe on disk quota


specification?

3. What is a free block space management? Why is it necessary for the system to
identify free blocks? What schemes are used by the system to manage these
free blocks?

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 125


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

4.7 File system reliability


Introduction
➢ If your computer is damaged by an accident, it is possible to replace the
damaged component with minimal cost. However, if a file in a computer is
damaged or lost, recovering from the loss would be expensive, if not
impossible, which is the case most of the time. File damage can happen due to
physical or logical reasons and we need to have an efficient and optimal file
system that can protect the information from logical damages. In this activity,
issues involved in protecting the file system and reliability concerns will be
discussed.
Backups
➢ Taking backups is an important task which mostly is overlooked by most users.
Backing up is a process of saving files on external devices, usually tapes so
that a system can use it to recover from a disaster when encountered.
➢ Backing up takes a long time and large storage space which requires to be
done efficiently. Considerations need to be made as to whether the entire file
system or part of it should be backed up.
➢ Mostly, it is desirable to take backups of directories and files that can’t be found
elsewhere. For instance, program files need not be backed up as it is possible
to reinstall them from the manufacturer provided devices. Similarly, temporary
files shouldn’t also be included in the backup contents. Moreover, taking a
backup of unchanged files and directories since the last backup is also a waste
of space and time.
➢ An incremental dump is a means by which a backup of whole file is taken
periodically and makes a daily backup of only the changed components since
the last backup and even more effective is to take backups of those files
changed since they were last dumped. Recovering is made complex in this
incremental dump though it reduces recovery time.
➢ When a system tries to recover using this dump file, the most current full backup
is restored first after which comes all incremental backups but in reverse order.
If the backup file is too much, compression algorithms can be used to compress
the backup before saving it onto the tape which will then be decompressed by
decompression algorithms at the recovery. If files and directories are active
while taking backups, inconsistent results might be obtained. So making
systems offline while backup or using algorithms to take rapid snapshots of file
systems can be used to avoid such situations. Two dumping mechanisms are
used to backup files. A physical dump writes all the disk blocks starting at block
0 of the disk till the end. It will not leave out empty disk blocks and bad blocks
which may result in an infinite disk read errors during the backup process. This
backup is known for its simplicity and great speed while making unselected
backup, incrementally dumping and individual file restores upon request are its
disadvantages.
➢ A logical dump is the most commonly used backup taking system used which
starts writing from selected files and directories and recursively backs up all
files that had been changed since the last backup or system installation to a

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 126


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

tape making recovery of a selected file or directory simpler. During recovery,


an empty file system is created onto which the most recent full dump will be
stored and the system makes use of this file to restore the directories and files
in it. Then, if there exists a dump performed incrementally after the full one, the
system restores them by performing the same task as with the full dump
recovery.

➢ A logical dump has some critical issues that we need to be aware of. One, it
doesn’t save the free disk block list and the system needs to reconstruct this
block after restore is performed. Second, for a linked file, the system should
make sure this fie is restored only once. Thirdly, files with holes inside, the hole
should neither be dumped nor restored so system should carefully inspect such
types of files before restoring them from the dump taken.
File system consistency
➢ A file system is always dealing with blocks through Reading, modifying and
writing blocks. Inconsistency may arise if a file system updates a block but the
system is interrupted before the changes are written. This inconsistency
becomes even worse when the unwritten block is free list, i-node or directory
blocks. To overcome this problem, most systems implement a utility program
to check for file system consistency that is run after each system boot like the
scandisk for windows and fsck for UNIX. The consistency check can be
performed on blocks or files. Two tables are constructed by the utility program
during block consistency checking each table with counters for each block that
is initialized with 0. The first table’s counters count the number of times each
block exist in that file while the second table’s counters keep track of the number
of times each block is found in the free list. List of all block numbers used in a
file is constructed by the program from the i-node. The counter at the first table
is then incremented for every block read. The program also checks the bitmap
to see blocks not in use and increments the counter of the second table for
every occurrence of a block in the bitmap. The consistency of the blocks will
then be checked by checking the counters from the two tables.
➢ If each block has a 1 value either in the first or second table, the program returns
fine indicating block consistency while a value 0 in both tables corresponding
to a block number indicates file inconsistency.
➢ The utility program then fixes the inconsistency and informs the user. File
consistency checking is also done in same fashion in a directory system. The
inspection starts from the root directory and recursively descends through the
tree. The file’s usage counter is incremented for every file in each directory
which will be checked against a list sorted by i-node numbers that indicates the
number of directories where each file is found in. When a file is created, the
counter starts at 1 and increments every time a file is linked. If the counters
value match, the file is determined to be consistent. The inconsistency happens
if the values of the counters do not match in which case the program should
take measures to correct the values.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 127


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

File system performance


➢ The time taken to access a disk is way longer than it takes to access main
memory. But this disk access time can also be improved in several ways. As a
result several file systems come with optimization techniques that would
enhance the performance of disk access.
➢ Caching is one method through which disk access time is optimized. Cache is
a collection of storage blocks which are kept in memory but are logical part of
a disk in order to enhance performance. Some systems maintain a separate
section of main memory for a cache where blocks are kept under the
assumption that they will be used again shortly. Other systems cache file data
using a page cache. The page cache uses virtual memory techniques to cache
file data as pages rather than as file system oriented blocks which is more
efficient than caching through physical disk blocks, as accesses interface with
virtual memory rather than the file system. All read requests first check the
cache for the presence of the block in the cache.
➢ If the block is found, the demand will be replied without disk communication.
On the other hand, if the block can’t be found in the cache, the block will be
fetched from the disk on to the cache first and then the request is answered.
Due to the large number of blocks found in cache, the system needs to respond
quickly for a block’s request. One way to do this can be through hashing the
device and disk addresses and then look for the disk in the hash table. If a block
is not found in the cache and needs to be brought and if the cache is already
full, some blocks must be removed and re-written back to the disk and make
some space for the new comer. This situation is similar with paging and thus
the algorithms discussed for page replacement can also be used for block
replacement as well.
➢ Block read ahead is another technique used by file systems to optimize
performance. This method brings blocks into the cache ahead before they are
requested so that a hit rate is increased. With this method, a requested block
and several subsequent blocks which are likely to be requested after current
block is processed are read and cached. Retrieving these data from the disk in
one transfer and caching them saves a considerable amount of time. This works
well for sequentially accessed files where the file system checks if block a+1
exists in the cache while reading block and makes a schedule to read block a+1
to the cache.
➢ However, if the file is randomly accessible, block read ahead has a
disadvantage of tying up the disk bandwidth with writing unwanted blocks into
the cache and even dropping possibly needed blocks from the cache.
➢ Arranging blocks which more likely are accessed in sequence together also
minimizes the amount of disk arm motion which in turn enhances disk access
performance.
Conclusion
➢ A file has different organization when seen from the system’s perspective. File
system designers are thus required to focus on internal structures as storage
allocation, disk space management, sharing of files, reliability and performance.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 128


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Three methods are used to allocate space to file systems among which the i-
node is known to be the optimal way. Directory-management routines must
consider efficiency, performance, and reliability. A hash table is a commonly
used method, as it is fast and efficient. Unfortunately, damage to the table or a
system crash can result in inconsistency between the directory information and
the disk’s contents. A consistency checker can be used to repair the damage.
Operating-system backup tools allow disk data to be copied to tape, enabling
the user to recover from data or even disk loss due to hardware failure,
operating system bug, or user error.

Assessment:
Instructions: Answer the following. Write your answers in the space
provided.
1. What are the mechanisms used by a file system to check for consistency?

2. What is backup? Discuss the methods of backup?

3. What are the main issues need to be addressed in disk space management in
relation to file systems?

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 129


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Self-Evaluation

In your own words, define the following terms.

1. Input

2. Output

3. File

4. Directory

Elaborate the following:

5. How can a disk access time be enhanced?

6. What are the three space allocation methods used in file systems?

7. What is the difference between file and block?

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 130


LEARNING MODULE SURIGAO STATE COLLEGE OF TECHNOLOGY

Review of Concepts:
A file is a named collection of related information that are treated
as a single entity defined by its creator and kept on secondary
storage devices. File names can be of fixed length or variable
length and are used to interact with it. A file can be structured in
bytes, records or trees. Several user programs or system calls
can be carried out to interact with files. Each device in a file
system keeps a volume table of contents or a device directory
listing the location of the files on the device. In addition, it is
useful to create directories to allow files to be organized.
Three directory organization methods are available: a single-
level, two-level and tree directory organization. A tree-structured
directory allows a user to create subdirectories to organize files.
The file system resides permanently on secondary storage
devices, mostly on disks which is designed to hold a large
amount of data permanently. Physical disks may be segmented
into partitions to control media use and to allow multiple,
possibly varying, file systems on a single spindle. These file
systems are mounted onto a layered logical file system
architecture to make use of them. The lower levels deal with the
physical properties of storage devices while the upper levels
deal with symbolic file names and logical properties of files. The
Intermediate levels map the logical file concepts into physical
device properties. Disk space is allocated to files through three
different mechanisms. Directory-management routines must
consider efficiency, performance, and reliability of file systems

References: William Korir, Introduction to Operating Systems,2017, Published under


http://en.wikipedia.org/wiki/Creative_Commons;
Required readings and other resources: Andrew S. Tanenbaum (2008). Modern
Operating System. 3rd Edition. Prentice-Hall; A Silberschatz, Peter B Galvin, G Gagne
(2009). Operating System Concepts. 8th Edition. Wiley.

CPE 373- OPERATING SYSTEMS (GRACECHELL M. PASCUA) 131

You might also like