0% found this document useful (0 votes)
5 views

Operating System Module

The document provides an overview of operating systems, detailing their functions, history, and types. It describes the role of an operating system as an interface between users and computer hardware, resource management, and the evolution of operating systems through various generations. Additionally, it categorizes operating systems into types such as batch, time-sharing, real-time, and distributed systems, while outlining key functions like process, memory, and I/O management.

Uploaded by

kiracherub866
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Operating System Module

The document provides an overview of operating systems, detailing their functions, history, and types. It describes the role of an operating system as an interface between users and computer hardware, resource management, and the evolution of operating systems through various generations. Additionally, it categorizes operating systems into types such as batch, time-sharing, real-time, and distributed systems, while outlining key functions like process, memory, and I/O management.

Uploaded by

kiracherub866
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 125

Jimma University

Jimma Institute of Technology


Faculty of Computing and Informatics

Operating System course Module

Jimma, Ethiopia
March, 2023
Chapter One
1. Overview and History of Operating system
1.1. Introduction to Operating System
In a computer system, we find four main components: the computer hardware, computer software and
the users. In a computer system the hardware provides the basic computing resources. Whereas
computer software can be divided into two main categories: application software and system software.
Application software consists of the programs for performing tasks particular to the machine’s
utilization. This software is designed to solve a particular problem for users. Examples of application
software include spreadsheets, database systems, desktop publishing systems, program development
software, and games. On the other hand, system software is more transparent and less noticed by the
typical computer user. This software provides a general programming environment in which
programmers can create specific applications to suit their needs. This environment provides new
functions that are not available at the hardware level and performs tasks related to executing the
application program. System software acts as an interface between the hardware of the computer and
the application software that users need to run on the computer. The most important type of system
software is the operating system.

An Operating System (OS) is a collection of programs that acts as an interface between a user of a
computer and the computer hardware. The purpose of an operating system is to provide an environment
in which a user may execute the programs. Operating Systems are viewed as resource managers. The
main resource is the computer hardware in the form of processors, storage, input/output devices,
communication devices, and data. Some of the operating system functions are: implementing the user
interface, sharing hardware among users, allowing users to share data among themselves, preventing
users from interfering with one another, scheduling resources among users, facilitating input/output,
recovering from errors, accounting for resource usage, facilitating parallel operations, organizing data for
secure and rapid access, and handling network communications.

A computer’s operating system is a group of programs designed to serve two basic purposes:
 To control the allocation and use of the computing system’s resources among the various users
and tasks, and
 To provide an interface between the computer hardware and the programmer that simplifies and
makes feasible the creation, coding, debugging, and maintenance of application programs.
An effective operating system should accomplish the following functions:
 Should act as a command interpreter by providing a user friendly environment
 Should facilitate communication with other users.

 Facilitate the directory/file creation along with the security option.

 Provide routines that handle the intricate details of I/O programming.


 Provide a loader program to move the compiled program code to the computer’s memory for
execution.
 Provide access to compilers to translate programs from high-level languages to machine
language.
 Take care of storage and device allocation.

 Provide for long term storage of user information in the form of files.

 Permit system resources to be shared among users when appropriate, and be protected from
unauthorized or mischievous intervention as necessary.

 Assure that when there are several active processes in the computer, each will get fair and non-
interfering access to the central processing unit for execution.

Though systems programs such as editors and translators and the various utility programs (such as sort and
file transfer program) are not usually considered part of the operating system, the operating system is
responsible for providing access to these system resources.
The abstract view of the components of a computer system and the positioning of OS is shown in the Figure
1

1.2. Goals of an operating System


The primary objective of a computer is to execute an instruction in an efficient manner and to
increase the productivity of processing resources attached with the computer system such as
hardware resources, software resources and the users. In other words, we can say that maximum
CPU utilization is the main objective, because it is the main device which is to be used for the
execution of the programs or instructions. We can brief the goals as:
 The primary goal of an operating system is to make the computer convenient to use.
 The secondary goal is to use the hardware in an efficient manner.

1.3. HISTORY OF OPERATING SYSTEMS

Operating systems have been evolving over the years. We will briefly look at this development of the
operating systems with respect to the evolution of the hardware / architecture of the computer systems
in this section. Since operating systems have historically been closely tied with the architecture of the
computers on which they run, we will look at successive generations of computers to see what their
operating systems were like. We may not exactly map the operating systems generations to the
generations of the computer, but roughly it provides the idea behind them.
We can roughly divide them into five distinct generations that are characterized by hardware component
technology, software development, and mode of delivery of computer services. The evolution of operating
system through the years can be mapped using generations of operating systems. There are four generations
of operating systems. These can be described as follows –
1.3.1. The First Generation ( 1945 - 1955 ): Vacuum Tubes and Plug boards
Digital computers were not constructed until the Second World War. Calculating engines with mechanical
relays were built at that time. However, the mechanical relays were very slow and were later replaced with
vacuum tubes. These machines were enormous but were still very slow.
These early computers were designed, built and maintained by a single group of people. Programming
languages were unknown and there were no operating systems so all the programming was done in machine
language. All the problems were simple numerical calculations.
By the 1950’s punch cards were introduced and this improved the computer system. Instead of using plug
boards, programs were written on cards and read into the system.
1.3.2. The Second Generation ( 1955 - 1965 ): Transistors and Batch Systems
Transistors led to the development of the computer systems that could be manufactured and sold to paying
customers. These machines were known as mainframes and were locked in air-conditioned computer rooms
with staff to operate them.
The Batch System was introduced to reduce the wasted time in the computer. A tray full of jobs was collected
in the input room and read into the magnetic tape. After that, the tape was rewound and mounted on a tape
drive. Then the batch operating system was loaded in which read the first job from the tape and ran it. The
output was written on the second tape. After the whole batch was done, the input and output tapes were
removed and the output tape was printed.
1.3.3. The Third Generation ( 1965 - 1980 ): IC and Multiprogramming
Until the 1960’s, there were two types of computer systems i.e. the scientific and the commercial computers.
These were combined by IBM in the System/360. This used integrated circuits and provided a major price
and performance advantage over the second generation systems.
The third generation operating systems also introduced multiprogramming. This meant that the processor
was not idle while a job was completing its I/O operation. Another job was scheduled on the processor so
that its time would not be wasted.
1.3.4. The Fourth Generation ( 1980 - Present ): Personal Computers
Personal Computers were easy to create with the development of large-scale integrated circuits. These were
chips containing thousands of transistors on a square centimeter of silicon. Because of these, microcomputers
were much cheaper than minicomputers and that made it possible for a single individual to own one of them.
The advent of personal computers also led to the growth of networks. This created network operating systems
and distributed operating systems. The users were aware of a network while using a network operating
system and could log in to remote machines and copy files from one machine to another.

1.4. Types of Operating System


In this section we will discuss about the different types of operating systems. Also study the types of
operating systems from Unit 2 in Block 1 of MCS-022 (Operating System Concepts and Networking
Management).
Modern computer operating systems may be classified into three groups, which are distinguished by the
nature of interaction that takes place between the computer user and his or her program during its processing.
The three groups are called batch, time-sharing and real-time operating systems
1.4.1. Batch Processing Operating System
In a batch processing operating system environment users submit jobs to a central place where these jobs are
collected into a batch, and subsequently placed on an input queue at the computer where they will be run. In
this case, the user has no interaction with the job during its processing, and the computer’s response time is
the turnaround time, the time from submission of the job until execution is complete, and the results are ready
for return to the person who submitted the job.
1.4.2. Time Sharing
Another mode for delivering computing services is provided by time sharing operating systems. In this
environment a computer provides computing services to several or many users concurrently on-line. Here,
the various users are sharing the central processor, the memory, and other resources of the computer system
in a manner facilitated, controlled, and monitored by the operating system. The user, in this environment,
has nearly full interaction with the program during its execution, and the computer’s response time may be
expected to be no more than a few second.
1.4.3. Real Time Operating System (RTOS)
The third class is the real time operating systems, which are designed to service those applications where
response time is of the essence in order to prevent error, misrepresentation or even disaster. Examples of real
time operating systems are those which handle airlines reservations, machine tool control, and monitoring of
a nuclear power station. The systems, in this case, are designed to be interrupted by external signals that
require the immediate attention of the computer system.
These real time operating systems are used to control machinery, scientific instruments and industrial
systems. An RTOS typically has very little user-interface capability, and no end-user utilities. A very
important part of an RTOS is managing the resources of the computer so that a particular operation
executes in precisely the same amount of time every time it occurs. In a complex machine, having a part
move more quickly just because system resources are available may be just as catastrophic as having it
not move at all because the system is busy.

A number of other definitions are important to gain an understanding of operatingsystems:

1.4.4. Multiprogramming Operating System


A multiprogramming operating system is a system that allows more than one active user program (or
part of user program) to be stored in main memory simultaneously. Thus, it is evident that a time-
sharing system is a multiprogramming system, but note that a multiprogramming system is not
necessarily a time-sharing system. A batch or real time operating system could, and indeed usually
does, have more than one active user program simultaneously in main storage. Another important, and
all too similar, term is “multiprocessing”.

1.4.5. Multiprocessing System


A multiprocessing system is a computer hardware configuration that includes more than one
independent processing unit. The term multiprocessing is generally used to refer to large computer
hardware complexes found in major scientific or commercial applications. More on multiprocessor
system can be studied in Unit-1 of Block-3 of this course.

1.4.6. Networking Operating System

A networked computing system is a collection of physical interconnected computers. The operating


system of each of the interconnected computers must contain, in addition to its own stand-alone
functionality, provisions for handing communication and transfer of program and data among the other
computers with which it is connected.

Network operating systems are not fundamentally different from single processor operating systems.
They obviously need a network interface controller and some low- level software to drive it, as well as
programs to achieve remote login and remote files access, but these additions do not change the essential
structure of the operating systems.

1.4.7. Distributed Operating System

A distributed computing system consists of a number of computers that are connected and managed so
that they automatically share the job processing load among the constituent computers, or separate the
job load as appropriate particularly configured processors. Such a system requires an operating system
which, in addition to the typical stand-alone functionality, provides coordination of the operations and
information flow among the component computers. The networked and distributed computing
environments and their respective operating systems are designed with more complex functional
capabilities. In a network operating system, the users are aware of the existence of multiple computers,
and can log in to remote machines and copy files from one machine to another. Each machine runs its
own local operating system and has its own user (or users).

A distributed operating system, in contrast, is one that appears to its users as a traditional uni-processor
system, even though it is actually composed of multiple processors. In a true distributed system, users
should not be aware of where their programs are being run or where their files are located; that should all
be handled automatically and efficiently by the operating system.
True distributed operating systems require more than just adding a little code to a uni-processor operating
system, because distributed and centralized systems differ in critical ways. Distributed systems, for
example, often allow program to run on several processors at the same time, thus requiring more complex
processor scheduling algorithms in order to optimize the amount of parallelism achieved.
1.4.8. Operating Systems for Embedded Devices
As embedded systems (PDAs, cellphones, point-of-sale devices, VCR’s, industrial robot control, or even
your toaster) become more complex hardware-wise with everygeneration, and more features are put into
them day-by-day, applications they run require more and more to run on actual operating system code in
order to keep the development time reasonable. Some of the popular OS are:
Nexus’s Conix - an embedded operating system for ARM processors.
Microsoft’s Windows CE and Windows NT Embedded OS.
Palm Computing’s Palm OS - Currently the leader OS for PDAs, has manyapplications and
supporting companies.
Sun’s Java OS - a standalone virtual machine not running on top of any other OS; mainly
targeted at embedded systems
1.5. Functions of Operating System
1.5.1. Process Management
The CPU executes a large number of programs. While its main concern is the execution of user
programs, the CPU is also needed for other system activities. These activities are called processes. A
process is a program in execution. Typically, a batch job is a process. A time-shared user program is a
process. A system task, such as spooling, is also a process. For now, a process may be considered as a
job or a time- shared program, but the concept is actually more general.

The operating system is responsible for the following activities in connection with processes
management:

The creation and deletion of both user and system processes

The suspension and resumption of processes.

The provision of mechanisms for process synchronization

The provision of mechanisms for deadlock handling.


1.5.2. Memory Management
Memory is the most expensive part in the computer system. Memory is a large arrayof words or bytes,
each with its own address. Interaction is achieved through a sequence of reads or writes of specific
memory address. The CPU fetches from and stores in memory.

There are various algorithms that depend on the particular situation to manage the memory. Selection of a
memory management scheme for a specific system depends upon many factors, but especially upon the
hardware design of the system. Each algorithm requires its own hardware support. The operating
system is responsible for the following activities in connection with memory management.

Decide which processes are to be loaded into memory when memory spacebecomes available.

Keep track of which parts of memory are currently being used and bywhom.

Allocate and deallocate memory space as needed.


1.5.3. Secondary Storage Management
The main purpose of a computer system is to execute programs. These programs, together with the data
they access, must be in main memory during execution. Since the main memory is too small to
permanently accommodate all data and program, the computer system must provide secondary storage to
backup main memory. Most modem computer systems use disks as the primary on-line storage of
information, of both programs and data. Most programs, like compilers, assemblers, sort routines, editors,
formatters, and so on, are stored on the disk until loaded into memory, and then use the disk as both the
source and destination of their processing. Hence the proper management of disk storage is of central
importance to a computer system.

There are few alternatives. Magnetic tape systems are generally too slow. In addition, they are limited to
sequential access. Thus tapes are more suited for storing infrequently used files, where speed is not a
primary concern.

The operating system is responsible for the following activities in connection with disk management:

Free space management


Storage allocation
Disk scheduling.
1.5.4. I/O Management
One of the purposes of an operating system is to hide the peculiarities or specific hardware devices from
the user. For example, in UNIX, the peculiarities of I/O devices are hidden from the bulk of the
operating system itself by the I/O system. The operating system is responsible for the following activities
in connection to I/O management:

A buffer caching system


To activate a general device driver code
To run the driver software for specific hardware devices as and when required

1.5.5. File Management


File management is one of the most visible services of an operating system. Computers can store
information in several different physical forms: magnetic tape, disk, and drum are the most common
forms. Each of these devices has its own characteristics and physical organization.
For convenient use of the computer system, the operating system provides a uniform logical view of
information storage. The operating system abstracts from the physical properties of its storage devices to
define a logical storage unit, the file. Files are mapped, by the operating system, onto physical devices.
A file is a collection of related information defined by its creator. Commonly, files represent programs
(both source and object forms) and data. Data files may be numeric, alphabetic or alphanumeric. Files
may be free-form, such as text files, or may be rigidly formatted. In general a files is a sequence of bits,
bytes, lines or records whose meaning is defined by its creator and user. It is a very general concept.
The operating system implements the abstract concept of the file by managing mass storage device, such
as types and disks. Also files are normally organized into directories to ease their use. Finally, when
multiple users have access to files, it maybe desirable to control by whom and in what ways files may
be accessed.
The operating system is responsible for the following activities in connection to the file management:
The creation and deletion of files.
The creation and deletion of directory.
The support of primitives for manipulating files and directories.
The mapping of files onto disk storage.
Backup of files on stable (nonvolatile) storage.
Protection and security of the files.

1.5.6. Protection
The various processes in an operating system must be protected from each other’s activities. For that
purpose, various mechanisms which can be used to ensure that the files, memory segment, CPU and other
resources can be operated on only by those processes that have gained proper authorization from the
operating system. For example, memory addressing hardware ensures that a process can only execute
within its own address space. The timer ensures that no process can gain control of the CPU without
relinquishing it. Finally, no process is allowed to do its own I/O, to protect the integrity of the various
peripheral devices. Protection refers to a mechanism for controlling the access of programs, processes,
or users to the resources defined by a computer controls to be imposed, together with some means of
enforcement. Protection can improve reliability by detecting latent errors at the interfaces between
component subsystems. Early detection of interface errors can often prevent contamination of a healthy
subsystem by a subsystem that is malfunctioning. An unprotected resource cannot defend against use (or
misuse) by an unauthorized or incompetent user.
1.5.7. Networking

A distributed system is a collection of processors that do not share memory or a clock. Instead, each processor
has its own local memory, and the processors communicate with each other through various communication
lines, such as high speed buses or telephone lines. Distributed systems vary in size and function. They may
involve microprocessors, workstations, minicomputers, and large general purpose computer systems.

The processors in the system are connected through a communication network, which can be configured in
the number of different ways. The network may be fully or partially connected. The communication network
design must consider routing and connection strategies and the problems of connection and security. A
distributed system provides the user with access to the various resources the system maintains. Access to a
shared resource allows computation speed-up, data availability, and reliability.

1.5.8. Command Interpretation


One of the most important components of an operating system is its command interpreter. The command
interpreter is the primary interface between the user and the rest of the system.
Many commands are given to the operating system by control statements. When a new job is started in a
batch system or when a user logs-in to a time-shared system, a program which reads and interprets control
statements is automatically executed. This program is variously called (1) the control card interpreter, (2) the
command line interpreter, (3) the shell (in Unix), and so on. Its function is quite simple: get the next
command statement, and execute it.

The command statements themselves deal with process management, I/O handling, secondary storage
management, main memory management, file system access, protection, and networking.
The Figure 2 depicts the role of the operating system in coordinating all the functions.

I/O
Management
Protection & File
Security Management

Process
Secondary
Management
Storage
Management
Operating System

Communication
Management Memory
Management
User
Interface Networking

Figure 2
Chapter Two

Processes and Process Management

2. Process and Threads


We are now about to embark on a detailed study of how operating systems are designed and
constructed. The most central concept in any operating system is the process: an abstraction
of a running program. Processes are one of the oldest and most important abstractions that
operating systems provide. They support the ability to have (pseudo) concurrent operation
even when there is only one CPU available. They turn a single CPU into multiple virtual
CPUs. Without the process abstraction, modem computing could not exist.

2.1. Process
All modern computers often do several things at the same time. People used to working
with personal computers may not be fully aware of this fact, so a few examples may make
the point clearer. First consider a Web server. Requests come in from all over asking for
Web pages. When a request comes in the server checks to see if the page needed is in the
cache. If it is, it is sent back; if it is not, a disk request is started to fetch it. However, from
the CPU's perspective, disk requests take eternity. While waiting for the disk request to
complete, many more requests may come in. If there are multiple disks present, some or
all of them may be fired off to other disks long before the first request is satisfied. Clearly
some way is needed to model and control this concurrency. Processes (and especially
threads) can help here.
is more than the program code, which is
sometimes known as the text section. It also includes the current activity, as represented by
the value of the program counter and the contents of the processor’s registers. A process
generally also includes the process stack, which contains temporary data (such as function
parameters, return addresses, and local variables), and a data section, which

in memory is shown in Figure we emphasize


that a program by itself is not a process. A program is a passive entity, such as a file
containing a list of instructions stored on disk (often called an executable file). In contrast,
a process is an active entity, with a program counter specifying the next instruction to
execute and a set of associated resources. A program becomes a process when an
executable file is loaded into memory. Two common techniques for loading executable
files
2.1.1. Process Creation
Operating systems need some way to create processes. In very simple systems, or in
systems designed for running only a single application it may be possible to have all the
processes that will ever be needed be present when the system comes up. There are four
principal events that cause processes to be created:

System initialization.

A user request to create a new process.

Process Termination
After a process has been created, it starts running and does whatever its job is. However,
nothing lasts forever, not even processes. Sooner or later the new process will terminate,
usually due to one of the following conditions:

Normal exit (voluntary).

Error exit (voluntary).


Fatal error (involuntary).

Killed by another process (involuntary).


Process Hierarchies
In some systems, when a process creates another process, the parent process and child
process continue to be associated in certain ways. The child process can itself create
more processes, forming a process hierarchy. Note that unlike plants and animals that
use sexual reproduction, a process has only one parent (but zero, one, two, or more
children).
In UNIX, a process and all of its children and further descendants together form a process
group. When a user sends a signal from the keyboard, the signal is delivered to all
members of the process group currently associated with the key board (usually ail active
processes that were created in the current window). Individually, each process can catch
the signal, ignore the signal, or take the default action, which is to be killed by the signal.
In contrast, Windows has no concept of a process hierarchy. All processes are equal. The
only hint of a process hierarchy is that when a process is created, the parent is given a
special token (called a handle) that it can use to control the child. However, it is free to
pass this token to some other process, thus invalidating the hierarchy. Processes in UNIX
cannot disinherit their children.
2.1.4. Process state

As a process executes, it changes state. The state of a process is defined in part by the
current activity of that process.

A process may be in one of the following states:


New. The process is being created.
Running. Instructions are being executed.
Waiting. The process is waiting for some event to occur (such as an I/O
completion or reception of a signal).

Ready. The process is waiting to be assigned to a processor.


Terminated. The process has finished execution.

These names are arbitrary, and they vary across operating systems. The states that they
represent are found on all systems, however. Certain operating systems also more
finely delineate process states. It is important to realize that only one process can be
running on any processor at any instant. Many
to these states is presented below in Figure
2.0

New Admitted Interrupt Exit Terminated

Ready Running

Scheduler dispatch
I/O or event completion I/O or event wait

Waiting

Fig 2.0 Diagram of process state

2.1.5. Implementation of Processes

many pieces of
information associated with a specific process, including these:
Process state. The state may be new, ready, running, and waiting, halted, andso on.

to be
executed for this process.

CPU registers. The registers vary in number and type, depending on the
computer architecture. They include accumulators, index registers, stack pointers,
and general-purpose registers, plus any condition-code information. Along with the
program counter, this state information mustbe saved when an interrupt occurs.
CPU-scheduling information. This information includes a process priority,pointers to
scheduling queues, and any other scheduling parameters.

Memory-management information. This information may include such


segment tables,
depending on the memory system used by the operatingsystem.
Accounting information. This information includes the amount of CPU and real time
used, time limits, account numbers, job or process numbers,and so on.

I/O status information. This information includes the list of I/O devices allocated to
the process, a list of open files, and so on.
In brief, the PCB simply serves as the repository for any information that may vary
from process to process.
2.1.6. Modeling Multiprogramming

When multiprogramming is used, the CPU utilization can be improved. Crudely put, if the
average process computes only 20% of the time it is sitting in memory, with five processes
in memory at once, the CPU should be busy all the time. This model is unrealistically
optimistic, however, since it tacitly assumes

A better model is to look at CPU usage from a probabilistic viewpoint. Suppose that a
process spends a fraction p of its time waiting for I/O to complete. With n processes in
memory at once, the probability that all n processes are wait

2.2. Threads

performs a
single thread of execution. For example, when a process is running a word-processor
program, a single thread of instructions is being executed. This single thread of control
allows the process to perform only one task at a time. The user cannot simultaneously type
in characters and run the spell
have extended the process concept to allow a process to have multiple
threads of execution and thus to perform more than one task at a time. This feature is
especially beneficial on multicore systems, where multiple threads can run in parallel. On
a system that supports threads, the PCB is expanded to include information for each thread.
Other changes throughout the system are also needed to support threads.

A thread is a basic unit of CPU utilization; it comprises a thread ID, a program counter, a
register set, and a stack. It shares with other threads belonging to the same process its
code section, data section, and other operating-system
has a single thread of control. If a process has
multiple threads of control, it can perform more than one task at a time. . Figure 2.1
illustrates the difference between a traditional single-threaded process and a multithreaded
process.

code data files


code data files

registers registers registers


registers stack

stack stack stack

Thread

Single-threaded process multithreaded process

Figure 2.1 Single-threaded and multithreaded processes.

2.2.1. Thread Usage

Why would anyone want to have a kind of process within a process? It turns out there are
several reasons for having these mini processes, called threads. Let
applications, multiple activities are
going on at once. Some of these may block from time to time. By decomposing such an
application into multiple sequential threads that run in quasi-parallel, the programming model
becomes simpler.
We have seen this argument before. It is precisely the argument for having processes. Instead
of thinking about interrupts, timers, and context switches, we can think about parallel
processes. Only now with threads we add a new element: the ability for the parallel entities
to share an address space and all of its data among themselves. This ability is essential for
certain applications, which is why having multiple processes (with their separate address
spaces) will not work.
A second argument for having threads is that since they are lighter weight than processes,
they are easier (i.e., faster) to create and destroy than processes. In many systems, creating
a thread goes 10-100 times faster than creating a process. When the number of threads needed
changes dynamically and rapidly, this property is useful to have.
Threads can help here. Suppose that the word processor is written as a two- threaded program.
One thread interacts with the user and the other handles reformatting in the background. As
soon as the sentence is deleted from page 1, the interactive thread tells the reformatting thread
to reformat the whole book. Meanwhile, the interactive thread continues to listen to the
keyboard and mouse and responds to simple commands like scrolling page 1 while the other
thread is computing madly in the background. With a little luck, the reformatting will be
completed before the user asks to see page 600, so it can be displayed instantly.
While we are at it, why not add a third thread? Many word processors have a feature of
automatically saving the entire file to disk every few minutes to protect the user against losing
a day's work in the event of a program crash, system crash, or power failure. The third thread
can handle the disk backups without interfering with the other two.
Finally, threads are useful on systems with multiple CPUs, where real parallelism is possible.
2.2.2. Multithreaded

The benefits of multithreaded programming can be broken down into four major categories:
 Responsiveness. Multithreading an interactive application may allowa program to
continue running even if part of it is blocked or is performing a lengthy operation,
thereby increasing responsiveness to the user. This quality is especially useful in
designing user interfaces. For instance, consider what happens when a user clicks a
button that results in the performance of a time-consuming operation. A single-
threaded application would be unresponsive to the user until the operation had
completed. In contrast, if the time-consuming operation is performed in a separate
thread, the application remains responsive to the user.
 Resource sharing. Processes can only share resources through techniques such as
shared memory and message passing. Such techniques mustbe explicitly arranged
by the programmer. However, threads share the
The benefit of sharing code and data is that it
allows an application to have several different threads of activity within the same
address space.
 Economy. Allocating memory and resources for process creation is costly. Because
threads share the resources of the process to which they belong, it is more
economical to create and context-switch threads. Empirically gauging the difference
in overhead can be difficult, but in general it is significantly more time consuming to
create and manage processes thanthreads.
 Scalability. The benefits of multithreading can be even greater in a multiprocessor
architecture, where threads may be running in parallel on different processing cores.
A single-threaded process can run on onlyone processor, regardless how many are
available. We explore this issuefurther in the following section.
2.2.3.

Our discussion so far has treated threads in a generic sense. However, support for threads may
be provided either at the user level, for user threads, or by the kernel, for kernel threads. User
threads are supported above the kernel and are managed without kernel support, whereas
kernel threads are supported and managed directly by the operating system. Virtually all
contemporary
support kernel threads.
Ultimately, a relationship must exist between user threads and kernel threads. In this section,
we look at three common ways of establishing such a relationship: the many-to-one model,
the one-to-one model, and the many-to- many models.
2.2.3.1.
2.2.4. POSIX Threads
The POSIX thread libraries are a C/C++ thread API based on standards. It enables the creation
of a new concurrent process flow. It works well on multi-processor or multi-core systems,
where the process flow may be scheduled to execute on another processor, increasing speed
through parallel or distributed processing. Because the system does not create a new system,
virtual memory space and environment for the process, threads needless overhead than
“forking” or creating a new process.

While multiprocessor systems are the most effective benefits can also be obtained on
uniprocessor systems that leverage delay in I/O and other system processes that may impede
process execution. To utilize the PThread interfaces, we must include the header pthread.h at
the start of the CPP script.
#include <pthread.h>
PThreads is a highly concrete multithreading system that is the UNIX system’s default
standard. PThreads is an abbreviation for POSIX threads, and POSIX is an abbreviation for
Portable Operating System Interface, which is a type of interface that the operating system
must implement. PThreads in POSIX outline the threading APIs that the operating system
must provide.
Why is PThreads used?
The fundamental purpose for adopting PThreads is to improve programme
performance.
When compared to the expense of starting and administering a process, a thread
requires far less operating system overhead. Thread management takes fewer system
resources than process management.
A process’s threads all share the same address space. Inter-thread communication is
more efficient and, in many circumstances, more user-friendly than inter-process
communication.
Threaded applications provide possible performance increases and practical
advantages over non-threaded programmes in a variety of ways.
Multi-threaded programmes will run on a single-processor system but will
automatically make use of a multiprocessor machine without the need for
recompilation.
The most significant reason for employing PThreads in a multiprocessor system is to
take advantage of possible parallelism. This will be the major emphasis of the rest of
this lesson.
In order for a programme to use PThreads, it must be divided into discrete,
independent tasks that may run concurrently.
The new thread is made runnable, and it will begin performing the start routine using
the arg argument as the argument. The arg parameter is a void pointer that can point
to any type of data. Casting this pointer into a scalar data type (such as int) is not
advised since the casts may not be portable.
Let’s have a look at a C example of a better implementation approach:
#include <pthread.h> #include <stdio.h> #include <stdlib.h>

#define NUMBER OF THREADS 10


void *print hello world(void *tid)
{
/* This function prints the thread’s identifier and then exits. */ printf("Hello World.
Greetings from thread %d\n", tid);
pthread exit(NULL);
}

int main(int argc, char *argv[])


{
/* The main program creates 10 threads and then exits. */ pthread t
threads[NUMBER OF THREADS];
int status, i;

for(i=0; i < NUMBER OF THREADS; i++) {


printf("Main here. Creating thread %d\n", i);
status = pthread create(&threads[i], NULL, print hello world, (void *)i);
if (status != 0) {
printf("Oops. pthread create returned error code %d\n", status); exit(-1);
}
}
exit(NULL);
}

Figure 2.4. Program using thread

2.2.5.
There are two main ways to implement a threads package: in user space and
mentation is also possible. We
will now describe these methods, along with their
The first method is to put the threads package entirely in user space. The ker
ordinary, single-
threaded processes. The first, and most obvious, advantage is that a user-level threads
package can be implemented on an operating system that does
user-level threads packages.
However, there is one key difference with processes. When a thread is finished running for
the moment, for example, when it calls thread-.yield, the code of thread yield can save the
thread's information in the thread table itself. Furthermore, it can then call the thread scheduler
to pick another thread to run. The procedure that saves the thread's state and the scheduler are
just local procedures, so invoking them is much more efficient than making a kernel call.
Among other issues, no trap is needed, no context switch is needed, and the memory cache
need not be flushed, and so on. This makes thread scheduling very fast.
User-level threads also have other advantages. They allow each process to have its own
customized scheduling algorithm. For some applications, for example, those with a garbage
collector thread, not having to worry about a thread being stopped at an inconvenient moment
is a plus. They also scale better, since kernel threads invariably require some table space and
stack, space in the kernel, which can be a problem if there are a very large number of threads.
Despite their better performance, user-level threads packages have some major problems.
First among these is the problem of how blocking system calls are implemented.
2.2.6. Implementing Threads in the Kernel
The kernel's thread table holds each thread's registers, state, and other information. The
information is the same as with user-level threads, but now kept in the kernel instead of in
user space (inside the run-time system). This information is a subset of the information that
traditional kernels maintain about their single- threaded processes, that is, the process state.
In addition, the kernel also maintains the traditional process table to keep track of processes.
Ail calls that might block a thread are implemented as system calls, at considerably greater
cost than a call to a run-time system procedure. When a thread blocks, the kernel, at its option,
can run either another thread from the same process (if one is ready) or a thread from a
different process. With user-level threads, the run-time system keeps running threads from its
own process until the kernel takes the CPU away from it (or there are no ready threads left
to run).
nel, some systems
take an environmentally correct approach and recycle their threads. When a thread is
destroyed, it is marked as not runnable, but its kernel data structures are not otherwise
affected. Later, when a new thread must be created, an old thread is reactivated, saving some
overhead. Thread recycling is also possible for user-level threads, but since the thread
management overhead is much smaller, there is less incentive to do this.
2.2.7. Hybrid Implementations
Various ways have been investigated to try to combine the advantages of user-level threads
with kernel-level threads. One way is use kernel-level threads and then multiplexes user-level
threads onto some or all of the kernel threads, when this approach is used; the programmer
can determine how many kernel threads to use and how many user-level threads to multiplex
on each one. This model gives the ultimate in flexibility.
2.3.
Processes frequently need to communicate with other processes. For example, in a shell
pipeline, the output of the first process must be passed to the second process, and so on
down the line. Thus there is a need for communication between processes, preferably in a
well-structured way not using interrupts. In the following sections we will look at some of
the issues related to this Inter process Communication, or IPC.
There are three issues:-
 The first was alluded to above: how one process can pass information to another.
 The second has to do with making sure two or more processes do not get in each
other's way, for example, two processes in an airline reservation system each trying
to grab the last seat on a plane for a different customer.
 The third concerns proper sequencing when dependencies are present: if process A
produces data and process B prints them, B has to wait until A has produced some data
before starting to print.
It is also important to mention that two of these issues apply equally well to threads.
The first one passing information is easy for threads since they share a common
address space (threads in different address spaces that need to communicate fail under
the heading of communicating processes). However, the other two keeping out of each
other's hair and proper sequencing apply equally well to threads. The same problems
exist and the same solutions apply. Below we will discuss the problem in the context
of processes, but please keep in mind that the same problems and solutions also apply
to threads.

2.3.1. Race Condition


In some operating systems, processes that are working together may share some common
storage that each one can read and write. The shared storage may be in main memory
(possibly in a kernel data structure) or it may be a shared file; the location of the shared
memory does not change the nature of the communication or the problems that arise. To see
how inter process communication works in practice, let us consider a simple but common
example: a print spooler. When a process wants to print a file, it enters the file name in a
special spooler directory.
Another process, the printer daemon, periodically checks to see if there are any files to be
printed, and if there are, it prints them and then removes their names from the directory.
2.3.2. Critical Regions
How do we avoid race conditions?
The key to preventing trouble here and in many other situations involving shared memory,
shared files, and shared everything else is to find some way to prohibit more than one process
from reading and writing the shared data at the same time. Put in other words, what we need
is mutual exclusion, that is, some way of making sure that if one process is using a shared
variable or file, the other processes will be excluded from doing the same thing.
That part of the program where the shared memory is accessed is called the critical region
or critical section. If we could arrange matters such that no two processes were ever in their
critical regions at the same time, we could avoid races. Although this requirement avoids race
conditions, it is not sufficient for having parallel processes cooperate correctly and efficiently
using shared data. We need four conditions to hold to have a good solution. Although this
requirement avoids race conditions, it is not sufficient for having parallel processes cooperate
correctly and efficiently using shared data. We need four conditions to hold to have a good
solution:
No two processes may be simultaneously inside their critical regions.

No assumptions may be made about speeds or the number of CPUs.

No process running outside its critical region may block other processes.

No process should have to wait forever to enter its critical region.


2.3.3. Mutual Exclusion with Busy Waiting
In this section we will examine various proposals for achieving mutual exclusion, so that
while one process is busy updating shared memory in its critical re

2.3.3.1. Disabling Interrupts


On a single-processor system, the simplest solution is to have each process disable all
interrupts just after entering its critical region and re-enable them just before leaving it. With
interrupts disabled, no clock interrupts can occur. The CPU is only switched from process to
process as a result of clock or other interrupts, after all, and with interrupts turned off the CPU
will not be switched to an• other process. Thus, once a process has disabled interrupts, it can
examine and update the shared memory without fear that any other process will intervene.
This approach is generally unattractive because it is unwise to give user processes the
power to turn off interrupts. Suppose that one of them did it, and never turned them
on again? That could be the end of the system. Furthermore, if the system is a
multiprocessor (with two or possibly more CPUs) disabling interrupts affects only the
CPU that executed the disable instruction. The other ones will continue running and
can access the shared memory.

2.3.3.2. Lock Variables


As a second attempt, let us look for a software solution. Consider having a single, shared
(lock) variable, initially 0. When a process wants to enter its critical region, it first tests the
lock. If the lock is 0, the process sets it to 1 and enters the critical region. If the lock is already
1, the process just waits until it becomes 0. Thus, a 0 means that no process is in its critical
region and a 1 means that some process is in its critical region. Unfortunately, this idea
contains exactly the same fatal flaw that we saw in the spooler directory. Suppose that one
process reads the lock and sees that it is 0. Before it can set the lock to 1, another process is
scheduled, runs, and sets the lock to 1. When the first process runs again, it will also set the
lock to 1, and two processes will be in their critical regions at the same time.
Now you might think that we could get around this problem by first reading out the lock
value, then checking it again just before storing into it, but that really does not help. The race
now occurs if the second process modifies the lock just after the first process has finished its
second check.

2.3.3.3. Strict Alternation


Turn Variable or Strict Alternation Approach is the software mechanism implemented at user
mode. It is a busy waiting solution which can be implemented only for two processes. In this
approach, A turn variable is used which is actually a lock.
This approach can only be used for only two processes. In general, let the two processes be
Pi and Pj. They share a variable called turn variable.

2.3.3.4. Peterson's Solution


What is Peterson's Solution in OS?
In operating systems, there may be a need for more than one process to access a shared
resource such as memory or CPU. In shared memory, if more than one process is accessing a
variable, then the value of that variable is determined by the last process to modify it, and the
last modified value overwrites the first modified value. This may result in losing important
information written during the first process. The location where these processes occur is
called the critical section. These critical sections prevent information loss by preventing two
processes from simultaneously being in the same critical region or updating the same variable
simultaneously. This problem is called the Critical-Section problem, and one of the solutions
to this problem is the Peterson's solution.
Peterson's solution is a classic solution to the critical section problem. The critical section
problem ensures that no two processes change or modify a resource's value simultaneously.
For example, let int a=5, and there are two processes p1 and p2 that can modify the value of
a. p1 adds 2 to a a=a+2 and p2 multiplies a with 2, a=a*2. If both processes modify the value
of a at the same time, then a value depends on the order of execution of the process. If p1
executes first, a will be 14; if p2 executes first, a will be 12. This change of values due to
access by two processes at a time is the cause of the critical section problem.

2.3.3.5. The TSL Instruction


If we are given assistance by the instruction set of the processor we can implement a
solution to the mutual exclusion problem. The instruction we require is called test and set
lock (TSL). This instruction reads the contents of a memory location, stores it in a register
and then stores a non-zero value at the address. This operation is guaranteed to be
indivisible. That is, no other process can access that memory location until the TSL
instruction has finished.

2.3.4. Sleep and Wakeup


Both Peterson's solution and the solutions using TSL or XCHG are correct, but both have
the defect of requiring busy waiting. In essence, what these solutions do is this: when a
process wants to enter its critical region, it checks to see if the

Not only does this approach waste CPU time, but it can also have unexpected effects.
Consider a computer with two processes, H, with high priority, and L, with low priority.
The scheduling rules are such that H is run whenever it is in ready state. At a certain moment,
with L in its critical region, H becomes ready to run (e.g., an I/O operation completes). H
now begins busy waiting, but since L is never scheduled while H is running, L never gets
the chance to leave its critical region, so H loops forever. This situation is sometimes
referred to as the priority inversion problem.
Now let us look at some inter process communication primitives that block instead of
wasting CPU time when they are not allowed to enter their critical regions. One of the
simplest is the pair sleep and wakeup. Sleep is a system call that causes the caller to block,
that is, be suspended until another process wakes it up. The wakeup call has one parameter,
the process to be awakened. Alternatively, both sleep and wakeup each have one parameter,
a memory address used to match up sleeps with wakeups.
Producer-Consumer Problem: - is a classic problem this is used for multi-process
synchronization i.e. synchronization between more than one processes. In the producer-
consumer problem, there is one Producer that is producing something and there is one
Consumer that is consuming the products produced by the Producer. The producers and
consumers share the same memory buffer that is of fixed-size. The job of the producer is to
generate the data, put it into the buffer, and again start generating data. While the job of the
Consumer is to consume the data from the buffer.
The following are the problems that might occur in the Producer-Consumer:
 The producer should produce data only when the buffer is not full. If the buffer is
full, then the producer shouldn't be allowed to put any data into the buffer.
 The consumer should consume data only when the buffer is not empty. If the buffer
is empty, then the consumer shouldn't be allowed to take any data from the buffer.
 The producer and consumer should not access the buffer at the same time.
The above three problems can be solved with the help of semaphores (learn more
about semaphores from here).
Semaphore: - is refer to the integer variables that are primarily used to solve the
critical section problem via combining two of the atomic procedures, wait and
signal, for the process synchronization.
In the producer-consumer problem, we use three semaphore variables:
 Semaphore S: This semaphore variable is used to achieve mutual exclusion between
processes. By using this variable, either Producer or Consumer will be allowed to use
or access the shared buffer at a particular time. This variable is set to 1 initially.
 Semaphore E: This semaphore variable is used to define the empty space in the
buffer. Initially, it is set to the whole space of the buffer i.e. "n" because the buffer is
initially empty.
 Semaphore F: This semaphore variable is used to define the space that is filled by
the producer. Initially, it is set to "0" because there is no space filled by the producer
initially.
By using the above three semaphore variables and by using the wait() and signal()
function, we can solve our problem(the wait() function decreases the semaphore
variable by 1 and the signal() function increases the semaphore variable by 1).
2.3.5. Mutexes
the
semaphore, called a mutex, is sometimes used. Mutexes are good only for
easy and efficient
to implement, which m a k e s t h e m especially u s e f u l in thread packages that are
implemented entirely in user space.
A mutex is a variable that can be in one of two states: unlocked or locked. Consequently,
only 1 bit is required to represent it, but in practice an integer often

ical region, it calls mutex-


lock. If the mutex is currently unlocked (meaning that the critical region is available),
the call succeeds and the calling thread is free to enter the critical region. On the other
hand, if the mutex is already locked, the calling thread is blocked until the thread in the
critical region is finished and calls mutex-.unlock. If multi
ed to acquire the lock. Because
mutexes are so simple, they can easily be implemented in user space provided that a TSL
or XCHG instruction is available.

2.3.6. Monitors
With semaphores and mutexes inter process communication looks easy, right? Forget it.
Monitors are used for process synchronization. With the help of programming languages, we
can use a monitor to achieve mutual exclusion among the processes. Example of monitors:
Java Synchronized methods such as Java offers notify () and wait () constructs. In other
words, monitors are defined as the construct of programming language, which helps in
controlling shared data access.
The Monitor is a module or package which encapsulates shared data structure, procedures,
and the synchronization between the concurrent procedure invocations.
Characteristics of Monitors.
 Inside the monitors, we can only execute one process at a time.
 Monitors are the group of procedures, and condition variables that are merged
together in a special type of module.
 If the process is running outside the monitor, then it cannot access the monitor’s
internal variable. But a process can call the procedures of the monitor.
 Monitors offer high-level of synchronization
 Monitors were derived to simplify the complexity of synchronization problems.
 There is only one process that can be active at a time inside the monitor.
Components of Monitor
There are four main components of the monitor:
1 Initialization
2 Private data
3 Monitor procedure
4 Monitor entry queue
Initialization: - Initialization comprises the code, and when the monitors are
created, we use this code exactly once.
Private Data: - Private data is another component of the monitor. It comprises all
the private data, and the private data contains private procedures that can only be
used within the monitor. So, outside the monitor, private data is not visible.
Monitor Procedure: - Monitors Procedures are those procedures that can be called
from outside the monitor.
Monitor Entry Queue: - Monitor entry queue is another essential component of the
monitor that includes all the threads, which are called procedures.
Condition Variables
There are two types of operations that we can perform on the condition variables of
the monitor:
 Wait
 Signal
Suppose there are two condition variables
Condition a, b // Declaring variable
Wait Operation
a.wait(): - The process that performs wait operation on the condition variables are
suspended and locate the suspended process in a block queue of that condition
variable.
Signal Operation
a.signal() : - If a signal operation is performed by the process on the condition
variable, then a chance is provided to one of the blocked processes.
2.3.7. Massage Passing
Process communication is the mechanism provided by the operating system that allows
processes to communicate with each other. This communication could involve a process
letting another process know that some event has occurred or transferring of data from one
process to another. One of the models of process communication is the message passing.
Message passing allows multiple processes to read and write data to the message queue
without being connected to each other. Messages are stored on the queue until their recipient
retrieves them. Message queues are quite useful for inter process communication and are used
by most operating systems. Message passing provides two operations which are send
message (destination, &message) and receive message ( ).
In figure 2.5, that demonstrates message passing model of process communication is given
as follows.
Figure 2.5 Message Passing Systems

The Producer-Consumer Problem with Message Passing: Message Passing allows us to solve
the Producer-Consumer problem on distributed systems. In the problem below, an actual buffer
does not exit. Instead, the producer and consumer pass messages to each other. These messages
can contain the items which, in the previous examples, were placed in a buffer. They can also
contain empty messages, meaning that a slot in the "buffer" is ready to receive a new item. In this
example, a buffer size of four has been chosen. The Consumer begins the process by sending four
empty messages to the producer. The producer creates a new item for each empty message it
receives, then it goes to sleep. The producer will not create a new item unless it first receives an
empty message. The consumer waits for messages from the producer which contain the items.
Once it consumes an item, it sends an empty message to the producer. Again, there is no real
buffer, only a buffer size which dictates the number of items allowed to be produced.
BufferSize = 4;
Producer()
{
int widget;
message m; // message buffer
while (TRUE) {
make_item(widget); // create a new item
receive(consumer, &m); // wait for an empty message to arrive
build_message(&m, widget); // make a message to send to the consumer
send(consumer, &m); // send widget to consumer
}
}
Consumer()
{
int widget;
message m;
for(0 to N) send(producer, &m); // send N empty messages
while (TRUE) {
receive(producer, &m); // get message containing a widget
extract_item(&m, widget); // take item out of message
send(producer, &m); // reply with an empty message
consume_item(widget); // consumer the item
}
}

When communicating processes are running on various computers connected by a


network, message passing systems face many difficult design problems that do not exist
with semaphores or monitors. For instance, the network may lose messages. The sender
and receiver might agree that as soon as a message is received, the receiver will send back
a certain acknowledgement message in order to prevent lost messages. The message is
sent again if the sender does not get the acknowledgement within a predetermined window
of time. Think about what would transpire if the message was delivered successfully but
the acknowledgement was misplaced. The message will be sent again, ensuring that the
recipient receives it twice.

The receiver must be able to distinguish between a new message and a retransmission of
an older message. The typical solution to this issue is to include consecutive sequence
numbers in each original message. Receivers are aware that a message is a duplicate and
might be ignored if it has the same sequence number as a prior message. So that the
process mentioned in a send or receive call is clear, message systems must also address
the issue of how processes are named.

Authentication is a problem in message-based systems as well: how can the client be


certain that he is speaking with the legitimate file server and not a phony one? When the
transmitter and receiver are on the same machine, there are other design considerations
that are crucial.

A semaphore action or accessing a monitor is always faster than copying messages from
one process to another. Making message passing efficient has required a great deal of
effort. For instance, Cheriton (1984) proposed restricting the size of messages to that
which can fit in the machine's registers and then performing message passing using the
registers. With message passing, numerous variations are available. Let's start by
examining the manner in which communications are addressed. Messages can be
addressed to processes by giving each one a distinct address. Creating a new data structure
called a mailbox is an alternative method. A mailbox is a location where a specific amount
of messages can be buffered, usually one that is defined when the mailbox is formed. The
address parameters in the send and receive calls are mailboxes, not processes, when
mailboxes are used. In order to make place for a new message, a

Procedure that attempts to transmit to a mailbox that is already full is halted. Both the
producer and the consumer would build mailboxes big enough to house N messages for
the producer-consumer issue.

The consumer would send empty messages to the producer's mailbox, and the producer
would respond by sending messages with data to the consumer's inbox. Using mailboxes
makes the buffering mechanism obvious: messages transmitted to the destination process
but not yet accepted are stored in the destination mailbox. The opposite of having
mailboxes is to completely disable buffering. When using this method, if the send occurs
before the receive, the sending process is halted until the receive at which point the
message can be copied directly from the sender to the receiver without any buffering in
between. The receiver is also blocked until a send occurs if the receive is completed first.
This tactic is frequently referred to as a rendezvous. Since the transmitter and receiver
must run in lockstep, it is less flexible than a buffered message method but simpler to
construct.

2.4. SCHEDULING

When a computer is multiprogrammed, it frequently has multiple processes or threads


competingfor the CPU at the same time. This situation occurs whenever two or more
of them are simultaneously in the ready state. If only one CPU is available, a choice
has to be made which process to run next. The part of the operating system that makes
the choice is called thescheduler, and the algorithm it uses is called the scheduling
algorithm. These topics form the subject matter of the following sections.
Many of the same issues that apply to process scheduling also apply to thread
scheduling, although some are different. When the kernel manages threads,
scheduling is usually done per thread, with little or no regard to which process the
thread belongs. Initially we will focus on scheduling issues that apply to both
processes and threads.
2.4.1. Introduction to scheduling

Back in the old days of batch systems with input in the form of card images on a magnetic
tape, the scheduling algorithm was simple: just runs the next job on the tape. With
multiprogramming systems, the scheduling algorithm became more complex because there
were generally multiple users waiting for service. Some mainframes still combine batch and
timesharing service, requiring the scheduler to decide whether a batch job or an interactive
user at a terminal should go next.

(As an aside, a batch job may be a request to run multiple programs in succession, but for this
section, we will just assume it is a request to run a single program.)
Because CPU time is a scarce resource on these machines, a good scheduler can make a big
difference in perceived performance and user satisfaction. Consequently, a great deal of work
has gone into devising clever and efficient scheduling algorithms. With the advent of personal
computers, the situation changed in two ways. First, most of the time there is only one active
process.

A user entering a document on a word processor is unlikely to be simultaneously


compiling a program in the background. When the user types a command to the word
processor, the scheduler does not have to do much work to figure out which process to run-
the word processor is the only candidate. Second, computers have gotten so much faster
over the yearsthat the CPU is rarely a scarce resource any more. Most programs for personal
computers are limited by the rate at which the user can present input (by typing or clicking),
not by the rate the CPU can process it. Even compilations, a major sink of CPU cycles in the
past, take just a few seconds in most cases nowadays. Even when two programs are actually
running at once, such asa word processor and a spreadsheet, it hardly matters which goes first
since the user is probably waiting for both of them to finish. As a consequence, scheduling
does not matter much on simplePCs. Of course, there are applications that practically eat the
CPU alive, for instance rendering one hour of high-resolution video while tweaking the colors
in each of the 108,000 frames (in NTSC) or 90,000 frames (in PAL) require industrial-
strength computing power. However, similar applications are the exception rather than the
rule.

When we turn to networked servers, the situation changes appreciably. Here multiple
processes often do compete for the CPU, so scheduling matters again. For example, when the
CPU has to choose between running a process that gathers the daily statistics and one that
serves user requests, the users will be a lot happier if the latter gets first crack at the CPU.

In addition to picking the right process to run, the scheduler also has to worry about making
efficient use of the CPU because process switching is expensive. To start with, a switch
fromuser mode to kernel mode must occur. Then the state of the current process must be
saved, including storing its registers in the process table so they can be reloaded later. In many
systems, the memory map (e.g., memory reference bits in the page table) must be saved as
well. Next a new process must be selected by running the scheduling algorithm. After that,
the MMU must bereloaded with the memory map of the new process. Finally, the new process
must be started.

In addition to all that, the process switch usually invalidates the entire memory cache, forcing
it to be dynamically reloaded from the main memory twice (upon entering the kernel and
upon leaving it). All in all, doing too many process switches per second can chew up a
substantial amount of CPU time, so caution is advised.
PROCESS BEHAVIOR

Nearly all processes alternate bursts of computing with (disk) 1/0 requests, as shown in
Figure 2.6 below. Typically, the CPU runs for a while without stopping, and then a
system call is made to read from a file or write to a file. When the system call completes,
the CPU computes again until it needs more data or has to write more data, and so on.
Note that some 110 activities count as computing. For example, when the CPU copies bits
to a video RAM to update the screen, it is computing, not doing 1/0, because the CPU is
in use. 1/0 in this sense is when a process enters the blocked state waiting for an external
device to complete its work.

Figure 2.6. Bursts of CPU usage

A CPU-bound process. (b) An UO-bound process.

The important thing to notice about the figure is that some processes, such as the one
in Fig. (a), spend most of their time computing, while others, such as the one in Fig.
(b), spend most of their time waiting for I/0. The former is called compute-bound; the
latter are called I/O-bound.Compute-bound processes typically have long CPU bursts
and thus infrequent 1/0 waits, whereas I/O bound processes have short CPU bursts
and thus frequent I/0 waits. Note that the key factoris the length of the CPU burst,
not the length of the I/0 burst. I/O bound processes are I/0 bound because they do not
compute much between I/0 requests, not because they have especially long I/0
requests. It takes the same time to issue the hardware request to read a disk block no
matter how much or how little time it takes to process the data after they arrive.
It is worth noting that as CPUs get faster, processes tend to get more I/O bound. This
effect occurs because CPUs are improving much faster than disks.
As a consequence, the scheduling of I/O-bound processes is likely to become a more
important subject in the future. The basic idea here is that if an I/O-bound process
wants to run, it shouldget a chance quickly so that it can issue its disk request and
keep the disk busy.
The objective of multiprogramming is to have some process running at all times, to
maximize CPU utilization. The objective of time sharing is to switch the CPU among
processes so frequently. In uniprocessor only one process is running. A process
migrates between various
scheduling queues throughout its lifetime. The process of selecting processes from
among these queues is carried out by a scheduler. The aim of processor scheduling is
to assign processes to be executed by the processor. Scheduling affects the
performance of the system, because it determines which process will wait and which
will progress.

Types of Scheduling

Long-term Scheduling: Long term scheduling is performed when a new process is


created. It is shown in the figure below. If the number of ready processes in the ready
queue becomes very high, then there is an overhead on the operating system (i.e.,
processor) for maintaining long lists, context switching and dispatching increases.
Therefore, allow only limited number of processes in to the ready queue. The "long-
term scheduler” managers this. Long-term scheduler determines which programs are
admitted into the system for processing. Once when admit a process or job, it becomes
process and is added to the queue for the short-term scheduler. In some systems, a
newly created process begins in a swapped-out condition; in which case it is added to
a queue for the medium-term scheduler scheduling manage queues to minimize
queuing delay and to optimize performance.
Figure 2.7 Scheduler

The long-term scheduler limits the number of processes to allow for processing by
taking the decision to add one or more new jobs, based on FCFS (First-Come, first-
serve) basis or priority or execution time or Input/output requirements. Long-term
scheduler executes relatively infrequently.

Medium-term Scheduling: Medium-term scheduling is a part of the swapping


function. When part of the main memory gets freed, the operating system looks at the
list of suspend ready processes, decides which one is to be swapped in (depending on
priority, memory and other resources required, etc.). This scheduler works in close
conjunction with the long-term scheduler. It will perform the swapping-in function
among the swapped-out processes. Medium-term scheduler executes somewhat more
frequently.
Short-term Scheduling: Short-term scheduler is also called as dispatcher. Short-term
scheduler is invoked whenever an event occurs, that may lead to the interruption of
the current running process. For example, clock interrupts, I/O interrupts, operating
system calls, signals, etc. Short- term scheduler executes most frequently. It selects
from among the processes that are ready to execute and allocates the CPU to one of
them. It must select a new process for the CPU frequently. It must be very fast.
Comparison between scheduler

Scheduling Criteria

Scheduling criteria is also called as scheduling methodology. Key to


multiprogramming is scheduling. Different CPU scheduling algorithms have different
properties. The criteria used forcomparing these algorithms include the following:
 CPU Utilization:

 Keep the CPU as busy as possible. It ranges from 0 to 100%. In practice, it


ranges from 40 to90%.
 Throughput:

 Throughput is the rate at which processes are completed per unit of time.

 Turnaround time:

 This is the how long a process takes to execute a process. It is calculated as the
time gapbetween the submission of a process and its completion.
 Waiting time:

 Waiting time is the sum of the time periods spent in waiting in the ready queue.

 Response time:
 Response time is the time it takes to start responding from submission time. It is
calculated asthe amount of time it takes from when a request was submitted until
the first response is produced.
 Fairness:

Each process should have a fair share of CPU.

Non-preemptive Scheduling

In non-preemptive mode, once if a process enters into running state, it continues to


execute until it terminates or blocks itself to wait for Input/output or by requesting
some operating system service.
Preemptive Scheduling

In preemptive mode, currently running process may be interrupted and moved to the
ready State by the operating system. When a new process arrives or when an interrupt
occurs, preemptive policies may incur greater overhead than non-preemptive version
but preemptive version may provide better service. It is desirable to maximize CPU
utilization and throughput, and to minimize turnaround time, waiting time and
response time.

WHEN TO SCHEDULE

A key issue related to scheduling is when to make scheduling decisions. It turns out
that there area variety of situations in which scheduling is needed.
First, when a new process is created, a decision needs to be made whether to run the
parent process or the child process. Since both processes are in ready state, it is a
normal scheduling decision and can go either way, that is, the scheduler can
legitimately choose to run either the parent or the child next.
Second, a scheduling decision must be made when a process exits. That process can
no longer run (since it no longer exists), so some other process must be chosen from
the set of ready processes. If no process is ready, a system-supplied idle process is
normally run.
Third, when a process blocks on I/0, or for some other reason, another process has to
be selected to run. Sometimes the reason for blocking may play a role in the choice.
For example, if A is an important process and it is waiting for B to exit its critical
region, letting B run next will allow it to exit its critical region and thus let A continue.
The trouble, however, is that the scheduler generally does not have the necessary
information to take this dependency into account.
Fourth, when an I/0 interrupt occurs, a scheduling decision may be made. If the
interrupt came from an I/0 devices that has now completed its work, some process that
was blocked waiting for the 110 may now be ready to run. It is up to the scheduler to
decide whether to run the newly ready process, the process that was running at the
time of the interrupt, or some third process.
Scheduling algorithms can be divided into two categories with respect to how they
deal with clock interrupts. A non-preemptive scheduling algorithm picks a process
to run and then just letsit run until it blocks (either on I/0 or waiting for another
process) or until it voluntarily releases the CPU. Even if it runs for hours, it will not
be forcibly suspended. In effect, no scheduling decisions are made during clock
interrupts. After clock interrupt processing has been completed, the process that was
running before the interrupt is resumed, unless a higher-priority process waswaiting
for a now-satisfied timeout.
In contrast, a preemptive scheduling algorithm picks a process and lets it run for a
maximum of some fixed time. If it is still running at the end of the time interval, it is
suspended and the scheduler picks another process to run (if one is available).
CATEGORIES GOALS OF SCHEDULING ALGORITHMS

Not surprisingly, in different environments different scheduling algorithms are


needed. This situation arises because different application areas (and different kinds
of operating systems)have different goals. In other words, what the scheduler should
optimize for is not the same in allsystems. Three environments worth distinguishing
are
 Batch.

 Interactive.

 Real time.

In order to design a scheduling algorithm, it is necessary to have some idea of what a


good algorithm should do. Some goals depend on the environment (batch, interactive,
or real time), but there are also some that are desirable in all cases. This are:
 Fairness - giving each process a fair share of the CPU
 Policy enforcement - seeing that stated policy is carried out
 Balance - keeping all parts of the system busy

Under all circumstances, fairness is important. Comparable processes should get


comparable service. Giving one process much more CPU time than an equivalent one
is not fair. Of course, different categories of processes may be treated differently.
Think of safety control and doing thepayroll at a nuclear reactor's computer center.
Somewhat related to fairness is enforcing the system's policies. If the local policy is
that safety control processes get to run whenever they want to, even if it means the
payroll is 30 sec late, thescheduler has to make sure this policy is enforced.
Another general goal is keeping all parts of the system busy when possible. If the
CPU and allthe 1/0 devices can be kept running all the time, more work gets done per
second than if some of the components are idle. In a batch system, for example, the
scheduler has control of which jobs are brought into memory to run.
Having some CPU-bound processes and some I/O-bound processes in memory
together is abetter idea than first loading and running all the CPU-bound jobs and
then, when they are finished, loading and running all the I/O-bound jobs. If the latter
strategy is used, when the CPU-bound processes are running, they will fight for the
CPU and the disk will be idle. Later, whenthe I/O-bound jobs come in, they will fight
for the disk and the CPU will be idle. Better to keep the whole system running at once
by a careful mix of processes.
Scheduling Algorithm Goals

In order to design a scheduling algorithm, it is necessary to have some idea of what a


good algorithm should do. Some goals depend on the environment (batch, interactive,
or real time), but some are desirable in all cases. Some goals are listed here.
 All systems
Fairness - giving each process a fair share of the CPU Policy enforcement - seeing
that stated policy is carried out Balance - keeping all parts of the system busy

 Batch systems
Throughput - maximize jobs per hour

Turnaround time - minimize time between submission and termination CPU utilization
- keep the CPU busy all the time

 Interactive systems
Response time - respond to requests quickly Proportionality - meet users’
expectations

 Real-time systems
Meeting deadlines - avoid losing data

Predictability - avoid quality degradation in multimedia systems

2.4.2. BATCH SYSTEMS

Batch systems are still in widespread use in the business world for doing payroll, inventory,
accounts receivable, accounts payable, interest calculation (at banks), claims processing (at
insurance companies), and other periodic tasks. In batch systems, there are no users
impatiently waiting at their terminals for a quick response to a short request. Consequently,
non-preemptive algorithms, or preemptive algorithms with long time periods for each
process, are often acceptable. This approach reduces process switches and thus improves
performance. The batch algorithms are actually fairly general and often applicable to other
situations as well, which makes them worth studying, even for people not involved in
corporate mainframe computing.So, the goals can be:
 Throughput - maximize jobs per hour

 Turnaround time - minimize time between submission and termination

 CPU utilization - keep the CPU busy all the time

The managers of large computer centers that run many batch jobs typically look at three
metrics to see how well their systems are performing: throughput, turnaround time, and
CPU utilization. Throughput is the number of jobs per hour that the system completes.
All things considered, finishing 50 jobs per hour is better than finishing 40 jobs per hour.
Turnaround time is the statistically average time from the moment that a batch job is
submitted until the moment it is completed. It measures how long the average user has to
wait for the output. Here the rule is: Small is beautiful.

A scheduling algorithm that maximizes throughput may not necessarily minimize


turnaround time. For example, given a mix of short jobs and long jobs, a scheduler that
always ran short jobsand never ran long jobs might achieve an excellent throughput (many
short jobs per hour) but at the expense of a terrible turnaround time for the long jobs. If
short jobs kept arriving at a fairly steady rate, the long jobs might never run, making the
mean turnaround time infinite while achieving a high throughput.

CPU utilization is often used as a metric on batch systems. Actually though, it is not such
a good metric. What really matters is how many jobs per hour come out of the system
(throughput) and how long it takes to get a job back (turnaround time). Using CPU
utilization as a metric is like rating cars based on how many times per hour the engine
turns over. On the other hand, knowing when the CPU utilization is approaching 100% is
useful for knowing when it is time to get more computing power.
2.4.2.1. First come first serve scheduling algorithm in Batch Systems
In the "First come first serve" scheduling algorithm, as the name suggests, the process
which arrives first, gets executed first, or we can say that the process which requests
the CPU first, gets the CPU allocated first. First Come First Serve, is just like FIFO
(First in First out) Queue data structure, where the data element which is added to the
queue first, is the one who leaves the queue first. It's easy to understand and implement
programmatically, using a Queue data structure, where a new process enters through
the tail of the queue, and the scheduler selects process from the head of the queue. A
perfect real life example of FCFS scheduling is buying tickets at ticket counter.
Problems with FCFS Scheduling
Below we have a few shortcomings or problems with the FCFS scheduling algorithm:
 It is Non Pre-emptive algorithm, which means the process priority doesn't matter. If a
process with very least priority is being executed, more like daily routine backup
process, which takes more time, and all of a sudden some other high priority process
arrives, like interrupt to avoid system crash, the high priority process will have to wait,
and hence in this case, the system will crash, just because of improper process
scheduling.
 Not optimal Average Waiting Time.
 Resources utilization in parallel is not possible, which leads to Convoy Effect, and
hence poor resource (CPU, I/O etc.) utilization.
Convoy Effect

Convoy Effect is a situation where many processes, which need to use a resource for
short time, are blocked by one process holding that resource for a long time. This
essentially leads to poor utilization of resources and hence poor performance.
2.4.2.2. Shortest Job First scheduling algorithm in batch system
Now let us look at another non preemptive batch algorithm that assumes the run times
are known in advance. In an insurance company, for example, people can predict quite
accurately how long it will take to run a batch of 1000 claims, since similar work is
done every day. When several equally important jobs are sitting in the input queue
waiting to be started, the scheduler picks the shortest job first. Look at Fig. 2-41.
Here we find four jobs A, B, C, and D with run times of 8, 4, 4, and 4 minutes,
respectively. By running them in that order, the turnaround time forA is 8 minutes, for
B is 12 minutes, for C is 16 minutes, and for D is 20 minutes foran average of 14(56/4)
minutes.
Figure 2.8 An example of shortest-job-first scheduling.

(a) Running four jobs in the original order. (b) Running them in shortest job first
order.
Now let us consider running these four jobs using shortest job first, as shown in Fig.
2-41(b). The turnaround times are now 4, 8, 12, and 20 minutes for an average of
11(44/4) minutes. Shortest job first is provably optimal. Consider the case of four
jobs, with execution times of a, b, c, and d, respectively. The first job finishes at time
a, the second at time a + b, and so on. The mean turnaround time is (4a + 3b
+ 2c + d)/4. It is clear that a contributes more to the average than theother times,
so it should be the shortest job, with b next, then c, and finally d as the longest since
it affects only its own turnaround time. The same argument applies equally well to
any number of jobs.

The average w a i t i n g time for (b) is (turnaround time – waiting time) = (4-4) +
(8-4) + (12-4) + (20-8) = 24/4 =6 and throughout = 1/20

It is worth pointing out that shortest job first is optimal only when all the jobs are
available simultaneously. As a counterexample, consider five jobs, A through E,
with run times of 2, 4, 1, 1, and 1, respectively. Their arrival times are 0, 0, 3, 3, and
3. Initially, only A or B can be chosen, since the other three jobs have not arrived yet.
Using shortest job first, we will run the jobs in the order A, B, C, D, E, for an average
waiting time of 4.6. However, running them in the order B, C, D, E, A has an average
waiting time of 4.4.

2.4.2.3. Shortest Remaining Time Next scheduling algorithm in batch


system
A preemptive version of shortest job first is shortest remaining time next. With this
algorithm, the scheduler always chooses the process whose remaining run time is
the shortest. Again here, the run time has to be known in advance. When a new
job arrives, its total time is compared to the current process’ remaining time. If the
new job needs less time to finish than the current process, the cur- rent process is
suspended and the new job started. This scheme allows new short jobs to get good
service.

2.4.3. INTERACTIVE SYSTEMS


In an environment with interactive users, preemption is essential to keep one process
from hogging the CPU and denying service to the others. Even if no process
intentionally ran forever, one process might shut out all the others indefinitely due to
a program bug. Preemption is needed to prevent this behavior. Servers also fall into
this category, since they normally serve multiple (remote) users, all of whom are in a
big hurry. So, the goals can be:
 Response time - respond to requests quickly

 Proportionality - meet users' expectations

For interactive systems, different goals apply. The most important one is to minimize
response time, that is, the time between issuing a command and getting the result. On
a personal computer where a background process is running (for example, reading and
storing e-mail from thenetwork), a user request to start a program or open a file should
take precedence over the background work. Having all interactive requests go first
will be perceived as good service.
A somewhat related issue is what might be called proportionality. Users have an
inherent (but often incorrect) idea of how long things should take. When a request that
is perceived ascomplex takes a long time, users accept that, but when a request that
is perceived as simple takes a long time, users get irritated. For example, if clicking
on a icon that starts sending a fax takes60 seconds to complete, the user will probably
accept that as a fact of life because he does not expect a fax to be sent in 5 seconds.
On the other hand, when a user clicks on the icon that breaks the phone connection
after the fax has been sent, he has different expectations. If it has not completed after
30 seconds, the user willprobably be swearing a blue streak, and after 60 seconds he
will be frothing at the mouth. This behavior is due to the common user perception that
placing a phone call and sending a fax is supposed to take a lot longer than just
hanging the phone up. In some cases (such as this one),the scheduler cannot do
anything about the response time, but in other cases it can, especially when the delay
is due to a poor choice of process order.
2.4.3.1. Round-Robin Scheduling algorithms in Interactive Systems

One of the oldest, simplest, fairest, and most widely used algorithms is round robin.
Each process is assigned a time interval, called its quantum, during which it is allowed
to run. If the process is still running at the end of the quantum, the CPU is preempted
and given to another process. If the process has blocked or finished before the
quantum has elapsed, the CPU switching is done when the process blocks, of course.
Round robin is easy to implement. All the scheduler needs to do is maintain a list of
runnable processes, as shown in Fig. (a). When the process uses up its quantum, it is
put on the end of the list, as shown in Fig. (b).
The only interesting issue with round robin is the length of the quantum.

Switching from one process to another requires a certain amount of time for doing the
administration-saving and loading registers and memory maps, updating various
tables and lists, flushing and reloading the memory cache, and so on. Suppose that
this process switch or context switch, as it is sometimes called, takes 1 msec, including
switching memory maps, flushing and reloading the cache, etc.
Also suppose that the quantum is set at 4 msec. With these parameters, after doing 4
msec of useful work, the CPU will have to spend (i.e., waste) 1msec on process
switching. Thus 20% of the CPU time will be thrown away on administrative
overhead. Clearly, this is too much.
Figure 2.9: Round-robin scheduling. (a) The list of runnable processes. (b) The list of
runnableprocesses after B uses up its quantum.
To improve the CPU efficiency, we could set the quantum to, say, 100 msec.
Now the wasted time is only 1%. But consider what happens on a server system if 50
requests come in within a very short time interval and with widely varying CPU
requirements. Fifty processes will be put on the list of runnable processes. If the CPU
is idle, the first one will start immediately, the second one may not start until 100 msec
later, and so on. The unlucky last one may have to wait 5 sec before getting a
chance, assuming all the others use their full quanta.
Most users will perceive a 5-sec response to a short command as sluggish. This
situation is especially bad if some of the requests near the end of the queue required
only a few milliseconds of CPU time. With a short quantum they would have gotten
better service.
Another factor is that if the quantum is set longer than the mean CPU burst,
preemption will not happen very often. Instead, most processes will perform a
blocking operation before the quantumruns out, causing a process switch. Eliminating
preemption improves performance becauseprocess switches then only happen when
they are logically necessary, that is, when a process blocks and cannot continue.
The conclusion can be formulated as follows: setting the quantum too short causes too
many process switches and lowers the CPU efficiency, but setting it too long may
cause poor response to short interactive requests. A quantum around 20-50 msec is
often a reasonable compromise.
Performance of RR Scheduling

 If there are n processes in the ready queue and time quantum is q, then each process
gets 1/n of the CPU time in chunks of at most q time units at once.
 No process waits for more than (n-1)*q time units until the next time quantum.

 The performance of RR depends on time slice. If it is large then it is the same as


FCFS. If qis small then overhead is too high.
2.4.3.2. Priority Scheduling in Interactive Systems

Round-robin scheduling makes the implicit assumption that all processes are equally
important. Frequently, the people who own and operate multiuser computers have
different ideas on that subject. At a university, for example, the pecking order may be
deans first, then professors, secretaries, janitors, and finally students.
The need to take external factors into account leads to priority scheduling. The basic
idea is straightforward: each process is assigned a priority, and the runnable process
with the highest priority is allowed to run. Even on a PC with a single owner, there
may be multiple processes, some of them more important than others. For example, a
daemon process sending electronic mail in the background should be assigned a lower
priority than a process displaying a video filmon the screen in real time.
To prevent high-priority processes from running indefinitely, the scheduler may
decrease the priority of the currently running process at each clock tick (i.e., at each
clock interrupt). If this action causes its priority to drop below that of the next highest
process, a process switch occurs. Alternatively, each process may be assigned a
maximum time quantum that it is allowed to run. When this quantum is used up, the
next highest priority process is given a chance to run.
Priorities can be assigned to processes statically or dynamically. On a military
computer, processes started by generals might begin at priority 100, processes started
by colonels at 90, majors at 80, captains at 70, lieutenants at 60, and so on.
Alternatively, at a commercial computer center, high-priority jobs might cost $100 an
hour, medium priority $75 an hour, and low priority $50 an hour. The UNIX system
has a command, nice, which allows a user to voluntarily reduce the priority of his
process, in order to be nice to the other users. Nobody ever uses it.
Priorities can also be assigned dynamically by the system to achieve certain system
goals. For example, some processes are highly 1/0 bound and spend most of their time
waiting for 110 to complete. Whenever such a process wants the CPU, it should be
given the CPU immediately, to let it start its next 1/0 request, which can then proceed
in parallel with another process actually computing. Making the I/O-bound process
wait a long time for the CPU will just mean having it around occupying memory for
an unnecessarily long time. A simple algorithm for giving good service to I/O-bound
processes is to set the priority to 1/f, where f is the fraction of the last quantum that a
process used. A process that used only 1 msec of its 50 msec quantum would get
priority 50, while a process that ran 25 msec before blocking would get priority 2,
and a processthat used the whole quantum would get priority 1.

Consider the following processes arrive at time 0


Process P1 P2 P3 P4 P5
Priority 2 4 5 3 1
Service Time (Ts) 3 6 4 5 2
Turnaround time (Tr) 18 10 4 15 20
Response time 15 4 0 10 18
Tr/Ts 6 1.67 1 3 10
Average response time =
(0+15+4+10+18)/5 = 9.4 Average turnaround
time = (18+10+4+15+20)/5=13.4Throughput =
5/20= 0.25
It is often convenient to group processes into priority classes and use priority scheduling among
the classes but round-robin scheduling within each class. The Figure below shows a system with
four priority classes. The scheduling algorithm is as follows: as long as there are runnable
processes in priority class 4, just run each one for one quantum, round-robin fashion, and never
bother with lower-priority classes. If priority class 4 is empty, then run the class 3 processes
round robin. If classes 4 and 3 are both empty, then run class 2 round robin, and so on. If priorities
are not adjusted occasionally, lower priority classes may all starve to death.

2.4.4. REAL-TIME SYSTEMS


In systems with real-time constraints, preemption is, oddly enough, sometimes not needed
because the processes know that they may not run for long periods of time and usually do
theirwork and block quickly. The difference with interactive systems is that real-time
systems run only programs that are intended to further the application at hand. Interactive
systems are generalpurpose and may run arbitrary programs that are not cooperative or even
malicious. So, the goalscan be:
 Real-time systems
 Meeting deadlines - avoid losing data
 Predictability - avoid quality degradation in multimedia systems
Real-time systems have different properties than interactive systems, and thus different
scheduling goals. They are characterized by having deadlines that must or at least should be met.
For example, if a computer is controlling a device that produces data at a regular rate, failure to
run the data-collection process on time may result in lost data. Thus, the foremost need in a real-
time system is meeting all (or most) deadlines.
In some real-time systems, especially those involving multimedia, predictability is important.
Missing an occasional deadline is not fatal, but if the audio process runs too erratically, the sound
quality will deteriorate rapidly. Video is also an issue, but the ear is much more sensitive to jitter
than the eye. To avoid this problem, process scheduling must be highly predictable and regular.
2.4.5. Policy Versus Mechanism

Up until now, we have tacitly assumed that all the processes in the system be- long to different users and are
thus competing for the CPU. While this is oftentrue, sometimes it happens that one process has many
children running under its control. For example, a database-management-system process may have many
children. Each child might be working on a different request, or each might have some specific function to
perform (query parsing, disk access, etc.). It is entirely possible that the main process has an excellent idea of
which of its children are the most important (or time critical) and which the least. Unfortunately, none of the
schedulers discussed above accept any input from user processes about scheduling decisions. As a result, the
scheduler rarely makes the best choice.
The solution to this problem is to separate the scheduling mechanism fromthe scheduling policy, a long-
established principle. What this means is that the scheduling algorithm is parameterized in some way, but the
parameters can be filled in by user processes. Let us consider the database example once again. Suppose that
the kernel uses a priority-scheduling algorithm but pro- vides a system call by which a process can set (and
change) the priorities of its children. In this way, the parent can control how its children are scheduled, even
though it itself does not do the scheduling. Here the mechanism is in the kernel but policy is set by a user
process. Policy-mechanism separation is a key idea.

2.5. CLASSICAL IPC PROBLEMS

The operating systems literature is full of interesting problems that have been widely discussed and
analyzed using a variety of synchronization methods. In the following sections we will examine
three of the better-known problems.
2.5.1. The Dining Philosophers Problem
In 1965, Dijkstra posed and then solved a synchronization problem he called the dining philosophers
problem. Since that time, everyone inventing yet another synchronization primitive has felt
obligated to demonstrate how wonderful the new primitive is by showing how elegantly it solves
the dining philosopher’s problem. The problem can be stated quite simply as follows. Five
philosophers are seated around a circular table. Each philosopher has a plate of spaghetti. The
spaghetti is so slippery that a philosopher needs two forks to eat it. Between each pair of plates is
one fork.

Figure 2.10
The life of a philosopher consists of alternating periods of eating and thinking. (This is something
of an abstraction, even for philosophers, but the other activities are irrelevant here.) When a
philosopher gets sufficiently hungry, she tries to ac- quire her left and right forks, one at a time, in
either order. If successful in acquiring two forks, she eats for a while, then puts down the forks,
and continues to think. The key question is: Can you write a program for each philosopher that does
what it is supposed to do and never gets stuck? (It has been pointed out that the two-fork requirement
is somewhat artificial; perhaps we should switch from Italian food to Chinese food, substituting rice
for spaghetti and chopsticks for forks.) Figure 2.11 shows the obvious solution. The procedure take
fork waits until the specified fork is available and then seizes it. Unfortunately, the obvious solution
is wrong. Suppose that all five philosophers take their left forks simultaneously. None will be able
to take their right forks, and there will be a deadlock.
We could easily modify the program so that after taking the left fork, the pro- gram checks to see if
the right fork is available. If it is not, the philosopher puts down the left one, waits for some time,
and then repeats the whole process. This proposal too, fails, although for a different reason. With a
little bit of bad luck, all the philosophers could start the algorithm simultaneously, picking up their
left forks, seeing that their right forks were not available, putting down their left forks,

#define N 5 /* number of philosophers */

void philosopher(int i) /* i: philosopher number, from 0 to 4 */

while (TRUE) {

/* philosopher is thinking */

take fork(i); /* take left fork */

take fork((i+1) % N); /* take right fork; % is modulo operator */eat( ); /*

yum-yum, spaghetti */

put fork(i); /* put left fork back on the table */


put fork((i+1) % N); /* put right fork back on the table */

Figure 2.11. A nonsolution to the dining philosopher’s problem.

The solution presented in Figure 1.12 is deadlock-free and allows the maximum parallelism for an
arbitrary number of philosophers. It uses an array, state, to keep track of whether a philosopher is
eating, thinking, or hungry (trying to acquire forks). A philosopher may move into eating state only
if neither neighbor is eat- ing. Philosopher i’s neighbors are defined by the macros LEFT and
RIGHT. In other words, if i is 2, LEFT is 1 and RIGHT is 3.
The program uses an array of semaphores, one per philosopher, so hungry philosophers can block
if the needed forks are busy. Note that each process runs the procedure philosopher as its main
code, but the other procedures, take forks, put forks, and test, are ordinary procedures and not
separate processes.

#define N 5 /* number of philosophers */ #define LEFT


(i+N 1)%N /* number of i’s left neighbor */ #define RIGHT
(i+1)%N /* number of i’s right neighbor */ #define
THINKING 0 /* philosopher is thinking */
#define HUNGRY 1 /* philosopher is trying to get forks */
#define EATING 2 /* philosopher is eating */
typedef int semaphore; /* semaphores are a special kind of int */
int state[N]; /* array to keep track of everyone’s state */
semaphore mutex = 1; /* mutual exclusion for critical regions */
semaphore s[N]; /* one semaphore per philosopher */

void philosopher(int i) /* i: philosopher number, from 0 to N 1 */


{
while (TRUE) { /* repeat forever */
/* philosopher is thinking */
take forks(i); /* acquire two forks or block */
eat( ); /* yum-yum, spaghetti */
put forks(i); /* put both forks back on table */
}
}

void take forks(int i) /* i: philosopher number, from 0 to N 1 */


{
down(&mutex); /* enter critical region */
state[i] = HUNGRY; /* record fact that philosopher i is hungry */
test(i); /* try to acquire 2 forks */
up(&mutex); /* exit critical region */
down(&s[i]); /* block if forks were not acquired */
}

void put forks(i) /* i: philosopher number, from 0 to N 1 */


{
down(&mutex); /* enter critical region */
state[i] = THINKING; /* philosopher has finished eating */
test(LEFT); /* see if left neighbor can now eat */
test(RIGHT); /* see if right neighbor can now eat */
up(&mutex); /* exit critical region */
}

void test(i) /* i: philosopher number, from 0 to N 1 */


{
if (state[i] == HUNGRY && state[LEFT] != EATING && state[RIGHT] != EATING) {
state[i] = EATING;
up(&s[i]);

Figure 2.12 A solution to the dining philosopher’s problem


2.5.2. The Readers and Writers Problem
The dining philosopher’s problem is useful for modeling processes that are competing for exclusive access
to a limited number of resources, such as I/O de- vices. Another famous problem is the readers and writers
problem which models access to a database. Imagine, for example, an airline reservation system, with many
competing processes wishing to read and write it. It is acceptable to have multiple processes reading the
database at the same time, but if one process is updating (writing) the database, no other processes may have
access to the database, not even readers.

2.5. Deadlock
Computer systems are full of resources that can be used only by one process at a time. Common examples
include printers, tape drives for backing up company data, and slots in the system’s internal tables. Having
two processes simultaneously writing to the printer leads to gibberish. Having two processes using the
same file-system table slot invariably will lead to a corrupted file system. Consequently, all operating
systems have the ability to (temporarily) grant a process exclusive access to certain resources.
For many applications, a process needs exclusive access to not one resource, but several. Suppose, for
example, two processes each want to record a scanned document on a Blu-ray disc. Process A requests
permission to use the scanner and is granted it. Process B is programmed differently and requests the Blu-
ray recorder first and is also granted it. Now A asks for the Blu-ray recorder, but the re- quest is suspended
until B releases it. Unfortunately, instead of releasing the Blu- ray recorder, B asks for the scanner. At this
point both processes are blocked and will remain so forever. This situation is called a deadlock.
Deadlocks can also occur across machines. For example, many offices have a local area network with
many computers connected to it. Often devices such as scanners, Blu-ray/DVD recorders, printers, and
tape drives are connected to the network as shared resources, available to any user on any machine. If these
de- vices can be reserved remotely (i.e., from the user’s home machine), deadlocks of the same kind can
occur as described above. More complicated situations can cause deadlocks involving three, four, or more
devices and users.

Deadlocks can also occur in a variety of other situations.. In a database system, for example, a program
may have to lock several records it is using, to avoid race conditions. If process A locks record R1 and
process B locks record R2, and then each process tries to lock the other one’s record, we also have a
deadlock. Thus, deadlocks can occur on hardware resources or on software resources.
2.5.1. RESOURCES
A major class of deadlocks involves resources to which some process has been granted exclusive access.
These resources include devices, data records, files, and so forth. To make the discussion of deadlocks as
general as possible, we will refer to the objects granted as resources. A resource can be a hardware device
(e.g., a Blu-ray drive) or a piece of information (e.g., a record in a database). A computer will normally
have many different resources that a process can acquire. For some resources, several identical instances
may be available, such as three Blu-raydrives. When several copies of a resource are available, any one
of them can be used to satisfy any request for the resource. In short, a resource is anything that must be
acquired, used, and released over the course of time.
Resources come in two types: preemptable and nonpreemptable. A preempt- able resource is one that
can be taken away from the process owning it with no ill effects. Memory is an example of a pre-emptable
resource. A nonpreemptable resource, in contrast, is one that cannot be taken away from its current
owner without potentially causing failure.
In general, deadlocks involve nonpreemptable resources. Potential deadlocks that involve preemptable
resources can usually be resolved by reallocating re- sources from one process to another. Thus, our
treatment will focus on nonpre- emptable resources.
The abstract sequence of events required to use a resource is given below.

1. Request the resource.

2. Use the resource.

3. Release the resource.

2.5.2. INTRODUCTION TO DEADLOCKS


Deadlock can be defined formally as follows:

A set of processes is deadlocked if each process in the set is waiting for an event that only another
process in the set can cause.

Because all the processes are waiting, none of them will ever cause any event that could wake up any
of the other members of the set, and all the processes continue to wait forever. For this model, we
assume that processes are single threaded and that no interrupts are possible to wake up a blocked
process. The no-interrupts condition is needed to prevent an otherwise deadlocked process from being
awakened by an alarm, and then causing events that release other processes in the set.
In most cases, the event that each process is waiting for is the release of some resource currently
possessed by another member of the set. In other words, each member of the set of deadlocked
processes is waiting for a resource that is owned by a deadlocked process. None of the processes can
run, none of them can release any resources, and none of them can be awakened. The number of
processes and the number and kind of resources possessed and requested are unimportant. This
result holds for any kind of resource, including both hardware and software. This kind of deadlock is
called a resource deadlock. It is probably the most common kind, but it is not the only kind. We first
study resource deadlocks in detail and then at the end of the chapter return briefly to other kinds of
deadlocks.

2.5.2.1. Conditions for Resource Deadlocks

 Mutual exclusion condition. Each resource is either currently assigned to exactly one
process or is available.

 Hold-and-wait condition. Processes currently holding resources that were granted earlier
can request new resources.

 No-preemption condition. Resources previously granted cannot be forcibly taken away


from a process. They must be explicitly released by the process holding them.

 Circular wait condition. There must be a circular list of two or more processes, each of
which is waiting for a resource held by the next member of the chain.

All four of these conditions must be present for a resource deadlock to occur. If one of them is
absent, no resource deadlock is possible.
2.5.2.2. Deadlock Modeling

Holt (1972) showed how these four conditions can be modeled using directed graphs. The graphs have
two kinds of nodes: processes, shown as circles, and resources, shown as squares. A directed arc from
a resource node (square) to a process node (circle) means that the resource has previously been requested
by, granted to, and is currently held by that process. In Figure 2.13(a), resource R is currently assigned
to process A.
A directed arc from a process to a resource means that the process is currently blocked waiting for that
resource. In Figure 2.13 (b), process B is waiting for resource
S. in Figure 2.13(c) we see a deadlock: process C is waiting for resource T, which is currently held by
process D. Process D is not about to release resource T becauseit is waiting for resource U, held by C.
Both processes will wait forever. A cycle in the graph means that there is a deadlock involving the
processes and resources in the cycle (assuming that there is one resource of each kind). In this example,
the cycle is C-T-D-U- C.

Figure 2.13. Resource allocation graphs. (a) Holding a resource. (b) Requesting a
resource. (c) Deadlock.

Deadlock Strategies: In general, four strategies are used for dealing with deadlocks.
 Just ignore the problem. Maybe if you ignore it, it will ignore you.

 Detection and recovery. Let them occur, detect them, and take action.

 Dynamic avoidance by careful resource allocation.

 Prevention, by structurally negating one of the four conditions.

2.5.3. THE OSTRICH ALGORITHM


The simplest approach is the ostrich algorithm: stick your head in the sand and pretend there is no problem.
People react to this strategy in different ways. Mathematicians find it unacceptable and say that
deadlocks must be prevented at all costs. Engineers ask how often the problem is expected, how often
the system crashes for other reasons, and how serious a deadlock is. If deadlocks occur on the average once
every five years, but system crashes due to hardware failures and operating system bugs occur once a
week, most engineers would not be willing to pay a large penalty in performance or convenience to
eliminate deadlocks.
2.5.4. Deadlock Detection with One Resource of Each Type

A second technique is detection and recovery. When this technique is used, the system does not attempt
to prevent deadlocks from occurring. Instead, it lets them occur, tries to detect when this happens, and
then takes some action to actually.
2.5.5. Deadlock Detection with Multiple Resources of Each Type

When multiple copies of some of the resources exist, a different approach is needed to detect deadlocks.
We will now present a matrix-based algorithm for detecting deadlock among n processes, P1 through P n .
Let the number of resource classes be m, with E1 resources of class 1, E2 resources of class 2, and
generally, Ei resources of class i (1 <= i <= m). E is the existing resource vector. It gives the total number
of instances of each resource in existence. For example, if class 1 is tape drives, then E1 = 2 means the
system has two tape drives.
At any instant, some of the resources are assigned and are not available. Let A be the available resource
vector, with Ai giving the number of instances of re- source i that are currently available (i.e., unassigned).
If both of our two tape drives are assigned, A1 will be 0.
Now we need two arrays, C, the current allocation matrix, and R, the request matrix. The ith row of
C tells how many instances of each resource class Pi currently holds. Thus, Cij is the number of instances
of resource j that are held by process i. Similarly, Rij is the number of instances of resource j that Pi wants.
Figure 2.14. The four data structures needed by the deadlock detection algorithm
The deadlock detection algorithm is based on comparing vectors. Let us define the relation A  B on two
vectors A and B to mean that each element of A is less than or equal to the corresponding element of B.
Mathematically, A  B holds if and only if Ai  Bi for 1  i  m.
As an example of how the deadlock detection algorithm works, see Fig. 6-7. Here we have three processes
and four resource classes, which we have arbitrarily labeled tape drives, plotters, scanners, and Blu-ray
drives. Process 1 has one scan- ner. Process 2 has two tape drives and a Blu-ray drive. Process 3 has a
plotter and two scanners. Each process needs additional resources, as shown by the R matrix.

2.5. 6. Recovery from Deadlock

a)
Suppose that our deadlock detection algorithm has succeeded and detected a deadlock. What next? Some
way is needed to recover and get the system going again. In this section we will discuss various ways of
recovering from deadlock. None of them are especially attractive, however.
 Recovery through Preemption
 Recovery through Rollback

 Recovery through Killing Processes


2.5.6. DEADLOCK AVOIDANCE

In the discussion of deadlock detection, we tacitly assumed that when a process asks for resources. In most
systems, however, resources are requested one at a time. The system must beable to decide whether granting
a resource is safe or not and make the allocation only when it is safe. Thus, the question arises: Is there an
algorithm that can always avoid deadlock by making the right choice all the time? The answer is a qualified
yes we can avoid deadlocks, but only if certain information is available in advance. In this section we
examine ways to avoid deadlock by careful resource allocation.

 Resource Trajectories
 Safe and Unsafe States
 The Banker’s Algorithm for a Single Resource
 The Banker’s Algorithm for Multiple Resources

2.5.7. DEADLOCK PREVENTION

Having seen that deadlock avoidance is essentially impossible, because it re- quires information about
future requests, which is not known, how do real systems avoid deadlock? The answer is to go back to the
four conditions stated by Coffman to see if they can provide a clue. If we can ensure that at least one of
these conditions is never satisfied, then deadlocks will be structurally impossible.

 Attacking the Mutual-Exclusion Condition


 Attacking the Hold-and-Wait Condition
 Attacking the No-Preemption Condition
 Attacking the Circular Wait Condition
Chapter Three
Memory Management
3. Introduction to Memory Management
Memory management is the functionality of an operating system which handles or manages primary
memory and moves processes back and forth between main memory and disk during execution.
Memory management keeps track of each and every memory location, regardless of either it is
allocated to some process or it is free. It checks how much memory is to be allocated to processes.
It decides which process will get memory at what time. It tracks whenever some memory gets freed
or unallocated and correspondingly it updates the status.
Over the years, people discovered the concept of a memory hierarchy, in which computers have a
few megabytes of very fast, expensive, volatile cache memory, a few gigabytes of medium-speed,
medium-priced, volatile main memory, and a few terabytes of slow, cheap, nonvolatile magnetic or
solid-state disk storage, not to mention removable storage, such as DVDs and USB sticks. It is the
job of the operating system to abstract this hierarchy into a useful model and then manage the
abstraction.

The part of the operating system that manages (part of) the memory hierarchyis called the memory
manager. Its job is to efficiently manage memory: keep track of which parts of memory are in use,
allocate memory to processes when they need it, and de-allocate it when they are done.

3.1. No memory abstraction


What is no memory abstraction?

The simplest memory abstraction is no abstraction at all. Early mainframe computers (before 1960),
early minicomputers (before 1970), and early personal computers (before 1980) had no memory
abstraction. Every program simply saw the physical memory. When a program executed an
instruction like MOV REGISTER1, 1000 the computer just moved the contents of physical memory
location 1000 to REGISTER. In this way the model of memory presented to the programmer was
just physical memory, a set of addresses from 0 to some maximum, each addresses
corresponding to a cell Containing some number of bits, commonly eight.
Under these conditions, it was not possible to have two running programs in memory at the same
time. If the first program wrote a new value to, say, location 2000, this would erase whatever
value the second program was storing there. Nothing would work and
both programs would crash almost immediately.
Even with the model of memory being just physical memory, many options are possible. Three
variations are demonstrated in Figure 3.1. The operating system may be at the bottom of memory
in RAM (Random Access Memory), as shown in Figure 3.1(a), or it may be in ROM (Read-Only
Memory) at the top of memory, as shown in Figure 3.1(b), or the device drivers may be at the top
of memory in a ROM and the rest of the system in RAM down below, as shown in Figure 3.1(c).
The first model was formerly used on mainframes and minicomputers but is rarely used any more.
The second model is used on some handheld computers and embedded systems. The third model
was used by early personal computers (e.g., running MS- DOS), where the portion of the system
in the ROM is called the BIOS (Basic Input Output System). Models (a) and (c) have the
disadvantage that a bug in the user program can wipe out the operating system, maybe with
disastrous results (such as garbling the disk). When the system is organized in this way, usually
only one process at a time can be running. As soon as the user types a command, the operating
system copies the requested program from disk to memory and executes it. When the process
finishes, theoperating system displays a prompt character and waits for a new command. When
it receives the command, it loads a new program into memory, overwriting the first one.

Figure 3.1. organizing memory with operating system and one user process, other
possibility also exist

One way to get some parallelism in a system with no memory abstraction is to program with
multiple threads. Since all threads in a process are supposed to see the same memory image, the
fact that they are forced to is not a problem. Whilethis idea works, it is of limited use since what
people often want is unrelated pro- grams to be running at the same time, something the threads
abstraction does not provide. Furthermore, any system that is so primitive as to provide no
memory abstraction is unlikely to provide threads abstraction.

Running Multiple Programs without a Memory Abstraction

Nevertheless, even with no memory abstraction, it is possible to run multiple programs at the same
time. What the operating system has to do is save the entire contents of memory to a disk file, then
bring in and run the next program. As long as there is only one program at a time in memory, there
are no conflicts.
3.2. A MEMORY ABSTRACTION: ADDRESS SPACES
Two problems must be solved to allow various applications to be in memory at the same time
without their interfering with each other: protection and relocation. A better solution is to invent a
new abstraction for memory: the address space. Just as the process concept creates a kind of abstract
CPU to run programs, the address space creates a kind of abstract memory for programs to live in.
An address space is the set of addresses that a process can use to address memory. Each process has
its own address space, independent of those belonging to other processes (except in some special
circumstances where processes want to share their address space.

Base and Limit Registers

This simple solution uses a particularly simple version of dynamic relocation. What it does is
map each process address space onto a different part of physical memory in a simple way.

3.3. Overlay, swapping and Partition

Overlays in Memory Management

Memory Management is the operating system function that assigns and manages the computer's
primary memory. The capacity of the computer's memory was one of the key constraints imposed on
programmers in the early days of computers. The application could not be loaded if it was larger than
the available memory, severely limiting program size. The earliest and most fundamental mechanism
for storing many processes in main memory is the Fixed Partitioning Technique. The underlying
problem with fixed partitioning is that the size of a process is restricted by the maximum partition
size, which implies that one process can never span another. The obvious solution would be
expanding the available RAM, but this would significantly raise the cost of the computer system. So,
earlier individuals have employed a method known as Overlays to tackle this problem.

Overlays in Memory Management operate on the premise that when a process runs, it does not
consume the complete program at the same time but rather a subset of it.

Then, the overlaying concept is that you load whatever component you want, and when the section
is completed, you unload it, which means you pull it back and get the new part you require and
execute it.

Overlaying is defined as "the process of inserting a block of computer code or other data into
internal memory, replacing what is already there."

It is a method that permits applications to be larger than the primary memory.

Because of the limitations of physical memory (internal memory for a system-on-chip) and the
absence of virtual memory features, overlaying is generally employed by embedded systems.

Overlays Driver: Overlaying is completely user-dependent. Even what component is necessary for
the first pass should be written by the user.

Types of overlays
There are two types of overlaying. These two types are modal overlays and non-modal overlays. Before
returning to the program, a modal overlay requires interaction from the user. Users cannot interact with
the website until a particular action is taken or the overlay is closed.

Swapping

If the physical memory of the computer is large enough to hold all the processes, the schemes
explained so far will more or less do. But in practice, the total amount of RAM required by all the
processes is often much more than can fit in memory. On a typical Windows or Linux system,
something like 40-60 processes or more may be started up when the computer is booted. Two
general approaches to dealing with memory overload have been developed over the years. The
simplest strategy, called swapping, consists of bringing in each process in its entirety, running it for
a while, and then putting it back on the disk. Idle processes are mostly stored on disk, so they do not
take up any memory when they are not running (although some of them wake up periodically to do
their work, then go to sleep again). The other strategy, called virtual memory, allows programs to run
even when they are only partially in main memory. Below we will examine swapping; in
"VIRTUAL MEMORY"

The operation of a swapping system is shown in Figure 3.2. In the beginning, only process A is in
memory. Then processes B and C are created or swapped in from disk. In Figure 3.2(d) A is swapped
out to disk. Then D comes in and B goes out. In the end A comes in again. Since A is now at a
different location, addresses contained in it must be relocated, either by software when it is swapped
in or (more likely) by hardware during program execution. For instance, base and limit registers
would work fine here.

Figure 3.2. Memory allocation changes as processes come into memory and leave it.
The shaded regions are unused memory.

When swapping makes multiple holes in memory, it is possible to merge them all into one big one by
moving all the processes downward as far as possible. This technique is known as memory
compaction. It is normally not done because it requires a lot of CPU time. For example, on a 1-GB
machine that can copy 4 bytes in 20 nsec, it would take about 5 sec to compact all of memory. A
point that is worth making concerns how much memory should be allocated for aprocess when
it is created or swapped in. If processes are created with a fixed size that never changes, then the
allocation is simple: the operating system allocates exactlywhat is required, no more and no less.
If it is expected that most processes will grow as they run, it is perhaps a good idea to allocate a little
extra memory whenever a process is swapped in or moved, to reducethe overhead associated with
moving or swapping processes that no longer fit in their allocated memory. However, when swapping
processes to disk, only the memory actually in use should be swapped; it is wasteful to swap the extra
memory as well. In Figure 3.3 (a) we see a memory configuration in which space for growth has been
allocated to two processes.

Figure 3.3. (a) Allocating space for a growing data segment. (b) Allocating space for a growing
stack and a growing data segment.

A partition is a logical division of a hard disk that is treated as a separate unit by operating systems
(OSes) and file systems. The OSes and file systems can manage information on each partition as if it
were a distinct hard drive. This allows the drive to operate as several smaller sections to improve
efficiency, although it reduces usable space on the hard disk because of additional overhead from
multiple OSes.

A disk partition manager allows system administrators to create, resize, delete and manipulate
partitions, while a partition table logs the location and size of the partition. Each partition appears
to the OS as a distinct logical disk, and the OS reads the partition table before any other part of the
disk.

Once a partition is created, it is formatted with a file system such as:


 NTFS on Windows drives;

 FAT32 and exFAT for removable drives;

 HFS Plus (HFS+) on Mac computers; or

 Ext4 on Linux.

Data and files are then written to the file system on the partition. When users boot the OS in a
computer, a critical part of the process is to give control to the first sector on the hard disk. This
includes the partition table that defines how many partitions will be formatted on the hard disk, the
size of each partition and the address where each disk partition begins. The sector also contains a
program that reads the boot sector for the OS and gives it control so that the rest of the OS can be
loaded into random access memory.

3.4. Managing Free Memory

When memory is allocated dynamically, the operating system must manage it. Generally, there are
two methods to keep track of memory usage: bitmaps and free lists. In this section and the next one
we will study these two methods.

Memory Management with Bitmaps

With a bitmap, memory is divided into allocation units as small as a few words and as large as several
kilobytes. Corresponding to each allocation unit is a bit in the bitmap, which is 0 if the unit is free
and 1 if it is occupied (or vice versa). Figure 3.4. Shows part of memory and the corresponding
bitmap.
Figure 3.4. (a) A part of memory with five processes and three holes. The tick marks show
the memory allocation units. The shaded regions (0 in the bitmap) are free. (b) The
corresponding bitmap. (c) The same information as a list.

Memory Management with Linked Lists

A different way of keeping track of memory is to maintain a linked list of assigned and free memory
segments, where a segment either includes a process or is an empty hole between two processes. The
memory of Figure 3.5.(a) is represented in Figure 3.5.(c) as a linked list of segments. Each entry in
the list specifies a hole (H) or process (P), the address at which it starts the length and pointer
to the next entry.

Figure 3.5. Four neighbor combinations for the terminating process X.

With the suitable examples discuss First fit, Best fit, and Worst fit strategies for memory
allocation.

First Fit
In the first fit approach is to allocate the first free partition or hole large enough which can
accommodate the process. It finishes after finding the first suitable free partition.
Advantage: - Fastest algorithm because it searches as little as possible.
Disadvantage: - The remaining unused memory areas left after allocation become waste
if it is too smaller. Thus request for larger memory requirement cannot be accomplished.
Best Fit
The best fit deals with allocating the smallest free partition which meets the requirement of the
requesting process. This algorithm first searches the entire list of free partitions and considers the
smallest hole that is adequate. It then tries to find a hole which is close to actual process size needed.
Advantage: - Memory utilization is much better than first fit as it searches the smallest free
partition firstavailable.
Disadvantage: - It is slower and may even tend to fill up memory with tiny useless holes.
Worst fit
In worst fit approach is to locate largest available free portion so that the portion left will be big
enough to be useful. It is the reverse of best fit.
Advantage: - Reduces the rate of production of small gaps.
Disadvantage: - If a process requiring larger memory arrives at a later stage then it cannot
be accommodatedas the largest hole is already split and occupied.
Nextfit: - Next fit is a modified version of first fit. It begins as first fit to find a free partition. When
called next time it starts searching from where it left off, not from the beginning.
3.5. Contiguous memory allocation
The main memory must accommodate both the operating system and the various user processes.
We therefore need to allocate different parts of the main memory in the most efficient way
possible.

The memory is usually divided into two partitions: one for the resident operating system, and one
for the user processes. We may place the operating system in either low memory or high memory.
With this approach each process is contained in a single contiguous section of memory.

One of the simplest methods for memory allocation is to divide memory into several fixed-sized
partitions. Each partition may contain exactly one process. In this multiple- partition method, when
a partition is free, a process is selected from the input queue and is loaded into the free partition.
When the process terminates, the partition becomes available for another process. The operating
system keeps a table indicating which parts of memory are available and which are occupied.
Finally, when a process arrives and needs memory, a memory section large enough for this process
is provided.
The contiguous memory allocation scheme can be implemented in operating systems with the help
of two registers, known as the base and limit registers. When a process is executing in main
memory, its base register contains the starting address of the memory location where the process
is executing, while the amount of bytes consumed by the process is stored in the limit register. A
process does not directly refer to the actual address for a corresponding memory location. Instead,
it uses a relative address with respect to its base register. All addresses referred by a program are
considered as virtual addresses. The CPU generates the logical or virtual address, which is
converted into an actual address with the help of the memory management unit (MMU). The base
addressregister is used for address translation by the MMU. Thus, a physical address is calculated
as follows:

Physical Address = Base register address + Logical address/Virtual address

The address of any memory location referenced by a process is checked to ensure that it does not
refer to an address of a neighboring process. This processing security is handled by the underlying
operating system.

One disadvantage of contiguous memory allocation is that the degree of multiprogramming is


reduced due to processes waiting for free memory.

3.6. Virtual Memory

What is Virtual Memory? Give its advantages and Disadvantages.

A computer can address more memory than the amount physically installed on the system. This extra
memory is actually called virtual memory and it is a section of a hard disk that's set up to emulate the
computer's RAM.
The main visible advantage of this scheme is that programs can be larger than physical memory.
Virtual memory serves two purposes. First, it allows us to extend the use of physical
memory by using disk. Second, it allows us to have memory protection, because each virtual
address is translated to a physical address. Following are the situations, when entire program is
not required to be loaded fully in main memory.
 User written error handling routines are used only when an error occurred in the data or
computation.
 Certain options and features of a program may be used rarely.
 Many tables are assigned a fixed amount of address space even though only a small
amount of the table is actually used.
 The ability to execute a program that is only partially in memory would counter many
benefits.
 Less number of I/O would be needed to load or swap each user program into memory.
 A program would no longer be constrained by the amount of physical memory that is
available.
 Each user program could take less physical memory; more programs could be run the
same time, with a corresponding increase in CPU utilization and throughput.

In real scenarios, most processes never need all their pages at once, for following reasons :

 Error handling code is not needed unless that specific error occurs, some of which are
quiterare.
 Arrays are often over-sized for worst-case scenarios, and only a small fraction of the
arrays are actually used in practice.
 Certain features of certain programs are rarely used.

Benefits of having Virtual Memory:


1. Large programs can be written, as virtual space available is huge compared to
physical memory.
2. Less I/O required, leads to faster and easy swapping of processes.
3. More physical memory available, as programs are stored on virtual memory, so they
occupyvery less space on actual physical memory.

Virtual memory is a feature of an operating system (OS) that allows a computer to compensate
for shortages of physical memory by temporarily transferring pages of data from random access
memory (RAM) to disk storage. Eventually, the OS will need to retrieve the data that was
moved to temporarily to disk storage -- but remember, the only reason the OS moved pages of
data from RAM to disk storage to begin with was because it was running out of RAM. To solve
the problem, the operating system will need to move other pages to hard disk so it has room to
bring back the pages it needs right away from temporary disk storage. This process is known
as paging or swapping and the temporary storage space on the hard disk is called a page file or
a swap file.
Swapping, which happens so quickly that the end user doesn't know it's happening, is carried
out by the computer’s memory manager unit (MMU). The memory manager unit may use one
of several algorithms to choose which page should be swapped out, including Least Recently
Used (LRU), Least Frequently Used (LFU) or Most Recently Used (MRU).
In a virtualized computing environment, administrators can use virtual memory management
techniques to allocate additional memory to a virtual machine (VM) that has run out of
resources. Such virtualization management tactics can improve VM performance and
management flexibility.

What is the difference between a physical address and a virtual address?


Logical address is the address generated by the CPU (from the perspective of a program that is
running) whereas physical address (or the real address) is the address seen by the memory unit
and it allows the data bus to access a particular memory cell in the main memory. All the logical
addresses need to be mapped in to physical addresses before they can be used by the MMU.
Physical and logical addresses are same when using compile time and load time address binding
but they differ when using execution time address binding.
3.6.1. Paging
Paging is a memory management technique in which the memory is divided into fixed size
pages. Paging is used for faster access to data. When a program needs a page, it is available in
the main memory as the OS copies a certain number of pages from your storage device to main
memory. Paging allows the physical address space of a processto be noncontiguous.

OS performs an operation for storing and retrieving data from secondary storage devices for
use in main memory. Paging is one of such memory management scheme. Data is retrieved
from storage media by OS, in the same sized blocks called as pages. Paging allows the physical
address space of the process to be non- contiguous. The whole program had to fit into storage
contiguously.

Paging is to deal with external fragmentation problem. This is to allow the logical address space
of a process to be noncontiguous, which makes the process to be allocated physical memory.

Paging is a method of writing data to, and reading it from, secondary storage for use in primary
storage, also known as main memory. Paging plays a role in memory management for a
computer's OS (operating system).

In a memory management system that takes advantage of paging, the OS reads data from
secondary storage in blocks called pages, all of which have identical size. The physical region of
memory containing a single page is called a frame. When paging is used, a frame does not have
to comprise a single physically contiguous region in secondary storage. This approach offers an
advantage over earlier memory management methods, because it facilitates more efficient and
faster use of storage.

Demand Paging
A demand paging system is quite similar to a paging system with swapping where processes
reside in secondary memory and pages are loaded only on demand, not in advance. When a
context switch occurs, the operating system does not copy any of the old program’s pages out to
the disk or any of the new program’s pages into the main memory Instead, it just begins executing
the new program after loading the first page and fetches that program’s pages as they are
referenced.
Figure 3.6. Swapping
While executing a program, if the program references a page which is not available in the main
memory because it was swapped out a little ago, the processor treats this invalid memory
reference as a page fault and transfers control from the program to the operating system to
demand the page back into the memory.

Following are the advantages of Demand Paging −

 Large virtual memory.


 More efficient use of memory.
 There is no limit on degree of multiprogramming.

Disadvantages: - Number of tables and the amount of processor overhead for handling page
interrupts are greater than in the case of the simple paged management techniques.

3.6.2. Page Table:


The virtual page number is used as an index into the page table to find the entry for that virtual
page. From the page table entry, the page frame number (if any) is found. The page frame number
is attached to the high-order end of the offset, replacing the virtual page number, to form a physical
address that can be sent to the memory.
Thus the purpose of the page table is to map virtual pages onto page frames. Mathematically
speaking, the page table is a function, with the virtual page number as argument and the physical
frame number as result. Using the result of this function, the virtual page field in a virtual address
can be replaced by a page frame field, thus forming a physical memory address.

Structure of a Page Table Entry

The size varies from computer to computer, but 32 bits is a common size. The most important
field is the Page frame number. After all, the goal of the page mapping is to output this value.
Next to it we have the Present/absent bit. If this bit is 1, the entry is valid and can be used. If it
is 0, the virtual page to which the entry belongs is not currently in memory. Accessing a page
table entry with this bit set to 0 causes a page fault.

Figure 3.7. Page table entry

The Protection bits tell what kinds of access are allowed. In the simplest form, this field includes
1 bit, with 0 for read/write and 1 for read only. A more complicated arrangement is having 3
bits, one bit each for enabling reading, writing, and executing the page.

The Modified and Referenced bits keep track of page usage. When a page is written to,the
hardware automatically sets the Modified bit.

The Referenced bit is set whenever a page is referenced, either for reading or writing.

Design issues with paging system.

 The mapping from virtual address to physical address must be fast.


 If the virtual address space is large, the page table will be large.
Advantages and Disadvantages of Paging

 Paging reduces external fragmentation, but still suffers from internal fragmentation.
 Paging is simple to implement and assumed as an efficient memory management
technique.
 Due to equal size of the pages and frames, swapping becomes very easy.
 Page table requires extra memory space, so may not be good for a system having small
RAM.

3.6.3. PAGE REPLACEMENT ALGORITHMS


Page replacement algorithms are the techniques using which an Operating System decides
which memory pages to swap out, write to disk when a page of memory needs to be allocated.
Paging happens whenever a page fault occurs and a free page cannotbe used for allocation
purpose accounting to reason that pages are not available or the number of free pages is lower
than required pages.

When the page that was selected for replacement and was paged out, is referenced again, it
has to read in from disk, and this requires for I/O completion. This process determines the
quality of the page replacement algorithm: the lesser the time waiting for page-ins, the better
is the algorithm.

A page replacement algorithm looks at the limited information about accessing the pages
provided by hardware, and tries to select which pages should be replaced to minimize the total
number of page misses, while balancing it with the costs of primary storage and processor
time of the algorithm itself. There are many different page replacement algorithms. We
evaluate an algorithm by running it on a particular string of memory reference and computing
the number of page faults,

Reference String
The string of memory references is called reference string. Reference strings are generated
artificially or by tracing a given system and recording the address of each memory reference.
The latter choice produces a large number of data.

3.6.3.1. First in First out Page Replacement Algorithm


 Oldest page in main memory is the one which will be selected for replacement.
 Easy to implement, keep a list, replace pages from the tail and add new pages at the
head.
3.6.3.2. Optimal Page Replacement Algorithm
 An optimal page-replacement algorithm has the lowest page-fault rate of all
algorithms. An optimal page-replacement algorithm exists, and has been called OPT
or MIN.
 Replace the page that will not be used for the longest period of time. Use the
time when a page is to be used.

3.6.3.3. Least Recently Used (LRU) PageReplacement Algorithm


 Page which has not been used for the longest time in main memory is the one
which will be selected for replacement.
 Easy to implement, keep a list, replace pages by looking back into time.

3.6.3.4. The Second-Chance Page Replacement Algorithm

A simple alteration to FIFO that avoids the problem of throwing out a heavily used page is to
inspect the R bit of the oldest page. If it is 0, the page is both old and unused, so it is replaced
immediately. If the R bit is 1, the bit is cleared, the page is put onto the end of the list of pages,
and its load time is updated as though it had just arrived in memory. Then the search continues.
The operation of this algorithm, called second chance, is shown in Figure 3.8. In Figure 3.8(a)
we see pages A through H kept on a linked list and sorted by the time they arrived in
memory.

Figure 3.8. Second-Chance Page Replacement Algorithm

3.6.3.5. The Clock Page Replacement


Algorithm

Although second chance is a reasonable algorithm, it is unnecessarily inefficient because it is


continually moving pages around on its list. A better approach is to keep all the page frames on
a circular list in the form of a clock. The hand points to the oldest page.

3.7. Segmentation
Segmentation is a memory management technique in which each job is divided into several
segments of different sizes, one for each module that contains pieces that perform related
functions. Each segment is actually a different logical address space of the program.

When a process is to be executed, its corresponding segmentation are loaded into non-
contiguous memory though every segment is loaded into a contiguous block of available
memory.

Segmentation memory management works very similar to paging but here segments are of
variable-length where as in paging pages are of fixed size.

A program segment contains the program's main function, utility functions, data structures, and
so on. The operating system maintains a segment map table for every process and a list of free
memory blocks along with segment numbers, their size and corresponding memory locations
in main memory. For each segment, the table stores the starting address of the segment and
the length of the segment. A reference to a memory location includes a value that identifies a
segment and an offset.
The virtual memory discussed so far is one-dimensional because the virtual addresses go from
0 to some maximum address, one address after another. For various problems, having two or
more separate virtual address spaces may be much better than having only one. For instance,
a compiler has many tables that are built up as compilation proceeds, possibly including:

 The source text being saved for the printed listing (on batch systems).

 The symbol table, containing the names and attributes of variables.

 The table containing all the integer and floating-point constants used.

 The parse tree, containing the syntactic analysis of the program.

 The stack used for procedure calls within the compiler.

Each of the first four tables grows continuously as compilation proceeds. The last one grows
and shrinks in unpredictable ways during compilation. In a one-dimensional memory, these five
tables would have to be assigned contiguous chunks of virtualaddress space, as in Figure 3.9.
Examine what happens if a program has a much largerthan usual number of variables but a
normal amount of everything else. The chunk of address space assigned for the symbol table
may fill up, but there may be lots of room inthe other tables. The compiler could, of course,
simply issue a message saying that the compilation cannot continue due to too many variables,
but doing so does not seem very sporting when unused space is left in the other tables.
Figure 3.9 One dimensional address spaces with growing tables

Explain any one type of Segmentation with paging.


If the segments are large, it may be inconvenient, or even impossible, to keep them in main
memory in their entirety. This leads to the idea of paging them, so that only those pages that
are really required have to be around. Many significant systems have supported paged
segments. In this section we will explain the first one: MULTICS. In the next one we will describe
a more recent one: the Intel Pentium.

MULTICS ran on the Honeywell 6000 machines and their descendants and provided each
program with a virtual memory of up to 218 segments (more than 250,000), each of which could
be up to 65,536 (36-bit) words long. To implement this, the MULTICSdesigners chose to treat
each segment as a virtual memory and to page it, combining the advantages of paging
(uniform page size and not having to keep the whole segment in memory if only part of it is
being used) with the advantages of segmentation (ease of programming, modularity,
protection, sharing).

3.8. Working sets and thrashing


The “working set” is short hand for “parts of memory that the current algorithm is using” and
is determined by which parts of memory the CPU just happens to access. It is totally automatic to
you. If you are processing an array and storing the results in a table, the array and the table are
your working set.

This is discussed because the CPU will automatically store accessed memory in cache, close to the
processor. The working set is a nice way to describe the memory you want stored. If it is small
enough, it can all fit in the cache and your algorithm will run very fast. On the OS level, the kernel
has to tell the CPU where to find the physical memory your application is using (resolving virtual
addresses) every time you access a new page (typically 4k in size) so also you want to avoid that
hit as much as possible.

Thrashing in Operating System

In case, if the page fault and swapping happen very frequently at a higher rate, then the operating
system has to spend more time swapping these pages. This state in the operating system is termed
thrashing. Because of thrashing the CPU utilization is going to be reduced.

Let’s understand by an example, if any process does not have the number of frames that it needs
to support pages in active use, then it will quickly page fault. And at this point, the process must
replace some pages. As all the pages of the process are actively in use, it must replace a page that
will be needed again right away. Consequently, the process will quickly fault again, and again, and
again, replacing pages that it must bring back in immediately. This high paging activity by a
process is called thrashing. During thrashing, the CPU spends less time on some actual productive
work spend more time swapping.

Figure 3.10 Thrashing


Causes of Thrashing

Thrashing affects the performance of execution in the Operating system. Also, thrashing results in
severe performance problems in the Operating system. When the utilization of CPU is low, then
the process scheduling mechanism tries to load many processes into the memory at the same time
due to which degree of Multiprogramming can be increased. Now in this situation, there are more
processes in the memory as compared to the available number of frames in the memory. Allocation
of the limited amount of frames to each process.

Whenever any process with high priority arrives in the memory and if the frame is not freely
available at that time, then the other process that has occupied the frame is residing in the frame
will move to secondary storage and after that this free frame will be allocated to higher priority
process.

We can also say that as soon as the memory fills up, the process starts spending a lot of time for
the required pages to be swapped in. Again, the utilization of the CPU becomes low because most
of the processes are waiting for pages. Thus, a high degree of multiprogramming and lack of frames
are two main causes of thrashing in the Operating system.

Effect of Thrashing

At the time, when thrashing starts then the operating system tries to apply either the Global page
replacement Algorithm or the Local page replacement algorithm.

Global Page Replacement

The Global Page replacement has access to bring any page, whenever thrashing found it tries to
bring more pages. Actually, due to this, no process can get enough frames and as a result, the
thrashing will increase more and more. Thus, the global page replacement algorithm is not suitable
whenever thrashing happens.

Local Page Replacement

Unlike the Global Page replacement, the local page replacement will select pages which only
belong to that process. Due to this, there is a chance of a reduction in the thrashing. As it is also
proved that there are many disadvantages of Local Page replacement. Thus, local page replacement
is simply an alternative to Global Page replacement.

Techniques used to handle the thrashing

As we have already told you the Local Page replacement is better than the Global Page replacement
but local page replacement has many disadvantages too, so it is not suggestible. Thus, given below
are some other techniques that are used.

3.9. Caching

Caching is the process of storing data in a separate place (called the cache) such that they could be
accessed faster if the same data is requested in the future. When some data is requested, the cache
is first checked to see whether it contains that data. If data is already in the cache, it is called a
cache hit. Then the data can be retrieved from the cache, which is much faster than retrieving it
from the original storage location. If the requested data is not in the cache, it is called a cache miss.
Then the data needs to be fetched from the original storage location, which would take a longer
time. Caching is used in different places. In the CPU, caching is used to improve the performance
by reducing the time taken to get data from the main memory. In web browsers, web caching is
used to store responses from previous visits to web sites, in order to make the next visits faster.

Chapter Four
Device Management

4.1. Introduction to Device Management

Devices are typically physical/hardware devices such as computers, laptops, servers, mobile
phones, etc. They could also be virtual, such as virtual machines or virtual switches. During the
execution of a program, it may require various computer resources (devices) for its complete
execution. The operating system has the onus upon it to provide judiciously the resources required.
It is the sole responsibility of the operating system to check if the resource is available or not. It is
not just concerned with the allocation of the devices but also with the deallocation, i.e., once the
requirement of a device/resource is over by a process, it must be taken away from the process.

In an operating system, device management refers to the control of input/output devices such as
discs, microphones, keyboards, printers, magnetic tape, USB ports, scanners, and various other
devices.

Device management in operating system:


 Keep tracks of all devices and the program which is responsible to perform this is called
I/O controller.
 Monitoring the status of each device such as storage drivers, printers and other peripheral
devices.
 Enforcing preset policies and taking a decision which process gets the device when and for
how long.
 Allocates and De-allocates the device in an efficient way. De-allocating them at two levels:
at the process level when I/O command has been executed and the device is temporarily
released, and at the job level, when the job is finished and the device is permanently
released.
 Optimizes the performance of individual devices

Generally it performs the following:

 Installing device and component-level drivers and related software

 Configuring a device so it performs as expected using the bundled operating system,


business/workflow software and/or with other hardware devices.

 Implementing security measures and processes.


4.1.1. I/O devices

The fundamentals of I/O devices may be divided into three categories:

 Boot Device

 Character Device

 Network Device

Boot Device

It stores data in fixed-size blocks, each with its unique address. For example- Disks.

Character Device

It transmits or accepts a stream of characters, none of which can be addressed individually. For
instance, keyboards, printers, etc.

Network Device

It is used for transmitting the data packets.

4.1.2. Storage Devices

There are two types of storage devices:-


 Volatile Storage Device –
It loses its contents when the power of the device is removed.
 Non-Volatile Storage device –
It does not loses its contents when the power is removed. It holds all the data when the
power is removed.

Secondary Storage is used as an extension of main memory. Secondary storage devices can hold
the data permanently. Storage devices consist of Registers, Cache, Main-Memory, Electronic-
Disk, Magnetic-Disk, Optical-Disk, Magnetic-Tapes. Each storage system provides the basic
system of storing a datum and of holding the datum until it is retrieved at a later time. All the
storage devices differ in speed, cost, size and volatility. The most common Secondary-storage
device is a Magnetic-disk, which provides storage for both programs and data.
In this hierarchy all the storage devices are arranged according to speed and cost. The higher levels
are expensive, but they are fast. As we move down the hierarchy, the cost per bit generally
decreases, whereas the access time generally increases.
The storage systems above the Electronic disk are volatile, where as those below are Non-Volatile.
An Electronic disk can be either designed to be either volatile or Non-Volatile. During normal
operation, the electronic disk stores data in a large DRAM array, which is Volatile. But many
electronic disk devices contain a hidden magnetic hard disk and a battery for backup power. If
external power is interrupted, the electronic disk controller copies the data from RAM to the
magnetic disk. When external power is restored, the controller copies the data back into the RAM.
The design of a complete memory system must balance all the factors. It must use only as much
expensive memory as necessary while providing as much inexpensive, Non-Volatile memory as
possible. Caches can be installed to improve performance where a large access-time or transfer-
rate disparity exists between two components.
Secondary storage structure
Secondary storage devices are those devices whose memory is non-volatile, meaning, the
stored data will be intact even if the system is turned off. Here are a few things worth noting
about secondary storage.
Magnetic Disk Structure
In modern computers, most of the secondary storage is in the form of magnetic disks. Hence,
knowing the structure of a magnetic disk is necessary to understand how the data in the disk is
accessed by the computer.
Structure of a magnetic disk
A magnetic disk contains several platters. Each platter is divided into circular shaped tracks. The
length of the tracks near the center is less than the length of the tracks farther from the center. Each
track is further divided into sectors, as shown in the figure.
Tracks of the same distance from center form a cylinder. A read-write head is used to read data
from a sector of the magnetic disk.
The speed of the disk is measured as two parts:
 Transfer rate: This is the rate at which the data moves from disk to the computer.
 Random access time: It is the sum of the seek time and rotational latency.
Seek time is the time taken by the arm to move to the required track. Rotational latency is defined
as the time taken by the arm to reach the required sector in the track. Even though the disk is
arranged as sectors and tracks physically, the data is logically arranged and addressed as an array
of blocks of fixed size. The size of a block can be 512 or 1024 bytes. Each logical block is mapped
with a sector on the disk, sequentially. In this way, each sector in the disk will have a logical
address.
Disk Scheduling Algorithms
On a typical multiprogramming system, there will usually be multiple disk access requests at any
point of time. So those requests must be scheduled to achieve good efficiency. Disk scheduling is
similar to process scheduling. Some of the disk scheduling algorithms is described below.
First Come First Serve
This algorithm performs requests in the same order asked by the system. Let’s take an example
where the queue has the following requests with cylinder numbers as follows:
98, 183, 37, 122, 14, 124, 65, 67
Assume the head is initially at cylinder 56. The head moves in the given order in the queue
i.e., 56→98→183→…→67.

Shortest Seek Time First (SSTF)


Here the position which is closest to the current head position is chosen first. Consider the previous
example where disk queue looks like, 98, 183, 37, 122, 14, 124, 65, 67 Assume the head is initially
at cylinder 56. The next closest cylinder to 56 is 65, and then the next nearest one is 67,
then 37, 14, so on.
SCAN algorithm
This algorithm is also called the elevator algorithm because of its behavior. Here, first the head
moves in a direction (say backward) and covers all the requests in the path. Then it moves in the
opposite direction and covers the remaining requests in the path. This behavior is similar to that of
an elevator. Let’s take the previous example,
98, 183, 37, 122, 14, 124, 65, 67 Assume the head is initially at cylinder 56. The head moves in
backward direction and accesses 37 and 14. Then it goes in the opposite direction and accesses the
cylinders as they come in the path.
4.1.3. Types of devices

There are three types of Operating system peripheral devices: dedicated, shared, and virtual.
These are as follows:

Dedicated Device

In device management, some devices are allocated or assigned to only one task at a time until that
job releases them. Devices such as plotters, printers, tape drivers, and other similar devices
necessitate such an allocation mechanism because it will be inconvenient if multiple people share
them simultaneously. The disadvantage of such devices is the inefficiency caused by allocating
the device to a single user for the whole duration of task execution, even if the device is not used
100% of the time.

Shared Devices

These devices could be assigned to a variety of processes. By interleaving their requests, disk-
DASD could be shared by multiple processes simultaneously. The Device Manager carefully
controls the interleaving, and pre-determined policies must resolve all difficulties.
Virtual Devices

Virtual devices are a hybrid of the two devices, and they are dedicated devices that have been
transformed into shared devices. For example, a printer can be transformed into a shareable device
by using a spooling program that redirects all print requests to a disk. A print job is not sent directly
to the printer; however, it is routed to the disk until it is fully prepared with all of the required
sequences and formatting, at which point it is transmitted to the printers. The approach can
transform a single printer into numerous virtual printers, improving performance and ease of use.

4.1.4. Functions of the device management in the operating system

The operating system (OS) handles communication with the devices via their drivers. The OS
component gives a uniform interface for accessing devices with various physical features. There
are various functions of device management in the operating system. Some of them are as
follows:

 It keeps track of data, status, location, uses, etc. The file system is a term used to define a
group of facilities.

 It enforces the pre-determined policies and decides which process receives the device
when and for how long.

 It improves the performance of specific devices.

 It monitors the status of every device, including printers, storage drivers, and other
devices.

 It allocates and effectively deallocates the device. De-allocating differentiates the devices
at two levels: first, when an I/O command is issued and temporarily freed. Second, when
the job is completed, and the device is permanently release
4.2. Characteristics of serial and parallel devices

There are two methods used for transferring data between computers which are given below: Serial
Transmission and Parallel Transmission.

4.2.1. Serial Transmission

When data is sent or received using serial data transmission, the data bits (0 and 1) are organized in
a specific order, since they can only be sent one after another. The order of the data bits is important
as it dictates how the transmission is organized when it is received. It is viewed as a reliable data
transmission method because a data bit is only sent if the previous data bit has already been
received.

In Serial Transmission, data is sent bit by bit from one computer to another in bi-direction where
each bit has its clock pulse rate. Eight bits are transferred at a time having a start and stop bit
(usually known as a Parity bit), i.e. 0 and 1 respectively. For transmitting data to a longer distance,
serial data cables are used. However, the data transferred in the serial transmission is in proper
order. It consists of a D-shaped 9 pin cable that connects the data in series

Figure 4.1 Serial transmission


Serial transmission has two classifications: asynchronous and synchronous.
Asynchronous Serial Transmission
Data bits can be sent at any point in time. Stop bits and start bits are used between data bytes to
synchronize the transmitter and receiver and to ensure that the data is transmitted correctly. The
time between sending and receiving data bits is not constant, so gaps are used to provide time
between transmissions.
The advantage of using the asynchronous method is that no synchronization is required between
the transmitter and receiver devices. It is also a more cost effective method. A disadvantage is that
data transmission can be slower, but this is not always the case.
Synchronous Serial Transmission
Data bits are transmitted as a continuous stream in time with a master clock. The data transmitter
and receiver both operate using a synchronized clock frequency; therefore, start bits, stop bits, and
gaps are not used. This means that data moves faster and timing errors are less frequent because
the transmitter and receiver time is synced. However, data accuracy is highly dependent on timing
being synced correctly between devices. In comparison with asynchronous serial transmission, this
method is usually more expensive.
When is serial transmission used to send data?
Serial transmission is normally used for long-distance data transfer. It is also used in cases where
the amount of data being sent is relatively small. It ensures that data integrity is maintained as it
transmits the data bits in a specific order, one after another. In this way, data bits are received in-
sync with one another.
4.2.2. Parallel Transmission
In Parallel Transmission, various bits are sent together simultaneously with a single clock pulse.
It is a fast way to transmit as it uses many input/output lines for transferring the data. Furthermore,
it is advantageous because it conforms to the underlying hardware also, as the electronic devices
like computer and communication hardware uses parallel circuitry internally. This is a reason the
parallel interface complements the internal hardware well.

Figure 4.2 Parallel Transmission


The installation and troubleshooting are easier in parallel transmission system due to its placement
in a single physical cable. Parallel Transmission uses a 25 pin port having 17 signal lines and 8
ground lines. The 17 signal lines are further divided as
4 lines that initiate handshaking,
Status lines used to communicate and notify errors and
8 to transfer data.
Despite the speed of the data, the parallel transmission has a limitation called skew where bits
could travel in quite different speeds over the wires.

For transferring data between computers, laptops, two methods are used, namely, Serial
Transmission and Parallel Transmission. There are some similarities and dissimilarities between
them. One of the primary differences is that; in Serial Transmission, data is sent bit by bit
whereas, in Parallel Transmission a byte (8 bits) or character is sent at a time. The similarity is
that both are used to connect and communicate with peripheral devices. Furthermore, the parallel
transmission is time-sensitive, whereas serial transmission is not time-sensitive. Other
differences are discussed below. Both Serial and Parallel Transmission have their advantages
and disadvantages, respectively. Parallel Transmission is used for a limited distance, provides
higher speed. On the other hand, Serial Transmission is reliable for transferring data to longer
distance. Hence, we conclude that both serial and parallel are individually essential for
transferring data.

4.3. Buffering Strategy

Buffering
A data buffer (or just buffer) is a region of a physical memory storage used to temporarily store
data while it is being moved from one place to another. Typically, the data is stored in a buffer as
it is retrieved from an input device (such as a microphone) or just before it is sent to an output
device (such as speakers). However, a buffer may be used when moving data between processes
within a computer. This is comparable to buffers in telecommunication. Buffers can be
implemented in a fixed memory location in hardware or by using a virtual data buffer in software,
pointing at a location in the physical memory. In all cases, the data stored in a data buffer are stored
on a physical storage medium. A majority of buffers are implemented in software, which typically
use the faster RAM to store temporary data, due to the much faster access time compared with
hard disk drives. Buffers are typically used when there is a difference between the rate at which
data is received and the rate at which it can be processed, or in the case that these rates are variable,
for example in a printer spooler or in online video streaming. In the distributed computing
environment, data buffer is often implemented in the form of burst buffer that provides distributed
buffering service.
Example, in printer’s spoolers, we can pass a large no of pages to print as input, but the
processing/printing is slow. Here buffering is used. I/O buffering the process of temporarily storing data that is passing
between a processor and a peripheral. The usual purpose is to smooth out the difference in rates at which the two devices
can handle data.

Uses of I/O Buffering:


 Buffering is done to deal effectively with a speed mismatch between the producer and
consumer of the data stream.
 A buffer is produced in main memory to heap up the bytes received from modem.
 After receiving the data in the buffer, the data get transferred to disk from buffer in a
single operation.
 This process of data transfer is not instantaneous, therefore the modem needs another
buffer in order to store additional incoming data.
 When the first buffer got filled, then it is requested to transfer the data to disk.
 The modem then starts filling the additional incoming data in the second buffer while the
data in the first buffer getting transferred to disk.
 When both the buffers completed their tasks, then the modem switches back to the first
buffer while the data from the second buffer get transferred to the disk.
 The use of two buffers disintegrates the producer and the consumer of the data, thus
minimizes the time requirements between them.
 Buffering also provides variations for devices that have different data transfer sizes.
Types of various I/O buffering techniques:
1. Single buffer: A buffer is provided by the operating system to the system portion of the main
memory.
Block oriented device
System buffer takes the input.
After taking the input, the block gets transferred to the user space by the process and then
the process requests for another block.
Two blocks works simultaneously, when one block of data is processed by the user
process, the next block is being read in.
OS can swap the processes.
OS can record the data of system buffer to user processes.
Stream oriented device
 Line- at a time operation is used for scroll made terminals. User inputs one line at a time,
with a carriage return signaling at the end of a line.
 Byte-at a time operation is used on forms mode, terminals when each keystroke is
significant.

Figure 4.3 Single Buffering


2. Double buffer:
Block oriented –
 There are two buffers in the system.
 One buffer is used by the driver or controller to store data while waiting for it to be taken
by higher level of the hierarchy.
 Other buffer is used to store data from the lower level module.
 Double buffering is also known as buffer swapping.
 A major disadvantage of double buffering is that the complexity of the process get
increased.
 If the process performs rapid bursts of I/O, then using double buffering may be deficient.
Stream oriented –
 Line- at a time I/O, the user process need not be suspended for input or output, unless
process runs ahead of the double buffer.
 Byte- at a time operations, double buffer offers no advantage over a single buffer of twice
the length.
Figure 4.4. Double Buffering
3. Circular buffer:
 When more than two buffers are used, the collection of buffers is itself referred to as a
circular buffer.
 In this, the data do not directly passed from the producer to the consumer because the
data would change due to overwriting of buffers before they had been consumed.
 The producer can only fill up to buffer i-1 while data in buffer i is waiting to be
consumed.

Figure 4.5. Circular Buffering

4.4. Direct Memory Access(DMA)

DMA stands for “Direct Memory Access” and is a method of transferring data from the computer‘s
RAM to another part of the computer without processing it using the CPU. While most data that
is input or output from your computer is processed by the CPU, some data does not require
processing, or can be processed by another device.

In these situations, DMA can save processing time and is a more efficient way to move data from
the computer’s memory to other devices. In order for devices to use direct memory access, they
must be assigned to a DMA channel. Each type of port on a computer has a set of DMA channels
that can be assigned to each connected device. For example, a PCI controller and a hard drive
controller each have their own set of DMA channels
For example, a sound card may need to access data stored in the computer’s RAM, but since it can
process the data itself, it may use DMA to bypass the CPU. Video cards that support DMA can
also access the system memory and process graphics without needing the CPU. Ultra DMA hard
drives use DMA to transfer data faster than previous hard drives that required the data to first be
run through the CPU.
An alternative to DMA is the Programmed Input/Output (PIO) interface in which all data
transmitted between devices goes through the processor. A newer protocol for the ATAIIDE
interface is Ultra DMA, which provides a burst data transfer rate up to 33 mbps. Hard drives that
come with Ultra DMAl33 also support PIO modes 1, 3, and 4, and multiword DMA mode 2 at
16.6 mbps.
DMA Transfer Types
Memory To Memory Transfer
In this mode block of data from one memory address is moved to another memory address. In this
mode current address register of channel 0 is used to point the source address and the current
address register of channel is used to point the destination address in the first transfer cycle, data
byte from the source address is loaded in the temporary register of the DMA controller and in the
next transfer cycle the data from the temporary register is stored in the memory pointed by
destination address. After each data transfer current address registers are decremented or
incremented according to current settings. The channel 1 current word count register is also
decremented by 1 after each data transfer. When the word count of channel 1 goes to FFFFH, a
TC is generated which activates EOP output terminating the DMA service.
Auto initialize
In this mode, during the initialization the base address and word count registers are loaded
simultaneously with the current address and word count registers by the microprocessor. The
address and the count in the base registers remain unchanged throughout the DMA service.
After the first block transfer i.e. after the activation of the EOP signal, the original values of the
current address and current word count registers are automatically restored from the base address
and base word count register of that channel. After auto initialization the channel is ready to
perform another DMA service, without CPU intervention.
DMA Controller
The controller is integrated into the processor board and manages all DMA data transfers.
Transferring data between system memory and an 110 device requires two steps. Data goes from
the sending device to the DMA controller and then to the receiving device. The microprocessor
gives the DMA controller the location, destination, and amount of data that is to be transferred.
Then the DMA controller transfers the data, allowing the microprocessor to continue with other
processing tasks. When a device needs to use the Micro Channel bus to send or receive data, it
competes with all the other devices that are trying to gain control of the bus. This process is known
as arbitration. The DMA controller does not arbitrate for control of the BUS instead; the I/O device
that is sending or receiving data (the DMA slave) participates in arbitration. It is the DMA
controller, however, that takes control of the bus when the central arbitration control point grants
the DMA slave’s request.

4.5. Recovery from failure


Recovery from Failure” is a phrase used to describe a need in aviation to continue real-time
operations to a safe conclusion despite a critical part of a system (technical, procedural, or human)
failing, sometimes at the most crucial time.
Continuation of operations to a safe conclusion can be guaranteed, or at least facilitated, through
system design, redundancy, back-up systems or procedures, safety nets, and even accurate fault
diagnoses and timely, correct responses by human operators. Many of these features are built-in
as system defences, but, as the subject concerns recovery from failure (or after failure) these
features can be considered as “containment” measures.
Examples

 Basic design. One of the simplest concepts to grasp concerning design is the use of more than
one engine; if one fails, the other is designed to be powerful enough to continue operations
safely. Furthermore, by placing engines under-the-wing instead of inside the wing, then added
protection is provided to other critical aircraft systems (other engines, fuel, and hydraulics) if
an engine fails “explosively”.
 Recovering from a system failure. E.g. recovering from the failure of flaps to lower. First of
all the possibility of this happening on modern aircraft is much reduced by the design of the
hydraulic system allowing for leaks to be isolated, thereby protecting essential services such
as undercarriage and flaps. Furthermore, back-up air-driven or electric hydraulic pumps may
be available allowing for redundancy. Further back-up may be possible through accumulators
that hold “one-shot” applications of services. This example shows a multi-level design that
allows many opportunities to recover from failure; or, actually prevent total failure. If,
however, the flaps still fail to lower, there will be Standard Operating Procedures that guide the
pilots to select a suitable runway (landing length, navigation and visual aids), environmental
conditions (wind, runway contamination), adjusted approach and landing speeds,
recommendations for braking and reverse thrust use for deceleration etc. Performance
calculations to determine the landing distance required can be considered to contain a safety net
in the form of a % safety margin.
 Recovering from specific situations. When people talk about the concept of “recovery from
failure” they are often referring to the several critical situations that pilots are trained to
recover from, such as recovery from: engine failure during take-off, unusual attitudes (loss of
control), uncontained engine failure, rejected take-off and rejected landing. These recovery
techniques are skill-based and procedure driven. The latest aircraft designs (2013) now have
the capability to recover automatically from situations such as unusual attitudes, without input
from pilots. In many cases it is even possible that entry into an unusual attitude and/or a stall is
prevented by the aircraft’s automated systems.
 Recovering from human failure. When humans perform a skill poorly, or omit to perform an
action (see human error), then a Safety Net can assist recovery. At the simplest level a safety
net could be a harness to prevent an engineer from falling whilst attending to an engine. To a
more technical degree, if an air traffic controller clears an aircraft to an unsafe level (where a
conflict exists), or two pilots fail to level-off on time at their cleared Flight Level, then
an Airborne Collision Avoidance System (ACAS) can help the pilots recover from a conflict
caused by the Level Bust.
 Recovering from an accident. The worst kind of failure, of course, is an accident; and even
with serious accidents there is the possibility of containment and partial recovery, i.e. the
saving of lives. Design, procedures and safety nets can all assist in recovery. E.g. passenger
seats and restraints designed to withstand high deceleration forces (typically 16 x gravity);
cabin interiors designed to prevent passengers from being incapacitated by smoke, fumes, and
noxious gases; cabin crew trained in procedures to assist passengers evacuate as fast as
possible; and life jackets and rafts available as containment measures if the aircraft has ditched
on water.

Chapter Five

File System

5.1. Fundamental concepts on file

In computing, file system or filesystem is a method and data structure that the operating system
uses to control how data is stored and retrieved. Without a file system, data placed in a storage
medium would be one large body of data with no way to tell where one piece of data stops and the
next begins. By separating the data into pieces and giving each piece a name, the data is easily
isolated and identified. Taking its name from the way paper-based data management system is
named, each group of data is called a file. The structure and logic rules used to manage the groups
of data and their names is called a file system.
5.1.1 Data and Meta Data

Metadata is data about data. This refers to not the data itself, but rather to any information that
describes some aspect of the data. Everything from the title, information on how data fits together
(e.g., which page goes before which other page), when and by whom the data was created, and
lists of web pages visited by people, can be classified as metadata. Metadata can be stored in a
variety of places. Where the metadata relates to databases, the data is often stored in tables and
fields within the database. Sometimes the metadata exists in a specialist document or database
designed to store such data, called a data dictionary or metadata repository.

5.2. File system techniques

5.2.1. Partition

A partition is a logical division of a hard disk that is treated as a separate unit by operating systems
(OSes) and file systems. The OSes and file systems can manage information on each partition as
if it were a distinct hard drive. This allows the drive to operate as several smaller sections to
improve efficiency, although it reduces usable space on the hard disk because of additional
overhead from multiple OSes.

A disk partition manager allows system administrators to create, resize, delete and manipulate
partitions, while a partition table logs the location and size of the partition. Each partition appears
to the OS as a distinct logical disk, and the OS reads the partition table before any other part of the
disk. Once a partition is created, it is formatted with a file system such as:

 NTFS on Windows drives;


 FAT32 and exFAT for removable drives;
 HFS Plus (HFS+) on Mac computers; or
 Ext4 on Linux.

Data and files are then written to the file system on the partition. When users boot the OS in a
computer, a critical part of the process is to give control to the first sector on the hard disk. This
includes the partition table that defines how many partitions will be formatted on the hard disk,
the size of each partition and the address where each disk partition begins. The sector also contains
a program that reads the boot sector for the OS and gives it control so that the rest of the OS can
be loaded into random access memory. A key aspect of partitioning is the active or bootable
partition, which is the designated partition on the hard drive that contains the OS. Only the partition
on each drive that contains the boot loader for the OS can be designated as the active partition. The
active partition also holds the boot sector and must be marked as active. A recovery partition
restores the computer to its original shipping condition. In enterprise storage, partitioning helps
enable short stroking, a practice of formatting a hard drive to speed performance through data
placement.

A partition is a logical division of a hard disk that is treated as a separate unit by operating systems (OSes)
and file systems. Each partition appears to the OS as a distinct logical disk, and the OS reads the partition
table before any other part of the disk.
5.2.2. Mounting and Unmounting File Systems

Before you can access the files on a file system, you need to mount the file system. When you
mount a file system, you attach that file system to a directory (mount point) and make it available
to the system. The root (/) file system is always mounted. Any other file system can be connected
or disconnected from the root (/) file system. When you mount a file system, any files or directories
in the underlying mount point directory are unavailable as long as the file system is mounted.
These files are not permanently affected by the mounting process, and they become available again
when the file system is unmounted. However, mount directories are typically empty, because you
usually do not want to obscure existing files.

5.2.3. Virtual file system

An operating system can have multiple file systems in it. Virtual File Systems are used to integrate
multiple file systems into an orderly structure. The key idea is to abstract out that part of the file
system that is common to all file systems and put that code in a separate layer that calls the
underlying concrete file system to actually manage the data.

5.2.4. memory-mapped file

A memory-mapped file contains the contents of a file in virtual memory. This mapping between a
file and memory space enables an application, including multiple processes, to modify the file by
reading and writing directly to the memory. You can use managed code to access memory-mapped
files in the same way that native Windows functions access memory-mapped files, as described in
Managing Memory-Mapped Files. There are two types of memory-mapped files:

Persisted memory-mapped files: Persisted files are memory-mapped files that are associated with
a source file on a disk. When the last process has finished working with the file, the data is saved
to the source file on the disk. These memory-mapped files are suitable for working with extremely
large source files.

Non-persisted memory: mapped files Non-persisted files are memory-mapped files that are not
associated with a file on a disk. When the last process has finished working with the file, the data
is lost and the file is reclaimed by garbage collection. These files are suitable for creating shared
memory for inter-process communications (IPC).

Processes, Views, and Managing Memory: mapped files can be shared across multiple processes.
Processes can map to the same memory-mapped file by using a common name that is assigned by
the process that created the file. To work with a memory-mapped file, you must create a view of
the entire memory-mapped file or a part of it. You can also create multiple views to the same part
of the memory-mapped file, thereby creating concurrent memory. For two views to remain
concurrent, they have to be created from the same memory-mapped file.

Multiple views may also be necessary if the file is greater than the size of the application’s logical
memory space available for memory mapping (2 GB on a 32-bit computer). There are two types
of views: stream access view and random access view. Use stream access views for sequential
access to a file; this is recommended for non-persisted files and IPC. Random access views are
preferred for working with persisted files. Memory-mapped files are accessed through the
operating system’s memory manager, so the file is automatically partitioned into a number of pages
and accessed as needed. You do not have to handle the memory management yourself. The
following illustration shows how multiple processes can have multiple and overlapping views to
the same memory-mapped file at the same time.

5.3. Special purpose file systems

FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many
other portable devices because of their relative simplicity. Performance of FAT compares poorly to most
other file systems as it uses overly simplistic data structures, making file operations time-consuming, and
makes poor use of disk space in situations where many small files are present. ISO 9660 and Universal
Disk Format are two common formats that target Compact Discs and DVDs. Mount Rainier is a newer
extension to UDF supported by Linux 2.6 series and Windows Vista that facilitates rewriting to DVDs in
the same fashion as has been possible with floppy disks.

5.3.1. Naming, searching and backup strategies

A file is named, for the convenience of its human users, and is referred to by its name. A name is
usually a string of characters, such as example.c. Some systems differentiate between uppercase
and lowercase characters in names, whereas other systems do not. When a file is named, it becomes
independent of the process, the user, and even the system that created it. For instance, one user
might create the file example.c, and another user might edit that file by specifying its name. The
file’s owner might write the file to a USB disk, send it as an e-mail attachment, or copy it across a
network, and it could still be called example. On the destination system.

A file’s attributes vary from one operating system to another but typically consist of these:

• Name: The symbolic file name is the only information kept in human readable form.
• Identifier: This unique tag, usually a number, identifies the file within the file system; it
is the non-human-readable name for the file.

• Type: This information is needed for systems that support different types of files.

• Location: This information is a pointer to a device and to the location of the file on that
device.

• Size: The current size of the file (in bytes, words, or blocks) and possibly the maximum
allowed size are included in this attribute.

• Protection: Access-control information determines who can do reading, writing,


executing, and so on.

 Time, date, and user identification: This information may be kept for creation, last
modification, and last use. These data can be useful for protection, security, and usage
monitoring.

File Operations

A file is an abstract data type. To define a file properly, we need to consider the operations that
can be performed on files. The operating system can provide system calls to create, write, read,
reposition, delete, and truncate files. Let’s examine what the operating system must do to perform
each of these six basic file operations. It should then be easy to see how other similar operations,
such as renaming a file, can be implemented.

•Creating a file: Two steps are necessary to create a file. First, space in the file system must be
found for the file.

•Writing a file: To write a file, we make a system call specifying both the name of the file and the
information to be written to the file. Given the name of the file, the system searches the directory
to find the file’s location. The system must keep a write pointer to the location in the file where
the next write is to take place. The write pointer must be updated whenever a write occurs.

•Reading a file: To read from a file, we use a system call that specifies the name of the file and
where (in memory) the next block of the file should be put. Again, the directory is searched for the
associated entry, and the system needs to keep a read pointer to the location in the file where the
next read is to take place. Once the read has taken place, the read pointer is updated. Because a
process is usually either reading from or writing to a file, the current operation location can be kept
as a per-process current file-position pointer. Both the read and write operations use this same
pointer, saving space and reducing system complexity.

•Repositioning within a file: The directory is searched for the appropriate entry, and the current-
file-position pointer is repositioned to a given value. Repositioning within a file need not involve
any actual I/O. This file operation is also known as a file seeks.

•Deleting a file: To delete a file, we search the directory for the named file. Having found the
associated directory entry, we release all file space, so that it can be reused by other files, and erase
the directory entry.

•Truncating a file: The user may want to erase the contents of a file but keep its attributes. Rather
than forcing the user to delete the file and then recreate it, this function allows all attributes to
remain unchanged—except for file length—but lets the file be reset to length zero and its file space
released.

Backup
A system backup is the process of backing up the operating system, files and system specific
useful/essential data. Backup is a process in which the state, files and data of a computer system
are duplicated to be used as a backup or data substitute when the primary system data is corrupted,
deleted or lost. There are mainly three types of backups are there: Full backup, differential backup,
and incremental backup. Let’s take a look at each type of backup and its respective pros and cons.
System backup primarily ensures that not only the user data in a system is saved, but also the
system’s state or operational condition. This helps in restoring the system to the last saved state
along with all the selected backup data. Generally, the system backup is performed through backup
software and the end file (system backup) generated through this process is known as the system
snapshot/image. Moreover, in a networked/enterprise environment, the system backup
file/snapshot/image is routinely uploaded and updated on an enterprise local/remote storage server.

Evolution of backup

Backup techniques have evolved over time and become increasingly sophisticated (and perhaps
complex as a result). Considerations such as time taken for backup, time taken for restores, storage
costs, network bandwidth savings, etc. –have all, over time, driven innovations that have been
designed to make backups better – but also increase complexity as a result.

Types of Backup

 Full backup
 Differential backup
 Incremental backup

SEARCHING

As the number of files in your folders increases, browsing through folders becomes a cumbersome
way of looking for files. However, you can find the file you need from among thousands of photos,
texts and other files by using the search function of your operating system. The search function
allows you to look for files and folders based on file properties such as the file name, save date or
size. The search function allows you to look for files and folders based on file properties such as
the file name, save date or size.
In Windows, you can search for files quickly by clicking the Start button at the bottom left of the
screen. Then, simply type the full or partial name of the file, program or folder. The search begins
as soon as you start typing, and the results will appear above the search field. If the file or program
you are looking for does not appear right away, wait a moment, as the search can take a while.
Also note that the search results are grouped by file type. If you are unable to find the file you are
looking for by using the quick search, you can narrow the search results by file type by clicking
the icons above the search term. You can narrow down your search results to: apps, settings,
documents, folders, photos, videos and music. Let’s say you recently downloaded a few photos
that were attached to an email message, but now you’re not sure where these files are on your
computer. If you’re struggling to find a file, you can always search for it. Searching allows you to
look for any file on your computer. To do this, click the Spotlight icon in the top-right corner of
the screen, then type the file name or keywords in the search box. The search results will appear
as you type. Simply click a file or folder to open it. If you’re using the search option, try using
different terms in your search.

Chapter Six
Security
6.1. Overview of system security

Many companies possess valuable information they want to guard closely. Among many things,
this information can be technical (e.g., a new chip design or software), commercial (e.g., studies
of the competition or marketing plans), financial (e.g., plans for a stock offering) or legal (e.g.,
documents about a potential merger or takeover). Many people keep their financial information,
including tax returns and credit card numbers, on their computer.

Guarding the information against unauthorized usage is therefore a major concern of all operating
systems. Unfortunately, it is also becoming increasingly difficult due to the widespread acceptance
of system bloat (and the accompanying bugs) as a normal phenomenon. In this chapter we will
examine computer security as it applies to operating systems.

Security is technical, administrative, legal, and political issues to making sure that files are not
read or modified by unauthorized person protection mechanisms. This refer to the specific
operating system mechanisms used to safeguard information in the computer. This issues of
security consists avoiding or minimizing threat which helps attacker to access the systems.

6.2. The Security Environment Threats

6.2.1 Threat

A cyber security threat refers to any possible malicious attack that seeks to unlawfully access data,
disrupt digital operations or damage information. Cyber threats can originate from various actors,
including corporate spies, hacktivists, terrorist groups, hostile nation-states, criminal
organizations, lone hackers and disgruntled employees.

There are different threat which includes: Malware, Emotet (banking Trojan), Denial of Service,
and Man in the Middle, Phishing, and SQL Injection and password attacks.

Security of an information system decompose into in three components: confidentiality, integrity,


and availability.

Confidentiality, is concerned with having secret data remain secret. More specifically, if the owner
of some data has decided that these data are to be made available only to certain people and no
others, the system should guarantee that release of the data to unauthorized people never occurs.
As an absolute minimum, the owner should be able to specify who can see what, and the system
should enforce these specifications, which ideally should be per file.

Integrity means that unauthorized users should not be able to modify any data without the owner’s
permission. Data modification in this context includes not only changing the data, but also
removing data and adding false data. If a system cannot guarantee that data deposited in it remain
unchanged until the owner decides to change them, it is not worth much for data storage.

The third property, availability, means that nobody can disturb the system to make it unusable.
Such denial-of-service attacks are increasingly common. For example, if a computer is an Internet
server, sending a flood of requests to it may cripple it by eating up all of its CPU time just
examining and discarding incoming requests. If it takes, say, 100 μsec to process an incoming
request to read a Web page, then anyone who manages to send 10,000 requests/sec can wipe it out.
Reasonable models and technology for dealing with attacks on confidentiality and integrity are
available; foiling denial-of-service attacks is much harder

6.2.2. Intruders

Intruders are attackers who attempt to breach the security of a network. They are skilled
professionals which uses different techniques to attack personal or organization account to gain
system access. They are one of security threats.
Common Categories
1. Casual prying by nontechnical users
2. Snooping by insiders
3. Determined attempt to make money
4. Commercial or military espionage
6.2.3 Accidental Data Loss

The others security threat is accidental data losses, which can be caused by different natural and
human factors.

Common Causes
1. Nature
- fires, floods, wars
2. Hardware or software errors
- CPU malfunction, bad disk, program bugs
3. Human errors
- data entry, wrong tape mounted
6.3. System protection, authentication

6.3.1. Formal models of protection

Protection models represent the protected objects in a system, how users or subjects (their proxies
in the computer system) may request access to them, how access decisions are made, and how the
rules governing access decisions may be altered. The access matrix model is the primary example
of a protection model.

Access Matrix is a security model of protection state in computer system. It is represented as a


matrix. Each cell of matrix represents set of access rights which are given to the processes of
domain means each entry (i, j) defines the set of operations that a process executing in domain Di
can invoke on object Oj.

Access Matrix is a security model of protection state in computer system. It is represented as a


matrix. Access matrix is used to define the rights of each process executing in the domain with
respect to each object. The rows of matrix represent domains and columns represent objects. Each
cell of matrix represents set of access rights which are given to the processes of domain means
each entry (i, j) defines the set of operations that a process executing in domain Di can invoke on
object Oj.
According to the above matrix: there are four domains and four objects- three files (F1, F2, F3)
and one printer. A process executing in D1 can read files F1 and F3. A process executing in domain
D4 has same rights as D1 but it can also write on files. Printer can be accessed by only one process
executing in domain D2. The mechanism of access matrix consists of many policies and semantic
properties. Specifically, we must ensure that a process executing in domain Di can access only
those objects that are specified in row i.

Association between the domain and processes can be either static or dynamic. Access matrix
provides an mechanism for defining the control for this association between domain and processes.
When we switch a process from one domain to another, we execute a switch operation on an object
(the domain). We can control domain switching by including domains among the objects of the
access matrix. Processes should be able to switch from one domain (Di) to another domain (Dj) if
and only is a switch right is given to access (i, j).

6.3.2. Memory protection

Memory protection is a way to manage access rights to the specific memory regions. It is used
by the majority of multi-tasking operating systems. The main goal of the memory protection
appears to be a banning of a process to access the part of memory which is not allocated to that
process. Such bans improve reliability of the programs and operating systems as an error in one
program may not directly affect the memory of other applications. It is important to distinguish
between the general principle of memory protection and ASLR, and NX-bit.

6.3.3. Encryption

Encryption is a method of securing data by scrambling the bits of a computer’s files so that they
become illegible. The only method of reading the encrypted files is by decrypting them with a
key; the key is unlocked with a password. More of encryption is explained in 6.4. Basics of
cryptograph in next section.

6.3.4 Recovery Management


Recovery Management is the process of planning, testing, and implementing the recovery
procedures ad standards required to restore service in the event of a component failure; either by
returning the component to normal operation, or taking alternative actions to restore service.

6.3.5. Authentication

6.4. Basics of Cryptography

The basic root of Cryptography comes from Greek word crypt which “hidden”.
Cryptography is the study of techniques of keeping information secure. It is the study of secret
writing.

The purpose of cryptography is to take a message or file, called the plaintext, and encrypt it into
ciphertext in such a way that only authorized people know how to convert it back to plaintext. For
all others, the ciphertext is just an incomprehensible pile of bits. Strange as it may sound to
beginners in the area, the encryption and decryption algorithms (functions) should always be
public. Trying to keep them secret almost never works and gives the people trying to keep the
secrets a false sense of security. In the trade, this tactic is called security by obscurity and is
employed only by security amateurs.

6.4.1. Secret-Key Cryptography

Both sender and receiver of the information uses shared keys. This keys used for both encryption
and decryption. This key should have to send to receiver before or after data is sent in other
communication. The key can be mono alphabetic or use keys to shift letters.
• Mono alphabetic substitution
To make encryption clearer, consider an encryption algorithm in which each letter is
replaced by a different letter, for example, all A’s are replaced by Qs, all B’s are
replaced by W’s, all Cs are replaced by E’s, and so on like this:

– This kind of encryption uses the replacement of each 26 letter with other letters as
the above exam plain and cipher text example.
 Secret-key crypto called symmetric-key crypto
– Given the encryption key, it is easy to find decryption key
– Single key is used to encrypt and decrypt information
 Public-Key Cryptography
• All users pick a public key/private key pair
– publish the public key
– private key not published
– Public key is the encryption key
– private key is the decryption key

You might also like