Operating System Module
Operating System Module
Jimma, Ethiopia
March, 2023
Chapter One
1. Overview and History of Operating system
1.1. Introduction to Operating System
In a computer system, we find four main components: the computer hardware, computer software and
the users. In a computer system the hardware provides the basic computing resources. Whereas
computer software can be divided into two main categories: application software and system software.
Application software consists of the programs for performing tasks particular to the machine’s
utilization. This software is designed to solve a particular problem for users. Examples of application
software include spreadsheets, database systems, desktop publishing systems, program development
software, and games. On the other hand, system software is more transparent and less noticed by the
typical computer user. This software provides a general programming environment in which
programmers can create specific applications to suit their needs. This environment provides new
functions that are not available at the hardware level and performs tasks related to executing the
application program. System software acts as an interface between the hardware of the computer and
the application software that users need to run on the computer. The most important type of system
software is the operating system.
An Operating System (OS) is a collection of programs that acts as an interface between a user of a
computer and the computer hardware. The purpose of an operating system is to provide an environment
in which a user may execute the programs. Operating Systems are viewed as resource managers. The
main resource is the computer hardware in the form of processors, storage, input/output devices,
communication devices, and data. Some of the operating system functions are: implementing the user
interface, sharing hardware among users, allowing users to share data among themselves, preventing
users from interfering with one another, scheduling resources among users, facilitating input/output,
recovering from errors, accounting for resource usage, facilitating parallel operations, organizing data for
secure and rapid access, and handling network communications.
A computer’s operating system is a group of programs designed to serve two basic purposes:
To control the allocation and use of the computing system’s resources among the various users
and tasks, and
To provide an interface between the computer hardware and the programmer that simplifies and
makes feasible the creation, coding, debugging, and maintenance of application programs.
An effective operating system should accomplish the following functions:
Should act as a command interpreter by providing a user friendly environment
Should facilitate communication with other users.
Provide for long term storage of user information in the form of files.
Permit system resources to be shared among users when appropriate, and be protected from
unauthorized or mischievous intervention as necessary.
Assure that when there are several active processes in the computer, each will get fair and non-
interfering access to the central processing unit for execution.
Though systems programs such as editors and translators and the various utility programs (such as sort and
file transfer program) are not usually considered part of the operating system, the operating system is
responsible for providing access to these system resources.
The abstract view of the components of a computer system and the positioning of OS is shown in the Figure
1
Operating systems have been evolving over the years. We will briefly look at this development of the
operating systems with respect to the evolution of the hardware / architecture of the computer systems
in this section. Since operating systems have historically been closely tied with the architecture of the
computers on which they run, we will look at successive generations of computers to see what their
operating systems were like. We may not exactly map the operating systems generations to the
generations of the computer, but roughly it provides the idea behind them.
We can roughly divide them into five distinct generations that are characterized by hardware component
technology, software development, and mode of delivery of computer services. The evolution of operating
system through the years can be mapped using generations of operating systems. There are four generations
of operating systems. These can be described as follows –
1.3.1. The First Generation ( 1945 - 1955 ): Vacuum Tubes and Plug boards
Digital computers were not constructed until the Second World War. Calculating engines with mechanical
relays were built at that time. However, the mechanical relays were very slow and were later replaced with
vacuum tubes. These machines were enormous but were still very slow.
These early computers were designed, built and maintained by a single group of people. Programming
languages were unknown and there were no operating systems so all the programming was done in machine
language. All the problems were simple numerical calculations.
By the 1950’s punch cards were introduced and this improved the computer system. Instead of using plug
boards, programs were written on cards and read into the system.
1.3.2. The Second Generation ( 1955 - 1965 ): Transistors and Batch Systems
Transistors led to the development of the computer systems that could be manufactured and sold to paying
customers. These machines were known as mainframes and were locked in air-conditioned computer rooms
with staff to operate them.
The Batch System was introduced to reduce the wasted time in the computer. A tray full of jobs was collected
in the input room and read into the magnetic tape. After that, the tape was rewound and mounted on a tape
drive. Then the batch operating system was loaded in which read the first job from the tape and ran it. The
output was written on the second tape. After the whole batch was done, the input and output tapes were
removed and the output tape was printed.
1.3.3. The Third Generation ( 1965 - 1980 ): IC and Multiprogramming
Until the 1960’s, there were two types of computer systems i.e. the scientific and the commercial computers.
These were combined by IBM in the System/360. This used integrated circuits and provided a major price
and performance advantage over the second generation systems.
The third generation operating systems also introduced multiprogramming. This meant that the processor
was not idle while a job was completing its I/O operation. Another job was scheduled on the processor so
that its time would not be wasted.
1.3.4. The Fourth Generation ( 1980 - Present ): Personal Computers
Personal Computers were easy to create with the development of large-scale integrated circuits. These were
chips containing thousands of transistors on a square centimeter of silicon. Because of these, microcomputers
were much cheaper than minicomputers and that made it possible for a single individual to own one of them.
The advent of personal computers also led to the growth of networks. This created network operating systems
and distributed operating systems. The users were aware of a network while using a network operating
system and could log in to remote machines and copy files from one machine to another.
Network operating systems are not fundamentally different from single processor operating systems.
They obviously need a network interface controller and some low- level software to drive it, as well as
programs to achieve remote login and remote files access, but these additions do not change the essential
structure of the operating systems.
A distributed computing system consists of a number of computers that are connected and managed so
that they automatically share the job processing load among the constituent computers, or separate the
job load as appropriate particularly configured processors. Such a system requires an operating system
which, in addition to the typical stand-alone functionality, provides coordination of the operations and
information flow among the component computers. The networked and distributed computing
environments and their respective operating systems are designed with more complex functional
capabilities. In a network operating system, the users are aware of the existence of multiple computers,
and can log in to remote machines and copy files from one machine to another. Each machine runs its
own local operating system and has its own user (or users).
A distributed operating system, in contrast, is one that appears to its users as a traditional uni-processor
system, even though it is actually composed of multiple processors. In a true distributed system, users
should not be aware of where their programs are being run or where their files are located; that should all
be handled automatically and efficiently by the operating system.
True distributed operating systems require more than just adding a little code to a uni-processor operating
system, because distributed and centralized systems differ in critical ways. Distributed systems, for
example, often allow program to run on several processors at the same time, thus requiring more complex
processor scheduling algorithms in order to optimize the amount of parallelism achieved.
1.4.8. Operating Systems for Embedded Devices
As embedded systems (PDAs, cellphones, point-of-sale devices, VCR’s, industrial robot control, or even
your toaster) become more complex hardware-wise with everygeneration, and more features are put into
them day-by-day, applications they run require more and more to run on actual operating system code in
order to keep the development time reasonable. Some of the popular OS are:
Nexus’s Conix - an embedded operating system for ARM processors.
Microsoft’s Windows CE and Windows NT Embedded OS.
Palm Computing’s Palm OS - Currently the leader OS for PDAs, has manyapplications and
supporting companies.
Sun’s Java OS - a standalone virtual machine not running on top of any other OS; mainly
targeted at embedded systems
1.5. Functions of Operating System
1.5.1. Process Management
The CPU executes a large number of programs. While its main concern is the execution of user
programs, the CPU is also needed for other system activities. These activities are called processes. A
process is a program in execution. Typically, a batch job is a process. A time-shared user program is a
process. A system task, such as spooling, is also a process. For now, a process may be considered as a
job or a time- shared program, but the concept is actually more general.
The operating system is responsible for the following activities in connection with processes
management:
There are various algorithms that depend on the particular situation to manage the memory. Selection of a
memory management scheme for a specific system depends upon many factors, but especially upon the
hardware design of the system. Each algorithm requires its own hardware support. The operating
system is responsible for the following activities in connection with memory management.
Decide which processes are to be loaded into memory when memory spacebecomes available.
Keep track of which parts of memory are currently being used and bywhom.
There are few alternatives. Magnetic tape systems are generally too slow. In addition, they are limited to
sequential access. Thus tapes are more suited for storing infrequently used files, where speed is not a
primary concern.
The operating system is responsible for the following activities in connection with disk management:
1.5.6. Protection
The various processes in an operating system must be protected from each other’s activities. For that
purpose, various mechanisms which can be used to ensure that the files, memory segment, CPU and other
resources can be operated on only by those processes that have gained proper authorization from the
operating system. For example, memory addressing hardware ensures that a process can only execute
within its own address space. The timer ensures that no process can gain control of the CPU without
relinquishing it. Finally, no process is allowed to do its own I/O, to protect the integrity of the various
peripheral devices. Protection refers to a mechanism for controlling the access of programs, processes,
or users to the resources defined by a computer controls to be imposed, together with some means of
enforcement. Protection can improve reliability by detecting latent errors at the interfaces between
component subsystems. Early detection of interface errors can often prevent contamination of a healthy
subsystem by a subsystem that is malfunctioning. An unprotected resource cannot defend against use (or
misuse) by an unauthorized or incompetent user.
1.5.7. Networking
A distributed system is a collection of processors that do not share memory or a clock. Instead, each processor
has its own local memory, and the processors communicate with each other through various communication
lines, such as high speed buses or telephone lines. Distributed systems vary in size and function. They may
involve microprocessors, workstations, minicomputers, and large general purpose computer systems.
The processors in the system are connected through a communication network, which can be configured in
the number of different ways. The network may be fully or partially connected. The communication network
design must consider routing and connection strategies and the problems of connection and security. A
distributed system provides the user with access to the various resources the system maintains. Access to a
shared resource allows computation speed-up, data availability, and reliability.
The command statements themselves deal with process management, I/O handling, secondary storage
management, main memory management, file system access, protection, and networking.
The Figure 2 depicts the role of the operating system in coordinating all the functions.
I/O
Management
Protection & File
Security Management
Process
Secondary
Management
Storage
Management
Operating System
Communication
Management Memory
Management
User
Interface Networking
Figure 2
Chapter Two
2.1. Process
All modern computers often do several things at the same time. People used to working
with personal computers may not be fully aware of this fact, so a few examples may make
the point clearer. First consider a Web server. Requests come in from all over asking for
Web pages. When a request comes in the server checks to see if the page needed is in the
cache. If it is, it is sent back; if it is not, a disk request is started to fetch it. However, from
the CPU's perspective, disk requests take eternity. While waiting for the disk request to
complete, many more requests may come in. If there are multiple disks present, some or
all of them may be fired off to other disks long before the first request is satisfied. Clearly
some way is needed to model and control this concurrency. Processes (and especially
threads) can help here.
is more than the program code, which is
sometimes known as the text section. It also includes the current activity, as represented by
the value of the program counter and the contents of the processor’s registers. A process
generally also includes the process stack, which contains temporary data (such as function
parameters, return addresses, and local variables), and a data section, which
System initialization.
Process Termination
After a process has been created, it starts running and does whatever its job is. However,
nothing lasts forever, not even processes. Sooner or later the new process will terminate,
usually due to one of the following conditions:
As a process executes, it changes state. The state of a process is defined in part by the
current activity of that process.
These names are arbitrary, and they vary across operating systems. The states that they
represent are found on all systems, however. Certain operating systems also more
finely delineate process states. It is important to realize that only one process can be
running on any processor at any instant. Many
to these states is presented below in Figure
2.0
Ready Running
Scheduler dispatch
I/O or event completion I/O or event wait
Waiting
many pieces of
information associated with a specific process, including these:
Process state. The state may be new, ready, running, and waiting, halted, andso on.
to be
executed for this process.
CPU registers. The registers vary in number and type, depending on the
computer architecture. They include accumulators, index registers, stack pointers,
and general-purpose registers, plus any condition-code information. Along with the
program counter, this state information mustbe saved when an interrupt occurs.
CPU-scheduling information. This information includes a process priority,pointers to
scheduling queues, and any other scheduling parameters.
I/O status information. This information includes the list of I/O devices allocated to
the process, a list of open files, and so on.
In brief, the PCB simply serves as the repository for any information that may vary
from process to process.
2.1.6. Modeling Multiprogramming
When multiprogramming is used, the CPU utilization can be improved. Crudely put, if the
average process computes only 20% of the time it is sitting in memory, with five processes
in memory at once, the CPU should be busy all the time. This model is unrealistically
optimistic, however, since it tacitly assumes
A better model is to look at CPU usage from a probabilistic viewpoint. Suppose that a
process spends a fraction p of its time waiting for I/O to complete. With n processes in
memory at once, the probability that all n processes are wait
2.2. Threads
performs a
single thread of execution. For example, when a process is running a word-processor
program, a single thread of instructions is being executed. This single thread of control
allows the process to perform only one task at a time. The user cannot simultaneously type
in characters and run the spell
have extended the process concept to allow a process to have multiple
threads of execution and thus to perform more than one task at a time. This feature is
especially beneficial on multicore systems, where multiple threads can run in parallel. On
a system that supports threads, the PCB is expanded to include information for each thread.
Other changes throughout the system are also needed to support threads.
A thread is a basic unit of CPU utilization; it comprises a thread ID, a program counter, a
register set, and a stack. It shares with other threads belonging to the same process its
code section, data section, and other operating-system
has a single thread of control. If a process has
multiple threads of control, it can perform more than one task at a time. . Figure 2.1
illustrates the difference between a traditional single-threaded process and a multithreaded
process.
Thread
Why would anyone want to have a kind of process within a process? It turns out there are
several reasons for having these mini processes, called threads. Let
applications, multiple activities are
going on at once. Some of these may block from time to time. By decomposing such an
application into multiple sequential threads that run in quasi-parallel, the programming model
becomes simpler.
We have seen this argument before. It is precisely the argument for having processes. Instead
of thinking about interrupts, timers, and context switches, we can think about parallel
processes. Only now with threads we add a new element: the ability for the parallel entities
to share an address space and all of its data among themselves. This ability is essential for
certain applications, which is why having multiple processes (with their separate address
spaces) will not work.
A second argument for having threads is that since they are lighter weight than processes,
they are easier (i.e., faster) to create and destroy than processes. In many systems, creating
a thread goes 10-100 times faster than creating a process. When the number of threads needed
changes dynamically and rapidly, this property is useful to have.
Threads can help here. Suppose that the word processor is written as a two- threaded program.
One thread interacts with the user and the other handles reformatting in the background. As
soon as the sentence is deleted from page 1, the interactive thread tells the reformatting thread
to reformat the whole book. Meanwhile, the interactive thread continues to listen to the
keyboard and mouse and responds to simple commands like scrolling page 1 while the other
thread is computing madly in the background. With a little luck, the reformatting will be
completed before the user asks to see page 600, so it can be displayed instantly.
While we are at it, why not add a third thread? Many word processors have a feature of
automatically saving the entire file to disk every few minutes to protect the user against losing
a day's work in the event of a program crash, system crash, or power failure. The third thread
can handle the disk backups without interfering with the other two.
Finally, threads are useful on systems with multiple CPUs, where real parallelism is possible.
2.2.2. Multithreaded
The benefits of multithreaded programming can be broken down into four major categories:
Responsiveness. Multithreading an interactive application may allowa program to
continue running even if part of it is blocked or is performing a lengthy operation,
thereby increasing responsiveness to the user. This quality is especially useful in
designing user interfaces. For instance, consider what happens when a user clicks a
button that results in the performance of a time-consuming operation. A single-
threaded application would be unresponsive to the user until the operation had
completed. In contrast, if the time-consuming operation is performed in a separate
thread, the application remains responsive to the user.
Resource sharing. Processes can only share resources through techniques such as
shared memory and message passing. Such techniques mustbe explicitly arranged
by the programmer. However, threads share the
The benefit of sharing code and data is that it
allows an application to have several different threads of activity within the same
address space.
Economy. Allocating memory and resources for process creation is costly. Because
threads share the resources of the process to which they belong, it is more
economical to create and context-switch threads. Empirically gauging the difference
in overhead can be difficult, but in general it is significantly more time consuming to
create and manage processes thanthreads.
Scalability. The benefits of multithreading can be even greater in a multiprocessor
architecture, where threads may be running in parallel on different processing cores.
A single-threaded process can run on onlyone processor, regardless how many are
available. We explore this issuefurther in the following section.
2.2.3.
Our discussion so far has treated threads in a generic sense. However, support for threads may
be provided either at the user level, for user threads, or by the kernel, for kernel threads. User
threads are supported above the kernel and are managed without kernel support, whereas
kernel threads are supported and managed directly by the operating system. Virtually all
contemporary
support kernel threads.
Ultimately, a relationship must exist between user threads and kernel threads. In this section,
we look at three common ways of establishing such a relationship: the many-to-one model,
the one-to-one model, and the many-to- many models.
2.2.3.1.
2.2.4. POSIX Threads
The POSIX thread libraries are a C/C++ thread API based on standards. It enables the creation
of a new concurrent process flow. It works well on multi-processor or multi-core systems,
where the process flow may be scheduled to execute on another processor, increasing speed
through parallel or distributed processing. Because the system does not create a new system,
virtual memory space and environment for the process, threads needless overhead than
“forking” or creating a new process.
While multiprocessor systems are the most effective benefits can also be obtained on
uniprocessor systems that leverage delay in I/O and other system processes that may impede
process execution. To utilize the PThread interfaces, we must include the header pthread.h at
the start of the CPP script.
#include <pthread.h>
PThreads is a highly concrete multithreading system that is the UNIX system’s default
standard. PThreads is an abbreviation for POSIX threads, and POSIX is an abbreviation for
Portable Operating System Interface, which is a type of interface that the operating system
must implement. PThreads in POSIX outline the threading APIs that the operating system
must provide.
Why is PThreads used?
The fundamental purpose for adopting PThreads is to improve programme
performance.
When compared to the expense of starting and administering a process, a thread
requires far less operating system overhead. Thread management takes fewer system
resources than process management.
A process’s threads all share the same address space. Inter-thread communication is
more efficient and, in many circumstances, more user-friendly than inter-process
communication.
Threaded applications provide possible performance increases and practical
advantages over non-threaded programmes in a variety of ways.
Multi-threaded programmes will run on a single-processor system but will
automatically make use of a multiprocessor machine without the need for
recompilation.
The most significant reason for employing PThreads in a multiprocessor system is to
take advantage of possible parallelism. This will be the major emphasis of the rest of
this lesson.
In order for a programme to use PThreads, it must be divided into discrete,
independent tasks that may run concurrently.
The new thread is made runnable, and it will begin performing the start routine using
the arg argument as the argument. The arg parameter is a void pointer that can point
to any type of data. Casting this pointer into a scalar data type (such as int) is not
advised since the casts may not be portable.
Let’s have a look at a C example of a better implementation approach:
#include <pthread.h> #include <stdio.h> #include <stdlib.h>
2.2.5.
There are two main ways to implement a threads package: in user space and
mentation is also possible. We
will now describe these methods, along with their
The first method is to put the threads package entirely in user space. The ker
ordinary, single-
threaded processes. The first, and most obvious, advantage is that a user-level threads
package can be implemented on an operating system that does
user-level threads packages.
However, there is one key difference with processes. When a thread is finished running for
the moment, for example, when it calls thread-.yield, the code of thread yield can save the
thread's information in the thread table itself. Furthermore, it can then call the thread scheduler
to pick another thread to run. The procedure that saves the thread's state and the scheduler are
just local procedures, so invoking them is much more efficient than making a kernel call.
Among other issues, no trap is needed, no context switch is needed, and the memory cache
need not be flushed, and so on. This makes thread scheduling very fast.
User-level threads also have other advantages. They allow each process to have its own
customized scheduling algorithm. For some applications, for example, those with a garbage
collector thread, not having to worry about a thread being stopped at an inconvenient moment
is a plus. They also scale better, since kernel threads invariably require some table space and
stack, space in the kernel, which can be a problem if there are a very large number of threads.
Despite their better performance, user-level threads packages have some major problems.
First among these is the problem of how blocking system calls are implemented.
2.2.6. Implementing Threads in the Kernel
The kernel's thread table holds each thread's registers, state, and other information. The
information is the same as with user-level threads, but now kept in the kernel instead of in
user space (inside the run-time system). This information is a subset of the information that
traditional kernels maintain about their single- threaded processes, that is, the process state.
In addition, the kernel also maintains the traditional process table to keep track of processes.
Ail calls that might block a thread are implemented as system calls, at considerably greater
cost than a call to a run-time system procedure. When a thread blocks, the kernel, at its option,
can run either another thread from the same process (if one is ready) or a thread from a
different process. With user-level threads, the run-time system keeps running threads from its
own process until the kernel takes the CPU away from it (or there are no ready threads left
to run).
nel, some systems
take an environmentally correct approach and recycle their threads. When a thread is
destroyed, it is marked as not runnable, but its kernel data structures are not otherwise
affected. Later, when a new thread must be created, an old thread is reactivated, saving some
overhead. Thread recycling is also possible for user-level threads, but since the thread
management overhead is much smaller, there is less incentive to do this.
2.2.7. Hybrid Implementations
Various ways have been investigated to try to combine the advantages of user-level threads
with kernel-level threads. One way is use kernel-level threads and then multiplexes user-level
threads onto some or all of the kernel threads, when this approach is used; the programmer
can determine how many kernel threads to use and how many user-level threads to multiplex
on each one. This model gives the ultimate in flexibility.
2.3.
Processes frequently need to communicate with other processes. For example, in a shell
pipeline, the output of the first process must be passed to the second process, and so on
down the line. Thus there is a need for communication between processes, preferably in a
well-structured way not using interrupts. In the following sections we will look at some of
the issues related to this Inter process Communication, or IPC.
There are three issues:-
The first was alluded to above: how one process can pass information to another.
The second has to do with making sure two or more processes do not get in each
other's way, for example, two processes in an airline reservation system each trying
to grab the last seat on a plane for a different customer.
The third concerns proper sequencing when dependencies are present: if process A
produces data and process B prints them, B has to wait until A has produced some data
before starting to print.
It is also important to mention that two of these issues apply equally well to threads.
The first one passing information is easy for threads since they share a common
address space (threads in different address spaces that need to communicate fail under
the heading of communicating processes). However, the other two keeping out of each
other's hair and proper sequencing apply equally well to threads. The same problems
exist and the same solutions apply. Below we will discuss the problem in the context
of processes, but please keep in mind that the same problems and solutions also apply
to threads.
No process running outside its critical region may block other processes.
Not only does this approach waste CPU time, but it can also have unexpected effects.
Consider a computer with two processes, H, with high priority, and L, with low priority.
The scheduling rules are such that H is run whenever it is in ready state. At a certain moment,
with L in its critical region, H becomes ready to run (e.g., an I/O operation completes). H
now begins busy waiting, but since L is never scheduled while H is running, L never gets
the chance to leave its critical region, so H loops forever. This situation is sometimes
referred to as the priority inversion problem.
Now let us look at some inter process communication primitives that block instead of
wasting CPU time when they are not allowed to enter their critical regions. One of the
simplest is the pair sleep and wakeup. Sleep is a system call that causes the caller to block,
that is, be suspended until another process wakes it up. The wakeup call has one parameter,
the process to be awakened. Alternatively, both sleep and wakeup each have one parameter,
a memory address used to match up sleeps with wakeups.
Producer-Consumer Problem: - is a classic problem this is used for multi-process
synchronization i.e. synchronization between more than one processes. In the producer-
consumer problem, there is one Producer that is producing something and there is one
Consumer that is consuming the products produced by the Producer. The producers and
consumers share the same memory buffer that is of fixed-size. The job of the producer is to
generate the data, put it into the buffer, and again start generating data. While the job of the
Consumer is to consume the data from the buffer.
The following are the problems that might occur in the Producer-Consumer:
The producer should produce data only when the buffer is not full. If the buffer is
full, then the producer shouldn't be allowed to put any data into the buffer.
The consumer should consume data only when the buffer is not empty. If the buffer
is empty, then the consumer shouldn't be allowed to take any data from the buffer.
The producer and consumer should not access the buffer at the same time.
The above three problems can be solved with the help of semaphores (learn more
about semaphores from here).
Semaphore: - is refer to the integer variables that are primarily used to solve the
critical section problem via combining two of the atomic procedures, wait and
signal, for the process synchronization.
In the producer-consumer problem, we use three semaphore variables:
Semaphore S: This semaphore variable is used to achieve mutual exclusion between
processes. By using this variable, either Producer or Consumer will be allowed to use
or access the shared buffer at a particular time. This variable is set to 1 initially.
Semaphore E: This semaphore variable is used to define the empty space in the
buffer. Initially, it is set to the whole space of the buffer i.e. "n" because the buffer is
initially empty.
Semaphore F: This semaphore variable is used to define the space that is filled by
the producer. Initially, it is set to "0" because there is no space filled by the producer
initially.
By using the above three semaphore variables and by using the wait() and signal()
function, we can solve our problem(the wait() function decreases the semaphore
variable by 1 and the signal() function increases the semaphore variable by 1).
2.3.5. Mutexes
the
semaphore, called a mutex, is sometimes used. Mutexes are good only for
easy and efficient
to implement, which m a k e s t h e m especially u s e f u l in thread packages that are
implemented entirely in user space.
A mutex is a variable that can be in one of two states: unlocked or locked. Consequently,
only 1 bit is required to represent it, but in practice an integer often
2.3.6. Monitors
With semaphores and mutexes inter process communication looks easy, right? Forget it.
Monitors are used for process synchronization. With the help of programming languages, we
can use a monitor to achieve mutual exclusion among the processes. Example of monitors:
Java Synchronized methods such as Java offers notify () and wait () constructs. In other
words, monitors are defined as the construct of programming language, which helps in
controlling shared data access.
The Monitor is a module or package which encapsulates shared data structure, procedures,
and the synchronization between the concurrent procedure invocations.
Characteristics of Monitors.
Inside the monitors, we can only execute one process at a time.
Monitors are the group of procedures, and condition variables that are merged
together in a special type of module.
If the process is running outside the monitor, then it cannot access the monitor’s
internal variable. But a process can call the procedures of the monitor.
Monitors offer high-level of synchronization
Monitors were derived to simplify the complexity of synchronization problems.
There is only one process that can be active at a time inside the monitor.
Components of Monitor
There are four main components of the monitor:
1 Initialization
2 Private data
3 Monitor procedure
4 Monitor entry queue
Initialization: - Initialization comprises the code, and when the monitors are
created, we use this code exactly once.
Private Data: - Private data is another component of the monitor. It comprises all
the private data, and the private data contains private procedures that can only be
used within the monitor. So, outside the monitor, private data is not visible.
Monitor Procedure: - Monitors Procedures are those procedures that can be called
from outside the monitor.
Monitor Entry Queue: - Monitor entry queue is another essential component of the
monitor that includes all the threads, which are called procedures.
Condition Variables
There are two types of operations that we can perform on the condition variables of
the monitor:
Wait
Signal
Suppose there are two condition variables
Condition a, b // Declaring variable
Wait Operation
a.wait(): - The process that performs wait operation on the condition variables are
suspended and locate the suspended process in a block queue of that condition
variable.
Signal Operation
a.signal() : - If a signal operation is performed by the process on the condition
variable, then a chance is provided to one of the blocked processes.
2.3.7. Massage Passing
Process communication is the mechanism provided by the operating system that allows
processes to communicate with each other. This communication could involve a process
letting another process know that some event has occurred or transferring of data from one
process to another. One of the models of process communication is the message passing.
Message passing allows multiple processes to read and write data to the message queue
without being connected to each other. Messages are stored on the queue until their recipient
retrieves them. Message queues are quite useful for inter process communication and are used
by most operating systems. Message passing provides two operations which are send
message (destination, &message) and receive message ( ).
In figure 2.5, that demonstrates message passing model of process communication is given
as follows.
Figure 2.5 Message Passing Systems
The Producer-Consumer Problem with Message Passing: Message Passing allows us to solve
the Producer-Consumer problem on distributed systems. In the problem below, an actual buffer
does not exit. Instead, the producer and consumer pass messages to each other. These messages
can contain the items which, in the previous examples, were placed in a buffer. They can also
contain empty messages, meaning that a slot in the "buffer" is ready to receive a new item. In this
example, a buffer size of four has been chosen. The Consumer begins the process by sending four
empty messages to the producer. The producer creates a new item for each empty message it
receives, then it goes to sleep. The producer will not create a new item unless it first receives an
empty message. The consumer waits for messages from the producer which contain the items.
Once it consumes an item, it sends an empty message to the producer. Again, there is no real
buffer, only a buffer size which dictates the number of items allowed to be produced.
BufferSize = 4;
Producer()
{
int widget;
message m; // message buffer
while (TRUE) {
make_item(widget); // create a new item
receive(consumer, &m); // wait for an empty message to arrive
build_message(&m, widget); // make a message to send to the consumer
send(consumer, &m); // send widget to consumer
}
}
Consumer()
{
int widget;
message m;
for(0 to N) send(producer, &m); // send N empty messages
while (TRUE) {
receive(producer, &m); // get message containing a widget
extract_item(&m, widget); // take item out of message
send(producer, &m); // reply with an empty message
consume_item(widget); // consumer the item
}
}
The receiver must be able to distinguish between a new message and a retransmission of
an older message. The typical solution to this issue is to include consecutive sequence
numbers in each original message. Receivers are aware that a message is a duplicate and
might be ignored if it has the same sequence number as a prior message. So that the
process mentioned in a send or receive call is clear, message systems must also address
the issue of how processes are named.
A semaphore action or accessing a monitor is always faster than copying messages from
one process to another. Making message passing efficient has required a great deal of
effort. For instance, Cheriton (1984) proposed restricting the size of messages to that
which can fit in the machine's registers and then performing message passing using the
registers. With message passing, numerous variations are available. Let's start by
examining the manner in which communications are addressed. Messages can be
addressed to processes by giving each one a distinct address. Creating a new data structure
called a mailbox is an alternative method. A mailbox is a location where a specific amount
of messages can be buffered, usually one that is defined when the mailbox is formed. The
address parameters in the send and receive calls are mailboxes, not processes, when
mailboxes are used. In order to make place for a new message, a
Procedure that attempts to transmit to a mailbox that is already full is halted. Both the
producer and the consumer would build mailboxes big enough to house N messages for
the producer-consumer issue.
The consumer would send empty messages to the producer's mailbox, and the producer
would respond by sending messages with data to the consumer's inbox. Using mailboxes
makes the buffering mechanism obvious: messages transmitted to the destination process
but not yet accepted are stored in the destination mailbox. The opposite of having
mailboxes is to completely disable buffering. When using this method, if the send occurs
before the receive, the sending process is halted until the receive at which point the
message can be copied directly from the sender to the receiver without any buffering in
between. The receiver is also blocked until a send occurs if the receive is completed first.
This tactic is frequently referred to as a rendezvous. Since the transmitter and receiver
must run in lockstep, it is less flexible than a buffered message method but simpler to
construct.
2.4. SCHEDULING
Back in the old days of batch systems with input in the form of card images on a magnetic
tape, the scheduling algorithm was simple: just runs the next job on the tape. With
multiprogramming systems, the scheduling algorithm became more complex because there
were generally multiple users waiting for service. Some mainframes still combine batch and
timesharing service, requiring the scheduler to decide whether a batch job or an interactive
user at a terminal should go next.
(As an aside, a batch job may be a request to run multiple programs in succession, but for this
section, we will just assume it is a request to run a single program.)
Because CPU time is a scarce resource on these machines, a good scheduler can make a big
difference in perceived performance and user satisfaction. Consequently, a great deal of work
has gone into devising clever and efficient scheduling algorithms. With the advent of personal
computers, the situation changed in two ways. First, most of the time there is only one active
process.
When we turn to networked servers, the situation changes appreciably. Here multiple
processes often do compete for the CPU, so scheduling matters again. For example, when the
CPU has to choose between running a process that gathers the daily statistics and one that
serves user requests, the users will be a lot happier if the latter gets first crack at the CPU.
In addition to picking the right process to run, the scheduler also has to worry about making
efficient use of the CPU because process switching is expensive. To start with, a switch
fromuser mode to kernel mode must occur. Then the state of the current process must be
saved, including storing its registers in the process table so they can be reloaded later. In many
systems, the memory map (e.g., memory reference bits in the page table) must be saved as
well. Next a new process must be selected by running the scheduling algorithm. After that,
the MMU must bereloaded with the memory map of the new process. Finally, the new process
must be started.
In addition to all that, the process switch usually invalidates the entire memory cache, forcing
it to be dynamically reloaded from the main memory twice (upon entering the kernel and
upon leaving it). All in all, doing too many process switches per second can chew up a
substantial amount of CPU time, so caution is advised.
PROCESS BEHAVIOR
Nearly all processes alternate bursts of computing with (disk) 1/0 requests, as shown in
Figure 2.6 below. Typically, the CPU runs for a while without stopping, and then a
system call is made to read from a file or write to a file. When the system call completes,
the CPU computes again until it needs more data or has to write more data, and so on.
Note that some 110 activities count as computing. For example, when the CPU copies bits
to a video RAM to update the screen, it is computing, not doing 1/0, because the CPU is
in use. 1/0 in this sense is when a process enters the blocked state waiting for an external
device to complete its work.
The important thing to notice about the figure is that some processes, such as the one
in Fig. (a), spend most of their time computing, while others, such as the one in Fig.
(b), spend most of their time waiting for I/0. The former is called compute-bound; the
latter are called I/O-bound.Compute-bound processes typically have long CPU bursts
and thus infrequent 1/0 waits, whereas I/O bound processes have short CPU bursts
and thus frequent I/0 waits. Note that the key factoris the length of the CPU burst,
not the length of the I/0 burst. I/O bound processes are I/0 bound because they do not
compute much between I/0 requests, not because they have especially long I/0
requests. It takes the same time to issue the hardware request to read a disk block no
matter how much or how little time it takes to process the data after they arrive.
It is worth noting that as CPUs get faster, processes tend to get more I/O bound. This
effect occurs because CPUs are improving much faster than disks.
As a consequence, the scheduling of I/O-bound processes is likely to become a more
important subject in the future. The basic idea here is that if an I/O-bound process
wants to run, it shouldget a chance quickly so that it can issue its disk request and
keep the disk busy.
The objective of multiprogramming is to have some process running at all times, to
maximize CPU utilization. The objective of time sharing is to switch the CPU among
processes so frequently. In uniprocessor only one process is running. A process
migrates between various
scheduling queues throughout its lifetime. The process of selecting processes from
among these queues is carried out by a scheduler. The aim of processor scheduling is
to assign processes to be executed by the processor. Scheduling affects the
performance of the system, because it determines which process will wait and which
will progress.
Types of Scheduling
The long-term scheduler limits the number of processes to allow for processing by
taking the decision to add one or more new jobs, based on FCFS (First-Come, first-
serve) basis or priority or execution time or Input/output requirements. Long-term
scheduler executes relatively infrequently.
Scheduling Criteria
Throughput is the rate at which processes are completed per unit of time.
Turnaround time:
This is the how long a process takes to execute a process. It is calculated as the
time gapbetween the submission of a process and its completion.
Waiting time:
Waiting time is the sum of the time periods spent in waiting in the ready queue.
Response time:
Response time is the time it takes to start responding from submission time. It is
calculated asthe amount of time it takes from when a request was submitted until
the first response is produced.
Fairness:
Non-preemptive Scheduling
In preemptive mode, currently running process may be interrupted and moved to the
ready State by the operating system. When a new process arrives or when an interrupt
occurs, preemptive policies may incur greater overhead than non-preemptive version
but preemptive version may provide better service. It is desirable to maximize CPU
utilization and throughput, and to minimize turnaround time, waiting time and
response time.
WHEN TO SCHEDULE
A key issue related to scheduling is when to make scheduling decisions. It turns out
that there area variety of situations in which scheduling is needed.
First, when a new process is created, a decision needs to be made whether to run the
parent process or the child process. Since both processes are in ready state, it is a
normal scheduling decision and can go either way, that is, the scheduler can
legitimately choose to run either the parent or the child next.
Second, a scheduling decision must be made when a process exits. That process can
no longer run (since it no longer exists), so some other process must be chosen from
the set of ready processes. If no process is ready, a system-supplied idle process is
normally run.
Third, when a process blocks on I/0, or for some other reason, another process has to
be selected to run. Sometimes the reason for blocking may play a role in the choice.
For example, if A is an important process and it is waiting for B to exit its critical
region, letting B run next will allow it to exit its critical region and thus let A continue.
The trouble, however, is that the scheduler generally does not have the necessary
information to take this dependency into account.
Fourth, when an I/0 interrupt occurs, a scheduling decision may be made. If the
interrupt came from an I/0 devices that has now completed its work, some process that
was blocked waiting for the 110 may now be ready to run. It is up to the scheduler to
decide whether to run the newly ready process, the process that was running at the
time of the interrupt, or some third process.
Scheduling algorithms can be divided into two categories with respect to how they
deal with clock interrupts. A non-preemptive scheduling algorithm picks a process
to run and then just letsit run until it blocks (either on I/0 or waiting for another
process) or until it voluntarily releases the CPU. Even if it runs for hours, it will not
be forcibly suspended. In effect, no scheduling decisions are made during clock
interrupts. After clock interrupt processing has been completed, the process that was
running before the interrupt is resumed, unless a higher-priority process waswaiting
for a now-satisfied timeout.
In contrast, a preemptive scheduling algorithm picks a process and lets it run for a
maximum of some fixed time. If it is still running at the end of the time interval, it is
suspended and the scheduler picks another process to run (if one is available).
CATEGORIES GOALS OF SCHEDULING ALGORITHMS
Interactive.
Real time.
Batch systems
Throughput - maximize jobs per hour
Turnaround time - minimize time between submission and termination CPU utilization
- keep the CPU busy all the time
Interactive systems
Response time - respond to requests quickly Proportionality - meet users’
expectations
Real-time systems
Meeting deadlines - avoid losing data
Batch systems are still in widespread use in the business world for doing payroll, inventory,
accounts receivable, accounts payable, interest calculation (at banks), claims processing (at
insurance companies), and other periodic tasks. In batch systems, there are no users
impatiently waiting at their terminals for a quick response to a short request. Consequently,
non-preemptive algorithms, or preemptive algorithms with long time periods for each
process, are often acceptable. This approach reduces process switches and thus improves
performance. The batch algorithms are actually fairly general and often applicable to other
situations as well, which makes them worth studying, even for people not involved in
corporate mainframe computing.So, the goals can be:
Throughput - maximize jobs per hour
The managers of large computer centers that run many batch jobs typically look at three
metrics to see how well their systems are performing: throughput, turnaround time, and
CPU utilization. Throughput is the number of jobs per hour that the system completes.
All things considered, finishing 50 jobs per hour is better than finishing 40 jobs per hour.
Turnaround time is the statistically average time from the moment that a batch job is
submitted until the moment it is completed. It measures how long the average user has to
wait for the output. Here the rule is: Small is beautiful.
CPU utilization is often used as a metric on batch systems. Actually though, it is not such
a good metric. What really matters is how many jobs per hour come out of the system
(throughput) and how long it takes to get a job back (turnaround time). Using CPU
utilization as a metric is like rating cars based on how many times per hour the engine
turns over. On the other hand, knowing when the CPU utilization is approaching 100% is
useful for knowing when it is time to get more computing power.
2.4.2.1. First come first serve scheduling algorithm in Batch Systems
In the "First come first serve" scheduling algorithm, as the name suggests, the process
which arrives first, gets executed first, or we can say that the process which requests
the CPU first, gets the CPU allocated first. First Come First Serve, is just like FIFO
(First in First out) Queue data structure, where the data element which is added to the
queue first, is the one who leaves the queue first. It's easy to understand and implement
programmatically, using a Queue data structure, where a new process enters through
the tail of the queue, and the scheduler selects process from the head of the queue. A
perfect real life example of FCFS scheduling is buying tickets at ticket counter.
Problems with FCFS Scheduling
Below we have a few shortcomings or problems with the FCFS scheduling algorithm:
It is Non Pre-emptive algorithm, which means the process priority doesn't matter. If a
process with very least priority is being executed, more like daily routine backup
process, which takes more time, and all of a sudden some other high priority process
arrives, like interrupt to avoid system crash, the high priority process will have to wait,
and hence in this case, the system will crash, just because of improper process
scheduling.
Not optimal Average Waiting Time.
Resources utilization in parallel is not possible, which leads to Convoy Effect, and
hence poor resource (CPU, I/O etc.) utilization.
Convoy Effect
Convoy Effect is a situation where many processes, which need to use a resource for
short time, are blocked by one process holding that resource for a long time. This
essentially leads to poor utilization of resources and hence poor performance.
2.4.2.2. Shortest Job First scheduling algorithm in batch system
Now let us look at another non preemptive batch algorithm that assumes the run times
are known in advance. In an insurance company, for example, people can predict quite
accurately how long it will take to run a batch of 1000 claims, since similar work is
done every day. When several equally important jobs are sitting in the input queue
waiting to be started, the scheduler picks the shortest job first. Look at Fig. 2-41.
Here we find four jobs A, B, C, and D with run times of 8, 4, 4, and 4 minutes,
respectively. By running them in that order, the turnaround time forA is 8 minutes, for
B is 12 minutes, for C is 16 minutes, and for D is 20 minutes foran average of 14(56/4)
minutes.
Figure 2.8 An example of shortest-job-first scheduling.
(a) Running four jobs in the original order. (b) Running them in shortest job first
order.
Now let us consider running these four jobs using shortest job first, as shown in Fig.
2-41(b). The turnaround times are now 4, 8, 12, and 20 minutes for an average of
11(44/4) minutes. Shortest job first is provably optimal. Consider the case of four
jobs, with execution times of a, b, c, and d, respectively. The first job finishes at time
a, the second at time a + b, and so on. The mean turnaround time is (4a + 3b
+ 2c + d)/4. It is clear that a contributes more to the average than theother times,
so it should be the shortest job, with b next, then c, and finally d as the longest since
it affects only its own turnaround time. The same argument applies equally well to
any number of jobs.
The average w a i t i n g time for (b) is (turnaround time – waiting time) = (4-4) +
(8-4) + (12-4) + (20-8) = 24/4 =6 and throughout = 1/20
It is worth pointing out that shortest job first is optimal only when all the jobs are
available simultaneously. As a counterexample, consider five jobs, A through E,
with run times of 2, 4, 1, 1, and 1, respectively. Their arrival times are 0, 0, 3, 3, and
3. Initially, only A or B can be chosen, since the other three jobs have not arrived yet.
Using shortest job first, we will run the jobs in the order A, B, C, D, E, for an average
waiting time of 4.6. However, running them in the order B, C, D, E, A has an average
waiting time of 4.4.
For interactive systems, different goals apply. The most important one is to minimize
response time, that is, the time between issuing a command and getting the result. On
a personal computer where a background process is running (for example, reading and
storing e-mail from thenetwork), a user request to start a program or open a file should
take precedence over the background work. Having all interactive requests go first
will be perceived as good service.
A somewhat related issue is what might be called proportionality. Users have an
inherent (but often incorrect) idea of how long things should take. When a request that
is perceived ascomplex takes a long time, users accept that, but when a request that
is perceived as simple takes a long time, users get irritated. For example, if clicking
on a icon that starts sending a fax takes60 seconds to complete, the user will probably
accept that as a fact of life because he does not expect a fax to be sent in 5 seconds.
On the other hand, when a user clicks on the icon that breaks the phone connection
after the fax has been sent, he has different expectations. If it has not completed after
30 seconds, the user willprobably be swearing a blue streak, and after 60 seconds he
will be frothing at the mouth. This behavior is due to the common user perception that
placing a phone call and sending a fax is supposed to take a lot longer than just
hanging the phone up. In some cases (such as this one),the scheduler cannot do
anything about the response time, but in other cases it can, especially when the delay
is due to a poor choice of process order.
2.4.3.1. Round-Robin Scheduling algorithms in Interactive Systems
One of the oldest, simplest, fairest, and most widely used algorithms is round robin.
Each process is assigned a time interval, called its quantum, during which it is allowed
to run. If the process is still running at the end of the quantum, the CPU is preempted
and given to another process. If the process has blocked or finished before the
quantum has elapsed, the CPU switching is done when the process blocks, of course.
Round robin is easy to implement. All the scheduler needs to do is maintain a list of
runnable processes, as shown in Fig. (a). When the process uses up its quantum, it is
put on the end of the list, as shown in Fig. (b).
The only interesting issue with round robin is the length of the quantum.
Switching from one process to another requires a certain amount of time for doing the
administration-saving and loading registers and memory maps, updating various
tables and lists, flushing and reloading the memory cache, and so on. Suppose that
this process switch or context switch, as it is sometimes called, takes 1 msec, including
switching memory maps, flushing and reloading the cache, etc.
Also suppose that the quantum is set at 4 msec. With these parameters, after doing 4
msec of useful work, the CPU will have to spend (i.e., waste) 1msec on process
switching. Thus 20% of the CPU time will be thrown away on administrative
overhead. Clearly, this is too much.
Figure 2.9: Round-robin scheduling. (a) The list of runnable processes. (b) The list of
runnableprocesses after B uses up its quantum.
To improve the CPU efficiency, we could set the quantum to, say, 100 msec.
Now the wasted time is only 1%. But consider what happens on a server system if 50
requests come in within a very short time interval and with widely varying CPU
requirements. Fifty processes will be put on the list of runnable processes. If the CPU
is idle, the first one will start immediately, the second one may not start until 100 msec
later, and so on. The unlucky last one may have to wait 5 sec before getting a
chance, assuming all the others use their full quanta.
Most users will perceive a 5-sec response to a short command as sluggish. This
situation is especially bad if some of the requests near the end of the queue required
only a few milliseconds of CPU time. With a short quantum they would have gotten
better service.
Another factor is that if the quantum is set longer than the mean CPU burst,
preemption will not happen very often. Instead, most processes will perform a
blocking operation before the quantumruns out, causing a process switch. Eliminating
preemption improves performance becauseprocess switches then only happen when
they are logically necessary, that is, when a process blocks and cannot continue.
The conclusion can be formulated as follows: setting the quantum too short causes too
many process switches and lowers the CPU efficiency, but setting it too long may
cause poor response to short interactive requests. A quantum around 20-50 msec is
often a reasonable compromise.
Performance of RR Scheduling
If there are n processes in the ready queue and time quantum is q, then each process
gets 1/n of the CPU time in chunks of at most q time units at once.
No process waits for more than (n-1)*q time units until the next time quantum.
Round-robin scheduling makes the implicit assumption that all processes are equally
important. Frequently, the people who own and operate multiuser computers have
different ideas on that subject. At a university, for example, the pecking order may be
deans first, then professors, secretaries, janitors, and finally students.
The need to take external factors into account leads to priority scheduling. The basic
idea is straightforward: each process is assigned a priority, and the runnable process
with the highest priority is allowed to run. Even on a PC with a single owner, there
may be multiple processes, some of them more important than others. For example, a
daemon process sending electronic mail in the background should be assigned a lower
priority than a process displaying a video filmon the screen in real time.
To prevent high-priority processes from running indefinitely, the scheduler may
decrease the priority of the currently running process at each clock tick (i.e., at each
clock interrupt). If this action causes its priority to drop below that of the next highest
process, a process switch occurs. Alternatively, each process may be assigned a
maximum time quantum that it is allowed to run. When this quantum is used up, the
next highest priority process is given a chance to run.
Priorities can be assigned to processes statically or dynamically. On a military
computer, processes started by generals might begin at priority 100, processes started
by colonels at 90, majors at 80, captains at 70, lieutenants at 60, and so on.
Alternatively, at a commercial computer center, high-priority jobs might cost $100 an
hour, medium priority $75 an hour, and low priority $50 an hour. The UNIX system
has a command, nice, which allows a user to voluntarily reduce the priority of his
process, in order to be nice to the other users. Nobody ever uses it.
Priorities can also be assigned dynamically by the system to achieve certain system
goals. For example, some processes are highly 1/0 bound and spend most of their time
waiting for 110 to complete. Whenever such a process wants the CPU, it should be
given the CPU immediately, to let it start its next 1/0 request, which can then proceed
in parallel with another process actually computing. Making the I/O-bound process
wait a long time for the CPU will just mean having it around occupying memory for
an unnecessarily long time. A simple algorithm for giving good service to I/O-bound
processes is to set the priority to 1/f, where f is the fraction of the last quantum that a
process used. A process that used only 1 msec of its 50 msec quantum would get
priority 50, while a process that ran 25 msec before blocking would get priority 2,
and a processthat used the whole quantum would get priority 1.
Up until now, we have tacitly assumed that all the processes in the system be- long to different users and are
thus competing for the CPU. While this is oftentrue, sometimes it happens that one process has many
children running under its control. For example, a database-management-system process may have many
children. Each child might be working on a different request, or each might have some specific function to
perform (query parsing, disk access, etc.). It is entirely possible that the main process has an excellent idea of
which of its children are the most important (or time critical) and which the least. Unfortunately, none of the
schedulers discussed above accept any input from user processes about scheduling decisions. As a result, the
scheduler rarely makes the best choice.
The solution to this problem is to separate the scheduling mechanism fromthe scheduling policy, a long-
established principle. What this means is that the scheduling algorithm is parameterized in some way, but the
parameters can be filled in by user processes. Let us consider the database example once again. Suppose that
the kernel uses a priority-scheduling algorithm but pro- vides a system call by which a process can set (and
change) the priorities of its children. In this way, the parent can control how its children are scheduled, even
though it itself does not do the scheduling. Here the mechanism is in the kernel but policy is set by a user
process. Policy-mechanism separation is a key idea.
The operating systems literature is full of interesting problems that have been widely discussed and
analyzed using a variety of synchronization methods. In the following sections we will examine
three of the better-known problems.
2.5.1. The Dining Philosophers Problem
In 1965, Dijkstra posed and then solved a synchronization problem he called the dining philosophers
problem. Since that time, everyone inventing yet another synchronization primitive has felt
obligated to demonstrate how wonderful the new primitive is by showing how elegantly it solves
the dining philosopher’s problem. The problem can be stated quite simply as follows. Five
philosophers are seated around a circular table. Each philosopher has a plate of spaghetti. The
spaghetti is so slippery that a philosopher needs two forks to eat it. Between each pair of plates is
one fork.
Figure 2.10
The life of a philosopher consists of alternating periods of eating and thinking. (This is something
of an abstraction, even for philosophers, but the other activities are irrelevant here.) When a
philosopher gets sufficiently hungry, she tries to ac- quire her left and right forks, one at a time, in
either order. If successful in acquiring two forks, she eats for a while, then puts down the forks,
and continues to think. The key question is: Can you write a program for each philosopher that does
what it is supposed to do and never gets stuck? (It has been pointed out that the two-fork requirement
is somewhat artificial; perhaps we should switch from Italian food to Chinese food, substituting rice
for spaghetti and chopsticks for forks.) Figure 2.11 shows the obvious solution. The procedure take
fork waits until the specified fork is available and then seizes it. Unfortunately, the obvious solution
is wrong. Suppose that all five philosophers take their left forks simultaneously. None will be able
to take their right forks, and there will be a deadlock.
We could easily modify the program so that after taking the left fork, the pro- gram checks to see if
the right fork is available. If it is not, the philosopher puts down the left one, waits for some time,
and then repeats the whole process. This proposal too, fails, although for a different reason. With a
little bit of bad luck, all the philosophers could start the algorithm simultaneously, picking up their
left forks, seeing that their right forks were not available, putting down their left forks,
while (TRUE) {
/* philosopher is thinking */
yum-yum, spaghetti */
The solution presented in Figure 1.12 is deadlock-free and allows the maximum parallelism for an
arbitrary number of philosophers. It uses an array, state, to keep track of whether a philosopher is
eating, thinking, or hungry (trying to acquire forks). A philosopher may move into eating state only
if neither neighbor is eat- ing. Philosopher i’s neighbors are defined by the macros LEFT and
RIGHT. In other words, if i is 2, LEFT is 1 and RIGHT is 3.
The program uses an array of semaphores, one per philosopher, so hungry philosophers can block
if the needed forks are busy. Note that each process runs the procedure philosopher as its main
code, but the other procedures, take forks, put forks, and test, are ordinary procedures and not
separate processes.
2.5. Deadlock
Computer systems are full of resources that can be used only by one process at a time. Common examples
include printers, tape drives for backing up company data, and slots in the system’s internal tables. Having
two processes simultaneously writing to the printer leads to gibberish. Having two processes using the
same file-system table slot invariably will lead to a corrupted file system. Consequently, all operating
systems have the ability to (temporarily) grant a process exclusive access to certain resources.
For many applications, a process needs exclusive access to not one resource, but several. Suppose, for
example, two processes each want to record a scanned document on a Blu-ray disc. Process A requests
permission to use the scanner and is granted it. Process B is programmed differently and requests the Blu-
ray recorder first and is also granted it. Now A asks for the Blu-ray recorder, but the re- quest is suspended
until B releases it. Unfortunately, instead of releasing the Blu- ray recorder, B asks for the scanner. At this
point both processes are blocked and will remain so forever. This situation is called a deadlock.
Deadlocks can also occur across machines. For example, many offices have a local area network with
many computers connected to it. Often devices such as scanners, Blu-ray/DVD recorders, printers, and
tape drives are connected to the network as shared resources, available to any user on any machine. If these
de- vices can be reserved remotely (i.e., from the user’s home machine), deadlocks of the same kind can
occur as described above. More complicated situations can cause deadlocks involving three, four, or more
devices and users.
Deadlocks can also occur in a variety of other situations.. In a database system, for example, a program
may have to lock several records it is using, to avoid race conditions. If process A locks record R1 and
process B locks record R2, and then each process tries to lock the other one’s record, we also have a
deadlock. Thus, deadlocks can occur on hardware resources or on software resources.
2.5.1. RESOURCES
A major class of deadlocks involves resources to which some process has been granted exclusive access.
These resources include devices, data records, files, and so forth. To make the discussion of deadlocks as
general as possible, we will refer to the objects granted as resources. A resource can be a hardware device
(e.g., a Blu-ray drive) or a piece of information (e.g., a record in a database). A computer will normally
have many different resources that a process can acquire. For some resources, several identical instances
may be available, such as three Blu-raydrives. When several copies of a resource are available, any one
of them can be used to satisfy any request for the resource. In short, a resource is anything that must be
acquired, used, and released over the course of time.
Resources come in two types: preemptable and nonpreemptable. A preempt- able resource is one that
can be taken away from the process owning it with no ill effects. Memory is an example of a pre-emptable
resource. A nonpreemptable resource, in contrast, is one that cannot be taken away from its current
owner without potentially causing failure.
In general, deadlocks involve nonpreemptable resources. Potential deadlocks that involve preemptable
resources can usually be resolved by reallocating re- sources from one process to another. Thus, our
treatment will focus on nonpre- emptable resources.
The abstract sequence of events required to use a resource is given below.
A set of processes is deadlocked if each process in the set is waiting for an event that only another
process in the set can cause.
Because all the processes are waiting, none of them will ever cause any event that could wake up any
of the other members of the set, and all the processes continue to wait forever. For this model, we
assume that processes are single threaded and that no interrupts are possible to wake up a blocked
process. The no-interrupts condition is needed to prevent an otherwise deadlocked process from being
awakened by an alarm, and then causing events that release other processes in the set.
In most cases, the event that each process is waiting for is the release of some resource currently
possessed by another member of the set. In other words, each member of the set of deadlocked
processes is waiting for a resource that is owned by a deadlocked process. None of the processes can
run, none of them can release any resources, and none of them can be awakened. The number of
processes and the number and kind of resources possessed and requested are unimportant. This
result holds for any kind of resource, including both hardware and software. This kind of deadlock is
called a resource deadlock. It is probably the most common kind, but it is not the only kind. We first
study resource deadlocks in detail and then at the end of the chapter return briefly to other kinds of
deadlocks.
Mutual exclusion condition. Each resource is either currently assigned to exactly one
process or is available.
Hold-and-wait condition. Processes currently holding resources that were granted earlier
can request new resources.
Circular wait condition. There must be a circular list of two or more processes, each of
which is waiting for a resource held by the next member of the chain.
All four of these conditions must be present for a resource deadlock to occur. If one of them is
absent, no resource deadlock is possible.
2.5.2.2. Deadlock Modeling
Holt (1972) showed how these four conditions can be modeled using directed graphs. The graphs have
two kinds of nodes: processes, shown as circles, and resources, shown as squares. A directed arc from
a resource node (square) to a process node (circle) means that the resource has previously been requested
by, granted to, and is currently held by that process. In Figure 2.13(a), resource R is currently assigned
to process A.
A directed arc from a process to a resource means that the process is currently blocked waiting for that
resource. In Figure 2.13 (b), process B is waiting for resource
S. in Figure 2.13(c) we see a deadlock: process C is waiting for resource T, which is currently held by
process D. Process D is not about to release resource T becauseit is waiting for resource U, held by C.
Both processes will wait forever. A cycle in the graph means that there is a deadlock involving the
processes and resources in the cycle (assuming that there is one resource of each kind). In this example,
the cycle is C-T-D-U- C.
Figure 2.13. Resource allocation graphs. (a) Holding a resource. (b) Requesting a
resource. (c) Deadlock.
Deadlock Strategies: In general, four strategies are used for dealing with deadlocks.
Just ignore the problem. Maybe if you ignore it, it will ignore you.
Detection and recovery. Let them occur, detect them, and take action.
A second technique is detection and recovery. When this technique is used, the system does not attempt
to prevent deadlocks from occurring. Instead, it lets them occur, tries to detect when this happens, and
then takes some action to actually.
2.5.5. Deadlock Detection with Multiple Resources of Each Type
When multiple copies of some of the resources exist, a different approach is needed to detect deadlocks.
We will now present a matrix-based algorithm for detecting deadlock among n processes, P1 through P n .
Let the number of resource classes be m, with E1 resources of class 1, E2 resources of class 2, and
generally, Ei resources of class i (1 <= i <= m). E is the existing resource vector. It gives the total number
of instances of each resource in existence. For example, if class 1 is tape drives, then E1 = 2 means the
system has two tape drives.
At any instant, some of the resources are assigned and are not available. Let A be the available resource
vector, with Ai giving the number of instances of re- source i that are currently available (i.e., unassigned).
If both of our two tape drives are assigned, A1 will be 0.
Now we need two arrays, C, the current allocation matrix, and R, the request matrix. The ith row of
C tells how many instances of each resource class Pi currently holds. Thus, Cij is the number of instances
of resource j that are held by process i. Similarly, Rij is the number of instances of resource j that Pi wants.
Figure 2.14. The four data structures needed by the deadlock detection algorithm
The deadlock detection algorithm is based on comparing vectors. Let us define the relation A B on two
vectors A and B to mean that each element of A is less than or equal to the corresponding element of B.
Mathematically, A B holds if and only if Ai Bi for 1 i m.
As an example of how the deadlock detection algorithm works, see Fig. 6-7. Here we have three processes
and four resource classes, which we have arbitrarily labeled tape drives, plotters, scanners, and Blu-ray
drives. Process 1 has one scan- ner. Process 2 has two tape drives and a Blu-ray drive. Process 3 has a
plotter and two scanners. Each process needs additional resources, as shown by the R matrix.
a)
Suppose that our deadlock detection algorithm has succeeded and detected a deadlock. What next? Some
way is needed to recover and get the system going again. In this section we will discuss various ways of
recovering from deadlock. None of them are especially attractive, however.
Recovery through Preemption
Recovery through Rollback
In the discussion of deadlock detection, we tacitly assumed that when a process asks for resources. In most
systems, however, resources are requested one at a time. The system must beable to decide whether granting
a resource is safe or not and make the allocation only when it is safe. Thus, the question arises: Is there an
algorithm that can always avoid deadlock by making the right choice all the time? The answer is a qualified
yes we can avoid deadlocks, but only if certain information is available in advance. In this section we
examine ways to avoid deadlock by careful resource allocation.
Resource Trajectories
Safe and Unsafe States
The Banker’s Algorithm for a Single Resource
The Banker’s Algorithm for Multiple Resources
Having seen that deadlock avoidance is essentially impossible, because it re- quires information about
future requests, which is not known, how do real systems avoid deadlock? The answer is to go back to the
four conditions stated by Coffman to see if they can provide a clue. If we can ensure that at least one of
these conditions is never satisfied, then deadlocks will be structurally impossible.
The part of the operating system that manages (part of) the memory hierarchyis called the memory
manager. Its job is to efficiently manage memory: keep track of which parts of memory are in use,
allocate memory to processes when they need it, and de-allocate it when they are done.
The simplest memory abstraction is no abstraction at all. Early mainframe computers (before 1960),
early minicomputers (before 1970), and early personal computers (before 1980) had no memory
abstraction. Every program simply saw the physical memory. When a program executed an
instruction like MOV REGISTER1, 1000 the computer just moved the contents of physical memory
location 1000 to REGISTER. In this way the model of memory presented to the programmer was
just physical memory, a set of addresses from 0 to some maximum, each addresses
corresponding to a cell Containing some number of bits, commonly eight.
Under these conditions, it was not possible to have two running programs in memory at the same
time. If the first program wrote a new value to, say, location 2000, this would erase whatever
value the second program was storing there. Nothing would work and
both programs would crash almost immediately.
Even with the model of memory being just physical memory, many options are possible. Three
variations are demonstrated in Figure 3.1. The operating system may be at the bottom of memory
in RAM (Random Access Memory), as shown in Figure 3.1(a), or it may be in ROM (Read-Only
Memory) at the top of memory, as shown in Figure 3.1(b), or the device drivers may be at the top
of memory in a ROM and the rest of the system in RAM down below, as shown in Figure 3.1(c).
The first model was formerly used on mainframes and minicomputers but is rarely used any more.
The second model is used on some handheld computers and embedded systems. The third model
was used by early personal computers (e.g., running MS- DOS), where the portion of the system
in the ROM is called the BIOS (Basic Input Output System). Models (a) and (c) have the
disadvantage that a bug in the user program can wipe out the operating system, maybe with
disastrous results (such as garbling the disk). When the system is organized in this way, usually
only one process at a time can be running. As soon as the user types a command, the operating
system copies the requested program from disk to memory and executes it. When the process
finishes, theoperating system displays a prompt character and waits for a new command. When
it receives the command, it loads a new program into memory, overwriting the first one.
Figure 3.1. organizing memory with operating system and one user process, other
possibility also exist
One way to get some parallelism in a system with no memory abstraction is to program with
multiple threads. Since all threads in a process are supposed to see the same memory image, the
fact that they are forced to is not a problem. Whilethis idea works, it is of limited use since what
people often want is unrelated pro- grams to be running at the same time, something the threads
abstraction does not provide. Furthermore, any system that is so primitive as to provide no
memory abstraction is unlikely to provide threads abstraction.
Nevertheless, even with no memory abstraction, it is possible to run multiple programs at the same
time. What the operating system has to do is save the entire contents of memory to a disk file, then
bring in and run the next program. As long as there is only one program at a time in memory, there
are no conflicts.
3.2. A MEMORY ABSTRACTION: ADDRESS SPACES
Two problems must be solved to allow various applications to be in memory at the same time
without their interfering with each other: protection and relocation. A better solution is to invent a
new abstraction for memory: the address space. Just as the process concept creates a kind of abstract
CPU to run programs, the address space creates a kind of abstract memory for programs to live in.
An address space is the set of addresses that a process can use to address memory. Each process has
its own address space, independent of those belonging to other processes (except in some special
circumstances where processes want to share their address space.
This simple solution uses a particularly simple version of dynamic relocation. What it does is
map each process address space onto a different part of physical memory in a simple way.
Memory Management is the operating system function that assigns and manages the computer's
primary memory. The capacity of the computer's memory was one of the key constraints imposed on
programmers in the early days of computers. The application could not be loaded if it was larger than
the available memory, severely limiting program size. The earliest and most fundamental mechanism
for storing many processes in main memory is the Fixed Partitioning Technique. The underlying
problem with fixed partitioning is that the size of a process is restricted by the maximum partition
size, which implies that one process can never span another. The obvious solution would be
expanding the available RAM, but this would significantly raise the cost of the computer system. So,
earlier individuals have employed a method known as Overlays to tackle this problem.
Overlays in Memory Management operate on the premise that when a process runs, it does not
consume the complete program at the same time but rather a subset of it.
Then, the overlaying concept is that you load whatever component you want, and when the section
is completed, you unload it, which means you pull it back and get the new part you require and
execute it.
Overlaying is defined as "the process of inserting a block of computer code or other data into
internal memory, replacing what is already there."
Because of the limitations of physical memory (internal memory for a system-on-chip) and the
absence of virtual memory features, overlaying is generally employed by embedded systems.
Overlays Driver: Overlaying is completely user-dependent. Even what component is necessary for
the first pass should be written by the user.
Types of overlays
There are two types of overlaying. These two types are modal overlays and non-modal overlays. Before
returning to the program, a modal overlay requires interaction from the user. Users cannot interact with
the website until a particular action is taken or the overlay is closed.
Swapping
If the physical memory of the computer is large enough to hold all the processes, the schemes
explained so far will more or less do. But in practice, the total amount of RAM required by all the
processes is often much more than can fit in memory. On a typical Windows or Linux system,
something like 40-60 processes or more may be started up when the computer is booted. Two
general approaches to dealing with memory overload have been developed over the years. The
simplest strategy, called swapping, consists of bringing in each process in its entirety, running it for
a while, and then putting it back on the disk. Idle processes are mostly stored on disk, so they do not
take up any memory when they are not running (although some of them wake up periodically to do
their work, then go to sleep again). The other strategy, called virtual memory, allows programs to run
even when they are only partially in main memory. Below we will examine swapping; in
"VIRTUAL MEMORY"
The operation of a swapping system is shown in Figure 3.2. In the beginning, only process A is in
memory. Then processes B and C are created or swapped in from disk. In Figure 3.2(d) A is swapped
out to disk. Then D comes in and B goes out. In the end A comes in again. Since A is now at a
different location, addresses contained in it must be relocated, either by software when it is swapped
in or (more likely) by hardware during program execution. For instance, base and limit registers
would work fine here.
Figure 3.2. Memory allocation changes as processes come into memory and leave it.
The shaded regions are unused memory.
When swapping makes multiple holes in memory, it is possible to merge them all into one big one by
moving all the processes downward as far as possible. This technique is known as memory
compaction. It is normally not done because it requires a lot of CPU time. For example, on a 1-GB
machine that can copy 4 bytes in 20 nsec, it would take about 5 sec to compact all of memory. A
point that is worth making concerns how much memory should be allocated for aprocess when
it is created or swapped in. If processes are created with a fixed size that never changes, then the
allocation is simple: the operating system allocates exactlywhat is required, no more and no less.
If it is expected that most processes will grow as they run, it is perhaps a good idea to allocate a little
extra memory whenever a process is swapped in or moved, to reducethe overhead associated with
moving or swapping processes that no longer fit in their allocated memory. However, when swapping
processes to disk, only the memory actually in use should be swapped; it is wasteful to swap the extra
memory as well. In Figure 3.3 (a) we see a memory configuration in which space for growth has been
allocated to two processes.
Figure 3.3. (a) Allocating space for a growing data segment. (b) Allocating space for a growing
stack and a growing data segment.
A partition is a logical division of a hard disk that is treated as a separate unit by operating systems
(OSes) and file systems. The OSes and file systems can manage information on each partition as if it
were a distinct hard drive. This allows the drive to operate as several smaller sections to improve
efficiency, although it reduces usable space on the hard disk because of additional overhead from
multiple OSes.
A disk partition manager allows system administrators to create, resize, delete and manipulate
partitions, while a partition table logs the location and size of the partition. Each partition appears
to the OS as a distinct logical disk, and the OS reads the partition table before any other part of the
disk.
Ext4 on Linux.
Data and files are then written to the file system on the partition. When users boot the OS in a
computer, a critical part of the process is to give control to the first sector on the hard disk. This
includes the partition table that defines how many partitions will be formatted on the hard disk, the
size of each partition and the address where each disk partition begins. The sector also contains a
program that reads the boot sector for the OS and gives it control so that the rest of the OS can be
loaded into random access memory.
When memory is allocated dynamically, the operating system must manage it. Generally, there are
two methods to keep track of memory usage: bitmaps and free lists. In this section and the next one
we will study these two methods.
With a bitmap, memory is divided into allocation units as small as a few words and as large as several
kilobytes. Corresponding to each allocation unit is a bit in the bitmap, which is 0 if the unit is free
and 1 if it is occupied (or vice versa). Figure 3.4. Shows part of memory and the corresponding
bitmap.
Figure 3.4. (a) A part of memory with five processes and three holes. The tick marks show
the memory allocation units. The shaded regions (0 in the bitmap) are free. (b) The
corresponding bitmap. (c) The same information as a list.
A different way of keeping track of memory is to maintain a linked list of assigned and free memory
segments, where a segment either includes a process or is an empty hole between two processes. The
memory of Figure 3.5.(a) is represented in Figure 3.5.(c) as a linked list of segments. Each entry in
the list specifies a hole (H) or process (P), the address at which it starts the length and pointer
to the next entry.
With the suitable examples discuss First fit, Best fit, and Worst fit strategies for memory
allocation.
First Fit
In the first fit approach is to allocate the first free partition or hole large enough which can
accommodate the process. It finishes after finding the first suitable free partition.
Advantage: - Fastest algorithm because it searches as little as possible.
Disadvantage: - The remaining unused memory areas left after allocation become waste
if it is too smaller. Thus request for larger memory requirement cannot be accomplished.
Best Fit
The best fit deals with allocating the smallest free partition which meets the requirement of the
requesting process. This algorithm first searches the entire list of free partitions and considers the
smallest hole that is adequate. It then tries to find a hole which is close to actual process size needed.
Advantage: - Memory utilization is much better than first fit as it searches the smallest free
partition firstavailable.
Disadvantage: - It is slower and may even tend to fill up memory with tiny useless holes.
Worst fit
In worst fit approach is to locate largest available free portion so that the portion left will be big
enough to be useful. It is the reverse of best fit.
Advantage: - Reduces the rate of production of small gaps.
Disadvantage: - If a process requiring larger memory arrives at a later stage then it cannot
be accommodatedas the largest hole is already split and occupied.
Nextfit: - Next fit is a modified version of first fit. It begins as first fit to find a free partition. When
called next time it starts searching from where it left off, not from the beginning.
3.5. Contiguous memory allocation
The main memory must accommodate both the operating system and the various user processes.
We therefore need to allocate different parts of the main memory in the most efficient way
possible.
The memory is usually divided into two partitions: one for the resident operating system, and one
for the user processes. We may place the operating system in either low memory or high memory.
With this approach each process is contained in a single contiguous section of memory.
One of the simplest methods for memory allocation is to divide memory into several fixed-sized
partitions. Each partition may contain exactly one process. In this multiple- partition method, when
a partition is free, a process is selected from the input queue and is loaded into the free partition.
When the process terminates, the partition becomes available for another process. The operating
system keeps a table indicating which parts of memory are available and which are occupied.
Finally, when a process arrives and needs memory, a memory section large enough for this process
is provided.
The contiguous memory allocation scheme can be implemented in operating systems with the help
of two registers, known as the base and limit registers. When a process is executing in main
memory, its base register contains the starting address of the memory location where the process
is executing, while the amount of bytes consumed by the process is stored in the limit register. A
process does not directly refer to the actual address for a corresponding memory location. Instead,
it uses a relative address with respect to its base register. All addresses referred by a program are
considered as virtual addresses. The CPU generates the logical or virtual address, which is
converted into an actual address with the help of the memory management unit (MMU). The base
addressregister is used for address translation by the MMU. Thus, a physical address is calculated
as follows:
The address of any memory location referenced by a process is checked to ensure that it does not
refer to an address of a neighboring process. This processing security is handled by the underlying
operating system.
A computer can address more memory than the amount physically installed on the system. This extra
memory is actually called virtual memory and it is a section of a hard disk that's set up to emulate the
computer's RAM.
The main visible advantage of this scheme is that programs can be larger than physical memory.
Virtual memory serves two purposes. First, it allows us to extend the use of physical
memory by using disk. Second, it allows us to have memory protection, because each virtual
address is translated to a physical address. Following are the situations, when entire program is
not required to be loaded fully in main memory.
User written error handling routines are used only when an error occurred in the data or
computation.
Certain options and features of a program may be used rarely.
Many tables are assigned a fixed amount of address space even though only a small
amount of the table is actually used.
The ability to execute a program that is only partially in memory would counter many
benefits.
Less number of I/O would be needed to load or swap each user program into memory.
A program would no longer be constrained by the amount of physical memory that is
available.
Each user program could take less physical memory; more programs could be run the
same time, with a corresponding increase in CPU utilization and throughput.
In real scenarios, most processes never need all their pages at once, for following reasons :
Error handling code is not needed unless that specific error occurs, some of which are
quiterare.
Arrays are often over-sized for worst-case scenarios, and only a small fraction of the
arrays are actually used in practice.
Certain features of certain programs are rarely used.
Virtual memory is a feature of an operating system (OS) that allows a computer to compensate
for shortages of physical memory by temporarily transferring pages of data from random access
memory (RAM) to disk storage. Eventually, the OS will need to retrieve the data that was
moved to temporarily to disk storage -- but remember, the only reason the OS moved pages of
data from RAM to disk storage to begin with was because it was running out of RAM. To solve
the problem, the operating system will need to move other pages to hard disk so it has room to
bring back the pages it needs right away from temporary disk storage. This process is known
as paging or swapping and the temporary storage space on the hard disk is called a page file or
a swap file.
Swapping, which happens so quickly that the end user doesn't know it's happening, is carried
out by the computer’s memory manager unit (MMU). The memory manager unit may use one
of several algorithms to choose which page should be swapped out, including Least Recently
Used (LRU), Least Frequently Used (LFU) or Most Recently Used (MRU).
In a virtualized computing environment, administrators can use virtual memory management
techniques to allocate additional memory to a virtual machine (VM) that has run out of
resources. Such virtualization management tactics can improve VM performance and
management flexibility.
OS performs an operation for storing and retrieving data from secondary storage devices for
use in main memory. Paging is one of such memory management scheme. Data is retrieved
from storage media by OS, in the same sized blocks called as pages. Paging allows the physical
address space of the process to be non- contiguous. The whole program had to fit into storage
contiguously.
Paging is to deal with external fragmentation problem. This is to allow the logical address space
of a process to be noncontiguous, which makes the process to be allocated physical memory.
Paging is a method of writing data to, and reading it from, secondary storage for use in primary
storage, also known as main memory. Paging plays a role in memory management for a
computer's OS (operating system).
In a memory management system that takes advantage of paging, the OS reads data from
secondary storage in blocks called pages, all of which have identical size. The physical region of
memory containing a single page is called a frame. When paging is used, a frame does not have
to comprise a single physically contiguous region in secondary storage. This approach offers an
advantage over earlier memory management methods, because it facilitates more efficient and
faster use of storage.
Demand Paging
A demand paging system is quite similar to a paging system with swapping where processes
reside in secondary memory and pages are loaded only on demand, not in advance. When a
context switch occurs, the operating system does not copy any of the old program’s pages out to
the disk or any of the new program’s pages into the main memory Instead, it just begins executing
the new program after loading the first page and fetches that program’s pages as they are
referenced.
Figure 3.6. Swapping
While executing a program, if the program references a page which is not available in the main
memory because it was swapped out a little ago, the processor treats this invalid memory
reference as a page fault and transfers control from the program to the operating system to
demand the page back into the memory.
Disadvantages: - Number of tables and the amount of processor overhead for handling page
interrupts are greater than in the case of the simple paged management techniques.
The size varies from computer to computer, but 32 bits is a common size. The most important
field is the Page frame number. After all, the goal of the page mapping is to output this value.
Next to it we have the Present/absent bit. If this bit is 1, the entry is valid and can be used. If it
is 0, the virtual page to which the entry belongs is not currently in memory. Accessing a page
table entry with this bit set to 0 causes a page fault.
The Protection bits tell what kinds of access are allowed. In the simplest form, this field includes
1 bit, with 0 for read/write and 1 for read only. A more complicated arrangement is having 3
bits, one bit each for enabling reading, writing, and executing the page.
The Modified and Referenced bits keep track of page usage. When a page is written to,the
hardware automatically sets the Modified bit.
The Referenced bit is set whenever a page is referenced, either for reading or writing.
Paging reduces external fragmentation, but still suffers from internal fragmentation.
Paging is simple to implement and assumed as an efficient memory management
technique.
Due to equal size of the pages and frames, swapping becomes very easy.
Page table requires extra memory space, so may not be good for a system having small
RAM.
When the page that was selected for replacement and was paged out, is referenced again, it
has to read in from disk, and this requires for I/O completion. This process determines the
quality of the page replacement algorithm: the lesser the time waiting for page-ins, the better
is the algorithm.
A page replacement algorithm looks at the limited information about accessing the pages
provided by hardware, and tries to select which pages should be replaced to minimize the total
number of page misses, while balancing it with the costs of primary storage and processor
time of the algorithm itself. There are many different page replacement algorithms. We
evaluate an algorithm by running it on a particular string of memory reference and computing
the number of page faults,
Reference String
The string of memory references is called reference string. Reference strings are generated
artificially or by tracing a given system and recording the address of each memory reference.
The latter choice produces a large number of data.
A simple alteration to FIFO that avoids the problem of throwing out a heavily used page is to
inspect the R bit of the oldest page. If it is 0, the page is both old and unused, so it is replaced
immediately. If the R bit is 1, the bit is cleared, the page is put onto the end of the list of pages,
and its load time is updated as though it had just arrived in memory. Then the search continues.
The operation of this algorithm, called second chance, is shown in Figure 3.8. In Figure 3.8(a)
we see pages A through H kept on a linked list and sorted by the time they arrived in
memory.
3.7. Segmentation
Segmentation is a memory management technique in which each job is divided into several
segments of different sizes, one for each module that contains pieces that perform related
functions. Each segment is actually a different logical address space of the program.
When a process is to be executed, its corresponding segmentation are loaded into non-
contiguous memory though every segment is loaded into a contiguous block of available
memory.
Segmentation memory management works very similar to paging but here segments are of
variable-length where as in paging pages are of fixed size.
A program segment contains the program's main function, utility functions, data structures, and
so on. The operating system maintains a segment map table for every process and a list of free
memory blocks along with segment numbers, their size and corresponding memory locations
in main memory. For each segment, the table stores the starting address of the segment and
the length of the segment. A reference to a memory location includes a value that identifies a
segment and an offset.
The virtual memory discussed so far is one-dimensional because the virtual addresses go from
0 to some maximum address, one address after another. For various problems, having two or
more separate virtual address spaces may be much better than having only one. For instance,
a compiler has many tables that are built up as compilation proceeds, possibly including:
The source text being saved for the printed listing (on batch systems).
The table containing all the integer and floating-point constants used.
Each of the first four tables grows continuously as compilation proceeds. The last one grows
and shrinks in unpredictable ways during compilation. In a one-dimensional memory, these five
tables would have to be assigned contiguous chunks of virtualaddress space, as in Figure 3.9.
Examine what happens if a program has a much largerthan usual number of variables but a
normal amount of everything else. The chunk of address space assigned for the symbol table
may fill up, but there may be lots of room inthe other tables. The compiler could, of course,
simply issue a message saying that the compilation cannot continue due to too many variables,
but doing so does not seem very sporting when unused space is left in the other tables.
Figure 3.9 One dimensional address spaces with growing tables
MULTICS ran on the Honeywell 6000 machines and their descendants and provided each
program with a virtual memory of up to 218 segments (more than 250,000), each of which could
be up to 65,536 (36-bit) words long. To implement this, the MULTICSdesigners chose to treat
each segment as a virtual memory and to page it, combining the advantages of paging
(uniform page size and not having to keep the whole segment in memory if only part of it is
being used) with the advantages of segmentation (ease of programming, modularity,
protection, sharing).
This is discussed because the CPU will automatically store accessed memory in cache, close to the
processor. The working set is a nice way to describe the memory you want stored. If it is small
enough, it can all fit in the cache and your algorithm will run very fast. On the OS level, the kernel
has to tell the CPU where to find the physical memory your application is using (resolving virtual
addresses) every time you access a new page (typically 4k in size) so also you want to avoid that
hit as much as possible.
In case, if the page fault and swapping happen very frequently at a higher rate, then the operating
system has to spend more time swapping these pages. This state in the operating system is termed
thrashing. Because of thrashing the CPU utilization is going to be reduced.
Let’s understand by an example, if any process does not have the number of frames that it needs
to support pages in active use, then it will quickly page fault. And at this point, the process must
replace some pages. As all the pages of the process are actively in use, it must replace a page that
will be needed again right away. Consequently, the process will quickly fault again, and again, and
again, replacing pages that it must bring back in immediately. This high paging activity by a
process is called thrashing. During thrashing, the CPU spends less time on some actual productive
work spend more time swapping.
Thrashing affects the performance of execution in the Operating system. Also, thrashing results in
severe performance problems in the Operating system. When the utilization of CPU is low, then
the process scheduling mechanism tries to load many processes into the memory at the same time
due to which degree of Multiprogramming can be increased. Now in this situation, there are more
processes in the memory as compared to the available number of frames in the memory. Allocation
of the limited amount of frames to each process.
Whenever any process with high priority arrives in the memory and if the frame is not freely
available at that time, then the other process that has occupied the frame is residing in the frame
will move to secondary storage and after that this free frame will be allocated to higher priority
process.
We can also say that as soon as the memory fills up, the process starts spending a lot of time for
the required pages to be swapped in. Again, the utilization of the CPU becomes low because most
of the processes are waiting for pages. Thus, a high degree of multiprogramming and lack of frames
are two main causes of thrashing in the Operating system.
Effect of Thrashing
At the time, when thrashing starts then the operating system tries to apply either the Global page
replacement Algorithm or the Local page replacement algorithm.
The Global Page replacement has access to bring any page, whenever thrashing found it tries to
bring more pages. Actually, due to this, no process can get enough frames and as a result, the
thrashing will increase more and more. Thus, the global page replacement algorithm is not suitable
whenever thrashing happens.
Unlike the Global Page replacement, the local page replacement will select pages which only
belong to that process. Due to this, there is a chance of a reduction in the thrashing. As it is also
proved that there are many disadvantages of Local Page replacement. Thus, local page replacement
is simply an alternative to Global Page replacement.
As we have already told you the Local Page replacement is better than the Global Page replacement
but local page replacement has many disadvantages too, so it is not suggestible. Thus, given below
are some other techniques that are used.
3.9. Caching
Caching is the process of storing data in a separate place (called the cache) such that they could be
accessed faster if the same data is requested in the future. When some data is requested, the cache
is first checked to see whether it contains that data. If data is already in the cache, it is called a
cache hit. Then the data can be retrieved from the cache, which is much faster than retrieving it
from the original storage location. If the requested data is not in the cache, it is called a cache miss.
Then the data needs to be fetched from the original storage location, which would take a longer
time. Caching is used in different places. In the CPU, caching is used to improve the performance
by reducing the time taken to get data from the main memory. In web browsers, web caching is
used to store responses from previous visits to web sites, in order to make the next visits faster.
Chapter Four
Device Management
Devices are typically physical/hardware devices such as computers, laptops, servers, mobile
phones, etc. They could also be virtual, such as virtual machines or virtual switches. During the
execution of a program, it may require various computer resources (devices) for its complete
execution. The operating system has the onus upon it to provide judiciously the resources required.
It is the sole responsibility of the operating system to check if the resource is available or not. It is
not just concerned with the allocation of the devices but also with the deallocation, i.e., once the
requirement of a device/resource is over by a process, it must be taken away from the process.
In an operating system, device management refers to the control of input/output devices such as
discs, microphones, keyboards, printers, magnetic tape, USB ports, scanners, and various other
devices.
Boot Device
Character Device
Network Device
Boot Device
It stores data in fixed-size blocks, each with its unique address. For example- Disks.
Character Device
It transmits or accepts a stream of characters, none of which can be addressed individually. For
instance, keyboards, printers, etc.
Network Device
Secondary Storage is used as an extension of main memory. Secondary storage devices can hold
the data permanently. Storage devices consist of Registers, Cache, Main-Memory, Electronic-
Disk, Magnetic-Disk, Optical-Disk, Magnetic-Tapes. Each storage system provides the basic
system of storing a datum and of holding the datum until it is retrieved at a later time. All the
storage devices differ in speed, cost, size and volatility. The most common Secondary-storage
device is a Magnetic-disk, which provides storage for both programs and data.
In this hierarchy all the storage devices are arranged according to speed and cost. The higher levels
are expensive, but they are fast. As we move down the hierarchy, the cost per bit generally
decreases, whereas the access time generally increases.
The storage systems above the Electronic disk are volatile, where as those below are Non-Volatile.
An Electronic disk can be either designed to be either volatile or Non-Volatile. During normal
operation, the electronic disk stores data in a large DRAM array, which is Volatile. But many
electronic disk devices contain a hidden magnetic hard disk and a battery for backup power. If
external power is interrupted, the electronic disk controller copies the data from RAM to the
magnetic disk. When external power is restored, the controller copies the data back into the RAM.
The design of a complete memory system must balance all the factors. It must use only as much
expensive memory as necessary while providing as much inexpensive, Non-Volatile memory as
possible. Caches can be installed to improve performance where a large access-time or transfer-
rate disparity exists between two components.
Secondary storage structure
Secondary storage devices are those devices whose memory is non-volatile, meaning, the
stored data will be intact even if the system is turned off. Here are a few things worth noting
about secondary storage.
Magnetic Disk Structure
In modern computers, most of the secondary storage is in the form of magnetic disks. Hence,
knowing the structure of a magnetic disk is necessary to understand how the data in the disk is
accessed by the computer.
Structure of a magnetic disk
A magnetic disk contains several platters. Each platter is divided into circular shaped tracks. The
length of the tracks near the center is less than the length of the tracks farther from the center. Each
track is further divided into sectors, as shown in the figure.
Tracks of the same distance from center form a cylinder. A read-write head is used to read data
from a sector of the magnetic disk.
The speed of the disk is measured as two parts:
Transfer rate: This is the rate at which the data moves from disk to the computer.
Random access time: It is the sum of the seek time and rotational latency.
Seek time is the time taken by the arm to move to the required track. Rotational latency is defined
as the time taken by the arm to reach the required sector in the track. Even though the disk is
arranged as sectors and tracks physically, the data is logically arranged and addressed as an array
of blocks of fixed size. The size of a block can be 512 or 1024 bytes. Each logical block is mapped
with a sector on the disk, sequentially. In this way, each sector in the disk will have a logical
address.
Disk Scheduling Algorithms
On a typical multiprogramming system, there will usually be multiple disk access requests at any
point of time. So those requests must be scheduled to achieve good efficiency. Disk scheduling is
similar to process scheduling. Some of the disk scheduling algorithms is described below.
First Come First Serve
This algorithm performs requests in the same order asked by the system. Let’s take an example
where the queue has the following requests with cylinder numbers as follows:
98, 183, 37, 122, 14, 124, 65, 67
Assume the head is initially at cylinder 56. The head moves in the given order in the queue
i.e., 56→98→183→…→67.
There are three types of Operating system peripheral devices: dedicated, shared, and virtual.
These are as follows:
Dedicated Device
In device management, some devices are allocated or assigned to only one task at a time until that
job releases them. Devices such as plotters, printers, tape drivers, and other similar devices
necessitate such an allocation mechanism because it will be inconvenient if multiple people share
them simultaneously. The disadvantage of such devices is the inefficiency caused by allocating
the device to a single user for the whole duration of task execution, even if the device is not used
100% of the time.
Shared Devices
These devices could be assigned to a variety of processes. By interleaving their requests, disk-
DASD could be shared by multiple processes simultaneously. The Device Manager carefully
controls the interleaving, and pre-determined policies must resolve all difficulties.
Virtual Devices
Virtual devices are a hybrid of the two devices, and they are dedicated devices that have been
transformed into shared devices. For example, a printer can be transformed into a shareable device
by using a spooling program that redirects all print requests to a disk. A print job is not sent directly
to the printer; however, it is routed to the disk until it is fully prepared with all of the required
sequences and formatting, at which point it is transmitted to the printers. The approach can
transform a single printer into numerous virtual printers, improving performance and ease of use.
The operating system (OS) handles communication with the devices via their drivers. The OS
component gives a uniform interface for accessing devices with various physical features. There
are various functions of device management in the operating system. Some of them are as
follows:
It keeps track of data, status, location, uses, etc. The file system is a term used to define a
group of facilities.
It enforces the pre-determined policies and decides which process receives the device
when and for how long.
It monitors the status of every device, including printers, storage drivers, and other
devices.
It allocates and effectively deallocates the device. De-allocating differentiates the devices
at two levels: first, when an I/O command is issued and temporarily freed. Second, when
the job is completed, and the device is permanently release
4.2. Characteristics of serial and parallel devices
There are two methods used for transferring data between computers which are given below: Serial
Transmission and Parallel Transmission.
When data is sent or received using serial data transmission, the data bits (0 and 1) are organized in
a specific order, since they can only be sent one after another. The order of the data bits is important
as it dictates how the transmission is organized when it is received. It is viewed as a reliable data
transmission method because a data bit is only sent if the previous data bit has already been
received.
In Serial Transmission, data is sent bit by bit from one computer to another in bi-direction where
each bit has its clock pulse rate. Eight bits are transferred at a time having a start and stop bit
(usually known as a Parity bit), i.e. 0 and 1 respectively. For transmitting data to a longer distance,
serial data cables are used. However, the data transferred in the serial transmission is in proper
order. It consists of a D-shaped 9 pin cable that connects the data in series
For transferring data between computers, laptops, two methods are used, namely, Serial
Transmission and Parallel Transmission. There are some similarities and dissimilarities between
them. One of the primary differences is that; in Serial Transmission, data is sent bit by bit
whereas, in Parallel Transmission a byte (8 bits) or character is sent at a time. The similarity is
that both are used to connect and communicate with peripheral devices. Furthermore, the parallel
transmission is time-sensitive, whereas serial transmission is not time-sensitive. Other
differences are discussed below. Both Serial and Parallel Transmission have their advantages
and disadvantages, respectively. Parallel Transmission is used for a limited distance, provides
higher speed. On the other hand, Serial Transmission is reliable for transferring data to longer
distance. Hence, we conclude that both serial and parallel are individually essential for
transferring data.
Buffering
A data buffer (or just buffer) is a region of a physical memory storage used to temporarily store
data while it is being moved from one place to another. Typically, the data is stored in a buffer as
it is retrieved from an input device (such as a microphone) or just before it is sent to an output
device (such as speakers). However, a buffer may be used when moving data between processes
within a computer. This is comparable to buffers in telecommunication. Buffers can be
implemented in a fixed memory location in hardware or by using a virtual data buffer in software,
pointing at a location in the physical memory. In all cases, the data stored in a data buffer are stored
on a physical storage medium. A majority of buffers are implemented in software, which typically
use the faster RAM to store temporary data, due to the much faster access time compared with
hard disk drives. Buffers are typically used when there is a difference between the rate at which
data is received and the rate at which it can be processed, or in the case that these rates are variable,
for example in a printer spooler or in online video streaming. In the distributed computing
environment, data buffer is often implemented in the form of burst buffer that provides distributed
buffering service.
Example, in printer’s spoolers, we can pass a large no of pages to print as input, but the
processing/printing is slow. Here buffering is used. I/O buffering the process of temporarily storing data that is passing
between a processor and a peripheral. The usual purpose is to smooth out the difference in rates at which the two devices
can handle data.
DMA stands for “Direct Memory Access” and is a method of transferring data from the computer‘s
RAM to another part of the computer without processing it using the CPU. While most data that
is input or output from your computer is processed by the CPU, some data does not require
processing, or can be processed by another device.
In these situations, DMA can save processing time and is a more efficient way to move data from
the computer’s memory to other devices. In order for devices to use direct memory access, they
must be assigned to a DMA channel. Each type of port on a computer has a set of DMA channels
that can be assigned to each connected device. For example, a PCI controller and a hard drive
controller each have their own set of DMA channels
For example, a sound card may need to access data stored in the computer’s RAM, but since it can
process the data itself, it may use DMA to bypass the CPU. Video cards that support DMA can
also access the system memory and process graphics without needing the CPU. Ultra DMA hard
drives use DMA to transfer data faster than previous hard drives that required the data to first be
run through the CPU.
An alternative to DMA is the Programmed Input/Output (PIO) interface in which all data
transmitted between devices goes through the processor. A newer protocol for the ATAIIDE
interface is Ultra DMA, which provides a burst data transfer rate up to 33 mbps. Hard drives that
come with Ultra DMAl33 also support PIO modes 1, 3, and 4, and multiword DMA mode 2 at
16.6 mbps.
DMA Transfer Types
Memory To Memory Transfer
In this mode block of data from one memory address is moved to another memory address. In this
mode current address register of channel 0 is used to point the source address and the current
address register of channel is used to point the destination address in the first transfer cycle, data
byte from the source address is loaded in the temporary register of the DMA controller and in the
next transfer cycle the data from the temporary register is stored in the memory pointed by
destination address. After each data transfer current address registers are decremented or
incremented according to current settings. The channel 1 current word count register is also
decremented by 1 after each data transfer. When the word count of channel 1 goes to FFFFH, a
TC is generated which activates EOP output terminating the DMA service.
Auto initialize
In this mode, during the initialization the base address and word count registers are loaded
simultaneously with the current address and word count registers by the microprocessor. The
address and the count in the base registers remain unchanged throughout the DMA service.
After the first block transfer i.e. after the activation of the EOP signal, the original values of the
current address and current word count registers are automatically restored from the base address
and base word count register of that channel. After auto initialization the channel is ready to
perform another DMA service, without CPU intervention.
DMA Controller
The controller is integrated into the processor board and manages all DMA data transfers.
Transferring data between system memory and an 110 device requires two steps. Data goes from
the sending device to the DMA controller and then to the receiving device. The microprocessor
gives the DMA controller the location, destination, and amount of data that is to be transferred.
Then the DMA controller transfers the data, allowing the microprocessor to continue with other
processing tasks. When a device needs to use the Micro Channel bus to send or receive data, it
competes with all the other devices that are trying to gain control of the bus. This process is known
as arbitration. The DMA controller does not arbitrate for control of the BUS instead; the I/O device
that is sending or receiving data (the DMA slave) participates in arbitration. It is the DMA
controller, however, that takes control of the bus when the central arbitration control point grants
the DMA slave’s request.
Basic design. One of the simplest concepts to grasp concerning design is the use of more than
one engine; if one fails, the other is designed to be powerful enough to continue operations
safely. Furthermore, by placing engines under-the-wing instead of inside the wing, then added
protection is provided to other critical aircraft systems (other engines, fuel, and hydraulics) if
an engine fails “explosively”.
Recovering from a system failure. E.g. recovering from the failure of flaps to lower. First of
all the possibility of this happening on modern aircraft is much reduced by the design of the
hydraulic system allowing for leaks to be isolated, thereby protecting essential services such
as undercarriage and flaps. Furthermore, back-up air-driven or electric hydraulic pumps may
be available allowing for redundancy. Further back-up may be possible through accumulators
that hold “one-shot” applications of services. This example shows a multi-level design that
allows many opportunities to recover from failure; or, actually prevent total failure. If,
however, the flaps still fail to lower, there will be Standard Operating Procedures that guide the
pilots to select a suitable runway (landing length, navigation and visual aids), environmental
conditions (wind, runway contamination), adjusted approach and landing speeds,
recommendations for braking and reverse thrust use for deceleration etc. Performance
calculations to determine the landing distance required can be considered to contain a safety net
in the form of a % safety margin.
Recovering from specific situations. When people talk about the concept of “recovery from
failure” they are often referring to the several critical situations that pilots are trained to
recover from, such as recovery from: engine failure during take-off, unusual attitudes (loss of
control), uncontained engine failure, rejected take-off and rejected landing. These recovery
techniques are skill-based and procedure driven. The latest aircraft designs (2013) now have
the capability to recover automatically from situations such as unusual attitudes, without input
from pilots. In many cases it is even possible that entry into an unusual attitude and/or a stall is
prevented by the aircraft’s automated systems.
Recovering from human failure. When humans perform a skill poorly, or omit to perform an
action (see human error), then a Safety Net can assist recovery. At the simplest level a safety
net could be a harness to prevent an engineer from falling whilst attending to an engine. To a
more technical degree, if an air traffic controller clears an aircraft to an unsafe level (where a
conflict exists), or two pilots fail to level-off on time at their cleared Flight Level, then
an Airborne Collision Avoidance System (ACAS) can help the pilots recover from a conflict
caused by the Level Bust.
Recovering from an accident. The worst kind of failure, of course, is an accident; and even
with serious accidents there is the possibility of containment and partial recovery, i.e. the
saving of lives. Design, procedures and safety nets can all assist in recovery. E.g. passenger
seats and restraints designed to withstand high deceleration forces (typically 16 x gravity);
cabin interiors designed to prevent passengers from being incapacitated by smoke, fumes, and
noxious gases; cabin crew trained in procedures to assist passengers evacuate as fast as
possible; and life jackets and rafts available as containment measures if the aircraft has ditched
on water.
Chapter Five
File System
In computing, file system or filesystem is a method and data structure that the operating system
uses to control how data is stored and retrieved. Without a file system, data placed in a storage
medium would be one large body of data with no way to tell where one piece of data stops and the
next begins. By separating the data into pieces and giving each piece a name, the data is easily
isolated and identified. Taking its name from the way paper-based data management system is
named, each group of data is called a file. The structure and logic rules used to manage the groups
of data and their names is called a file system.
5.1.1 Data and Meta Data
Metadata is data about data. This refers to not the data itself, but rather to any information that
describes some aspect of the data. Everything from the title, information on how data fits together
(e.g., which page goes before which other page), when and by whom the data was created, and
lists of web pages visited by people, can be classified as metadata. Metadata can be stored in a
variety of places. Where the metadata relates to databases, the data is often stored in tables and
fields within the database. Sometimes the metadata exists in a specialist document or database
designed to store such data, called a data dictionary or metadata repository.
5.2.1. Partition
A partition is a logical division of a hard disk that is treated as a separate unit by operating systems
(OSes) and file systems. The OSes and file systems can manage information on each partition as
if it were a distinct hard drive. This allows the drive to operate as several smaller sections to
improve efficiency, although it reduces usable space on the hard disk because of additional
overhead from multiple OSes.
A disk partition manager allows system administrators to create, resize, delete and manipulate
partitions, while a partition table logs the location and size of the partition. Each partition appears
to the OS as a distinct logical disk, and the OS reads the partition table before any other part of the
disk. Once a partition is created, it is formatted with a file system such as:
Data and files are then written to the file system on the partition. When users boot the OS in a
computer, a critical part of the process is to give control to the first sector on the hard disk. This
includes the partition table that defines how many partitions will be formatted on the hard disk,
the size of each partition and the address where each disk partition begins. The sector also contains
a program that reads the boot sector for the OS and gives it control so that the rest of the OS can
be loaded into random access memory. A key aspect of partitioning is the active or bootable
partition, which is the designated partition on the hard drive that contains the OS. Only the partition
on each drive that contains the boot loader for the OS can be designated as the active partition. The
active partition also holds the boot sector and must be marked as active. A recovery partition
restores the computer to its original shipping condition. In enterprise storage, partitioning helps
enable short stroking, a practice of formatting a hard drive to speed performance through data
placement.
A partition is a logical division of a hard disk that is treated as a separate unit by operating systems (OSes)
and file systems. Each partition appears to the OS as a distinct logical disk, and the OS reads the partition
table before any other part of the disk.
5.2.2. Mounting and Unmounting File Systems
Before you can access the files on a file system, you need to mount the file system. When you
mount a file system, you attach that file system to a directory (mount point) and make it available
to the system. The root (/) file system is always mounted. Any other file system can be connected
or disconnected from the root (/) file system. When you mount a file system, any files or directories
in the underlying mount point directory are unavailable as long as the file system is mounted.
These files are not permanently affected by the mounting process, and they become available again
when the file system is unmounted. However, mount directories are typically empty, because you
usually do not want to obscure existing files.
An operating system can have multiple file systems in it. Virtual File Systems are used to integrate
multiple file systems into an orderly structure. The key idea is to abstract out that part of the file
system that is common to all file systems and put that code in a separate layer that calls the
underlying concrete file system to actually manage the data.
A memory-mapped file contains the contents of a file in virtual memory. This mapping between a
file and memory space enables an application, including multiple processes, to modify the file by
reading and writing directly to the memory. You can use managed code to access memory-mapped
files in the same way that native Windows functions access memory-mapped files, as described in
Managing Memory-Mapped Files. There are two types of memory-mapped files:
Persisted memory-mapped files: Persisted files are memory-mapped files that are associated with
a source file on a disk. When the last process has finished working with the file, the data is saved
to the source file on the disk. These memory-mapped files are suitable for working with extremely
large source files.
Non-persisted memory: mapped files Non-persisted files are memory-mapped files that are not
associated with a file on a disk. When the last process has finished working with the file, the data
is lost and the file is reclaimed by garbage collection. These files are suitable for creating shared
memory for inter-process communications (IPC).
Processes, Views, and Managing Memory: mapped files can be shared across multiple processes.
Processes can map to the same memory-mapped file by using a common name that is assigned by
the process that created the file. To work with a memory-mapped file, you must create a view of
the entire memory-mapped file or a part of it. You can also create multiple views to the same part
of the memory-mapped file, thereby creating concurrent memory. For two views to remain
concurrent, they have to be created from the same memory-mapped file.
Multiple views may also be necessary if the file is greater than the size of the application’s logical
memory space available for memory mapping (2 GB on a 32-bit computer). There are two types
of views: stream access view and random access view. Use stream access views for sequential
access to a file; this is recommended for non-persisted files and IPC. Random access views are
preferred for working with persisted files. Memory-mapped files are accessed through the
operating system’s memory manager, so the file is automatically partitioned into a number of pages
and accessed as needed. You do not have to handle the memory management yourself. The
following illustration shows how multiple processes can have multiple and overlapping views to
the same memory-mapped file at the same time.
FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many
other portable devices because of their relative simplicity. Performance of FAT compares poorly to most
other file systems as it uses overly simplistic data structures, making file operations time-consuming, and
makes poor use of disk space in situations where many small files are present. ISO 9660 and Universal
Disk Format are two common formats that target Compact Discs and DVDs. Mount Rainier is a newer
extension to UDF supported by Linux 2.6 series and Windows Vista that facilitates rewriting to DVDs in
the same fashion as has been possible with floppy disks.
A file is named, for the convenience of its human users, and is referred to by its name. A name is
usually a string of characters, such as example.c. Some systems differentiate between uppercase
and lowercase characters in names, whereas other systems do not. When a file is named, it becomes
independent of the process, the user, and even the system that created it. For instance, one user
might create the file example.c, and another user might edit that file by specifying its name. The
file’s owner might write the file to a USB disk, send it as an e-mail attachment, or copy it across a
network, and it could still be called example. On the destination system.
A file’s attributes vary from one operating system to another but typically consist of these:
• Name: The symbolic file name is the only information kept in human readable form.
• Identifier: This unique tag, usually a number, identifies the file within the file system; it
is the non-human-readable name for the file.
• Type: This information is needed for systems that support different types of files.
• Location: This information is a pointer to a device and to the location of the file on that
device.
• Size: The current size of the file (in bytes, words, or blocks) and possibly the maximum
allowed size are included in this attribute.
Time, date, and user identification: This information may be kept for creation, last
modification, and last use. These data can be useful for protection, security, and usage
monitoring.
File Operations
A file is an abstract data type. To define a file properly, we need to consider the operations that
can be performed on files. The operating system can provide system calls to create, write, read,
reposition, delete, and truncate files. Let’s examine what the operating system must do to perform
each of these six basic file operations. It should then be easy to see how other similar operations,
such as renaming a file, can be implemented.
•Creating a file: Two steps are necessary to create a file. First, space in the file system must be
found for the file.
•Writing a file: To write a file, we make a system call specifying both the name of the file and the
information to be written to the file. Given the name of the file, the system searches the directory
to find the file’s location. The system must keep a write pointer to the location in the file where
the next write is to take place. The write pointer must be updated whenever a write occurs.
•Reading a file: To read from a file, we use a system call that specifies the name of the file and
where (in memory) the next block of the file should be put. Again, the directory is searched for the
associated entry, and the system needs to keep a read pointer to the location in the file where the
next read is to take place. Once the read has taken place, the read pointer is updated. Because a
process is usually either reading from or writing to a file, the current operation location can be kept
as a per-process current file-position pointer. Both the read and write operations use this same
pointer, saving space and reducing system complexity.
•Repositioning within a file: The directory is searched for the appropriate entry, and the current-
file-position pointer is repositioned to a given value. Repositioning within a file need not involve
any actual I/O. This file operation is also known as a file seeks.
•Deleting a file: To delete a file, we search the directory for the named file. Having found the
associated directory entry, we release all file space, so that it can be reused by other files, and erase
the directory entry.
•Truncating a file: The user may want to erase the contents of a file but keep its attributes. Rather
than forcing the user to delete the file and then recreate it, this function allows all attributes to
remain unchanged—except for file length—but lets the file be reset to length zero and its file space
released.
Backup
A system backup is the process of backing up the operating system, files and system specific
useful/essential data. Backup is a process in which the state, files and data of a computer system
are duplicated to be used as a backup or data substitute when the primary system data is corrupted,
deleted or lost. There are mainly three types of backups are there: Full backup, differential backup,
and incremental backup. Let’s take a look at each type of backup and its respective pros and cons.
System backup primarily ensures that not only the user data in a system is saved, but also the
system’s state or operational condition. This helps in restoring the system to the last saved state
along with all the selected backup data. Generally, the system backup is performed through backup
software and the end file (system backup) generated through this process is known as the system
snapshot/image. Moreover, in a networked/enterprise environment, the system backup
file/snapshot/image is routinely uploaded and updated on an enterprise local/remote storage server.
Evolution of backup
Backup techniques have evolved over time and become increasingly sophisticated (and perhaps
complex as a result). Considerations such as time taken for backup, time taken for restores, storage
costs, network bandwidth savings, etc. –have all, over time, driven innovations that have been
designed to make backups better – but also increase complexity as a result.
Types of Backup
Full backup
Differential backup
Incremental backup
SEARCHING
As the number of files in your folders increases, browsing through folders becomes a cumbersome
way of looking for files. However, you can find the file you need from among thousands of photos,
texts and other files by using the search function of your operating system. The search function
allows you to look for files and folders based on file properties such as the file name, save date or
size. The search function allows you to look for files and folders based on file properties such as
the file name, save date or size.
In Windows, you can search for files quickly by clicking the Start button at the bottom left of the
screen. Then, simply type the full or partial name of the file, program or folder. The search begins
as soon as you start typing, and the results will appear above the search field. If the file or program
you are looking for does not appear right away, wait a moment, as the search can take a while.
Also note that the search results are grouped by file type. If you are unable to find the file you are
looking for by using the quick search, you can narrow the search results by file type by clicking
the icons above the search term. You can narrow down your search results to: apps, settings,
documents, folders, photos, videos and music. Let’s say you recently downloaded a few photos
that were attached to an email message, but now you’re not sure where these files are on your
computer. If you’re struggling to find a file, you can always search for it. Searching allows you to
look for any file on your computer. To do this, click the Spotlight icon in the top-right corner of
the screen, then type the file name or keywords in the search box. The search results will appear
as you type. Simply click a file or folder to open it. If you’re using the search option, try using
different terms in your search.
Chapter Six
Security
6.1. Overview of system security
Many companies possess valuable information they want to guard closely. Among many things,
this information can be technical (e.g., a new chip design or software), commercial (e.g., studies
of the competition or marketing plans), financial (e.g., plans for a stock offering) or legal (e.g.,
documents about a potential merger or takeover). Many people keep their financial information,
including tax returns and credit card numbers, on their computer.
Guarding the information against unauthorized usage is therefore a major concern of all operating
systems. Unfortunately, it is also becoming increasingly difficult due to the widespread acceptance
of system bloat (and the accompanying bugs) as a normal phenomenon. In this chapter we will
examine computer security as it applies to operating systems.
Security is technical, administrative, legal, and political issues to making sure that files are not
read or modified by unauthorized person protection mechanisms. This refer to the specific
operating system mechanisms used to safeguard information in the computer. This issues of
security consists avoiding or minimizing threat which helps attacker to access the systems.
6.2.1 Threat
A cyber security threat refers to any possible malicious attack that seeks to unlawfully access data,
disrupt digital operations or damage information. Cyber threats can originate from various actors,
including corporate spies, hacktivists, terrorist groups, hostile nation-states, criminal
organizations, lone hackers and disgruntled employees.
There are different threat which includes: Malware, Emotet (banking Trojan), Denial of Service,
and Man in the Middle, Phishing, and SQL Injection and password attacks.
Confidentiality, is concerned with having secret data remain secret. More specifically, if the owner
of some data has decided that these data are to be made available only to certain people and no
others, the system should guarantee that release of the data to unauthorized people never occurs.
As an absolute minimum, the owner should be able to specify who can see what, and the system
should enforce these specifications, which ideally should be per file.
Integrity means that unauthorized users should not be able to modify any data without the owner’s
permission. Data modification in this context includes not only changing the data, but also
removing data and adding false data. If a system cannot guarantee that data deposited in it remain
unchanged until the owner decides to change them, it is not worth much for data storage.
The third property, availability, means that nobody can disturb the system to make it unusable.
Such denial-of-service attacks are increasingly common. For example, if a computer is an Internet
server, sending a flood of requests to it may cripple it by eating up all of its CPU time just
examining and discarding incoming requests. If it takes, say, 100 μsec to process an incoming
request to read a Web page, then anyone who manages to send 10,000 requests/sec can wipe it out.
Reasonable models and technology for dealing with attacks on confidentiality and integrity are
available; foiling denial-of-service attacks is much harder
6.2.2. Intruders
Intruders are attackers who attempt to breach the security of a network. They are skilled
professionals which uses different techniques to attack personal or organization account to gain
system access. They are one of security threats.
Common Categories
1. Casual prying by nontechnical users
2. Snooping by insiders
3. Determined attempt to make money
4. Commercial or military espionage
6.2.3 Accidental Data Loss
The others security threat is accidental data losses, which can be caused by different natural and
human factors.
Common Causes
1. Nature
- fires, floods, wars
2. Hardware or software errors
- CPU malfunction, bad disk, program bugs
3. Human errors
- data entry, wrong tape mounted
6.3. System protection, authentication
Protection models represent the protected objects in a system, how users or subjects (their proxies
in the computer system) may request access to them, how access decisions are made, and how the
rules governing access decisions may be altered. The access matrix model is the primary example
of a protection model.
Association between the domain and processes can be either static or dynamic. Access matrix
provides an mechanism for defining the control for this association between domain and processes.
When we switch a process from one domain to another, we execute a switch operation on an object
(the domain). We can control domain switching by including domains among the objects of the
access matrix. Processes should be able to switch from one domain (Di) to another domain (Dj) if
and only is a switch right is given to access (i, j).
Memory protection is a way to manage access rights to the specific memory regions. It is used
by the majority of multi-tasking operating systems. The main goal of the memory protection
appears to be a banning of a process to access the part of memory which is not allocated to that
process. Such bans improve reliability of the programs and operating systems as an error in one
program may not directly affect the memory of other applications. It is important to distinguish
between the general principle of memory protection and ASLR, and NX-bit.
6.3.3. Encryption
Encryption is a method of securing data by scrambling the bits of a computer’s files so that they
become illegible. The only method of reading the encrypted files is by decrypting them with a
key; the key is unlocked with a password. More of encryption is explained in 6.4. Basics of
cryptograph in next section.
6.3.5. Authentication
The basic root of Cryptography comes from Greek word crypt which “hidden”.
Cryptography is the study of techniques of keeping information secure. It is the study of secret
writing.
The purpose of cryptography is to take a message or file, called the plaintext, and encrypt it into
ciphertext in such a way that only authorized people know how to convert it back to plaintext. For
all others, the ciphertext is just an incomprehensible pile of bits. Strange as it may sound to
beginners in the area, the encryption and decryption algorithms (functions) should always be
public. Trying to keep them secret almost never works and gives the people trying to keep the
secrets a false sense of security. In the trade, this tactic is called security by obscurity and is
employed only by security amateurs.
Both sender and receiver of the information uses shared keys. This keys used for both encryption
and decryption. This key should have to send to receiver before or after data is sent in other
communication. The key can be mono alphabetic or use keys to shift letters.
• Mono alphabetic substitution
To make encryption clearer, consider an encryption algorithm in which each letter is
replaced by a different letter, for example, all A’s are replaced by Qs, all B’s are
replaced by W’s, all Cs are replaced by E’s, and so on like this:
– This kind of encryption uses the replacement of each 26 letter with other letters as
the above exam plain and cipher text example.
Secret-key crypto called symmetric-key crypto
– Given the encryption key, it is easy to find decryption key
– Single key is used to encrypt and decrypt information
Public-Key Cryptography
• All users pick a public key/private key pair
– publish the public key
– private key not published
– Public key is the encryption key
– private key is the decryption key