Unit-2 Os
Unit-2 Os
PROCESSES
1.3.1 Process concepts
Process : A process is a program in execution. A process is more than the program code, which is
sometimes known as the text section. It also includes the current activity, as represented by the value of
the program counter and the contents of the processor's registers. A process generally also includes the
process stack, which contains temporary data (such as function parameters, return addresses, and local
variables), and a data section, which contains global variables. A process may also include a heap, which
is memory that is dynamically allocated during process run time.
Structure of a process
We emphasize that a program by itself is not a process; a program is a passive entity, such as a file
containing a list of instructions stored on disk (often called an executable file), whereas a process is an
active entity, with a program counter specifying the next instruction to execute and a set of associated
resources. A program becomes a process when an executable file is loaded into memory.
Two common techniques for loading executable files are double-clicking an icon representing the
executable file and entering the name of the executable file on the command line (as in prog. exe or
a.out.)
Process
We emphasize that a program by itself is not a process; a program is a passive entity, such as a file
containing a list of instructions stored on disk (often called an executable file), whereas a process
is an active entity, with a program counter specifying the next instruction to execute and a set of
associated resources. A program becomes a process when an executable file is loaded into memory.
Two common techniques for loading executable files are double-clicking an icon representing the
executable file and entering the name of the executable file on the command line (as in prog. exe or
a.out.)
Process State
As a process executes, it changes state. The state of a process is defined in part by the current activity
of
that process. Each process may be in one of the following states:
• New. The process is being created.
• Running. Instructions are being executed.
• Waiting. The process is waiting for some event to occur (such as an I/O
completion or reception of a signal).
• Ready. The process is waiting to be assigned to a processor.
• Terminated. The process has finished execution. These names are arbitrary, and they vary across
operating systems. The states that they represent are fotind on all systems, however. Certain operating
systems also more finely delineate process states. It is important to realize that only one process can be
running on any processor at any instant.
Process state. The state may be new, ready, running, and waiting, halted, and so on.
Program counter-The counter indicates the address of the next instruction to be executed for this
process.
• CPU registers- The registers vary in number and type, depending on the computer architecture.
They include accumulators, index registers, stack pointers, and general-purpose registers, plus any
condition code information.
Accounting information-This information includes the amount of CPU and real time used, time
limits, account members, job or process numbers, and so on.
I/O status information-This information includes the list of I/O devices allocated to the process,
a list of open files, and so on.
This queue is generally stored as a linked list. A ready-queue header contains pointers to the first and
final PCBs in the list. Each PCB includes a pointer field that points to the next PCB in the ready queue.
The process scheduler selects an available process (possibly from a set of several available
processes) for program execution on the CPU. As processes enter the system, they are put into a job
queue, which consists of all processes in the system. The processes that are residing in main memory
and are ready and waiting to execute are kept on a list called the ready queue.
This queue is generally stored as a linked list. A ready-queue header contains pointers to the first and
final PCBs in the list. Each PCB includes a pointer field that points to the next PCB in the ready queue.
Each rectangular box represents a queue. Two types of queues are present: the ready queue and a set of
device queues. The circles represent the resources that serve the queues, and the arrows indicate the flow
of processes in the system.
A new process is initially put in the ready queue. It waits there till it is selected for execution, or
is dispatched. Once the process is allocated the CPU and is executing, one of several events could
occur:
• The process could issue an I/O request and then be placed in an I/O queue.
• The process could create a new sub process and wait for the sub process’s termination.
• The process could be removed forcibly from the CPU, as a result of an interrupt, and be put back in the
ready queue.
Schedulers
A process migrates among the various scheduling queues throughout its lifetime. The operating system
must select, for scheduling purposes, processes from these queues in some fashion.
The selection process is carried out by the appropriate scheduler. The long-term scheduler,
or job scheduler, selects processes from this pool and loads them into memory for execution. The
short-term scheduler, or CPU scheduler, selects from among the processes that are ready
to execute and allocates the CPU to one of them.
Most operating systems identify processes according to a unique process identifier (or pid),
which is typically an integer number. These processes are responsible for managing memory and file
systems. The sched process also creates the init process, which serves as the root parent process for all
user processes.
When a process creates a new process, two possibilities exist in terms of execution:
1. The parent continues to execute concurrently with its children.
2. The parent waits until some or all of its children have terminated.
There are also two possibilities in terms of the address space of the new process:
1. The child process is a duplicate of the parent process (it has the same program and data as the parent).
2. The child process has a new program loaded into it.
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
int main()
{
pid-t pid;
/* fork a child process */
pid = fork();
if (pid < 0) {/* error occurred */
fprintf(stderr, "Fork Failed");
exit (-1) ;
}
else if (pid == 0} {/* child process */
execlpf"/bin/Is","Is",NULL);
}
else {/* parent process */
/* parent will wait for the child to complete */
wait(NULL);
printf("Child Complete");
exit (0) ;
}
}
In UNIX, as we've seen, each process is identified by its process identifier, which is a unique integer. A
new process is created by the fork() system call. The new process consists of a copy of the address space
of the original process.
This mechanism allows the parent process to communicate easily with its child process. Both processes
(the parent and the child) continue execution at the instruction after the fork(), with one difference: The
return code for the fork() is zero for the new (child) process, whereas the (nonzero) process identifier of
the child is returned to the parent. the exec() system call is used after a fork() system call by one of the
two processes to replace the process's memory space with a new program.
The exec () system call loads a binary file into memory (destroying the memory image of the program
containing the execO system call) and starts its execution.
Process Termination
A process terminates when it finishes executing its final statement and asks the operating system to delete
it by using the exit () system call. At that point, the process may return a status value (typically an integer)
to its parent process (via the wait() system call). All the resources of the process—including physical and
virtual memory, open files, and I/O buffers—are deal located by the operating system.
Termination can occur in other circumstances as well. A process can cause the termination of
another process via an appropriate system call (for example, TerminateProcessO in Win32). Usually, such
a system call can be invoked only by the parent of the process that is to be terminated.
A parent may terminate the execution of one of its children for a variety of reasons, such as these:
• The child has exceeded its usage of some of the resources that it has been allocated.
• The task assigned to the child is no longer required.
• The parent is exiting, and the operating system does not allow a child to
continue if its parent terminates.
Consider that, in UNIX, we can terminate a process by using the exit() system call; its parent
process may wait for the termination of a child process by using the wait() system call. The wait () system
call returns the process identifier of a terminated child so that the parent can tell which of its possibly
many children has terminated.
If the parent terminates, however, all its children have assigned as their new parent the init
process.
Processes executing concurrently in the operating system may be either independent processes or
cooperating processes. A process is independent if it cannot affect or be affected by the other
processes executing in the system.
Any process that does not share data with any other process is independent. A process is cooperating
if it can affect or be affected by the other processes executing in the system.
There are several reasons for providing an environment that allows process cooperation:
• Information sharing. Since several users may be interested in the same piece of information (for
instance, a shared file), we must provide an environment to allow concurrent access to such information.
• Computation speedup. If we want a particular task to run faster, we must break it into
subtasks, each of
which will be executing in parallel with the others. Notice that such a speedup can be achieved only if the
computer has multiple processing elements (such as CPUs or I/O channels).
• Modularity. We may want to construct the system in a modular fashion, dividing the system
functions into separate processes or threads. • Convenience. Even an individual user may work on
many tasks at the same time. For instance, a user may be editing, printing, and compiling in parallel.
Cooperating processes require an interprocess communication (IPC) mechanism that will
allow them to exchange data and information. There are two fundamental models of interprocess
communication:
(1) shared memory and (2) message passing. In the shared-memory model, a region of
memory
that is shared by cooperating processes is established. Processes can then exchange information by
reading and writing data to the shared region. In the message passing model, communication takes place
by means of messages exchanged between the cooperating processes.
Message passing is useful for exchanging smaller amounts of data, because no conflicts need be
avoided. Message passing is also easier to implement than is shared memory for intercomputer
communication. Shared memory allows maximum speed and convenience of communication, as it can be
done at memory speeds when within a computer.
Shared memory is faster than message passing, as message-passing systems are typically
implemented using system calls and thus require the more time consuming task of kernel intervention.
The form of the data and the location are determined by these processes and are not under the operating
system's control. The processes are also responsible for ensuring that they are not writing to the same
location simultaneously.
Message-Passing Systems The scheme requires that these processes share a region of memory
and that the code for accessing and manipulating the shared memory be written explicitly by the
application programmer. Another way to achieve the same effect is for the operating system to provide
the means for cooperating processes to communicate with each other via a message-passing facility.
Message passing provides a mechanism to allow processes to communicate and to synchronize their
actions without sharing the same address space and is particularly useful in a distributed environment,
where the communicating processes may reside on different computers connected by a network.
A message-passing facility provides at least two operations: send(message) and receive(message).
Messages sent by a process can be of either fixed or variable size. If only fixed-sized messages can be
sent, the system-level implementation is straightforward. This restriction, however, makes the task of
programming more difficult. Conversely, variable-sized messages require a more complex system-level
implementation, but the programming task becomes simpler. This is a common kind of tradeoff seen
throughout operating system design.
Naming
Processes that want to communicate must have a way to refer to each other. They can use either direct or
indirect communication.
Under direct communication, each process that wants to communicate must explicitly name the recipient
or sender of the communication. In this scheme, the send.0 and receive() primitives are defined as:
• send(P, message)—Send a message to process P.
• receive (Q, message)—Receive a message from process Q.
A communication link in this scheme has the following properties:
• A link is established automatically between every pair of processes that want to communicate. The
processes need to know only each other's identity to communicate.
• A link is associated with exactly two processes.
• Between each pair of processes, there exists exactly one link.
The disadvantage in both of these schemes (symmetric and asymmetric) is the limited modularity of the
resulting process definitions. Changing the identifier of a process may necessitate examining all other
process definitions.
Synchronization
Communication between processes takes place through calls to send() and receive () primitives. There are
different design options for implementing each primitive. Message passing may be either blocking or
nonblocking— also known as synchronous and asynchronous.
• Blocking send- The sending process is blocked until the message is received by the receiving
process or by the mailbox.
• Nonblocking send- The sending process sends the message and resumes operation.
• Blocking receive- The receiver blocks until a message is available.
• Zero capacity- The queue has a maximum length of zero; thus, the link cannot have any messages
waiting in it. In this case, the sender must block until the recipient receives the
message.
• Bounded capacity- The queue has finite length n; thus, at most n messages can reside in it. If
the queue is not full when a new message is sent, the message is placed in the queue (either the message
is copied or a pointer to the message is kept), and the sender can continue execution without waiting. The
links capacity is finite, however. If the link is full, the sender must block until space is available in the
queue.
• Unbounded capacity- The queues length is potentially infinite; thus, any number of messages
can wait in it. The sender never blocks.
Processes that wish to access a shared-memory segment must attach it to their address space
using the shmat () (SHared Memory ATtach) system call.
The call to shmat () expects three parameters as well. The first is the integer identifier of the
shared-memory segment being attached, and the second is a pointer location in memory indicating where
the shared memory will be attached.
If we pass a value of NULL, the operating system selects the location on the user's behalf. The third
parameter identifies a flag that allows the shared memory region to be attached in read-only or read-write
mode; by passing a parameter of 0, we allow both reads and writes to the shared region.
The third parameter identifies a mode flag. If set, the mode flag allows the shared-memory region to be
attached in read-only mode; if set to 0, the flag allows both reads and writes to the shared region.
An Example: Windows XP
The Windows XP operating system is an example of modern design that employs modularity to increase
functionality and decrease the time needed to implement new features. Windows XP provides support for
multiple operating environments, or subsystems, with which application programs communicate via a
message-passing mechanism. The application programs can be considered clients of the Windows XP
subsystem server.
The message-passing facility in Windows XP is called the local procedure call (LPC) facility.
The LPC in Windows XP communicates between two processes on the same machine. It is similar to the
standard RPC mechanism that is widely used, but it is optimized for and specific to Windows XP.
Windows XP uses a port object to establish and maintain a connection between two processes. Every
client that calls a subsystem needs a communication channel, which is provided by a port object and is
never inherited.
Windows XP uses two types of ports: connection ports and communication ports. They are really the
same but are given different names according to how they are used. Connection ports are named objects
and are visible to all processes
The simplest, which is used for small messages, uses the port's message queue as intermediate storage and
copies the message from one process to the other. Under this method, messages of up to 256 bytes can be
sent. If a client needs to send a larger message, it passes the message through a section object, which sets
up a region of shared memory. The client has to decide when it sets up the channel whether or not it will
need to send a large message. If the client determines that it does want to send large messages, it asks for
a section object to be created. Similarly, if the server decides that replies will be large, it creates a section
object. So that the section object can be used, a small message is sent that contains a pointer and size
information about the section object. This method is more complicated than the first method, but it avoids
data copying. In both cases, a callback mechanism can be used when either the client or the server cannot
respond immediately to a request.
Scheduling Algorithms
There are various algorithms which are used by the Operating System to schedule the processes on the
processor in an efficient way.
FCFS Scheduling
First come first serve (FCFS) scheduling algorithm simply schedules the jobs according to their arrival time. The job
which comes first in the ready queue will get the CPU first. The lesser the arrival time of the job, the sooner will the job
get the CPU. FCFS scheduling may cause the problem of starvation if the burst time of the first process is the longest
among all the jobs.
Advantages of FCFS
o Simple
o Easy
Disadvantages of FCFS
1. The scheduling method is non preemptive, the process will run to the completion.
2. Due to the non-preemptive nature of the algorithm, the problem of starvation may occur.
3. Although it is easy to implement, but it is poor in performance since the average waiting time is higher as
compare to other scheduling algorithms.
Example
Let's take an example of The FCFS scheduling algorithm. In the Following schedule, there are 5 processes with process
ID P0, P1, P2, P3 and P4. P0 arrives at time 0, P1 at time 1, P2 at time 2, P3 arrives at time 3 and Process P4 arrives at
time 4 in the ready queue. The processes and their respective Arrival and Burst time are given in the following table.
The Turnaround time and the waiting time are calculated by using the following formula.
1. Turn Around Time = Completion Time - Arrival Time
2. Waiting Time = Turnaround time - Burst Time
The average waiting Time is determined by summing the respective waiting time of all the processes and divided the
sum by the total number of processes.
0 0 2 2 2 0
1 1 6 8 7 1
2 2 4 12 8 4
3 3 9 21 18 9
4 4 12 33 29 17
(Gantt chart)
Advantages of SJF
1. Maximum throughput
Disadvantages of SJF
1. May suffer with the problem of starvation
2. It is not implementable because the exact Burst time for a process can't be known in advance.
There are different techniques available by which, the CPU burst time of the process can be determined. We will discuss
them later in detail.
Example
In the following example, there are five jobs named as P1, P2, P3, P4 and P5. Their arrival time and burst time are given
in the table below.
PID Arrival Time Burst Time Completion Time Turn Waiting Time
Around
Time
1 1 7 8 7 0
2 3 3 13 10 7
3 6 2 10 4 2
4 7 10 31 24 14
5 9 8 21 12 4
Since, No Process arrives at time 0 hence; there will be an empty slot in the Gantt chart from time 0 to 1 (the time at
which the first process arrives).
According to the algorithm, the OS schedules the process which is having the lowest burst time among the available
processes in the ready queue.
Till now, we have only one process in the ready queue hence the scheduler will schedule this to the processor no matter
what is its burst time.
This will be executed till 8 units of time. Till then we have three more processes arrived in the ready queue hence the
scheduler will choose the process with the lowest burst time.
Among the processes given in the table, P3 will be executed next since it is having the lowest burst time among all the
available processes.
So that's how the procedure will go on in shortest job first (SJF) scheduling algorithm.
Avg Waiting Time = 27/5
Example
In this Example, there are five jobs P1, P2, P3, P4, P5 and P6. Their arrival time and burst time are given
below in the table
1 0 8 20 20 12 0
2 1 4 10 9 5 1
3 2 2 4 2 0 2
4 3 1 5 2 1 4
5 4 3 13 9 6 10
6 5 2 7 2 0 5
Advantages
1. It can be actually implementable in the system because it is not depending on the burst time.
Disadvantages
1. The higher the time quantum, the higher the response time in the system.
2. The lower the time quantum, the higher the context switching overhead in the system.
3. Deciding a perfect time quantum is really a very difficult task in the system.
RR Scheduling Example
Process ID Arrival Burst Time
Time
1 0 5
2 1 6
3 2 3
In
4 3 1
5 4 5
6 6 4
the following example, there are six processes named as P1, P2, P3, P4, P5 and P6. Their arrival time and burst time are
given below in the table. The time quantum of the system is 4 units.
According to the algorithm, we have to maintain the ready queue and the Gantt chart. The structure of both the data
structures will be changed after every scheduling.
Ready Queue:
Initially, at time 0, process P1 arrives which will be scheduled for the time slice 4 units. Hence in the ready queue, there
will be only one process P1 at starting with CPU burst time 5 units.
P1
GANTT chart
The P1 will be executed for 4 units first.
Ready Queue
Meanwhile the execution of P1, four more processes P2, P3, P4 and P5 arrives in the ready queue. P1 has not completed
yet, it needs another 1 unit of time hence it will also be added back to the ready queue.
P2 P3 P4 P5 P1
6 3 1 5 1
GANTT chart
After P1, P2 will be executed for 4 units of time which is shown in the Gantt chart.
Ready Queue
P3 P4 P5 P1 P6 P2
3 1 5 1 4 2
During the execution of P2, one more process P6 is arrived in the ready queue. Since P2 has not completed yet hence, P2
will also be added back to the ready queue with the remaining burst time 2 units.
GANTT chart
After P1 and P2, P3 will get executed for 3 units of time since its CPU burst time is only 3 seconds.
Ready Queue
Since P3 has been completed, hence it will be terminated and not be added to the ready queue. The next process will be
executed is P4.
P4 P5 P1 P6 P2
1 5 1 4 2
GANTT chart
After, P1, P2 and P3, P4 will get executed. Its burst time is only 1 unit which is lesser then the time quantum hence it
will be completed.
Ready Queue
The next process in the ready queue is P5 with 5 units of burst time. Since P4 is completed hence it will not be added
back to the queue.
P5 P1 P6 P2
5 1 4 2
GANTT chart
P5 will be executed for the whole time slice because it requires 5 units of burst time which is higher than the time slice.
Ready Queue
P1 P6 P2 P5
1 4 2 1
P5 has not been completed yet; it will be added back to the queue with the remaining burst time of 1 unit.
GANTT Chart
The process P1 will be given the next turn to complete its execution. Since it only requires 1 unit of burst time hence it
will be completed.
Ready Queue
P1 is completed and will not be added back to the ready queue. The next process P6 requires only 4 units of burst time
and it will be executed next.
P6 P2 P5
4 2 1
GANTT chart
P6 will be executed for 4 units of time till completion.
Ready Queue
P2 P5
2 1
Since P6 is completed, hence it will not be added again to the queue. There are only two processes present in the ready
queue. The Next process P2 requires only 2 units of time.
GANTT Chart
P2 will get executed again, since it only requires only 2 units of time hence this will be completed.
Ready Queue
P5
Now, the only available process in the queue is P5 which requires 1 unit of burst time. Since the time slice is of 4 units
hence it will be completed in the next burst.
GANTT chart
P5 will get executed till completion.
The completion time, Turnaround time and waiting time will be calculated as shown in the table below.
As, we know,
1 0 5 17 17 12
2 1 6 23 22 16
3 2 3 11 9 6
4 3 1 12 9 8
5 4 5 24 20 15
6 6 4 21 15 11
1 2 0 3
2 6 2 5
3 3 1 4
4 5 4 2
5 7 6 9
6 4 5 4
7 10 7 10
We can prepare the Gantt chart according to the Non Preemptive priority scheduling.
The Process P1 arrives at time 0 with the burst time of 3 units and the priority number 2. Since No other process has
arrived till now hence the OS will schedule it immediately.
Meanwhile the execution of P1, two more Processes P2 and P3 are arrived. Since the priority of P3 is 3 hence the CPU
will execute P3 over P2.
Meanwhile the execution of P3, All the processes get available in the ready queue. The Process with the lowest priority
number will be given the priority. Since P6 has priority number assigned as 4 hence it will be executed just after P3.
Proce Priorit Arriv Bur Completi Turnarou Waitin Respon
ss Id y al st on Time nd Time g se Time
After
P6, P4
Time Tim Time
has the e
least
1 2 0 3 3 3 0 0
2 6 2 5 18 16 11 13
3 3 1 4 7 6 2 3
4 5 4 2 13 9 7 11
5 7 6 9 27 21 12 18
6 4 5 4 11 6 2 7
7 10 7 10 37 30 18 27
priority number among the available processes; it will get executed for the whole burst time.
Since all the jobs are available in the ready queue hence All the Jobs will get executed according to their priorities. If two
jobs have similar priority number assigned to them, the one with the least arrival time will be executed.
Example
Process Id Priority Arrival Time Burst Time
1 2(L) 0 1
2 6 1 7
3 3 2 3
4 5 3 6
5 4 4 5
6 10(H) 5 15
7 9 15 8
There are 7 processes P1, P2, P3, P4, P5, P6 and P7 given. Their respective priorities, Arrival Times and Burst times are
given in the table below.
GANTT chart PreparationAt time 0, P1 arrives with the burst time of 1 units and priority 2.
Since no other process is available hence this will be scheduled till next job arrives or its completion (whichever is
lesser).
At time 1, P2 arrives. P1 has completed its execution and no other process is available at this time hence the Operating
system has to schedule it regardless of the priority assigned to it.
The Next process P3 arrives at time unit 2, the priority of P3 is higher to P2. Hence the execution of P2 will be stopped
and P3 will be scheduled on the CPU.
During the execution of P3, three more processes P4, P5 and P6 becomes available. Since, all these three have the
priority lower to the process in execution so PS can't preempt the process. P3 will complete its execution and then P5
will be scheduled with the priority highest among the available processes.
Meanwhile the execution of P5, all the processes got available in the ready queue. At this point, the algorithm will start
behaving as Non Preemptive Priority Scheduling. Hence now, once all the processes get available in the ready queue, the
OS just took the process with the highest priority and execute that process till completion. In this case, P4 will be
scheduled and will be executed till the completion.
Since P4 is completed, the other process with the highest priority available in the ready queue is P2. Hence P2 will be
scheduled next.
P2 is given the CPU till the completion. Since its remaining burst time is 6 units hence P7 will be scheduled after this.
The only remaining process is P6 with the least priority, the Operating System has no choice unless of executing it. This
will be executed at the last.
The Completion Time of each process is determined with the help of GANTT chart. The turnaround time and the waiting
time can be calculated by the following formula.
1 2 0 1 1 1 0
2 6 1 7 22 21 14
3 3 2 3 5 3 0
4 5 3 6 16 13 7
5 4 4 5 10 6 1
6 10 5 15 45 40 25
7 9 6 8 30 24 16
7. Enhanced throughput of the system: If a process is divided into multiple threads, and
each thread function is considered as one job, then the number of jobs completed per unit
of time is increased, thus increasing the throughput of the system.
8.
Types of Threads
There are two types of threads.
User Level Thread
Kernel Level Thread
Threads and its types in Operating System
Thread is a single sequence stream within a process. Threads have same properties as of the
process so they are called as light weight processes. Threads are executed one after another but
gives the illusion as if they are executing in parallel. Each thread has different states. Each thread
has
1. A program counter
2. A register set
3. A stack space
Threads are not independent of each other as they share the code, data, OS resources etc.
Similarity between Threads and Processes –
Only one thread or process is active at a time
Within process both execute sequentiall
Both can create children
Differences between Threads and Processes –
Threads are not independent, processes are.
Threads are designed to assist each other, processes may or may not do it
Types of Threads:
A thread is a path which is followed during a program’s execution. Majority of programs written
now a days run as a single thread.Lets say, for example a program is not capable of reading
keystrokes while making drawings. These tasks cannot be executed by the program at the same
time. This problem can be solved through multitasking so that two or more tasks can be executed
simultaneously.
Multitasking is of two types: Processor based and thread based. Processor based multitasking is
totally managed by the OS, however multitasking through multithreading can be controlled by the
programmer to some extent.
The concept of multi-threading needs proper understanding of these two terms – a process and
a thread. A process is a program being executed. A process can be further divided into
independent units known as threads.
A thread is like a small light-weight process within a process. Or we can say a collection of
threads is what is known as a process.
Applications –
Threading is used widely in almost every field. Most widely it is seen over the internet now days
where we are using transaction processing of every type like recharges, online transfer, banking
etc. Threading is a segment which divide the code into small parts that are of very light weight
and has less burden on CPU memory so that it can be easily worked out and can achieve goal in
desired field. The concept of threading is designed due to the problem of fast and regular changes
in technology and less the work in different areas due to less application. Then as says “need is
the generation of creation or innovation” hence by following this approach human mind develop
the concept of thread to enhance the capability of programming.
In this model, we have multiple user threads mapped to one kernel thread. In this model when a
user thread makes a blocking system call entire process blocks. As we have only one kernel
thread and only one user thread can access kernel at a time, so multiple threads are not able
access multiprocessor at the same time.
Threading Issues in OS
There are several threading issues when we are in a multithreading environment. In this
section, we will discuss the threading issues with system calls, cancellation of thread, signal
handling, thread pool and thread-specific data.
Along with the threading issues, we will also discuss how these issues can be deal or resolve
to retain the benefit of the multithreaded programming environment.
Threading Issues in OS
1. System Calls
2. Thread Cancellation
3. Signal Handling
4. Thread Pool
5. Thread Specific Data
1. The fork() and exec() System Calls
The fork() and exec() are the system calls. The fork() call creates a duplicate process of the
process that invokes fork(). The new duplicate process is called child process and process
invoking the fork() is called the parent process. Both the parent process and the child process
continue their execution from the instruction that is just after the fork().
Let us now discuss the issue with the fork() system call. Consider that a thread of the
multithreaded program has invoked the fork(). So, the fork() would create a new duplicate
process. Here the issue is whether the new duplicate process created by fork() will duplicate
all the threads of the parent process or the duplicate process would be single-threaded.
Well, there are two versions of fork() in some of the UNIX systems. Either the fork() can
duplicate all the threads of the parent process in the child process or the fork() would only
duplicate that thread from parent process that has invoked it.
Which version of fork() must be used totally depends upon the application.
Next system call i.e. exec() system call when invoked replaces the program along with all its
threads with the program that is specified in the parameter to exec(). Typically the exec()
system call is lined up after the fork() system call.
Here the issue is if the exec() system call is lined up just after the fork() system call then
duplicating all the threads of parent process in the child process by fork() is useless. As the
exec() system call will replace the entire process with the process provided to exec() in the
parameter.
In such case, the version of fork() that duplicates only the thread that invoked the fork()
would be appropriate.
2. Thread cancellation
Termination of the thread in the middle of its execution it is termed as ‘thread cancellation’.
Let us understand this with the help of an example. Consider that there is a multithreaded
program which has let its multiple threads to search through a database for some information.
However, if one of the thread returns with the desired result the remaining threads will be
cancelled.
Now a thread which we want to cancel is termed as target thread. Thread cancellation can be
performed in two ways:
What if the resources had been allotted to the cancel target thread?
What if the target thread is terminated when it was updating the data, it was sharing with
some other thread.
Here the asynchronous cancellation of the thread where a thread immediately cancels the
target thread without checking whether it is holding any resources or not creates troublesome.
However, in deferred cancellation, the thread that indicates the target thread about the
cancellation, the target thread crosschecks its flag in order to confirm that it should it be
cancelled immediately or not. The thread cancellation takes place where they can be cancelled
safely such points are termed as cancellation points by Pthreads.
3. Signal Handling
Signal handling is more convenient in the single-threaded program as the signal would be
directly forwarded to the process. But when it comes to multithreaded program, the issue
arrives to which thread of the program the signal should be delivered.
Synchronous signals are forwarded to the same process that leads to the generation of the
signal. Asynchronous signals are generated by the event external to the running process thus
the running process receives the signals asynchronously.
So if the signal is synchronous it would be delivered to the specific thread causing the
generation of the signal. If the signal is asynchronous it cannot be specified to which thread of
the multithreaded program it would be delivered. If the asynchronous signal is notifying to
terminate the process the signal would be delivered to all the thread of the process.
UNIX allow the thread to specify which signal it can accept and which it will not whereas the
ACP is forwarded to the specific thread.
4. Thread Pool
When a user requests for a webpage to the server, the server creates a separate thread to
service the request. Although the server also has some potential issues. Consider if we do not
have a bound on the number of actives thread in a system and would create a new thread for
every new request then it would finally result in exhaustion of system resources.
We are also concerned about the time it will take to create a new thread. It must not be that
case that the time require to create a new thread is more than the time required by the thread
to service the request and then getting discarded as it would result in wastage of CPU time.
The solution to this issue is the thread pool. The idea is to create a finite amount of threads
when the process starts. This collection of threads is referred to as the thread pool. The
threads stay in the thread pool and wait till they are assigned any request to be serviced.
Whenever the request arrives at the server, it invokes a thread from the pool and assigns it the
request to be serviced. The thread completes its service and return back to the pool and wait
for the next request.
If the server receives a request and it does not find any thread in the thread pool it waits for
some or the other thread to become free and return to the pool. This much better than creating
a new thread each time a request arrives and convenient for the system that cannot handle a
large number of concurrent threads.
Consider a transaction processing system, here we can process each transaction in a different
thread. To determine each transaction uniquely we will associate a unique identifier with it.
Which will help the system to identify each transaction uniquely.
As we are servicing each transaction in a separate thread. So we can use thread-specific data
to associate each thread to a specific transaction and its unique id. Thread libraries such as
Win32, Pthreads and Java support to thread-specific data.
So these are threading issues that occur in the multithreaded programming environment. We
have also seen how these issues can be resolved.
Related Terms:
1. Multithreading Models in Operating System
2. Thread Libraries in OS
3. Semaphore in Operating System
4. Multiplexing and it’s Types
5. Address Resolution Protocol (ARP)
One approach is when all the scheduling decisions and I/O processing are handled by a single
processor which is called the Master Server and the other processors executes only the user
code. This is simple and reduces the need of data sharing. This entire scenario is
called Asymmetric Multiprocessing.
A second approach uses Symmetric Multiprocessing where each processor is self scheduling.
All processes may be in a common ready queue or each processor may have its own private
queue for ready processes. The scheduling proceeds further by having the scheduler for each
processor examine the ready queue and select a process to execute.
Processor Affinity –
Processor Affinity means a processes has an affinity for the processor on which it is currently
running.
When a process runs on a specific processor there are certain effects on the cache memory. The
data most recently accessed by the process populate the cache for the processor and as a result
successive memory access by the process are often satisfied in the cache memory. Now if the
process migrates to another processor, the contents of the cache memory must be invalidated for
the first processor and the cache for the second processor must be repopulated. Because of the
high cost of invalidating and repopulating caches, most of the SMP(symmetric multiprocessing)
systems try to avoid migration of processes from one processor to another and try to keep a
process running on the same processor. This is known as PROCESSOR AFFINITY.
There are two types of processor affinity:
1. Soft Affinity – When an operating system has a policy of attempting to keep a process
running on the same processor but not guaranteeing it will do so, this situation is called soft
affinity.
2. Hard Affinity – Hard Affinity allows a process to specify a subset of processors on which it
may run. Some systems such as Linux implements soft affinity but also provide some system
calls like sched_setaffinity() that supports hard affinity.
Load Balancing –
Load Balancing is the phenomena which keeps the workload evenly distributed across all
processors in an SMP system. Load balancing is necessary only on systems where each processor
has its own private queue of process which are eligible to execute. Load balancing is unnecessary
because once a processor becomes idle it immediately extracts a runnable process from the
common run queue. On SMP(symmetric multiprocessing), it is important to keep the workload
balanced among all processors to fully utilize the benefits of having more than one processor else
one or more processor will sit idle while other processors have high workloads along with lists of
processors awaiting the CPU.
There are two general approaches to load balancing :
1. Push Migration – In push migration a task routinely checks the load on each processor and if
it finds an imbalance then it evenly distributes load on each processors by moving the
processes from overloaded to idle or less busy processors.
2. Pull Migration – Pull Migration occurs when an idle processor pulls a waiting task from a
busy processor for its execution.
Multicore Processors –
In multicore processors multiple processor cores are places on the same physical chip. Each core
has a register set to maintain its architectural state and thus appears to the operating system as a
separate physical processor. SMP systems that use multicore processors are faster and
consume less power than systems in which each processor has its own physical chip.
However multicore processors may complicate the scheduling problems. When processor
accesses memory then it spends a significant amount of time waiting for the data to become
available. This situation is called MEMORY STALL. It occurs for various reasons such as cache
miss, which is accessing the data that is not in the cache memory. In such cases the processor can
spend upto fifty percent of its time waiting for data to become available from the memory. To
solve this problem recent hardware designs have implemented multithreaded processor cores in
which two or more hardware threads are assigned to each core. Therefore if one thread stalls
while waiting for the memory, core can switch to another thread.
There are two ways to multithread a processor :
1. Coarse-Grained Multithreading – In coarse grained multithreading a thread executes on a
processor until a long latency event such as a memory stall occurs, because of the delay
caused by the long latency event, the processor must switch to another thread to begin
execution. The cost of switching between threads is high as the instruction pipeline must be
terminated before the other thread can begin execution on the processor core. Once this new
thread begins execution it begins filling the pipeline with its instructions.
2. Fine-Grained Multithreading – This multithreading switches between threads at a much
finer level mainly at the boundary of an instruction cycle. The architectural design of fine
grained systems include logic for thread switching and as a result the cost of switching
between threads is small.
In this type of multiple-processor scheduling even a single CPU system acts like a multiple-
processor system. In a system with Virtualization, the virtualization presents one or more virtual
CPU to each of virtual machines running on the system and then schedules the use of physical
CPU among the virtual machines. Most virtualized environments have one host operating system
and many guest operating systems. The host operating system creates and manages the virtual
machines. Each virtual machine has a guest operating system installed and applications run
within that guest.Each guest operating system may be assigned for specific use cases,applications
or users including time sharing or even real-time operation. Any guest operating-system
scheduling algorithm that assumes a certain amount of progress in a given amount of time will be
negatively impacted by the virtualization. A time sharing operating system tries to allot 100
milliseconds to each time slice to give users a reasonable response time. A given 100 millisecond
time slice may take much more than 100 milliseconds of virtual CPU time. Depending on how
busy the system is, the time slice may take a second or more which results in a very poor
response time for users logged into that virtual machine. The net effect of such scheduling
layering is that individual virtualized operating systems receive only a portion of the available
CPU cycles, even though they believe they are receiving all cycles and that they are scheduling
all of those cycles.Commonly, the time-of-day clocks in virtual machines are incorrect because
timers take no longer to trigger than they would on dedicated CPU’s.
Virtualizations can thus undo the good scheduling-algorithm efforts of the operating systems
within virtual machines.
Thread Scheduling
Scheduling of threads involves two boundary scheduling,
Scheduling of user level threads (ULT) to kernel level threads (KLT) via leightweight process
(LWP) by the application developer.
Scheduling of kernel level threads by the system scheduler to perform different unique os
functions.
Leightweight Process (LWP) :
Light-weight process are threads in the user space that acts as an interface for the ULT to access
the physical CPU resources. Thread library schedules which thread of a process to run on which
LWP and how long. The number of LWP created by the thread library depends on the type of
application. In the case of an I/O bound application, the number of LWP depends on the number
of user-level threads. This is because when an LWP is blocked on an I/O operation, then to
invoke the other ULT the thread library needs to create and schedule another LWP. Thus, in an
I/O bound application, the number of LWP is equal to the number of the ULT. In the case of a
CPU bound application, it depends only on the application. Each LWP is attached to a separate
kernel-level thread.
In real-time, the first boundary of thread scheduling is beyond specifying the scheduling policy
and the priority. It requires two controls to be specified for the User level threads: Contention
scope, and Allocation domain. These are explained as following below.
1. Contention Scope :
The word contention here refers to the competition or fight among the User level threads to
access the kernel resources. Thus, this control defines the extent to which contention takes place.
It is defined by the application developer using the thread library. Depending upon the extent of
contention it is classified as Process Contention Scope and System Contention Scope.
Figure 2.11: A proposed solution to the CR problem. (a) Process 0, (b) Process 1
The producer produces the item and inserts it into the buffer. The value of the global variable count got increased at each
insertion. If the buffer is filled completely and no slot is available then the producer will sleep, otherwise it keep
inserting.
On the consumer's end, the value of count got decreased by 1 at each consumption. If the buffer is empty at any point of
time then the consumer will sleep otherwise, it keeps consuming the items and decreasing the value of count by 1.
The consumer will be waked up by the producer if there is at least 1 item available in the buffer which is to be
consumed. The producer will be waked up by the consumer if there is at least one slot available in the buffer so that the
producer can write that.
Well, the problem arises in the case when the consumer got preempted just before it was about to sleep. Now the
consumer is neither sleeping nor consuming. Since the producer is not aware of the fact that consumer is not actually
sleeping therefore it keep waking the consumer while the consumer is not responding since it is not sleeping.
This leads to the wastage of system calls. When the consumer get scheduled again, it will sleep because it was about to
sleep when it was preempted.
The producer keep writing in the buffer and it got filled after some time. The producer will also sleep at that time
keeping in the mind that the consumer will wake him up when there is a slot available in the buffer.
The consumer is also sleeping and not aware with the fact that the producer will wake him up.
This is a kind of deadlock where neither producer nor consumer is active and waiting for each other to wake them up.
This is a serious problem which needs to be addressed.
Introduction to Semaphores
In 1965, Dijkstra proposed a new and very significant technique for managing concurrent processes by
using the value of a simple integer variable to synchronize the progress of interacting processes. This
integer variable is called a semaphore. So it is basically a synchronizing tool and is accessed only
through two low standard atomic operations, wait and signal designated by P(S) and V(S) respectively.
In very simple words, the semaphore is a variable that can hold only a non-negative Integer value, shared
between all the threads, with operations wait and signal, which work as follow:
else S := S + 1;
wait(S)
S--;
Note:
When one process modifies the value of a semaphore then, no other process can simultaneously modify
that same semaphore's value. In the above case the integer value of S(S<=0) as well as the possible
modification that is S-- must be executed without any interruption.
Signal: Increments the value of its argument S, as there is no more process blocked on the queue.
This Operation is mainly used to control the exit of a task from the critical
section.signal() operation was originally termed as V; so it is also known as V(S) operation. The
definition of signal operation is as follows:
signal(S)
S++;
Also, note that all the modifications to the integer value of semaphore in the wait() and signal() operations
must be executed indivisibly.
Properties of Semaphores
Types of Semaphores
Process i
begin
P(mutex);
execute CS;
V(mutex);
End;
Advantages of Semaphores
Disadvantages of Semaphores
One of the biggest limitations is that semaphores may lead to priority inversion; where low
priority processes may access the critical section first and high priority processes may access the
critical section later.
To avoid deadlocks in the semaphore, the Wait and Signal operations are required to be executed
in the correct order.
Using semaphores at a large scale is impractical; as their use leads to loss of modularity and this
happens because the wait() and signal() operations prevent the creation of the structured layout for
the system.
Their use is not enforced but is by convention only.
With improper use, a process may block indefinitely. Such a situation is called Deadlock. We will
be studying deadlocks in detail in coming lessons.
Mutex:-
A mutex is a binary variable whose purpose is to provide locking mechanism. It is used to provide
mutual exclusion to a section of code, means only one process can work on a particular code section at
a time.
There is misconception that binary semaphore is same as mutex variable but both are different in the
sense that binary semaphore apart from providing locking mechanism also provides two atomic
operation signal and wait, means after releasing resource semaphore will provide signaling mechanism
for the processes who are waiting for the resource.
Mutex is locking mechanism in OS
difference between mutex and semaphore:-
1. Mutex is used for thread but semaphore is used for process.
2.mutex is work in userspace but semaphore is work in kernal space.
3.mutex is locking mechanism ownership mathead but seamphore is signalling mechanism/didn't
ownership.
4.thread to thread mutex is used but process to process locking mechanism semaphore is used.
The full form of Mutex is Mutual Exclusion Object. It is a special type of binary semaphore which used
for controlling access to the shared resource. It includes a priority inheritance mechanism to avoid
extended priority inversion problems. It allows current higher priority tasks to be kept in the blocked state
for the shortest time possible. However, priority inheritance does not correct priority- inversion but only
minimizes its effect.
KEY DIFFERENCE
Mutex is a locking mechanism whereas Semaphore is a signaling mechanism
Mutex is just an object while Semaphore is an integer
Mutex has no subtype whereas Semaphore has two types, which are counting
semaphore and binary semaphore.
Semaphore supports wait and signal operations modification, whereas Mutex is
only modified by the process that may request or release a resource.
Semaphore value is modified using wait () and signal () operations, on the other
hand, Mutex operations are locked or unlocked.
Use of Semaphore
In the case of a single buffer, we can separate the 4 KB buffer into four 1 KB buffers. Semaphore can be
associated with these four buffers. This allows users and producers to work on different buffers at the
same time.
Use of Mutex
A mutex provides mutual exclusion, which can be either producer or consumer that can have the key
(mutex) and proceed with their work. As long as producer fills buffer, the user needs to wait, and vice
versa. In Mutex lock, all the time, only a single thread can work with the entire buffer.
Difference between Semaphore vs. Mutex
If no resource is free, then the process requires a resource If it is locked, the process has to wait. T
Resource
that should execute wait operation. It should wait until the should be kept in a queue. This needs to
management
count of the semaphore is greater than 0. only when the mutex is unlocked.
Value can be changed by any process releasing or obtaining Object lock is released only by the proce
Ownership
the resource. obtained the lock on it.
It is occupied if all resources are being used and the process In case if the object is already locked, th
Resources
requesting for resource performs wait () operation and requesting resources waits and is queued
Occupancy
blocks itself until semaphore count becomes >1. system before lock is released.
Advantages of Mutex
Here, are important pros/benefits of Mutex
Mutexes are just simple locks obtained before entering its critical section and then releasing it.
Since only one thread is in its critical section at any given time, there are no race conditions, and
data always remain consistent.
Disadvantage of Semaphores
Here, are cons/drawback of semaphore
One of the biggest limitations of a semaphore is priority inversion.
The operating system has to keep track of all calls to wait and signal semaphore.
Their use is never enforced, but it is by convention only.
In order to avoid deadlocks in semaphore, the Wait and Signal operations require to be executed
in the correct order.
Semaphore programming is a complex method, so there are chances of not achieving mutual
exclusion.
It is also not a practical method for large scale use as their use leads to loss of modularity.
Semaphore is more prone to programmer error.
It may cause deadlock or violation of mutual exclusion due to programmer error.
Disadvantages of Mutex
Here, are cons/drawback of Mutex
If a thread obtains a lock and goes to sleep or it is preempted, then the other thread may not able to
move forward. This may lead to starvation.
It can't be locked or unlocked from a different context than the one that acquired it.
Only one thread should be allowed in the critical section at a time.
The normal implementation may lead to busy waiting state, which wastes CPU time.
Syntax of monitor
Condition Variables
There are two types of operations that we can perform on the condition variables of the monitor:
1. Wait
2. Signal
3. Suppose there are two condition variables
4. sa
5. Wait Operation
6. a.wait(): – The process that performs wait operation on the condition variables are suspended and
locate the suspended process in a block queue of that condition variable.
7. Signal Operation
8. a.signal() : – If a signal operation is performed by the process on the condition variable, then a chance
is provided to one of the blocked processes.
9. Advantages of Monitor
10. It makes the parallel programming easy, and if monitors are used, then there is less error-prone as
compared to the semaphore.
11. condition a, b // Declaring variable
Monitors Semaphore
In monitors, wait always block the caller. In semaphore, wait does not always block the caller.
Condition variables are present in the monitor. Condition variables are not present in the semaphore.
Barrier
In parallel computing, a barrier is a type of synchronization method. A barrier for a group of threads or processes in
the source code means any thread/process must stop at this point and cannot proceed until all other
threads/processes reach this barrier.
Many collective routines and directive-based parallel languages impose implicit barriers. For example, a parallel do
loop in Fortran with OpenMP will not be allowed to continue on any thread until the last iteration is completed.
This is in case the program relies on the result of the loop immediately after its completion. In message passing,
any global communication (such as reduction or scatter) may imply a barrier.
Dynamic barriers
Classic barrier constructs define the set of participating processes/threads statically. This is usually done either at
program startup or when a barrier like the Pthreads barrier is instantiated. This restricts the possible applications for
which barriers can be used.
To support more dynamic programming paradigms like fork/join parallelism, the sets of participants have to be
dynamic. Thus, the set of processes/threads participating in a barrier operation needs to be able to change over
time. X10* introduced the concept of clocks for that purpose, which provide a dynamic barrier semantic. Building
on clocks, phasers have been proposed to add even more flexibility to barrier synchronization. With phasers it is
possible to express data dependencies between the participating processes explicitly to avoid unnecessary over-
synchronization.
A barrier can also be a high-level programming language statement which prevents the compiler from reordering
other operations over the barrier statement during optimization passes. Such statements can potentially generate
processor barrier instructions. Different classes of barrier exist and may apply to a specific set of operations only.
Message passing:
Process communication is the mechanism provided by the operating system that allows processes to
communicate with each other. This communication could involve a process letting another process know
that some event has occurred or transferring of data from one process to another. One of the models of
process communication is the message passing model.
Message passing model allows multiple processes to read and write data to the message queue without
being connected to each other. Messages are stored on the queue until their recipient retrieves them.
Message queues are quite useful for interprocess communication and are used by most operating systems.
A diagram that demonstrates message passing model of process communication is given as follows −
In the above diagram, both the processes P1 and P2 can access the message queue and store and retrieve
data.
The problem was designed to illustrate the problem of avoiding deadlock, a system state in which no
progress is possible.
One idea is to instruct each philosopher to behave as follows:
• think until the left fork is available; when it is, pick it up
• think until the right fork is available; when it is, pick it up
• eat
• put the left fork down
• put the right fork down
• repeat from the start
This solution is incorrect: it allows the system to reach deadlock. Suppose that all five philosophers
take their left forks simultaneously. None will be able to take their right forks, and there will be a
deadlock.
We could modify the program so that after taking the left fork, the program checks to see if the right
fork is available. If it is not, the philosopher puts down the left one, waits for some time, and then
repeats the whole process. This proposal too, fails, although for a different reason. With a little bit of
bad luck, all the philosophers could start the algorithm simultaneously, picking up their left forks,
seeing that their right forks were not available, putting down their left forks, waiting, picking up their
left forks again simultaneously, and so on, forever. A situation like this, in which all the programs
continue to run indefinitely but fail to make any progress is called starvation
The solution presented below is deadlock-free and allows the maximum parallelism for an arbitrary
number of philosophers. It uses an array, state, to keep track of whether a philosopher is eating,
thinking, or hungry (trying to acquire forks). A philosopher may move into eating state only if neither
neighbor is eating. Philosopher i's neighbors are defined by the macros LEFT and RIGHT. In other
words, if i is 2, LEFT is 1 and RIGHT is 3.
Solution:
#define N 5 /* number of philosophers */
#define LEFT (i+N-1)%N /* number of i's left neighbor */
#define RIGHT (i+1)%N /* number of i's right neighbor */
#define THINKING 0 /* philosopher is thinking */
#define HUNGRY 1 /* philosopher is trying to get forks */
#define EATING 2 /* philosopher is eating */
typedef int semaphore; /* semaphores are a special kind of int */
int state[N]; /* array to keep track of everyone's state */
semaphore mutex = 1; /* mutual exclusion for critical regions */
semaphore s[N]; /* one semaphore per philosopher */
void philosopher(int i) /* i: philosopher number, from 0 to N1 */
{
while (TRUE){ /* repeat forever */
think(); /* philosopher is thinking */
take_forks(i); /* acquire two forks or block */
eat(); /* yum-yum, spaghetti */
put_forks(i); /* put both forks back on table */
}
}
void take_forks(int i) /* i: philosopher number, from 0 to N1 */
{
down(&mutex); /* enter critical region */
state[i] = HUNGRY; /* record fact that philosopher i is hungry */
test(i); /* try to acquire 2 forks */
up(&mutex); /* exit critical region */
down(&s[i]); /* block if forks were not acquired */
}
void put_forks(i) /* i: philosopher number, from 0 to N1 */
{
down(&mutex); /* enter critical region */
state[i] = THINKING; /* philosopher has finished eating */
test(LEFT); /* see if left neighbor can now eat */
test(RIGHT); /* see if right neighbor can now eat */
up(&mutex); /* exit critical region */
}
void test(i) /* i: philosopher number, from 0 to N1* /
{
if (state[i] == HUNGRY && state[LEFT] != EATING && state[RIGHT] != EATING) {
state[i] = EATING;
up(&s[i]);
}
}