Slides 03
Slides 03
(3rd Edition)
Introduction to threads
Basic idea
We build virtual processors in software, on top of physical
processors:
Processor: Provides a set of instructions along with the capability
of automatically executing a series of those
instructions.
Thread: A minimal software processor in whose context a series
of instructions can be executed. Saving a thread
context implies stopping the current execution and
saving all the data needed to continue the execution at
a later stage.
Process: A software processor in whose context one or more
threads may be executed. Executing a thread, means
executing a series of instructions in the context of
that thread.
2 / 47
Processes: Threads Introduction to threads
Context switching
Contexts
► Processor context: The minimal collection of values stored in
the registers of a processor used for the execution of a series
of instructions (e.g., stack pointer, addressing registers,
program counter).
3 / 47
Processes: Threads Introduction to threads
Context switching
Contexts
► Processor context: The minimal collection of values stored in
the registers of a processor used for the execution of a series
of instructions (e.g., stack pointer, addressing registers,
program counter).
► Thread context: The minimal collection of values stored in
registers and memory, used for the execution of a series
of instructions (i.e., processor context, state).
3 / 47
Processes: Threads Introduction to threads
Context switching
Contexts
► Processor context: The minimal collection of values stored in
the registers of a processor used for the execution of a series
of instructions (e.g., stack pointer, addressing registers,
program counter).
► Thread context: The minimal collection of values stored in
registers and memory, used for the execution of a series
of instructions (i.e., processor context, state).
► Process context: The minimal collection of values stored in
registers and memory, used for the execution of a thread
(i.e., thread context, but now also at least MMU register
values).
3 / 47
Processes: Threads Introduction to threads
Context switching
Observations
1. Threads share the same address space. Thread context
switching can be done entirely independent of the operating
system.
2. Process switching is generally (somewhat) more expensive as
it involves getting the OS in the loop, i.e., trapping to the
kernel.
3. Creating and destroying threads is much cheaper than doing
so for processes.
4 / 47
Processes: Threads Introduction to threads
Operating system
Trade-offs
► Threads use the same address space: more prone to errors
► No support from OS/HW to protect threads using each
other’s memory
► Thread context switching may be faster than process context
6 / 47
Processes: Threads Introduction to threads
Main issue
Should an OS kernel provide threads, or should they be
implemented as user-level packages?
User-space solution
► All operations can be completely handled within a single process
⇒ implementations can be extremely efficient.
process
► All in which
services a thread
provided by the kernel⇒
resides if the
are donekernel decides
on behalf to
of the
a thread, the entire process will be blocked.
block
► Threads are used when there are lots of external events: threads
block on a per-event basis ⇒ if the kernel can’t
distinguish threads, how can it support signaling events
to them?
Thread implementation 8 / 47
Processes: Threads Introduction to threads
Kernel solution
The whole idea is to have the kernel contain the implementation of a
thread package. This means that all operations return as system
calls:
► Operations that block a thread are no longer a problem: the
kernel schedules another available thread within the same
process.
► handling external events is simple: the kernel (which catches
all events) schedules the thread associated with the event.
► The problem is (or used to be) the loss of efficiency due to
the fact that each thread operation requires a trap to the
kernel.
Conclusion – but
Try to mix user-level and kernel-level threads into a single
Thread implementation 9 / 47
Processes: Threads Introduction to threads
Lightweight processes
Basic idea
Introduce a two-level threading approach: lightweight processes
that can execute user-level threads.
Thread state
User space
Thread
Lightweight process
Kernel space
Thread implementation 10 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
► User-level thread does system call ⇒ the LWP that is
executing that thread, blocks. The thread remains bound to
the LWP.
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
► User-level thread does system call ⇒ the LWP that is
that thread, blocks. The thread remains bound to the LWP.
executing
► The kernel can schedule another LWP having a runnable
thread bound to it. Note: this thread can switch to any other
runnable thread currently in user space.
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
► User-level thread does system call ⇒ the LWP that is
that thread, blocks. The thread remains bound to the LWP.
executing
► The kernel can schedule another LWP having a runnable
thread bound to it. Note: this thread can switch to any other
runnable thread currently in user space.
► A thread calls a blocking user-level operation ⇒ do context
switch to a runnable thread, (then bound to the same LWP).
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
► User-level thread does system call ⇒ the LWP that is
that thread, blocks. The thread remains bound to the LWP.
executing
► The kernel can schedule another LWP having a runnable
thread bound to it. Note: this thread can switch to any other
runnable thread currently in user space.
► A thread calls a blocking user-level operation ⇒ do context
to a runnable thread, (then bound to the same LWP).
switch
► When there are no threads to schedule, an LWP may remain
idle, and may even be removed (destroyed) by the kernel.
Thread implementation 11 / 47
Processes: Threads Introduction to threads
Lightweight processes
Principle operation
► User-level thread does system call ⇒ the LWP that is
that thread, blocks. The thread remains bound to the LWP.
executing
► The kernel can schedule another LWP having a runnable
thread bound to it. Note: this thread can switch to any other
runnable thread currently in user space.
► A thread calls a blocking user-level operation ⇒ do context
to a runnable thread, (then bound to the same LWP).
switch
► When there are no threads to schedule, an LWP may remain
idle, and may even be removed (destroyed) by the kernel.
Note
This concept has been virtually abandoned – it’s just either
user-level or kernel-level threads.
Thread implementation 11 / 47
Processes: Threads Threads in distributed systems
N i· i
TLP ∑i=1
= 1 −cc0
Multithreaded clients 13 / 47
Processes: Threads Threads in distributed systems
N i· i
TLP ∑i=1
= 1 −cc0
Practical measurements
A typical Web browser has a TLP value between 1.5 and 2.5 ⇒
threads are primarily used for logically organizing browsers.
Multithreaded clients 13 / 47
Processes: Threads Threads in distributed systems
Improve performance
► Starting a thread is cheaper than starting a new process.
► Having a single-threaded server prohibits simple scale-up to
a multiprocessor system.
► As with clients: hide network latency by reacting to next request
while previous one is being replied.
Better structure
► Most servers have high I/O demands. Using simple,
well-understood blocking calls simplifies the overall structure.
► Multithreaded programs tend to be smaller and easier
to understand due to simplified flow of control.
Multithreaded servers 14 / 47
Processes: Threads Threads in distributed systems
Worker thread
Request coming in
from the network
Operating system
Overview
Model Characteristics
Multithreading Parallelism, blocking system calls
Single-threaded process No parallelism, blocking system calls
Finite-state machine Parallelism, nonblocking system calls
Multithreaded servers 15 / 47
Processes: Virtualization Principle of virtualization
Virtualization
Observation
Virtualization is important:
► Hardware changes faster than software
► Ease of portability and code migration
► Isolation of failing or attacked components
Interface A
Program Implementation of
mimicking A on B
Interface A Interface B
16 / 47
Processes: Virtualization Principle of virtualization
Mimicking interfaces
Types of virtualization 17 / 47
Processes: Virtualization Principle of virtualization
Ways of virtualization
(a) Process VM, (b) Native VMM, (c) Hosted
VMM
Application/Libraries
Differences
(a) Separate set of instructions, an interpreter/emulator, running
atop an OS.
(b) Low-level instructions, along with bare-bones minimal
operating system
Processes: Virtualization Principle of virtualization
Special instructions
► Control-sensitive instruction: may affect configuration of a
machine (e.g., one affecting relocation register or interrupt
table).
► Behavior-sensitive instruction: effect is partially determined by
context (e.g., POPF sets an interrupt-enabled flag, but only in
system mode).
Types of virtualization 19 / 47
Processes: Virtualization Principle of virtualization
Solutions
► Emulate all instructions
► Wrap nonprivileged sensitive instructions to divert control to
VMM
► Paravirtualization: modify guest OS, either by preventing
nonprivileged sensitive instructions, or making them
nonsensitive (i.e., changing the context).
Types of virtualization 20 / 47
Processes: Virtualization Application of virtual machines to distributed systems
IaaS
Instead of renting out a physical machine, a cloud provider will rent
out a VM (or VMM) that may possibly be sharing a physical machine
with other customers ⇒ almost complete isolation between
customers (although performance isolation may not be reached).
21 / 47
Processes: Clients Networked user interfaces
Client-server interaction
Network Network
22 / 47
Processes: Clients Networked user interfaces
Xlib Xlib
X kernel
Device drivers
Xlib Xlib
X kernel
Device drivers
Improving X
Practical observations
► There is often no clear separation between application logic
and user-interface commands
► Applications tend to operate in a tightly synchronous manner
with an X kernel
Alternative approaches
► Let applications control the display completely, up to the
pixel level (e.g., VNC)
► Provide only a few high-level display operations (dependent
on local video drivers), allowing more efficient display
operations.
Client-side software
Generally tailored for distribution transparency
► Access transparency: client-side stubs for RPCs
► Location/migration transparency: let client-side software
keep track of actual location
► Replication transparency: multiple invocations handled by client
stub:
Client machine Server 1 Server 2 Server 3
Client Server Server Server
appl appl appl appl
25 / 47
Processes: Servers General design issues
Basic model
A process implementing a specific service on behalf of a collection of
clients. It waits for an incoming request from a client and
subsequently ensures that the request is taken care of, after which it
waits for the next incoming request.
26 / 47
Processes: Servers General design issues
Concurrent servers
Observation
Concurrent servers are the norm: they can easily handle multiple
requests, notably in the presence of blocking operations (to disks
or other servers).
Contacting a server
Observation: most services are tied to a specific port
ftp-data 20 File Transfer [Default Data]
ftp 21 File Transfer [Control]
telnet 23 Telnet
smtp 25 Simple Mail Transfer
www 80 Web (HTTP)
Out-of-band communication
Issue
Is it possible to interrupt a server once it has accepted (or is in
the process of accepting) a service request?
Interrupting a server 29 / 47
Processes: Servers General design issues
Out-of-band communication
Issue
Is it possible to interrupt a server once it has accepted (or is in
the process of accepting) a service request?
Interrupting a server 29 / 47
Processes: Servers General design issues
Out-of-band communication
Issue
Is it possible to interrupt a server once it has accepted (or is in
the process of accepting) a service request?
Interrupting a server 29 / 47
Processes: Servers General design issues
Consequences
► Clients and servers are completely independent
► State inconsistencies due to client or server crashes are reduced
► Possible loss of performance because, e.g., a server
cannot anticipate client behavior (think of prefetching file
blocks)
Consequences
► Clients and servers are completely independent
► State inconsistencies due to client or server crashes are reduced
► Possible loss of performance because, e.g., a server
cannot anticipate client behavior (think of prefetching file
blocks)
Stateful servers
Keeps track of the status of its clients:
► Record that a file has been opened, so that prefetching can
be done
► Knows which data a client has cached, and allows clients to
keep local copies of shared data
Stateful servers
Keeps track of the status of its clients:
► Record that a file has been opened, so that prefetching can
be done
► Knows which data a client has cached, and allows clients to
keep local copies of shared data
Observation
The performance of stateful servers can be extremely high,
provided clients are allowed to keep local copies. As it turns out,
reliability is often not a major problem.
Dispatched
request
Client requests
Crucial element
The first tier is generally responsible for passing requests to
an appropriate server: request dispatching
Local-area clusters 32 / 47
Processes: Servers Server clusters
Request Handling
Observation
Having the first tier handle all communication from/to the cluster
may lead to a bottleneck.
Request
Request
Client Switch (handed off)
Server
Local-area clusters 33 / 47
Processes: Servers Server clusters
Server clusters
The front end may easily get overloaded: special
measures may be needed
► Transport-layer switching: Front end simply passes the TCP
request to one of the servers, taking some performance
metric into account.
► Content-aware distribution: Front end reads the content of
the request and then selects the best server.
Other messages
Dis-
Client Switch 4. Inform patcher
Setup request switch
1. Pass setup request Distributor
2. Dispatcher selects
to a distributor server
Application
Local-area clusters server 34 / 47
Processes: Servers Server clusters
Client transparency
To keep client unaware of distribution, let DNS resolver act on behalf
of client. Problem is that the resolver may actually be far from local
Wide-area clusters 35 / 47
Processes: Servers Server clusters
Internet
Wide-area clusters 36 / 47
Processes: Servers Server clusters
Wide-area clusters 37 / 47
Processes: Servers Server clusters
Wide-area clusters 37 / 47
Processes: Servers Server clusters
Example: PlanetLab
Essence
Different organizations contribute machines, which they
subsequently share for various experiments.
Problem
We need to ensure that different distributed applications do not get
into each other’s way ⇒ virtualization
Process
Process
Process
Process
Process
Process
Process
Process
Process
/usr
/usr
/usr
/usr
/usr
/dev
/home
/proc
/dev
/home
/proc
/dev
/home
/proc
/dev
/home
/proc
/dev
/home
/proc
Vserver Vserver Vserver Vserver Vserver
Hardware
Vserver
Independent and protected environment with its own libraries, server
versions, and so on. Distributed applications are assigned a
collection of vservers distributed across multiple machines
Case study: PlanetLab 39 / 47
Processes: Servers Server clusters
Node
Vserver
Code repository
code code
CS exec exec*
resource resource
code code
REV −→ exec −→ exec*
resource resource
42 / 47
Processes: Code migration Reasons for migrating code
code code
CoD ←− exec* ←−
exec
resource resource
code code
MA exec −→ −→ exec*
resource resource resource resource
43 / 47
Processes: Code migration Reasons for migrating code
Main problem
► The target machine may not be suitable to execute the
migrated code
► The definition of process/thread/processor context is highly
dependent on local hardware, operating system and
runtime system
45 / 47
Processes: Code migration Migration in heterogeneous systems
46 / 47
Processes: Code migration Migration in heterogeneous systems
Downtime
Response time
Time
47 / 47