01-da24-Introduction
01-da24-Introduction
Prof R. Guerraoui
© R. Guerraoui 1
2
The history of algorithms
3
What is an algorithm?
An ordered set of elementary instructions
Really?
4
In short
We study algorithms for distributed systems
5
Distributed algorithms
E. Dijkstra (concurrent os)~60’s
7
Overview
8
A distributed system
A C
9
Clients-server
Client A
Client B Server
10
Multiple servers
(genuine distribution – P2P -
decentralization)
Server B
Server A Server C
11
The optimistic view
12
The pessimistic view
13
Distributed algorithms
(Today: Google)
14
Satoshi Nakamoto
(2008) Nick Szabo
2009: 0.005 $
2016: 600 $
2020: 10000 $
15
Overview
16
Distributed systems
17
Distributed systems
18
Content of this course
Reliable broadcast
Causal order broadcast
Shared memory
Consensus
Total order broadcast
Atomic commit
Terminating reliable broadcast
……
19
Reliable distributed services
Example 1: reliable broadcast
Ensure that a message sent to a
group of processes is received
(delivered) by all or none
Example 2: atomic commit
Ensure that the processes reach a
common decision on whether to
commit or abort a transaction
20
Underlying services
(1): processes (abstracting computers)
21
Processes
§ The distributed system is made of a finite
set of processes: each process models a
sequential program
§ Processes are denoted p1,..pN or p, q, r
§ Processes have unique identities and
know each other
§ Every pair of processes is connected by a
link through which the processes
exchange messages
22
Processes
A process executes a step at every tick of its
local clock: a step consists of
A local computation (local event) and
message exchanges with other processes
(global event)
23
Processes
The program of a process is made of a finite
set of modules (or components) organized as
a software stack
Modules within the same process interact by
exchanging events
upon event < Event1, att1, att2,..> do
// something
trigger < Event2, att1, att2,..>
24
Modules of a process
indication
request (deliver)
indication
request (deliver)
indication
request (deliver)
25
Overview
(1) Why? Motivation
26
Approach
Specifications: What is the service?
i.e., the problem ~ liveness + safety
Assumptions: What is the model, i.e.,
the power of the adversary?
Algorithms: How do we implement the
service? Where are the bugs (proof)?
What cost (complexity)?
27
Overview
(1) Why? Motivation
28
Liveness and safety
Safety is a property which states that
nothing bad should happen
Liveness is a property which states
that something good should happen
Any specification can be expressed in
terms of liveness and safety
properties (Lamport and Schneider)
29
Liveness and safety
30
Specifications
Example 1: reliable broadcast
Ensure that a message sent to a
group of processes is received by all
or none
Example 2: atomic commit
Ensure that the processes reach a
common decision on whether to
commit or abort a transaction
31
Overview
32
Overview
34
Processes
35
Processes
36
Processes and channels
37
Fair-loss links
38
Stubborn links
39
Algorithm (sl)
Implements: StubbornLinks (sp2p).
Uses: FairLossLinks (flp2p).
upon event < sp2pSend, dest, m> do
while (true) do
trigger < flp2pSend, dest, m>;
upon event < flp2pDeliver, src, m> do
trigger < sp2pDeliver, src, m>;
40
Reliable (Perfect) links
Properties
PL1. Validity: If pi and pj are correct,
then every message sent by pi to pj is
eventually delivered by pj
PL2. No duplication: No message is
delivered (to a process) more than once
PL3. No creation: No message is
delivered unless it was sent
41
Algorithm (pl)
Implements: PerfectLinks (pp2p).
Uses: StubbornLinks (sp2p).
upon event < Init> do delivered := Æ;
upon event < pp2pSend, dest, m> do
trigger < sp2pSend, dest, m>;
upon event < sp2pDeliver, src, m> do
if m Ï delivered then
trigger < pp2pDeliver, src, m>;
add m to delivered;
42
Reliable links
We shall assume reliable links (also called
perfect) throughout this course (unless
specified otherwise)
43
Overview
(1) Why? Motivation
(2) Where? Between the network and the
application
(3) How? (3.1) Specifications, (3.2)
assumptions, and (3.3) algorithms
3.2.1 Processes and links
3.2.2 Failure Detection
44
Failure detection
45
Failure detection
46
Failure detection
Perfect:
Strong Completeness: Eventually, every process that
crashes is permanently suspected by every correct process
Strong Accuracy: No process is suspected before it crashes
Eventually Perfect:
Strong Completeness
Eventual Strong Accuracy: Eventually, no correct process is
ever suspected
47
Failure detection
Algorithm:
(1) Processes periodically send heartbeat messages
(2) A process sets a timeout based on worst case round
trip of a message exchange
(3) A process suspects another process if it timeouts
that process
(4) A process that delivers a message from a suspected
process revises its suspicion and doubles its time-out
48
Timing assumptions
Synchronous:
Processing: the time it takes for a process to execute
a step is bounded and known
Delays: there is a known upper bound limit on the
time it takes for a message to be received
Clocks: the drift between a local clock and the global
real time clock is bounded and known
Eventually Synchronous: the timing
assumptions hold eventually
Asynchronous: no assumption
49
Overview
50
Algorithmic
modules of a process
indication
request (deliver)
indication
request (deliver)
indication
request (deliver)
51
Algorithms (representation)
p1
m3
m1
p2
m2
p3
52
Algorithms (representation)
p1
m1
p2
m2
p3 crash
53
For every abstraction
54
Content of the course
Reliable broadcast
Causal order broadcast
Shared memory
Consensus
Total order broadcast
Atomic commit
Leader election
Terminating reliable broadcast
View synchronous broadcast
55