Synchronization Algorithms and Concurrent Programming
Synchronization Algorithms and Concurrent Programming
That
That you
you mention
mention their
their source,
source, after
after all,
all, II would
would like
like people
people to
to use
use mymy book!
book!
That
That you
you note
note that
that they
they are
are adapted
adapted from
from (or(or perhaps
perhaps identical
identical to)
to)
my
my slides,
slides, and
and note
note my
my copyright
copyright of
of this
this material.
material.
Thanks
Thanks and
and enjoy!
enjoy!
Gadi
Gadi Taubenfeld
Taubenfeld
All
All material
material copyright
copyright 2014
2014
Gadi
Gadi Taubenfeld,
Taubenfeld, All
All Rights
Rights Reserved
Reserved
5.1 Barriers
5.2 Atomic Counter
5.3 Test-and-set Bits
5.4 Combining Tree Barrier*
5.5 A Tree-based Barriers
5.6 The Dissemination Barrier*
5.7 The See-Saw Barrier
5.8 Semaphores
5.9 Bibliographic Notes*
5.10 Problems*
P1
P1 P1
P1 P1
P1
P2 P2 P2
Barrier
Barrier
Barrier
P2 P2 P2
P3
P3 P3
P3 P3
P3
P4
P4 P4
P4 P4
P4
time
begin a b c d e f
time
a+b+c a+b+c
end a a+b a+b+c a+b+c+d
d+e+ d+e+f+
Synchronization Algorithms and Concurrent Programming 7
Chapter 5
Gadi Taubenfeld © 2014
Example: Parallel Prefix Sum
begin a b c d e f
a a+b c d e f
a a+b a+b+c d e f
time
a+b+c a+b+c
end a a+b a+b+c a+b+c+d
d+e+ d+e+f+
Synchronization Algorithms and Concurrent Programming 8
Chapter 5
Gadi Taubenfeld © 2014
Example: Parallel Prefix Sum
begin a b c d e f
a+b+c a+b+c
end a a+b a+b+c a+b+c+d
d+e+ d+e+f+
Synchronization Algorithms and Concurrent Programming 9
Chapter 5
Gadi Taubenfeld © 2014
Example: Parallel Prefix Sum
begin a b c d e f
barrier
a+b+c a+b+c
end a a+b a+b+c a+b+c+d
d+e+ d+e+f+
Synchronization Algorithms and Concurrent Programming 10
Chapter 5
Gadi Taubenfeld © 2014
Example: Video
Single thread
while (true)
{
frame = prepare_next_frame();
frame.display();
}
Barrier globalBarrier;
i = getThreadID();
while (true)
{
frame[ i ].prepare();
globalBarrier.await();
frame[ i ].display();
}
Computer graphics
Garbage collections
Atomic Bit
Atomic Register
Fetch-and-increment register / atomic counter
1 local.go := go
2 local.counter := fetch-and-increment (counter)
3 if local.counter + 1 = n then
4 counter := 0
5 go := 1 - go
6 else await(local.go ≠ go) fi
counter ? go ? SM
local.go ? local.go ?
P1 P2
local.counter ? local.counter ?
1 local.go := go
2 local.counter := fetch-and-increment (counter)
3 if local.counter + 1 = n then
4 counter := 0
5 go := 1 - go
6 else await(local.go ≠ go) fi
counter 120 go 10 SM
local.go ?0 local.go ?0
P1 P2
local.counter ?0 local.counter ?1
P2 P1
1 local.go := go
2 local.counter := fetch-and-increment (counter)
2≠
=0+1
1+1
3 if local.counter + 1 = n then
4 counter := 0
5 go := 1 - go P1 Busy wait
6 else await(local.go ≠ go) fi
counter 120 go 10 SM
local.go ?0 local.go ?0
P1 P2
local.counter ?0 local.counter ?1
P2 P1
1 local.go := go Counter is “fetch-
P1: 0+1≠2
2 and-increment”
local.counter := fetch-and-increment (counter)
P2: 1+1=2
3 if local.counter + 1 = n then register
4 counter := 0
5 go := 1 - go P1 Busy wait
6 else await(local.go ≠ go) fi
Is this
1 local.counter := fetch-and-increment(counter)
implementation
?incorrect
2 if local.counter + 1 = n then
3 counter := 0
4 else await(counter = 0) fi
1 local.go := go[i]
2 local.counter := fetch-and-increment (counter)
3 if local.counter + 1 = n then
4 counter := 0
5 for j=1 to n do go[j] := 1 – go[j] od
6 else await(local.go ≠ go[i]) fi
P3 P2 P1
1 local.go := go[i]
2 local.counter := fetch-and-increment (counter)
3=≠2+1
0+1
1+1
3 if local.counter + 1 = n then
4 counter := 0
5 for j=1 to n do go[j] := 1 – go[j] od
P1,P2
P1 Busy
Busywait
wait
6 else await(local.go ≠ go[i]) fi
Input: bit b
Test-and-set is an atomic operation:
b is set to 1
the old value of b (i.e., 0 or 1) is returned
Operations supported:
Test-and-set
like a test-and-set bit
Reset
1 local.go := go
2 if test-and-set(leader) = 0 then // the leader
3 local.counter := 0
4 repeat
5 await(countflag = 1) // a test operation
6 local.counter = local.counter + 1
7 reset(countflag)
8 until (local.counter = n - 1)
9 reset(leader)
10 go := 1 – go
11 else // the other processes
12 await(test-and-set(countflag) = 0)
13 await(local.go ≠ go)
14 fi
P1 P2 P3 P4
P1 P2 P3 P4
!go
repeat
2. Asymmetric-ness
Process has a role according to its index i
Pros: saves bits and operations
Cons: different processes differ in their tasks
Pros:
Shared memory: Only bits - O(1) space
As opposed to the counter-based which requires O(log n)
Does not require memory initialization (in the second version)
Cons:
Asymmetric (in the second version)
Still high contention on countflag & go bits
2 3
4 5 6 7
Synchronization Algorithms and Concurrent Programming 42
Chapter 5
Gadi Taubenfeld © 2014
A Tree-based Barrier
1
Assume
𝑖
2 3
2 𝑖
4 5 6 7
8 9 10 11 12 13 14 15
arrive
go
2 3 4 5 6 7 8 9 10 11 12 13 14 15
Waiting for
Waiting for go[3]
p4 to arrive
go[2]
4 5 6 7
arrive 01 01 01 01 01 01
go 1 1 1 1 1 1
Chapter 5
2 3 4 5 6 7 45
A Tree-based Barrier
Pros:
Low shared memory contention
No bit is shared by more than 2 processes
Good for larger n
Fast (in comparison to local spinning)
– information from the root propagates after log(n) steps
Uses only atomic bits (no special objects)
On some models:
each process spins on a locally accessible bit
# (remote memory ref.) = O(1) per process
Cons:
Shared memory space complexity – O(n)
Asymmetric – not all the processes do the same amount of work (*)
There is a similar barrier which is symmetric, but at the cost of more )*(
. shared memory consumption -- O(nlogn) as opposed to O(n)
.See the Dissemination Barrier from Section 5.6 page 213
P2
T: 3
2
P1
T: 1
2
On right
side
On left
side
Never P7
P3 Got-off
been on T:T:22
P6P5
T:T:2P2
T:22
P1
P4 T: 0
T: 2
Applicable if:
scheduled process is “never-been-on”
Operation:
Saves the go bit locally
got on the up side, and swings the see-saw
P2 P1
T: 2 T: 2
Chapter 5 54
Rule #2 – Emitter Token-state No-token-present
Token-present
RMW
See-saw-state Left-side-down
Applicable if:
scheduled process is “down-side”, has tokens,
and token-state = no-token-present
Operation:
Deposit one token in the shared token-state
If remains without tokens, got-off the see-saw, and swing
it
P1
T: 2
P2
T: 21
Chapter 5 55
Rule #3 – Absorber Token-state No-token-present
Token-present
RMW
See-saw-state Left-side-down
Applicable if:
scheduled process is “up-side”, and
token-state = token-present
Operation:
Takes the token from token-state
P1
T: 23
P2
T: 1
Chapter 5 56
Rule #2 – Emitter Token-state No-token-present
Token-present
RMW
See-saw-state Right-side-down
Left-side-down
Applicable if:
scheduled process is “down-side”, has tokens,
and token-state = no-token-present
Operation:
Deposit one token in the shared token-state
If remains without tokens, got-off the see-saw, and swing
!
it
The process that got-off now P1
awaits the go flag T: 3
P2
T: 0
1
Chapter 5 57
Rule #4 – Leader Token-state No-token-present
Token-present
RMW
See-saw-state Right-side-down
Applicable if:
scheduled process is on the see-saw, and sees at
least 2n tokens
Operation:
Gets-off the see-saw, and flips the shared go bit
…ZZZ
P1
T: 3
P2
T: 0
Chapter 5 58
Rule #4 – Leader Token-state No-token-present
RMW
See-saw-state Right-side-down
Applicable if:
scheduled process is on the see-saw, and sees at
least 2n tokens
Operation:
Gets-off the see-saw, and flips the shared go bit
!go
…ZZZ
P1
T: 4
P2
T: 0
Chapter 5 59
Rule #5 – End of Token-state No-token-present
the Algorithm RMW
See-saw-state Right-side-down
Applicable if:
scheduled process notices that the go bit has been
flipped (relative to its local.go)
Operation:
Everybody has arrived continue past the barrier
!go
…ZZZ
P2 P1
T: 0 T: 4
Chapter 5 60
Important Invariants
Token Invariant
During a single episode of the see-saw barrier, the
number of tokens in the system
is either 2n or 2n+1 (like in the test-and-set barrier)
never changes
Balance Invariant
During a single episode of the see-saw barrier, the
number of processes on the left and on the right
side of the see-saw is
either perfectly balanced
or favored the down-side by 1
Pros:
O(1) shared memory space complexity
No need to initialize shared memory
Symmetric
Cons:
Uses custom Read-Modify-Write register
High memory contention on the RMW bits
Worst case O(n2) total shared memory
references
Complex
1 down(arrival)
2 counter := counter + 1 // atomic register
3 if counter < n then up(arrival) else up(departure) fi
4 down(departure)
5 counter := counter - 1
6 if counter > 0 then up(departure) else up(arrival) fi
:Question
Would this barrier be correct if the
?shared counter won’t be an atomic register
Synchronization Algorithms and Concurrent Programming 70
Chapter 5
Gadi Taubenfeld © 2014
Barrier using Semaphores
Properties
Pros:
Very Simple
Space complexity O(1)
Symmetric
Cons:
Required a strong object
Requires some central manager
High contention on the semaphores if no central manager
Propagation delay O(n)
Simple barrier
Based on atomic fetch-and-increment counter
Tree-based barrier
See-Saw barrier
Semaphore-based barrier