06 Consensus
06 Consensus
Systems
CPSC 5520 Consensus
Kevin Lundeen
Consensus Protocols
• Consensus
• ”Where everyone agrees”
• Used to describe distributed systems behavior on replication, especially in
the face of failures
• Especially state machine replication, i.e., log replication
• Log Replication
• We used a logical clock last week to implement an algorithm that successfully
replicated logs
• Intolerant of failures
• Intolerant of dynamic addition/removal of nodes
• Requires reliable ordered messaging
• O(n2) messages per log write
• We’ll study Raft next to overcome most of these issues (textbook has Paxos)
• Even more robust (handling Byzantine failures) is PBFT
Raft
- Diego Ongaro
• Raft cluster
• small number of servers (five is typical)
• can tolerate a minority of servers failing simultaneously (two can fail
simultaneously in five-node Raft cluster)
• Each server is one of three states:
1. Leader – sole leader that handles all client communications
2. Follower – most servers merely respond to RPCs from leader and
candidates
3. Candidate – a server that noticed absence of leader and is trying to elect
itself to be leader
• Each leader leads for a term
• Term numbers are incremented at each new election
• Once elected a candidate becomes sole leader for that term
• All entries for a term are initiated by its leader
Raft Basics §5.1
(continued)
• Elections can result in split vote
• Term ends with no leader (and no entries in log)
• Another election ensues
• Terms act as a logical clock
• Current term is communicated in all RPCs
• if one server’s current term is smaller than the other’s, then it updates its
current term to the larger value
• if a candidate or leader discovers that its term is out of date, it immediately
reverts to follower state
• if a server receives a request with a stale term number, it rejects the request
• Communication is via RPCs
• RequestVote – candidate to all others
• AppendEntries – leader to all followers
• Failed RPCs are retried
• RPCs done in parallel for best performance
Leader Election §5.2
§5.4.1
A is more up-to-date than B iff:
1. A’s current term is greater than B’s, or
2. A and B have the same current term, but A has a longer log than B
Log Replication §5.3