System Recovery
System Recovery
Recovery
Recovery (cont.)
System failure:
System does not meet requirements, i.e.does not perform its
services as specified
Fault:
3
Classification of failures
Process failure:
Behaviour: process causes system state to deviate
etc.
Recovery/Design strategies:
Reconstruct content from archive + log of
activities
Design mirrored disk system
Communication Medium Failure:
Behaviour: a site cannot communicate with
another operational site
Errors/Faults: failure of switching nodes or
communication links
Recovery/Design Strategies: reroute, errorresistant communication protocols
5
error-free state
Approaches to failure recovery:
Forward-error recovery:
Remove errors in process/system state (if errors
can be completely assessed)
Continue process/system forward execution
Backward-error recovery:
Restore process/system to previous error-free
state and restart from there
Main memory
Bring object to MM
to be accessed
Approaches:
stable
storage
Storage that
maintains
information in
the event of
system failure
(1.a) Updating-in-place
Principle: every update (write) operation to an object
creates a log in stable storage that can be used to
undo and redo the operation
Log content: object name, old object state, new
object state
Implementation of a recoverable update operation:
Do operation:update object and write log record
Undo operation: log(old) -> object (undoes the
action performed by a do)
Redo operation: log(new) -> object (redoes the
action performed by a do)
Display operation: display log record (optional)
Problem: a do cannot be recovered if system
crashes after write object but before log record write
(1.b) The write-ahead log protocol
Principle: write log record before updating object
10
11
approach
Only a part of the system state is saved to
minimize recovery
When an object is modified, page containing
object is first copied on stable storage
(shadow page)
If process successfully commits: shadow
page discarded and modified page is made
part of the database
If process fails: shadow page used and the
modified page discarded
12
13
x1
x2
m
Y
Z
x3
y1
z1
y2
z2
Time
Domino Effect
Lost messages
X
x1
m
Failure
y1
Time
15
Problem of livelock
x1
n1
m1
y1
(a)
X
Y
(a)
(b)
16
x1
y1
n1
Failure
Time
n2
m2
Time
1.
17
Consistency of Checkpoint
2. Consistent set of checkpoints
x2
[
y1
y2
Strongly consistent
z1
consistent
z2
Checkpoint/Recovery Algorithm
Synchronous
Asynchronous
Preliminary (Assumption)
Goal
Synchronous Checkpointing
Assumptions
tentative checkpoint :
a temporary checkpoint
a candidate for permanent checkpoint
permanent checkpoint :
Checkpoint Algorithm
Synchronous Checkpointing
Algorithm
1.
2.
3.
4.
5.
6.
Supplement
Once a process has taken a tentative checkpoint, it shouldnt
send messages until it is informed of initiators decision.
Diagram of Checkpoint
Algorithm
Synchronous Checkpointing
Tentative
checkpoint
Initiator
decide to commit
|
request to
take a
tentative
checkpoint
[
consistent global checkpoint
OK
permanent checkpoint
[
Unnecessary checkpoint
Optimized Algorithm
Synchronous Checkpointing
Each message is labeled by order of sending
Labeling Scheme
[
x2
x3
: smallest label
y1
y2
: largest label
[
Y
y2
last_label_rcvdX[Y] :
the last message that X received from Y after X has taken its
last permanent or tentative checkpoint. if not exists, is in it.
first_label_sentX[Y] x2
:
the first message that X sent to Y after X took its last
permanent or tentative checkpoint . if not exists, is in it.
ckpt_cohortX :
the set of all processes that may have to take checkpoints
when X decides to take a checkpoint.
Checkpoint request need to be sent to only the processes
included in ckpt_cohort
Optimized Algorithm
Synchronous Checkpointing
[
first_label_sentY[X]
Optimized Algorithm
Synchronous Checkpointing
Algorithm
1.
2.
3.
4.
5.
6.
7.
Correctness
Synchronous Checkpointing
Recovery Algorithm
Synchronous Recovery
Labeling Scheme
: smallest label
: largest label
last_label_rcvdX[Y] :
the last message that X received from Y after X has taken its
last permanent or tentative checkpoint. If not exists, is in it.
first_label_sentX[Y] :
the first message that X sent to Y after X took its last
permanent or tentative checkpoint . If not exists, is in it.
roll_cohortX :
the set of all processes that may have to roll back to the
latest checkpoint when process X rolls back.
last_label_sentX[Y] :
the last message that X sent to Y before X takes its latest
permanent checkpoint. If not exist, is in it.
Recovery Algorithm
Synchronous Recovery
Recovery Algorithm
Synchronous Recovery
Algorithm
1.
2.
3.
4.
5.
6.
Diagram of Synchronous
Recovery
x1
x2
Failure
X
X
Y
y1
z1
Unnecessary Rollback
31
y2
z2
Asynchronous Checkpoint
Characteristic
Preliminary (Assumptions)
Asynchronous Checkpoint / Recovery
Goal
To find the latest consistent set of checkpoints
Assumptions
volatile log :
quick access
lost if the corresponding processor fails
stable log :
slow access
not lost even if processors fail
Preliminary (Definition)
Asynchronous Checkpoint / Recovery
Definition
CkPti : the checkpoint (stable log) that i rolled back to
when failure occurs
RCVDij (CkPti / e ) :
the number of messages received by processor i from
processor j, per the information stored in the checkpoint
CkPti or event e.
SENTij(CkPti / e ) :
the number of messages sent by processor i to processor
j, per the information stored in the checkpoint CkPti or
event e
Recovery Algorithm
Asynchronous Checkpoint / Recovery
Algorithm
1.
2.
3.
4.
5.
6.
Asynchronous Recovery
X:Y
X
Ex0
x1
Ex1
Ex2
Ex3
3 <= 2
2
(Y,2)
Ey0
Ey1
Ey2
y1
Ey3
Ez1
Y:X
1 <= 2
(X,0)
(Y,1)
Ez0
(X,2) (Z,0)
X:Z
0 <= 0
Y:Z
1 <= 1
(Z,1)
Ez2
z1
Z:X
0 <= 0
Z:Y
1
2 <= 1