0% found this document useful (0 votes)

10 views

Unit 4_Deadlock Handling & Recovery Techniques & Failuere Classification

Uploaded by

omvati343

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Unit 4_Deadlock Handling & Recovery Techniques & Failuere Classification

Uploaded by

omvati343

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Distributed Deadlock

&
Recovery
Outlines..
• Deadlock in Distributed Systems
• Recovery in DBMS
• Advanced recovery techniques
– Shadow Paging
– Fuzzy checkpoint
– ARIES
– RAID levels
• Two Phase and Three Phase commit protocols

2
Deadlock
• A deadlock can occur because transactions wait for
one another. Informally, a deadlock situation is a set
of requests that can never be granted by the
concurrency control mechanism.
• A deadlock can be indicated by a cycle in the wait-
for-graph (WFG).
• In computer science, deadlock refers to a specific
condition when two or more processes are each
waiting for another to release a resource, or more
than two processes are waiting for resources in a
circular chain.

3
Deadlock

4
Necessary Condition for Deadlock
• Mutual exclusion: A resource may be acquired
exclusively by only one process at a time.

5
Necessary Condition for Deadlock
• Hold and wait: Processes currently holding resources
that were granted earlier can request new resources.

6
Necessary Condition for Deadlock
• No preemption: Once a process has obtained a
resources, the system cannot remove it from the
process control until the process has finished using
the resource.

7
Necessary Condition for Deadlock
• Circular wait: A circular chain of hold and wait
condition exists in the system.

8
Deadlock Detection
• Deadlock detection is the process of actually
determining that a deadlock exists and identifying
the processes and resources involved in the
deadlock.
• Detection of a cycle in WFG proceeds concurrently
with normal operation in main of deadlock
detection.

9
Deadlock Prevention
• The deadlock prevention approach does not
allow any transaction to acquire locks that will
lead to deadlocks. The convention is that
when more than one transactions request for
locking the same data item, only one of them
is granted the lock.
• One of the most popular deadlock prevention
methods is pre-acquisition of all the locks.

10
Deadlock Avoidance
• The deadlock avoidance approach handles deadlocks before they
occur. It analyzes the transactions and the locks to determine
whether or not waiting leads to a deadlock.
• There are two algorithms for this purpose, namely wait-die and
wound-wait.
• Let us assume that there are two transactions, T1 and T2, where T1
tries to lock a data item which is already locked by T2. The
algorithms are as follows −
• Wait-Die − If T1 is older than T2, T1 is allowed to wait. Otherwise, if
T1 is younger than T2, T1 is aborted and later restarted.
• Wound-Wait − If T1 is older than T2, T2 is aborted and later
restarted. Otherwise, if T1 is younger than T2, T1 is allowed to wait.

11
Recovery in DBMS
• Recovery refers to restoring a system to its normal operation
state.
• Once a failure has occurred, it is essential that the process
where the failure happened can recover to correct state.
• Following are some solution on process recovery:
1. Reclaim resources allocated to process.
2. Undo modification made to database.
3. Restart the process.
4. Or Restart the process from point of failure and resume
execution.

12
Concept of Recovery

Fault

Erroneous state Valid

Error
state

Fault

13
Concept of Recovery
• System failure: System does not meet requirements

• Erroneous system state: State which could lead to a system

failure by a sequence of valid state transitions.

• Error: the part of the system state which different from its
intended value.

• Fault: Abnormal physical condition. Eg, design errors,

manufacturing problems, damage, external disturbances.

14
Classification of Failure
• In a distributed database system, we need to deal with four
types of failures: transaction failures (aborts), site (system)
failures, media (disk) failures, and communication line
failures.
• Some of these are due to hardware and others are due to
software.
• Software failures are typically caused by “bugs” in the code.
As stated before, most of the software failures are soft
failures.

15
1 Transaction Failures
• Transactions can fail for a number of reasons. Failure can be
due to an error in the transaction caused by incorrect input
data as well as the detection of a present or potential
deadlock.

• some concurrency control algorithms do not permit a

transaction to proceed or even to wait if the data that they
attempt to access are currently being accessed by another
transaction. This might also be considered a failure.

16
2 Site (System) Failures
• A system failure is always assumed to result in the loss of
main memory contents. Therefore, any part of the database
that was in main memory buffers is lost as a result of a system
failure.
• In distributed database terminology, system failures are
typically referred to as site failures, since they result in the
failed site being unreachable from other sites in the
distributed system.
• Total failure refers to the simultaneous failure of all sites in
the distributed system
• Partial failure indicates the failure of only some sites while
the others remain operational.

17
3 Media Failures
• Media failure refers to the failures of the secondary storage
devices that store the database.
• Such failures may be due to operating system errors, as well
as to hardware faults such as head crashes or controller
failures.
• The important point from the perspective of DBMS reliability
is that all or part of the database that is on the secondary
storage is considered to be destroyed and inaccessible.

18
4 Communication Failures
• Communication failures, however, are unique to the
distributed case.
• There are a number of types of communication failures. The
most common ones are the errors in the messages,
improperly ordered messages, lost (or undeliverable)
messages, and communication line failures.
• Lost or undeliverable messages are typically the consequence
of communication line failures or (destination) site failures.
• The detection will be facilitated by the use of timers and a
timeout mechanism that keeps track of how long it has been
since the sender site has not received a confirmation from the
destination site about the receipt of a message.

19
Methods to control failure
• Failure affected transactions must be aborted.

• Site failure message is broadcasted to all sites.

• Checking must be done periodically to see whether the failed

site has recovered or not.

• After restarting the failure site, site must initiate a recovery

procedure to abort all partial transaction that were active at
the time of failure.

20
Different techniques of recoverability
• The additional components and abnormal algorithm can be
added to a system these components and algorithms attempt
to ensure that occurrences of erroneous states do not result
in later system failure ideally, they removes these errors and
restore them to “correct” states from which normal
processing can continue. These additional component and
abnormal algorithm, called recovery technique.
 Backward and Forward Error Recovery
 Log Based Recovery
 Write-Ahead Logging Protocol

21
Backward and Forward Error Recovery
Recovery

Forward-error Backward-error

Operational-base State-based

22
Backward and Forward Error Recovery
•Failure recovery: restore an erroneous state to an error-free state
•Approaches to failure recovery:
– Forward-error recovery:
• Remove errors in process/system state (if errors can be completely assessed)
• Continue process/system forward execution
– Backward-error recovery:
• Restore process/system to previous error-free state and restart from there
•Comparison: Forward vs. Backward error recovery
– Backward-error recovery
(+)Simple to implement
(+)Can be used as general recovery mechanism
(-) Performance penalty
(-) No guarantee that fault does not occur again
(-) Some components cannot be recovered
– Forward-error Recovery
(+)Less overhead
(-) Limited use, i.e. only when impact of faults understood
(-) Cannot be used as general mechanism for error recovery

23
Backward-Error Recovery: Basic approach
•Principle: restore process/system to a known, error-free “recovery point”/ “checkpoint”.
•System model:

Storage that
CPU
maintains
secondary stable information in
storage Main memory storage the event of
system failure
Bring object to MM Store logs and
to be accessed recovery points

Write object back

if modified

•Approaches:
(1) Operation-based approach
(2) State-based approach
24
(1) The Operation-based Approach
•Principle:
– Record all changes made to state of process (‘audit trail’ or ‘log’) such that process
can be returned to a previous state
– Example: A transaction based environment where transactions update a database
• It is possible to commit or undo updates on a per-transaction basis
• A commit indicates that the transaction on the object was successful and changes are
permanent
(1.a) Updating-in-place
• Principle: every update (write) operation to an object creates a log in stable storage
that can be used to ‘undo’ and ‘redo’ the operation
• Log content: object name, old object state, new object state
• Implementation of a recoverable update operation:
– Do operation: update object and write log record
– Undo operation: log(old) -> object (undoes the action performed by a do)
– Redo operation: log(new) -> object (redoes the action performed by a do)
– Display operation: display log record (optional)
• Problem: a ‘do’ cannot be recovered if system crashes after write object but before
log record write
(1.b) The write-ahead log protocol
• Principle: write log record before updating object
25
(2) State-based Approach

•Principle: establish frequent ‘recovery points’ or ‘checkpoints’ saving the entire

state of process
•Actions:
– ‘Checkpointing’ or ‘taking a checkpoint’: saving process state
– ‘Rolling back’ a process: restoring a process to a prior state

Note: A process should be rolled back to the most recent ‘recovery point’ to
minimize the overhead and delays in the completion of the process

•Shadow Pages: Special case of state-based approach

– Only a part of the system state is saved to minimize recovery
– When an object is modified, page containing object is first copied on stable
storage (shadow page)
– If process successfully commits: shadow page discarded and modified page is
made part of the database
– If process fails: shadow page used and the modified page discarded

26
Recovery in concurrent systems
• Issue: if one of a set of cooperating processes fails and has to be rolled back to a
recovery point, all processes it communicated with since the recovery point have to be
rolled back.
• Conclusion: In concurrent and/or distributed systems all cooperating processes have to
establish recovery points
• Orphan messages and the domino effect

X x1 x2 x3
m
Y y1 y2

Z z z
– Case 11: failure of X after x23 : no impact on Y or Z Time
– Case 2: failure of Y after sending msg. ‘m’
• Y rolled back to y2
• ‘m’ ≡ orphan massage
• X rolled back to x2
– Case 3: failure of Z after z2
• Y has to roll back to y1
• X has to roll back to x1 Domino Effect
• Z has to roll back to z1

27
Lost messages

X x1
m
Failure
Y y1
Time

• Assume that x1 and y1 are the only recovery points for processes X and Y, respectively
• Assume Y fails after receiving message ‘m’
• Y rolled back to y1, X rolled back to x1
• Message ‘m’ is lost

Note: there is no distinction between this case and the case where message ‘m’ is lost in
communication channel and processes X and Y are in states x1 and y1, respectively

28
Log Based Recovery
• When failures occur the following operation
that use the log are executed.

• UNDO: restore database to state prior to

execution.

• REDO: perform the changes to the database

over again.

29
UNDO

30
REDO

31
Write-Ahead Logging Protocol
• write-ahead logging (WAL) is a family of techniques for
providing atomicity and durability in database systems.
• In a system using WAL, all modifications are written to a log
before they are applied. Usually both redo and undo information
is stored in the log.
• The purpose of this can be illustrated by an example. Imagine a
program that is in the middle of performing some operation
when the machine it is running on loses power. Upon restart,
that program might well need to know whether the operation it
was performing succeeded, half-succeeded, or failed. If a write-
ahead log is used, the program can check this log and compare
what it was supposed to be doing when it unexpectedly lost
power to what was actually done. On the basis of this
comparison, the program could decide to undo what it had
started, complete what it had started, or keep things as they are
Write-Ahead Logging Protocol

33
Advanced recovery techniques
• Shadow Paging
• Fuzzy checkpoint
• ARIES
• RAID levels

34
Shadow Paging
• It is inconvenient to maintain logs of all
transactions for the purposes of recovery. An
alternative is to use a system of shadow
paging.
• This is where the database is divided into
pages that may be stored in any order on the
disk.
• In order to identify the location of any given
page, we use something called a page table.

35
Shadow Paging
• During the life of a transaction two page tables
are maintained, one called a shadow page table
and current page table.
• When a transaction begins both of these page
tables point to the same locations (are identical).
• However during the lifetime of a transaction
changes may be made update values etc. So
whenever we update a page in the database we
always write the updated page to a new location.
• This means that when we then update our
current page table to reflect the changes that
have been made.

36
Shadow Paging

37
Fuzzy checkpoint
• In a fuzzy checkpoint, the database server does not flush
the modified pages in the shared-memory buffer pool to
disk for certain types of operations, called fuzzy
operations.
• When a fuzzy checkpoint completes, the pages might not
be consistent with each other, because the database
server does not flush all data pages to disk.
• A fuzzy checkpoint completes much more quickly than a
full checkpoint and reduces the amount of physical
logging during heavy update activity. When necessary,
the database server performs a full checkpoint to ensure
the physical consistency of all data on disk.

38
Fuzzy checkpoint
• Fuzzy Operations
– Inserts
– Updates
– Deletes

• Important:
– Fuzzy checkpoints are disabled for the primary
and secondary servers in a High-Availability Data
Replication pair.

39
ARIES
• In computer science, Algorithms for Recovery
and Isolation Exploiting Semantics, or ARIES
is a recovery algorithm designed to work with
a no-force, steal database approach; it is used
by IBM DB2, Microsoft SQL Server and many
other database systems.

40
Principles lie behind ARIES
• Three main principles lie behind ARIES
• Write-ahead logging: Any change to an object is first
recorded in the log, and the log must be written to
stable storage before changes to the object are written
to disk.
• Repeating history during Redo: On restart after a
crash, ARIES retraces the actions of a database before
the crash and brings the system back to the exact state
that it was in before the crash. Then it undoes the
transactions still active at crash time.
• Logging changes during Undo: Changes made to the
database while undoing transactions are logged to
ensure such an action isn't repeated in the event of
repeated restarts.

41
ARIES LSN

42
RAID Level
Redundant Array of Inexpensive Disks
RAID Level 0 Striping
Mirroring and performance
RAID Level 1
improvements
RAID Level 2 Byte-level parity

RAID Level 3 Block-level parity

RAID Level 4 Rotating parity

RAID Level 5 Tolerates failure of two disk drives

43
RAID Level-0

file data block 0 block 1 block 2 block 3 block 4

0 block 0 0 block 1
1 block 2 1 block 3
sectors 2 block 4 2
sectors
3 3
4 4
5 5

Disk 0 Disk 1

44
RAID Level-1
file data block 0 block 1 block 2 block 3 block 4

0 block 0 0 block 0
1 block 1 1 block 1
sectors 2 block 2 2 block 2
sectors
3 block 3 3 block 3
4 block 4 4 block 4
5 5

Disk 0 Disk 1

45
RAID Level - 2
• RAID Level 2 uses concept of parallel access
technique. It works on the word(byte) level. So
each strip stores one bit. It takes data striping to
the extreme, writing only 1 bit per strip, instead
of in arbitrary size block. For this reason, it
require a minimum, of 8 surface to write data to
the hard disk.
• In RAID level 2, strip are very small, so when a
block is read, all disks are accessed in parallel.
• Hamming code generation is time consuming,
therefore RAID level 2 is too slow for most
commercial application.

46
RAID Level - 2

Error Correction Code (ECC)

47
RAID Level - 3

48
RAID Level - 4

49
RAID Level - 5

50
RAID Level - 6

51
Two Phase Commit Protocols

52
State Transitions in 2PC Protocol

53
Three Phase Commit Protocols

54
State Transitions in 3PC Protocol

Defining The Enterprise Architecture Service Catalog Dec 2018
No ratings yet
Defining The Enterprise Architecture Service Catalog Dec 2018
12 pages
Advanced Process Control: Beyond Single Loop Control
From Everand
Advanced Process Control: Beyond Single Loop Control
Cecil L. Smith
No ratings yet
Correct Answer: 10: 1. The Default Value of "Target Scope" For Static Route Is
100% (1)
Correct Answer: 10: 1. The Default Value of "Target Scope" For Static Route Is
4 pages
What3words - Technical Appraisal V1.1
No ratings yet
What3words - Technical Appraisal V1.1
22 pages
Os Unit - 4
No ratings yet
Os Unit - 4
29 pages
Technical Question Bank Operating Systems
No ratings yet
Technical Question Bank Operating Systems
17 pages
Deadlock
No ratings yet
Deadlock
44 pages
Intro To DS Chapter 6
No ratings yet
Intro To DS Chapter 6
51 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
SDA Unit 1 - Chapter Availability
No ratings yet
SDA Unit 1 - Chapter Availability
16 pages
ADB Chapter 3
No ratings yet
ADB Chapter 3
54 pages
UNIT-3 Process synchronization and deadlock
No ratings yet
UNIT-3 Process synchronization and deadlock
47 pages
OS_L14
No ratings yet
OS_L14
17 pages
OpSy 03 CH 24
No ratings yet
OpSy 03 CH 24
25 pages
SPOS_Unit 5
No ratings yet
SPOS_Unit 5
45 pages
Lecture 5 - Inter Process Communication
No ratings yet
Lecture 5 - Inter Process Communication
17 pages
Module-5 Transaction
No ratings yet
Module-5 Transaction
47 pages
Concurrency: Mutual Exclusion and Synchronization
No ratings yet
Concurrency: Mutual Exclusion and Synchronization
46 pages
OS notes
No ratings yet
OS notes
18 pages
Lesson 08
No ratings yet
Lesson 08
39 pages
M1
No ratings yet
M1
154 pages
Database Recovery Techniques
No ratings yet
Database Recovery Techniques
4 pages
Sybscit - Os-U-Iii
No ratings yet
Sybscit - Os-U-Iii
28 pages
Lecture 06
No ratings yet
Lecture 06
16 pages
Os Unit 2
No ratings yet
Os Unit 2
60 pages
Deadlocks
No ratings yet
Deadlocks
9 pages
Module 3
No ratings yet
Module 3
26 pages
CSC423 - Lec12 - Distributed and Parallel ComputerSystems
No ratings yet
CSC423 - Lec12 - Distributed and Parallel ComputerSystems
28 pages
DS CH7 - Fault Tolerance
No ratings yet
DS CH7 - Fault Tolerance
17 pages
DeadLock in Operating System
No ratings yet
DeadLock in Operating System
15 pages
Chap 3
No ratings yet
Chap 3
65 pages
Unit 5
No ratings yet
Unit 5
34 pages
Osy Qb Ans [Final]
No ratings yet
Osy Qb Ans [Final]
23 pages
Thread Issues
No ratings yet
Thread Issues
15 pages
Distributed Systems - Fault Tolerance
No ratings yet
Distributed Systems - Fault Tolerance
21 pages
Deadlock in Distributed Enviornment
0% (1)
Deadlock in Distributed Enviornment
31 pages
05 Concurrency
No ratings yet
05 Concurrency
75 pages
Module 4
No ratings yet
Module 4
26 pages
Lm2-Rollback & Recovery
No ratings yet
Lm2-Rollback & Recovery
34 pages
Unit-3 OS Deadlock
No ratings yet
Unit-3 OS Deadlock
45 pages
Fault Tolerance in Distributed Computing
No ratings yet
Fault Tolerance in Distributed Computing
32 pages
Distributed Deadlock Detection
No ratings yet
Distributed Deadlock Detection
18 pages
DR Riham Operating Systems-Online2-w5 -Ch2-1
No ratings yet
DR Riham Operating Systems-Online2-w5 -Ch2-1
28 pages
Lecture 21,22,23,24 Availability Modifiability Tactics
No ratings yet
Lecture 21,22,23,24 Availability Modifiability Tactics
66 pages
2.8 Centralized Deadlock Detection and Resolution
No ratings yet
2.8 Centralized Deadlock Detection and Resolution
26 pages
Transaction Management and concurrency control
No ratings yet
Transaction Management and concurrency control
37 pages
Unit 2 - OS
No ratings yet
Unit 2 - OS
75 pages
Dependable and Secure Computing Concepts
No ratings yet
Dependable and Secure Computing Concepts
14 pages
Concurrency: Mutual Exclusion and Synchronization
No ratings yet
Concurrency: Mutual Exclusion and Synchronization
67 pages
Operting System Book
100% (2)
Operting System Book
46 pages
Deadlock os lecture notes
No ratings yet
Deadlock os lecture notes
11 pages
CloudNative_III
No ratings yet
CloudNative_III
46 pages
University of Makeni (Unimak) Sylvanus Koroma
No ratings yet
University of Makeni (Unimak) Sylvanus Koroma
14 pages
Chapter 3 - Transaction Management DB
No ratings yet
Chapter 3 - Transaction Management DB
45 pages
Deadlock
No ratings yet
Deadlock
38 pages
1.3.2 (B)
No ratings yet
1.3.2 (B)
10 pages
L 10 Deadlock Handling
No ratings yet
L 10 Deadlock Handling
10 pages
Real Time Systems IX
No ratings yet
Real Time Systems IX
40 pages
4
No ratings yet
4
25 pages
20-Distributed Architecture and Features-28!02!2024
No ratings yet
20-Distributed Architecture and Features-28!02!2024
24 pages
Unit 2
No ratings yet
Unit 2
34 pages
Mohamed Abdelrahman Anwar - 20011634 - Sheet 3
No ratings yet
Mohamed Abdelrahman Anwar - 20011634 - Sheet 3
16 pages
Operating Systems Interview Questions You'll Most Likely Be Asked
From Everand
Operating Systems Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Unit 4_Query Processing
No ratings yet
Unit 4_Query Processing
49 pages
graph algorithms-final
No ratings yet
graph algorithms-final
158 pages
Recursion
No ratings yet
Recursion
12 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
NNAI BAI-205 UNIT 1
No ratings yet
NNAI BAI-205 UNIT 1
107 pages
IS LECTURE 1
No ratings yet
IS LECTURE 1
37 pages
CSUnit1[1]
No ratings yet
CSUnit1[1]
124 pages
communication channels
No ratings yet
communication channels
7 pages
ABB Crane - US - Application Portal (Low Voltage AC Drives)
No ratings yet
ABB Crane - US - Application Portal (Low Voltage AC Drives)
3 pages
List of All Companies
100% (2)
List of All Companies
86 pages
Drops
No ratings yet
Drops
2 pages
Diode Ba157 - BA159 (Data Sheet)
No ratings yet
Diode Ba157 - BA159 (Data Sheet)
3 pages
Armanduey 73@
No ratings yet
Armanduey 73@
24 pages
SMELRSVXTiilq4kr5heA Mastering Hashtags Ebook V2 by Lauren Ashley
No ratings yet
SMELRSVXTiilq4kr5heA Mastering Hashtags Ebook V2 by Lauren Ashley
15 pages
Chapter 5 - Control Chart For Attributes
No ratings yet
Chapter 5 - Control Chart For Attributes
28 pages
ETABS-Exmple Using Is 456
No ratings yet
ETABS-Exmple Using Is 456
33 pages
Kemppi - A7 MIG Welder - en - US PDF
No ratings yet
Kemppi - A7 MIG Welder - en - US PDF
9 pages
Japmc-Nt110 Card
No ratings yet
Japmc-Nt110 Card
73 pages
Francisco Andres Reales Castro: It Engineer
No ratings yet
Francisco Andres Reales Castro: It Engineer
1 page
Tps 54628
No ratings yet
Tps 54628
22 pages
Ealn-7twq64 R7 en
No ratings yet
Ealn-7twq64 R7 en
8 pages
Airbnb - Price Prediction
No ratings yet
Airbnb - Price Prediction
9 pages
Iec 60027letter Symbols To Be Used in Electrical Technology
No ratings yet
Iec 60027letter Symbols To Be Used in Electrical Technology
16 pages
Text 678
No ratings yet
Text 678
10 pages
Tinkercad - From Mind To Design in Minutes
No ratings yet
Tinkercad - From Mind To Design in Minutes
2 pages
Teacher Planner
No ratings yet
Teacher Planner
200 pages
User Guide FX Tools (FX CommPro N2)
No ratings yet
User Guide FX Tools (FX CommPro N2)
44 pages
Citibank Suvidha Account Closure Letter
No ratings yet
Citibank Suvidha Account Closure Letter
1 page
Master Writer Slave Receiver
No ratings yet
Master Writer Slave Receiver
3 pages
Landstar Log
No ratings yet
Landstar Log
2 pages
OpenLMIS Implementaion Pre-Proposal
No ratings yet
OpenLMIS Implementaion Pre-Proposal
2 pages
Resistor-Transistor Logic (RTL) - Electronics Club Digital Electronics
No ratings yet
Resistor-Transistor Logic (RTL) - Electronics Club Digital Electronics
4 pages
CK4L3000P LED Video Processor User Manual CKDZ English (Ingles), Video Controller
No ratings yet
CK4L3000P LED Video Processor User Manual CKDZ English (Ingles), Video Controller
32 pages
Multi Purpose Editor
No ratings yet
Multi Purpose Editor
25 pages
Test MultiMedia
No ratings yet
Test MultiMedia
9 pages

Unit 4_Deadlock Handling & Recovery Techniques & Failuere Classification

Uploaded by

Unit 4_Deadlock Handling & Recovery Techniques & Failuere Classification

Uploaded by

Distributed Deadlock

Erroneous state Valid

• Erroneous system state: State which could lead to a system

• Fault: Abnormal physical condition. Eg, design errors,

• some concurrency control algorithms do not permit a

• Site failure message is broadcasted to all sites.

• Checking must be done periodically to see whether the failed

• After restarting the failure site, site must initiate a recovery

Write object back

•Principle: establish frequent ‘recovery points’ or ‘checkpoints’ saving the entire

•Shadow Pages: Special case of state-based approach

• UNDO: restore database to state prior to

• REDO: perform the changes to the database

RAID Level 3 Block-level parity

RAID Level 4 Rotating parity

RAID Level 5 Tolerates failure of two disk drives

file data block 0 block 1 block 2 block 3 block 4

Error Correction Code (ECC)

You might also like