0% found this document useful (0 votes)
17 views22 pages

Database Recovery Techniques

This chapter discusses various database recovery techniques to restore systems after failures, focusing on concepts like system logs and commit points. It outlines recovery methods such as deferred update (NO-UNDO/REDO), immediate update (UNDO/REDO), and shadow paging, as well as the widely used ARIES recovery system. The chapter also covers the importance of caching, write-ahead logging, checkpoints, and strategies to handle transaction rollbacks and cascading rollbacks.

Uploaded by

Putta Swamy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views22 pages

Database Recovery Techniques

This chapter discusses various database recovery techniques to restore systems after failures, focusing on concepts like system logs and commit points. It outlines recovery methods such as deferred update (NO-UNDO/REDO), immediate update (UNDO/REDO), and shadow paging, as well as the widely used ARIES recovery system. The chapter also covers the importance of caching, write-ahead logging, checkpoints, and strategies to handle transaction rollbacks and cascading rollbacks.

Uploaded by

Putta Swamy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Database Recovery Techniques

In this chapter, we discuss some of the techniques that can be used for database recovery in case
of system failure.
In Section 20.1.4 we discussed the different causes of failure, such as system crashes and
transaction errors. In Section 20.2, we introduced some of the concepts that are used by recovery
processes, such as the system log and commit points.
This chapter explains important concepts related to database recovery and gives an overview of
different recovery methods.
 In Section 22.1, we begin with a basic outline of how recovery usually works and introduce
the main types of recovery algorithms. We also explain key ideas like write-ahead logging,
the difference between in-place and shadow updates, and how a system undoes the effects of
a failed or incomplete transaction.
 Section 22.2 covers deferred update recovery, also called the NO-UNDO/REDO method.
Here, changes are not made to the database until the transaction is successfully completed.
 In Section 22.3, we look at immediate update recovery, where changes can be written to the
database even before a transaction is complete. This section includes UNDO/REDO and
UNDO/NO-REDO techniques.
 Section 22.4 discusses a method called shadow paging (or shadowing), which fits into the
NO-UNDO/NO-REDO category, as it avoids both undo and redo operations.
 Section 22.5 presents a widely used and practical recovery system called ARIES.
 In Section 22.6, we briefly look at how recovery works in multidatabase systems (systems
with more than one database).
 Section 22.7 explains how to recover from catastrophic failures, such as hardware crashes.
 Finally, Section 22.8 gives a summary of the chapter. While recovery and concurrency
control are closely related, this chapter focuses only on recovery, without going into
concurrency control. For details on how recovery is implemented in specific database
systems, you can refer to the user manuals or additional reading suggested at the end of the
chapter.
22.1 Recovery Concepts – Simplified Notes
22.1.1 Overview of Recovery and Algorithm Types
 Recovery Goal: Restore the database to its last consistent state after a failure.
 Key Component: The System Log, which tracks all changes made by transactions.
Two Main Failure Scenarios:
1. Catastrophic Failure (e.g., disk crash):
o Restore from a backup.
o Use the log to redo committed transactions.
2. Non-catastrophic Failure (e.g., system crash or power failure):
o Use the log to:
 Undo uncommitted transactions.
 Redo committed transactions not yet written to disk.
Recovery Techniques:
1. Deferred Update (NO-UNDO/REDO):
o No changes are made to the database until a transaction commits.
o Only REDO is needed if system crashes after commit.
2. Immediate Update (UNDO/REDO):
o Changes are made before commit, but only after being logged.
o Both UNDO (for uncommitted changes) and REDO (for committed but unwritten
changes) may be needed.
3. UNDO/NO-REDO:
o All changes written before commit.
o Only UNDO is needed if the transaction fails before committing.
✅ UNDO and REDO must be idempotent – repeating them doesn't change the outcome.

22.1.2 Caching (Buffering) of Disk Pages


 Database disk blocks are loaded into main memory buffers (DBMS Cache).
 Each buffer has:
o Dirty Bit: Indicates if data has been changed.
o Pin/Unpin Bit: Pinned pages can't be flushed (written) yet.
Two Ways to Write Back:
1. In-Place Updating: Overwrites the same disk location.
2. Shadowing: Writes to a new location (rarely used).
 Terms:
o BFIM (Before Image): Old value.
o AFIM (After Image): New value.

22.1.3 Write-Ahead Logging (WAL) and Steal/Force Policies


 WAL Rule: Log must be written to disk before the data page is modified.
Log Record Types:
 UNDO log: Records old value (BFIM) – used to undo.
 REDO log: Records new value (AFIM) – used to redo.
Buffer Management Policies:
Policy Meaning Effect on Recovery
Steal Allows writing uncommitted data to disk Need UNDO
No-Steal Doesn’t allow uncommitted data to be written to disk UNDO not needed
Force Forces all updates to disk before commit REDO not needed
No-Force Does not force all updates to disk before commit Need REDO
✅ Most DBMS use Steal/No-Force, so they need both UNDO and REDO (UNDO/REDO).
WAL Protocol Rules:
1. UNDO info must be flushed before overwriting database.
2. All log entries must be written before commit is acknowledged.

22.1.4 Checkpoints and Fuzzy Checkpointing


 A Checkpoint is a snapshot of the DBMS state written to the log periodically.
 It saves time during recovery by avoiding redoing operations before the checkpoint.
Checkpoint Process:
1. Pause transaction execution.
2. Write all dirty buffers to disk.
3. Write [CHECKPOINT] record to the log.
4. Resume transactions.
Fuzzy Checkpointing:
 Allows transaction processing to continue during checkpoint.
 Log contains [begin_checkpoint] and [end_checkpoint] markers.
 A pointer tracks the last valid checkpoint.

22.1.5 Rollback and Cascading Rollback


 If a transaction fails, all its changes must be undone using UNDO log entries.
 If another transaction has read a value from a rolled-back transaction, it also must be
rolled back — this is called cascading rollback.
❌ Cascading rollback is complex and inefficient, so modern DBMS prevent it using strict or
cascadeless schedules.

22.1 Recovery Concepts Simplified


When a database system encounters a failure, recovery ensures the database is restored to a
consistent state. This involves undoing or redoing certain operations based on information
stored in a system log. Below is a simplified explanation of key recovery concepts.

22.1.1 Recovery Overview and Types of Recovery Algorithms


What is Recovery?
Recovery restores the database to its last consistent state before a failure. The system log (a
record of all changes made by transactions) is critical for this process.
Recovery Strategies:
1. Catastrophic Failure (e.g., disk crash):
o Restore a backup copy of the database from archival storage (like tape).
o Redo operations of committed transactions from the log to bring the database up
to date.
2. Noncatastrophic Failure (e.g., software or transaction errors):
o Analyze the log to identify changes causing inconsistencies.
o Undo uncommitted changes (e.g., reverse updates from a failed transaction).
o Redo committed changes not yet written to disk to ensure consistency.
Main Recovery Policies:
 Deferred Update (NO-UNDO/REDO):
o Updates are not written to the database until a transaction commits.
o Changes are stored in memory or the log before commit.
o If a transaction fails before committing, no changes are on disk, so no undo is
needed.
o Redo may be required to apply committed changes not yet on disk.
 Immediate Update (UNDO/REDO or UNDO/NO-REDO):
o Updates can be written to the database before a transaction commits, but they’re
logged first.
o If a transaction fails, undo reverses its changes.
o Redo may be needed to ensure committed changes are on disk.
o A variation (UNDO/NO-REDO) requires all updates to be written to disk before
commit, so only undo is needed.
Idempotent Operations:
 Undo and redo operations must be repeatable without changing the result. This ensures
recovery works correctly even if interrupted by another failure.

22.1.2 Caching (Buffering) of Disk Blocks


The database uses a DBMS cache (main memory buffers) to temporarily hold disk pages
(blocks) for faster access. Key points:
 A cache directory tracks which disk pages are in memory.
 If a page isn’t in the cache, it’s read from disk.
 A dirty bit indicates if a page in the cache was modified (1 = modified, 0 = unchanged).
 A pin-unpin bit prevents modified pages from being written to disk until allowed (e.g.,
until a transaction commits).
 Modified pages are written back to disk only if their dirty bit is 1.
Updating Strategies:
 In-Place Updating: Overwrites the original disk page with new data (most common).
 Shadowing: Writes updated data to a new disk location, keeping the old version (rarely
used).

22.1.3 Write-Ahead Logging, Steal/No-Steal, and Force/No-Force


Write-Ahead Logging (WAL):
 Ensures recovery is possible by logging changes before they’re applied to the database.
 The old value (before image, BFIM) is logged for undo, and the new value (after image,
AFIM) is logged for redo.
 Log entries are written to disk before overwriting the database’s BFIM with the AFIM.
Steal/No-Steal:
 Steal: Allows a modified page to be written to disk before the transaction commits (saves
memory but may require undo).
 No-Steal: Prevents writing modified pages to disk until the transaction commits (no undo
needed but requires more memory).
Force/No-Force:
 Force: All modified pages are written to disk before a transaction commits (no redo
needed).
 No-Force: Modified pages can stay in memory, reducing disk writes but may require
redo.
Common Approach: Most systems use steal/no-force (UNDO/REDO), balancing memory use
and disk I/O efficiency.
WAL Protocol for UNDO/REDO:
1. Log entries for undo (BFIM) must be on disk before overwriting the database.
2. All log entries (undo and redo) must be on disk before a transaction commits.

22.1.4 Checkpoints and Fuzzy Checkpointing


Checkpoints:
 Periodically, the system saves a snapshot of the database state to the log.
 Actions during a checkpoint:
1. Pause transactions briefly.
2. Write all modified cache buffers to disk.
3. Log a checkpoint record (including active transaction IDs).
4. Resume transactions.
 Checkpoints reduce recovery time by marking a point where all prior committed changes
are on disk.
Fuzzy Checkpointing:
 To avoid delays, transactions can resume after logging a begin_checkpoint record.
 A valid checkpoint pointer tracks the latest completed checkpoint.
 An end_checkpoint record is logged once all modified buffers are written to disk.

22.1.5 Transaction Rollback and Cascading Rollback


Transaction Rollback:
 If a transaction fails before committing, its changes on disk are undone using the log’s
BFIMs.
Cascading Rollback:
 Occurs when one transaction (T) fails, and other transactions (e.g., S) read its
uncommitted changes.
 S must also be rolled back, and so on, creating a chain reaction.
 Example: If T3 writes B, T2 reads B, and T3 fails, T2 must be rolled back too.
 Modern systems avoid cascading rollback by enforcing strict or cascadeless schedules,
eliminating the need to log read operations.

22.1.6 Actions That Don’t Affect the Database


 Some transaction actions (e.g., generating reports) don’t modify the database.
 These actions should only execute after the transaction commits to avoid errors.
 Solution: Queue such actions as batch jobs, executed only after commit. If the
transaction fails, the jobs are canceled.

This simplified explanation covers the core concepts of database recovery, focusing on clarity
and ease of understanding while retaining essential details. Let me know if you need further
clarification or examples!

22.2 NO-UNDO/REDO Recovery (Deferred Update)


🔑 Main Idea:
In deferred update, no changes are made to the actual database until a transaction successfully
commits. Until then, all updates are kept:
 In the log (with the new values only — called AFIM),
 And in the main memory (cache).
If a transaction fails before commit, there’s nothing to undo because the database hasn’t been
changed yet.

🧠 Key Points:
 Only REDO is needed (for committed transactions).
 UNDO is not needed — failed transactions haven’t touched the disk.
 Useful for simple/short transactions, since changes stay in memory until commit (can use
up buffer space).

✅ Deferred Update Protocol:


1. No changes to disk until transaction commits (no-steal policy).
2. Before committing:
o All REDO log entries must be written to disk (WAL rule).
3. If a crash occurs after commit but before updates are saved, REDO the changes from the
log.

🔄 Recovery in Multiuser Systems (RDU_M Method):


 Keep two lists:
o Commit list: Transactions committed after the last checkpoint.
o Active list: Transactions that were running but not committed.
 After crash:
o REDO committed transactions (from log).
o Ignore uncommitted ones — they didn’t affect the disk.
📝 How REDO works: Use the log entry [write_item, T, X, new_value] and set item X =
new_value in the database.
💡 Optimization: If multiple REDOs are logged for the same item, only the last one is needed.

🚫 No Cascading Rollback or UNDO Needed


Why?
1. Transactions update the disk only after commit.
2. Transactions can’t read uncommitted data (locks are held until commit).

⚠️Drawbacks:
 More memory needed (buffers stay pinned).
 Lower concurrency (locked items stay locked until commit).

Let me know if you want a diagram or step-by-step example!


I prefer this response
ChatGPT
Response 2
Here is a simple and easy-to-understand explanation of Section 22.2: NO-UNDO/REDO
Recovery Based on Deferred Update from your DBMS textbook:

22.2 NO-UNDO/REDO Recovery (Deferred Update) – Simplified Notes


🌟 What is Deferred Update?
 In this method, the database is not updated on disk until the transaction is completely
successful and reaches the commit point.
 During execution:
o Changes are only kept in main memory (cache) and logged.
o If the transaction fails before committing, it has made no changes to the disk, so
there is nothing to undo.
 Only REDO is needed if the system crashes after commit but before writing changes to
disk.

🔁 How It Works – Step by Step


1. While Transaction is Running:
o All updates are saved in log files and buffers, not in the database.
o Changes are not visible to other transactions.
2. At Commit Point:
o All REDO log entries are force-written to disk.
o After that, changes are written to the database.
3. If the System Crashes:
o Only committed transactions are redone from the log.
o No need to undo anything because no uncommitted changes were written to disk.

✅ Why Only REDO is Needed


 Updates reach the actual database only after commit.
 If crash happens before commit, the transaction is just discarded — nothing needs to be
undone.
 If crash happens after commit, use the REDO log to apply changes to the database.

📦 Deferred Update Protocol


1. No database updates on disk before commit → requires no-steal policy (buffers stay
pinned).
2. A transaction can’t commit until all REDO log entries are force-written to disk →
follows WAL (Write-Ahead Logging).

⚙️RDU_M (Recovery using Deferred Update - Multiuser) Algorithm


 Maintains two lists:
o Committed transactions (after last checkpoint).
o Active (uncommitted) transactions.
🔄 REDO Process:
 For each committed transaction after the last checkpoint:
o REDO its write operations by setting values from the log (AFIM = After Image).
 Ignore all uncommitted transactions — they are assumed to fail and are resubmitted.
✅ Optimizing REDO:
 If the same item was updated multiple times, only the last update needs to be redone.
 Start REDO from the end of the log, and skip any item that has already been redone.

🔐 Concurrency Control with Deferred Update


 Use Strict Two-Phase Locking (Strict 2PL):
o Keep locks on items until commit.
o This avoids reading uncommitted data, so no cascading rollbacks.

⚠️Drawbacks of Deferred Update


 All updates must stay in cache until commit → may use too much memory.
 Locks are held until commit → may reduce concurrency.

💡 Main Advantages
1. No need to UNDO anything.
2. No cascading rollbacks.
3. Simple recovery: just REDO committed transactions.

📊 Example Summary from Figures:


 At checkpoint (t1): T1 committed, T3 and T4 active.
 At crash (t2): T2 and T3 committed, T4 and T5 not committed.
 During recovery:
o REDO T2 and T3 (committed after t1).
o Ignore T4 and T5 (not committed).
o No need to redo T1 (committed before t1).

Let me know if you’d like a diagram, a flowchart, or practice questions to make it even clearer!
I prefer this response
22.2 NO-UNDO/REDO Recovery Based on Deferred Update (Simplified)
What is Deferred Update?
Deferred update is a recovery method where a transaction’s changes are not written to the
database on disk until the transaction successfully completes (reaches its commit point). Until
then, changes are stored in the system log and cache buffers in memory.
Key Idea:
 If a transaction fails before committing, no changes are made to the database on disk, so
there’s no need to undo anything.
 Only REDO is needed to reapply changes from committed transactions if they weren’t
fully written to the database before a system crash.
Why REDO Only?
 The log records only the new values (after images, AFIMs) of data items, as no undo is
needed.
 If the system crashes after a transaction commits but before its changes are written to
disk, the log’s REDO entries are used to apply those changes.
Limitations:
 This method works best for short transactions that modify few items.
 For longer or larger transactions, it can use up a lot of memory because all changes must
stay in cache buffers until the transaction commits, and these buffers are pinned (can’t be
replaced).

Deferred Update Protocol


1. No Disk Updates Until Commit: A transaction cannot write changes to the database on
disk until it reaches its commit point. All modified buffers remain pinned in memory (this
is called a no-steal policy).
2. Log Before Commit: Before a transaction commits, all its REDO-type log entries (new
values of data items) must be written to the log, and the log must be force-written to disk
(following the write-ahead logging protocol).
Result:
 No undo is needed because the database isn’t changed until the transaction commits.
 REDO may be required to apply committed changes that weren’t written to disk before a
crash.

Recovery in Multiuser Systems (RDU_M Algorithm)


In systems with multiple users, concurrency control (like strict two-phase locking) works with
recovery to ensure correct transaction execution. Strict two-phase locking keeps locks on data
items until the transaction commits, ensuring strict and serializable schedules.
RDU_M (Recovery using Deferred Update in a Multiuser Environment):
 The system maintains two lists:
o Commit List: Transactions that committed since the last checkpoint.
o Active List: Transactions that are still running and haven’t committed.
 During recovery:
o REDO all write operations from committed transactions (in the order they appear
in the log).
o Ignore active transactions (they didn’t commit, so their changes aren’t on disk
and are effectively canceled).
o Canceled transactions can be resubmitted later.
REDO Procedure:
 For each log entry [write_item, T, X, new_value], set the value of item X in the database
to new_value (the after image, AFIM).

Example (Based on Figure 22.2)


 Scenario:
o At time t1, a checkpoint is taken. Transaction T1 has committed, but T3 and T4
haven’t.
o By time t2 (system crash), T2 and T3 have committed, but T4 and T5 haven’t.
 Recovery Actions:
o No need to redo T1’s changes (committed before the checkpoint, already on disk).
o Redo T2 and T3’s write operations (committed after the checkpoint).
o Ignore T4 and T5 (not committed, no changes on disk, effectively canceled).

Optimization
To make recovery faster:
 If an item (e.g., X) was updated multiple times by committed transactions since the last
checkpoint, only redo the last update from the log.
 Track redone items in a list to avoid repeating earlier updates, as the last update
overwrites them.

Handling Aborted Transactions


If a transaction is aborted (e.g., due to a deadlock), it’s resubmitted because it never changed the
database on disk.

Benefits of Deferred Update


1. No Undo Needed: Since changes aren’t written to disk until commit, there’s no need to
reverse operations if a transaction fails.
2. No Cascading Rollback: Because items are locked until commit, other transactions can’t
read uncommitted changes, preventing dependency issues.

Drawbacks
1. Limited Concurrency: Items stay locked until the transaction commits, which can slow
down other transactions.
2. High Memory Usage: All changes must stay in pinned cache buffers until commit, which
can strain memory for large transactions.

This simplified explanation captures the essence of the NO-UNDO/REDO recovery method using
deferred update, making it easy to understand while covering the key points. Let me know if you
need further details or clarification!

22.3 Recovery Techniques: Immediate Update


🔑 Main Idea:
In immediate update, a transaction can update the database on disk before it commits.
This is different from deferred update, where nothing is written to disk until commit.

🧠 Key Concepts:
 Since changes can be written before commit, we need a way to undo the effects of a
failed transaction.
 For this, the system must log the old value of any data item that is updated (called BFIM
– Before Image).
 The log must contain both:
o UNDO information (old value),
o REDO information (new value), if commit happens before all updates are
written to disk.

💡 Two Main Techniques:


1. UNDO/NO-REDO
 All changes are written to disk before the transaction commits.
 So, REDO is not needed (data is already on disk).
 Only UNDO is required if a transaction fails.
 This uses a steal/force strategy:
o Steal: changes can go to disk before commit.
o Force: all changes must be on disk before commit.
2. UNDO/REDO (Most Common)
 A transaction can commit before its updates are all saved to disk.
 So, recovery may need both:
o UNDO for failed transactions (undo disk changes),
o REDO for committed ones (redo updates that weren’t written).
 This uses a steal/no-force strategy:
o Steal: allows writing to disk before commit.
o No-force: not all changes need to be on disk at commit.

🔄 Recovery with Concurrency (RIU_M Method)


This recovery method is for multiuser systems using UNDO/REDO with checkpoints and strict
concurrency control (e.g., strict two-phase locking).

RIU_M Recovery Steps:


1. Two transaction lists:
o Committed (after last checkpoint),
o Active (uncommitted at crash time).
2. UNDO all operations of active (uncommitted) transactions:
o Use the UNDO procedure to restore old values (BFIM).
o Do this in reverse log order (latest to earliest).
3. REDO all operations of committed transactions:
o Use the REDO procedure to reapply changes (AFIM).
o Do this in original log order (earliest to latest).

🔁 UNDO Procedure:
 For each [write_item, T, X, old_value, new_value]:
o Set X = old_value in the database.
🔁 REDO Procedure:
 For each [write_item, T, X, new_value]:
o Set X = new_value in the database.

🚀 Optimizing Recovery:
To improve performance:
 For REDO: Only apply the last change to each item (scan log backwards).
 For UNDO: Only undo each item once (scan log forwards, and skip items already
undone).

⚠️Important Note:
 Because updates happen before commit, recovery must undo failed transaction effects.
 This makes UNDO/REDO more complex, but it's the most flexible and widely used
method.
22.3 Recovery Techniques Based on Immediate Update (Simplified)
What is Immediate Update?
In immediate update recovery, a transaction’s changes can be written to the database on disk
before the transaction reaches its commit point. This is different from deferred update, where
changes wait until commit.
Key Idea:
 Changes are written to the database immediately but must also be logged to allow
recovery.
 If a transaction fails before committing, its changes must be undone (rolled back) using
the before image (BFIM) from the log.
 If a transaction commits but some changes aren’t on disk, they may need to be redone
using the after image (AFIM) from the log.
Why Undo?
 Since changes can be written to disk before a transaction commits, a failed transaction’s
changes must be reversed to restore the database to its previous state.
 This uses a steal strategy, where modified cache buffers can be written to disk before
commit to free up memory.

Two Types of Immediate Update Recovery


1. UNDO/NO-REDO Recovery:
o How it works: All changes by a transaction must be written to the database on
disk before it commits.
o Result: No redo is needed because all committed changes are already on disk.
Only undo is required for failed transactions.
o Strategy: Uses steal/force (steal: buffers can be written before commit; force: all
buffers are written to disk at commit).
2. UNDO/REDO Recovery (Most Common):
o How it works: A transaction can commit before all its changes are written to disk.
o Result: Both undo (for failed transactions) and redo (for committed transactions)
may be needed during recovery.
o Strategy: Uses steal/no-force (steal: buffers can be written before commit; no-
force: not all buffers need to be written at commit).
o This is the most flexible and widely used method.

Recovery in Multiuser Systems (RIU_M Algorithm)


In systems with multiple users, recovery works with concurrency control (e.g., strict two-phase
locking), which ensures strict schedules. This means a transaction can’t read or write an item
until the transaction that wrote it has committed, preventing issues like cascading rollback.
However, deadlocks may require aborting and undoing transactions.
RIU_M (Recovery using Immediate Update for a Multiuser Environment):
 Requirements: The log includes checkpoints, and the system tracks two lists:
o Committed Transactions: Those that committed since the last checkpoint.
o Active Transactions: Those still running and not yet committed.
 Recovery Steps:
1. Undo Active Transactions: Reverse all write operations of uncommitted (active)
transactions using the log’s BFIMs. Undo in reverse order of how operations
were logged.
2. Redo Committed Transactions: Reapply all write operations of committed
transactions in the order they appear in the log, using the AFIMs.
Undo Procedure:
 For a log entry [write_item, T, X, old_value, new_value], set item X in the database back
to old_value (the BFIM).
 Undo operations in reverse order to ensure the correct sequence.
Redo Procedure:
 Same as in deferred update: Set item X to new_value (the AFIM) from the log entry.

Optimization
To make recovery faster:
 For Redo: Start from the end of the log and redo only the last update for each item.
Track redone items in a list to avoid repeating.
 For Undo: Start from the beginning of the log and apply the earliest undo for each item.
Track undone items to avoid repeating.

This simplified explanation covers the core of immediate update recovery, focusing on clarity
and ease of understanding. Let me know if you need further details or examples!

22.4 Shadow Paging – Simplified Notes


📘 What is Shadow Paging?
Shadow Paging is a recovery technique that allows transactions to be reversed easily if a
failure occurs, without using a log (in single-user environments).

🧱 How It Works
1. Database is split into fixed-size pages (e.g., Page 1, Page 2, ..., Page n).
2. A directory keeps track of where each page is located on disk.
o The directory maps page numbers to physical disk blocks.
o All reads and writes go through this directory.
3. When a transaction starts:
o The current directory is copied and saved as the shadow directory.
o The shadow directory stays unchanged during the transaction.
o The current directory is used for any updates.

✍️When Updating a Page:


 Instead of overwriting the original page:
o A new page is written to a new location on disk.
o The current directory is updated to point to this new page.
o The shadow directory still points to the original page.
✅ This means we have two versions:
 Shadow directory → old (safe) version
 Current directory → new (modified) version

🔄 Recovery After a Crash


 Just discard the current directory and any newly written pages.
 Use the shadow directory to restore the original state of the database.
 Since the original data was never overwritten, it's safe.
👉 This method requires NO UNDO and NO REDO.

✅ Committing a Transaction
 When a transaction commits:
o The shadow directory is discarded.
o The current directory becomes the new shadow directory.
o Now, the new pages are officially part of the database.

🧩 Advantages
 Simple and clean: just switch directories for commit or recovery.
 No need to undo or redo any operations.

⚠️Disadvantages
1. Page movement: Updated pages are written to new locations, so pages can become
scattered on disk.
o Makes it hard to keep related data together for performance.
2. Large directories: If the directory is big, saving it to disk every time a transaction
commits can be slow.
3. Garbage collection needed:
o After commit, old pages (no longer in use) must be freed and added back to the
free space list.
4. Atomic switch:
o Switching from the shadow to the current directory must be done as one atomic
step to avoid inconsistency.

🤝 Multiuser Systems
 In multiuser systems, you still need:
o Logging and checkpointing for concurrency control.
o So shadow paging is mainly practical for single-user or simple environments.

Let me know if you’d like a visual diagram or a simple example to help make this even clearer!
22.4 Shadow Paging (Simplified)
What is Shadow Paging?
Shadow paging is a recovery method that doesn’t require a log in a single-user environment. In
a multiuser environment, a log may still be needed for concurrency control. It treats the
database as a collection of fixed-size disk pages (or blocks) and uses a directory to manage
them.

How Shadow Paging Works


 Database Structure: The database is divided into a set number of disk pages (e.g., n
pages).
 Directory: A table with n entries, where each entry points to a specific database page on
disk. This directory is usually kept in main memory for quick access, and all read/write
operations go through it.
 Shadow Directory: When a transaction starts, the current directory (pointing to the latest
database pages) is copied to a shadow directory, which is saved on disk. The shadow
directory remains unchanged during the transaction.
 Current Directory: The transaction uses the current directory, which is updated as
changes are made.
Write Operations:
 When a transaction updates a page, it creates a new copy of that page on a different disk
block instead of overwriting the old one.
 The current directory is updated to point to the new page, while the shadow directory
still points to the old, unchanged page.
 This creates two versions of updated pages: the old version (shadow directory) and the
new version (current directory).

Recovery Process
 If a transaction fails:
o Discard the modified pages and the current directory.
o Restore the shadow directory, which points to the database’s state before the
transaction started.
o This returns the database to its original state without needing to undo or redo
changes.
 If a transaction commits:
o Discard the shadow directory, making the current directory the new official
directory.
 Result: Shadow paging is a NO-UNDO/NO-REDO method because it neither reverses
nor reapplies changes during recovery.
Shadow Paging in Multiuser Environments
 In a multiuser system with concurrent transactions, shadow paging needs logs and
checkpoints to work with concurrency control methods (e.g., locking).
 This ensures transactions don’t interfere with each other.

Drawbacks of Shadow Paging


1. Changing Page Locations: Updated pages are written to new disk locations, making it
hard to keep related pages close together without complex storage management.
2. Overhead for Large Directories: If the directory is large, saving shadow directories to
disk for every commit is time-consuming.
3. Garbage Collection: After a transaction commits, old pages (referenced by the shadow
directory) must be freed and added to a list of available pages for reuse.
4. Atomic Operations: Switching between the current and shadow directories must be done
as a single, uninterrupted operation to avoid errors.

This simplified explanation covers the essentials of shadow paging, making it easy to understand
while retaining key details. Let me know if you need further clarification or examples!

You might also like