0% found this document useful (0 votes)
21 views71 pages

Chapter 6 Database Management (1)

The document presents an overview of transaction management, concurrency, and recovery in database systems, covering key concepts such as transaction states, ACID properties, and transaction control commands. It explains the importance of transactions in maintaining data integrity and consistency, detailing various transaction states like active, partially committed, failed, aborted, committed, and terminated. Additionally, it discusses operations such as read, write, commit, rollback, and the significance of serializability in concurrent transaction execution.

Uploaded by

yashshende802
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views71 pages

Chapter 6 Database Management (1)

The document presents an overview of transaction management, concurrency, and recovery in database systems, covering key concepts such as transaction states, ACID properties, and transaction control commands. It explains the importance of transactions in maintaining data integrity and consistency, detailing various transaction states like active, partially committed, failed, aborted, committed, and terminated. Additionally, it discusses operations such as read, write, commit, rollback, and the significance of serializability in concurrent transaction execution.

Uploaded by

yashshende802
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 71

Transaction Management and

Concurrency and Recovery

Presented by
Prof. Pooja Mhatre

PRAVIN ROHIDAS PATIL COLLEGE OF


ENGINEERING AND TECHNOLOGY
Topics to be covered

✔ Transaction Concept
✔ Transaction States
✔ Operations of Transaction
✔ ACID Properties
✔ Transaction Control Commands
✔ Current Execution
✔ Serializability Conflict and View
✔ Concurrency Control: Lock-based
✔ Timestamp-based Protocol
✔ Recovery System: Log based recovery
✔ Deadlock Handling
Transaction Concept
• Transactions are a set of operations used to perform a
logical set of work.
• A transaction usually means that the data in the database
has changed.
• One of the major uses of DBMS is to protect the user data
from system failures. It is done by ensuring that all the
data is restored to a consistent state when the computer
is restarted after a crash.
• The transaction is any one execution of the user program
in a DBMS.
• One of the important properties of the transaction is that
it contains a finite number of steps.
• Executing the same program multiple times will generate
multiple transactions.
What does a Transaction mean in DBMS?
• A transaction refers to a sequence of one or more operations
(such as read, write, update, or delete) performed on the
database as a single logical unit of work.
• A transaction ensures that either all the operations are
successfully executed (committed) or none of them take
effect (rolled back).
• Transactions are designed to maintain the integrity,
consistency and reliability of the database, even in the case
of system failures or concurrent access.
Transaction State
• A transaction is a set of operations or tasks
performed to complete a logical process,
which may or may not change the data in a
database. To handle different situations, like
system failures, a transaction is divided into
different states.
• A transaction state refers to the current
phase or condition of a transaction during
its execution in a database. It represents the
progress of the transaction and determines
whether it will successfully complete
(commit) or fail (abort).
These are different types of Transaction States :
1. Active State
2. Partially Committed State
3. Failed State
4. Aborted State
5. Commuted / Commit State
6. Terminated State
Active State
• This is the first stage of a transaction, when the
transaction’s instructions are being executed.
• It is the first stage of any transaction when it has begun
to execute. The execution of the transaction takes place
in this state.
• Operations such as insertion, deletion, or updation are
performed during this state.
• During this state, the data records are under
manipulation and they are not saved to the database,
rather they remain somewhere in a buffer in the main
memory.
• Example:
• The transaction begins. It reads the balance of Account
A and checks if it has enough funds.
• Read balance of Account A = $1000.
Partially Committed
State
• The transaction has finished its final operation, but the
changes are still not saved to the database.
• After completing all read and write operations, the
modifications are initially stored in main memory or a local
buffer.
• If the changes are made permanent on the DataBase then
the state will change to “committed state” and in case of
failure it will go to the “failed state”.
• Example:
• The transaction performs all its operations but hasn’t yet
saved (committed) the changes to the database.
• Deduct $500 from Account A’s balance ($1000 – $500 =
$500) and temporarily update Account B’s balance (add
$500)
Failed State
• If any of the transaction-related operations cause
an error during the active or partially committed
state, further execution of the transaction is
stopped and it is brought into a failed state.
• Here, the database recovery system makes sure
that the database is in a consistent state.
• Example:
• If something goes wrong during the transaction
(e.g., power failure, system crash), the
transaction moves to this state.
• System crashes after deducting $500 from
Account A but before adding it to Account B.
Aborted State
• If a transaction reaches the failed state due to a failed check,
the database recovery system will attempt to restore it to a
consistent state. If recovery is not possible, the transaction is
either rolled back or cancelled to ensure the database remains
consistent.
• In the aborted state, the DBMS recovery system performs one
of two actions:
• Kill the transaction: The system terminates the transaction to
prevent it from affecting other operations.
• Restart the transaction: After making necessary adjustments,
the system reverts the transaction to an active state and
attempts to continue its execution.
• Example:
• The failed transaction is rolled back, and the database is
restored to its original state.
• Account A’s balance is restored to $1000, and no changes are
made to Account B.
Commuted / Commit
State
• This state of transaction is achieved when all the
transaction-related operations have been executed
successfully along with the Commit operation, i.e.
data is saved into the database after the required
manipulations in this state. This marks the
successful completion of a transaction.
• Example:
• The transaction successfully completes, and the
changes are saved permanently in the database.
• Account A’s new balance = $500; Account B’s new
balance = $1500. Changes are written to the
database.
Terminated State
• If there isn’t any roll-back or the
transaction comes from the
“committed state”, then the system is
consistent and ready for new
transaction and the old transaction is
terminated.
Operations of Transaction
1) Read(X)
• A read operation is used to read the value of a particular
database element X and stores it in a temporary buffer in
the main memory for further actions such as displaying
that value.
• Example:
• For a banking system, when a user checks their balance,
a Read operation is performed on their account balance:

SELECT balance FROM accounts WHERE account_id =


'A123';

• This updates the balance of the user's account after


withdrawal.
Operations of Transaction
2) Write(X)
• A write operation is used to write the value to the database
from the buffer in the main memory. For a write operation to be
performed, first a read operation is performed to bring its value
in buffer, and then some changes are made to it, e.g. some set
of arithmetic operations are performed on it according to the
user's request, then to store the modified value back in the
database, a write operation is performed.
• Example:
• For the banking system, if a user withdraws money,
a Write operation is performed after the balance is updated:

UPDATE accounts SET balance = balance - 100 WHERE


account_id = 'A123';

• This updates the balance of the user’s account after


withdrawal.
Operations of Transaction
3) Commit
• This operation in transactions is used to maintain integrity
in the database. Due to some failure of power, hardware,
or software, etc., a transaction might get interrupted
before all its operations are completed.
• This may cause ambiguity in the database, i.e. it might
get inconsistent before and after the transaction.
• Example:
• After a successful money transfer in a banking system,
a Commit operation finalizes the transaction:
COMMIT;

• Once the transaction is committed, the changes to the


database are permanent, and the transaction is
considered successful.
Operations of Transaction
4) Rollback
• If an error occurs, the Rollback operation undoes all the
changes made by the transaction, reverting the database to its
last consistent state.
• In simple words, it can be said that a rollback operation does
undo the operations of transactions that were performed before
its interruption to achieve a safe state of the database and avoid
any kind of ambiguity or inconsistency.
• Example:
• Suppose during the money transfer process, the system
encounters an issue, like insufficient funds in the sender’s
account. In that case, the transaction is rolled back:
ROLLBACK;

• This will undo all the operations performed so far and ensure
that the database remains consistent.
ACID Properties
• A transaction is a single logical unit of work that
interacts with the database, potentially modifying its
content through read and write operations.
• To maintain database consistency both before and
after a transaction, specific properties, known as ACID
properties must be followed.
Atomicity
• By this, we mean that either the entire
transaction takes place at once or doesn’t happen
at all. There is no midway i.e. transactions do not
occur partially. Each transaction is considered as
one unit and either runs to completion or is not
executed at all. It involves the following two
operations.
• Abort : If a transaction aborts, changes made to
the database are not visible.
• Commit : If a transaction commits, changes
made are visible.
• Atomicity is also known as the ‘All or nothing
rule’.
Atomicity example
• Consider the following transaction T consisting
of T1 and T2 : Transfer of 100 from account X to account Y .

• If the transaction fails after completion of T1 but before


completion of T2 ( say, after write(X) but before write(Y) ),
then the amount has been deducted from X but not added
to Y . This results in an inconsistent database state.
Therefore, the transaction must be executed in its entirety in
order to ensure the correctness of the database state.
Consistency
• Consistency ensures that a database remains in a valid
state before and after a transaction.
• It guarantees that any transaction will take the
database from one consistent state to another,
maintaining the rules and constraints defined for the
data.
• Referring to the example above,
• The total amount before and after the transaction must
be maintained.
• Total before T occurs = 500 + 200 = 700 .
• Total after T occurs = 400 + 300 = 700 .
• Therefore, the database is consistent . Inconsistency
occurs in case T1 completes but T2 fails.
Isolation
• This property ensures that multiple transactions
can occur concurrently without leading to the
inconsistency of the database state.
• Transactions occur independently without
interference.
• Changes occurring in a particular transaction will
not be visible to any other transaction until that
particular change in that transaction is written to
memory or has been committed.
• This property ensures that when multiple
transactions run at the same time, the result will
be the same as if they were run one after
another in a specific order.
Isolation example
• Let X = 500, Y = 500.
• Consider two transactions T and T”.

• Suppose T has been executed till Read (Y) and then T’’ starts. As a result, interleaving
of operations takes place due to which T’’ reads the correct value of X but the
incorrect value of Y and sum computed by
• T’’: (X+Y = 50, 000+500=50, 500) .
• is thus not consistent with the sum at end of the transaction:
• T: (X+Y = 50, 000 + 450 = 50, 450) .
• This results in database inconsistency, due to a loss of 50 units. Hence, transactions
must take place in isolation and changes should be visible only after they have been
made to the main memory.
Durability
• This property ensures that once the transaction
has completed execution, the updates and
modifications to the database are stored in and
written to disk and they persist even if a system
failure occurs.
• It indicates that permanent changes are made by
the successful execution of a transaction.
• These updates now become permanent and are
stored in non-volatile memory. The effects of the
transaction, thus, are never lost.
• Example:
• After transferring money between two accounts,
if the system crashes, the changes made by the
transaction should still be saved when the
system restarts.
TCL commands
• Transaction Control Language (TCL) is a
critical component of SQL used to manage
transactions and ensure data
integrity in relational databases.
• By using TCL commands, we can control how
changes to the database are committed or
reverted, maintaining consistency across multiple
operations.
• TCL commands:
1. BEGIN TRANSACTION
2. COMMIT
3. ROLLBACK
4. SAVEPOINT
BEGIN TRANSACTION
Command
• The BEGIN TRANSACTION command marks the
beginning of a new transaction.
• All SQL statements that follow this command will be part
of the same transaction until a COMMIT or ROLLBACK is
encountered.
• This command doesn’t make any changes to the
database, it just starts the transaction.
• Syntax:
BEGIN TRANSACTION transaction_name ;
• Example
BEGIN TRANSACTION TransferFunds;
COMMIT

• The COMMIT command is used to save all the


transactions to the database that have been
performed during the current transaction.
• Once a transaction is committed, it becomes
permanent and cannot be undone.
• This command is typically used at the end of a series
of SQL statements to ensure that all changes made
during the transaction are saved.
• Syntax:
COMMIT;
• Example: Student table

• Following is an example which would delete those records


from the table which have age = 20 and then COMMIT the
changes in the database.
DELETE FROM Student WHERE AGE = 20;
COMMIT;
• Output:
ROLLBACK

• The ROLLBACK command is used to undo all


the transactions that have been
performed during the current transaction
but have not yet been committed.
• This command is useful for reverting the
database to its previous state in case an error
occurs or if the changes made are not desired.
• Syntax:
ROLLBACK;
Example: Student table

• Delete those records from the table which have age = 20 and then
ROLLBACK the changes in the database. In this case, the DELETE
operation is undone, and the changes to the database are not saved.
DELETE FROM Student WHERE AGE = 20;
ROLLBACK;

• Output
SAVEPOINT
• The SAVEPOINT command is used to set a point within a
transaction to which we can later roll back.
• This command allows for partial rollbacks within a transaction,
providing more control over which parts of a transaction to undo.
• Syntax:
SAVEPOINT savepoint_name;
Example
SAVEPOINT SP1; //Savepoint created.
DELETE FROM Student WHERE AGE = 20; //deleted
SAVEPOINT SP2; //Savepoint created.
ROLLBACK TO
SAVEPOINT
• The ROLLBACK TO SAVEPOINT command allows us to roll back the
transaction to a specific savepoint, effectively undoing changes made after
that point.
• Syntax:
ROLLBACK TO SAVEPOINT SAVEPOINT_NAME;
• Example
• Deletion have been taken place, let us assume that we have changed our
mind and decided to ROLLBACK to the SAVEPOINT that we identified as SP1
which is before deletion. So, In this case the DELETE operation is undone, and
the transaction is returned to the state it was in at the SP1 savepoint.
ROLLBACK TO SP1; //Rollback completed
RELEASE SAVEPOINT
• This command is used to remove a SAVEPOINT that we
have created. Once a SAVEPOINT has been released, we
can no longer use the ROLLBACK command to undo
transactions performed since the last SAVEPOINT.
• It is used to initiate a database transaction and used to
specify characteristics of the transaction that follows.
• Syntax:
RELEASE SAVEPOINT SAVEPOINT_NAME
• Example
• Once the savepoint SP2 is released, we can no longer roll
back to it.
RELEASE SAVEPOINT SP2; // Release the second savepoint.
Serializability
• Serializability is the property that ensures that the
concurrent execution of a set of transactions
produces the same result as if these transactions
were executed one after the other without any
overlapping, i.e., serially.
• Why is Serializability Important?
• In a database system, for performance
optimization, multiple transactions often run
concurrently. While concurrency improves
performance, it can introduce several data
inconsistency problems if not managed properly.
• Serializability ensures that even when transactions
are executed concurrently, the database remains
consistent, producing a result that's equivalent to
a serial execution of these transactions.
Schedule is Serializability
Transaction-1 Transaction-2
Non-serial Schedule
• A schedule where the R(a)
transactions are overlapping
or switching places. As they W(a)
are used to carry out actual
database operations,
R(b)
multiple transactions are
running at once.
W(b)
• It's possible that these
transactions are focusing on
the same data set. R(b)
Therefore, it is crucial that
non-serial schedules can be R(a)
serialized in order for our
database to be consistent W(b)
both before and after the
transactions are executed.
W(a)
Serial Schedule Non-Serial Schedule
Testing for serializability
in DBMS
• Testing for serializability in a DBMS involves
verifying if the interleaved execution of
transactions maintains the consistency of the
database.
• The most common way to test for
serializability is using a precedence graph
(also known as a serializability graph or
conflict graph).
Types of Serializability
• There are two ways to check whether any
non-serial schedule is serializable.
Conflict Serializability
• Conflict serializability is based on the concept of conflicts
between operations of different transactions.
• A conflict occurs when two operations:
⮚ Belong to different transactions.
⮚ Access the same data item.
⮚ At least one of the operations is a write.
• In conflict serializability, a schedule (sequence of
operations) is serializable if you can reorder the operations
(without changing the final outcome of each operation) such
that the schedule is serial (i.e., one transaction is fully
completed before another starts).
• A conflict graph is used to represent conflict serializability.
• Each transaction is a node, and directed edges are drawn
between nodes if the operations conflict. If the conflict
graph has no cycles, the schedule is conflict serializable.
Conflict Serializability
Example
• Consider two transactions:
T1:
● Write(A)
● Read(B)
T2:
● Read(A)
● Write(B)
• Let's assume that A and B are shared resources in the
database, and we want to check if the following schedule
of operations is conflict serializable.
• Schedule:
T1: Write(A)
T2: Read(A)
T1: Read(B)
T2: Write(B)
Now, let's check if this schedule is conflict serializable:
Step 1: Identify conflicts
● Write(A) by T1 and Read(A) by T2 conflict (since T1 is writing and T2 is
reading the same data item, A).
● Read(B) by T1 and Write(B) by T2 conflict (since T1 is reading and T2 is
writing the same data item, B).
Step 2: Construct a conflict graph
● Each transaction is a node in the graph.
● Draw a directed edge between transactions if they conflict.
● There is an edge from T1 → T2 for Write(A) by T1 and Read(A) by
T2.
● There is an edge from T1 → T2 for Read(B) by T1 and Write(B) by
T2.
Step 3: Check for cycles in the graph
● The conflict graph has edges:
T1 → T2 (for both A and B).
● There are no cycles in the graph.
Since there are no cycles, the schedule is conflict serializable.
Possible serial order (from the conflict graph):
T1→T2
(i.e., T1 completes all operations before T2 starts).
View Serializability
• View serializability is a more general form of serializability
compared to conflict serializability. It focuses on the view of
the data that transactions read and write, rather than the
conflicts between operations.
• For a schedule to be view serializable, the following
conditions must hold:
⮚ The initial read of each data item in the schedule must be
the same as in the serial schedule.
⮚ The final write of each data item must be the same as in the
serial schedule.
⮚ The read-from relationship (i.e., which transaction reads a
value written by another transaction) must be maintained.
• Unlike conflict serializability, view serializability does not
require the operations to be in conflict.
• It is a more relaxed condition, which allows some schedules
that are not conflict serializable to still be view serializable.
View Serializability Example
• Consider two transactions:
T1:
● Read(A)
● Write(B)
T2:
● Write(A)
● Read(B)
• Let's assume we want to check if the following schedule
is view serializable.
• Schedule:
T1: Read(A)
T2: Write(A)
T1: Write(B)
T2: Read(B)
• Now, let's check if this schedule is
view serializable:
Step 1: Check the initial read
● T1 reads A before T2 writes A. In a serial execution, T1
would have to read A before T2 writes it. This condition
holds.
Step 2: Check the final write
● T1 writes B and T2 reads B. In a serial execution, T2
should read B after T1 writes it. This condition also holds.
Step 3: Check the read-from relationship
● T1 reads A, and T2 writes A, so T1 is reading from T2’s
write. This is allowed because T1 reads A before T2 writes
it.
● T2 reads B after T1 writes it. This read-from relationship
holds as well.
• Since all the conditions are satisfied, the schedule is view
serializable.
• Possible serial order (from the view serializability rules):
T1 → T2
Concurrency Control
Protocols
• Concurrency control protocols are the set of rules
which are maintained in order to solve the
concurrency control problems in the database.
• It ensures that the concurrent transactions can
execute properly while maintaining the database
consistency.
• The concurrent execution of a transaction is
provided with atomicity, consistency, isolation,
durability, and serializability via the concurrency
control protocols.
Advantages of
Concurrency
• Waiting Time: It means if a process is in a ready state but
still the process does not get the system to get execute is
called waiting time. So, concurrency leads to less waiting
time.
• Response Time: The time wasted in getting the response
from the CPU for the first time, is called response time. So,
concurrency leads to less Response Time.
• Resource Utilization: The amount of Resource utilization in
a particular system is called Resource Utilization. Multiple
transactions can run parallel in a system. So, concurrency
leads to more Resource Utilization.
• Efficiency: The amount of output produced in comparison to
given input is called efficiency. So, Concurrency leads to
more Efficiency.
Disadvantages of
Concurrency
• Overhead: Implementing concurrency control requires additional
overhead, such as acquiring and releasing locks on database objects. This
overhead can lead to slower performance and increased resource
consumption, particularly in systems with high levels of concurrency.
• Deadlocks: Deadlocks can occur when two or more transactions are
waiting for each other to release resources, causing a circular dependency
that can prevent any of the transactions from completing. Deadlocks can
be difficult to detect and resolve, and can result in reduced throughput and
increased latency.
• Reduced concurrency: Concurrency control can limit the number of users
or applications that can access the database simultaneously. This can lead
to reduced concurrency and slower performance in systems with high
levels of concurrency.
• Complexity: Implementing concurrency control can be complex,
particularly in distributed systems or in systems with complex transactional
logic. This complexity can lead to increased development and maintenance
costs.
• Inconsistency: In some cases, concurrency control can lead to
inconsistencies in the database. For example, a transaction that is rolled
back may leave the database in an inconsistent state, or a long-running
transaction may cause other transactions to wait for extended periods,
leading to data staleness and reduced accuracy.
Lock-based concurrency
control
• Lock-based concurrency control is one of the most
commonly used methods in DBMS. It ensures that when a
transaction accesses data, other transactions are prevented
from accessing it at the same time in conflicting ways.
• Locks are used to control concurrent access to data. These
protocols prevent concurrency issues by allowing only one
transaction to access a specific data item at a time.
• Locks help multiple transactions work together smoothly by
managing access to the database items.
• Locking is a common method used to maintain the
serializability of transactions.
• A transaction must acquire a read lock or write lock on a
data item before performing any read or write operations on
it.
Types of Locks
• Shared Lock (S Lock):
Allows multiple transactions to read a data item
but prevents them from writing to it.

• Exclusive Lock (X Lock):


Ensures that only one transaction can read or
write to a data item, preventing other transactions from
accessing it.
Types of Lock-Based
Protocols
1. Simplistic Lock Protocol
• It is the simplest method for locking data during a transaction.
• Simple lock-based protocols enable all transactions to obtain a lock
on the data before inserting, deleting, or updating it.
• It will unlock the data item once the transaction is completed.

2. Pre-Claiming Lock Protocol


• The Pre-Claiming Lock Protocol evaluates a transaction to identify
all the data items that require locks.
• Before the transaction begins, it requests the database
management system to grant locks on all necessary data
elements.
• If all the requested locks are successfully acquired, the transaction
proceeds. Once the transaction is completed, all locks are released.
• However, if any of the locks are unavailable, the transaction rolls
back and waits until all required locks are granted before
restarting.
Types of Lock-Based
Protocols
3. Two-phase locking (2PL)
• A transaction is said to follow the Two-Phase Locking
protocol if Locking and Unlocking can be done in two phases
:
• Growing Phase: New locks on data items may be acquired
but none can be released.
• Shrinking Phase: Existing locks may be released but no
new locks can be acquired.

4. Strict Two-Phase Locking Protocol


• Strict Two-Phase Locking requires that in addition to the 2-
PL all Exclusive(X) locks held by the transaction be released
until after the Transaction Commits.
Timestamp based
Concurrency Control
• Timestamp-Based Concurrency Control (TCC) is a
technique used in database management systems to
handle concurrent transactions.
• The idea is to use timestamps to control the order in
which transactions are executed, ensuring that they do
not violate consistency constraints while allowing for
concurrent execution.
• In other words, TCC uses the concept of a timestamp,
assigned to each transaction, to determine whether the
transaction can be allowed to access certain data or
whether it needs to be rolled back.
Key Concepts of Timestamp-
Based Concurrency Control:
1. Timestamp:
Every transaction is assigned a unique timestamp
when it starts. This timestamp is typically based on
the system clock or generated using a logical counter.
The timestamp determines the transaction's "age,"
with older transactions having earlier timestamps.
2. Transaction States:
Read and Write Operations: When a transaction
performs a read or write operation on a data item, the
system checks the timestamp and ensures that the
transaction is executed in a way that does not cause
conflicts with other transactions.
3. Two Types of Timestamps:
Read Timestamp (RTS): The latest timestamp of
the transaction that has read the data item.
Write Timestamp (WTS): The latest timestamp of
the transaction that has written to the data item.
Key Concepts of Timestamp-
Based Concurrency Control:
4. Rules:
When a transaction tries to access a data item, the system uses the
transaction's timestamp and compares it with the read and write
timestamps of that data item. Based on these comparisons, different rules
are applied.
For Read Operation:
• If a transaction tries to read a data item that was written by another
transaction with a timestamp greater than the transaction's timestamp,
then the read is allowed because the other transaction has committed, and
it is considered as a legitimate access.
• If the transaction attempts to read a data item that has already been
written by a transaction with a lower timestamp, it will be denied,
because the read would violate the consistency of the system (it would be
reading old or invalid data).
For Write Operation:
• If a transaction attempts to write to a data item that has already been
written by another transaction with a greater timestamp, then the write
operation is denied, as it would conflict with the other transaction's
outcome.
• If the write can proceed without violating the consistency constraints, it is
allowed, and the transaction's write timestamp for the data item is
updated.
Key Concepts of Timestamp-
Based Concurrency Control:
5. Commitment Protocol:
Once a transaction has been executed successfully, it can
either commit or abort based on the timestamp-based
validation.
6. Abort and Rollback:
If a transaction violates the concurrency rules based on
timestamps (such as reading or writing inconsistent data), it
will be aborted and rolled back, to prevent inconsistency in
the database.
Database Recovery
System
• A Database Recovery System in a Database Management
System (DBMS) refers to the set of techniques, tools, and
processes used to restore a database to a consistent state
after a failure.
• The goal of a recovery system is to ensure data integrity,
consistency, and durability, even in the face of hardware
crashes, power failures, or software errors.
• Types of Recovery Techniques in DBMS
1. Rollback/Undo Recovery Technique
2. Commit/Redo Recovery Technique
3. CheckPoint Recovery Technique
Rollback/Undo Recovery
Technique
• The rollback/undo recovery technique is based on the
principle of backing out or undoing the effects of a
transaction that has not been completed successfully
due to a system failure or error.
• This technique is accomplished by undoing the changes
made by the transaction using the log records stored in
the transaction log.
• The transaction log contains a record of all the
transactions that have been performed on the
database.
• The system uses the log records to undo the changes
made by the failed transaction and restore the
database to its previous state.
Commit/Redo Recovery
Technique
• The commit/redo recovery technique is based on the
principle of reapplying the changes made by a
transaction that has been completed successfully to
the database.
• This technique is accomplished by using the log
records stored in the transaction log to redo the
changes made by the transaction that was in progress
at the time of the failure or error.
• The system uses the log records to reapply the
changes made by the transaction and restore the
database to its most recent consistent state.
Checkpoint Recovery
Technique
• Checkpoint Recovery is a technique used to improve data
integrity and system stability, especially in databases and
distributed systems.
• It entails preserving the system’s state at regular intervals,
known as checkpoints, at which all ongoing transactions are
either completed or not initiated.
• This saved state, which includes memory and CPU registers,
is kept in stable, non-volatile storage so that it can withstand
system crashes.
• In the event of a breakdown, the system can be restored to
the most recent checkpoint, which reduces data loss and
downtime.
• The frequency of checkpoint formation is carefully regulated
to decrease system overhead while ensuring that recent data
may be restored quickly.
Log-Based Recovery
• Log-based recovery is a technique in Database Management
Systems (DBMS) that helps ensure data integrity and
consistency in the event of system crashes, power failures, or
other types of failures.
• The transaction log plays a central role in log-based recovery.
• The transaction log records all database operations, including
updates, insertions, and deletions, as well as the start and
commit or abort status of transactions.
• This log serves as a vital tool in recovering the database to a
consistent state after a failure.
• The idea is to keep a record of all the changes made to the
database in a log file. This log can then be used to restore the
database to a consistent state following a failure.
Key Concepts in Log-Based
Recovery
1. Write-Ahead Log (WAL):
• The central principle of log-based recovery is the Write-Ahead
Log (WAL) protocol, which ensures that all changes to the
database are first recorded in the log before any changes are
actually made to the database.
• This guarantees that, in case of a crash, the system can recover
by replaying the log entries or undoing the incomplete
transactions.

2. Log Structure:
• The log is typically a sequential record of all operations performed
on the database. Each log entry contains:
● Transaction ID: Identifies the transaction performing the
operation.
● Type of Operation: Whether the operation is a read, write, or
commit.
● Before and After Data: The data before and after the
operation.
● Timestamp: The time when the operation was performed.
Key Concepts in Log-Based
Recovery
3. Log Records Types:
• Update Log Record: Contains the before and after values
for a database item that has been modified.
• Commit Log Record: Marks the point where a transaction
successfully completed and can be committed.
• Abort Log Record: Marks the point where a transaction
was aborted and indicates that its changes need to be
undone.

4. Types of Recovery: There are generally two types of


recovery that use the log:
• Undo: Reverts changes made by transactions that have
not been committed when the system crashes.
• Redo: Re-applies changes made by transactions that have
been committed before the crash but whose changes
might not have been written to the database.
Key Concepts in Log-Based
Recovery
5. Recovery Process:
• After a crash, the recovery process involves scanning the log to determine
which transactions need to be rolled back (undo) and which need to be
redone (redo).
• The recovery process is typically divided into three phases:
● Analysis Phase: Analyzes the log to identify which transactions were
active at the time of the crash, which transactions were committed,
and which were not.
● Redo Phase: Re-applies changes made by committed transactions,
even if they were not written to disk.
● Undo Phase: Rolls back changes made by uncommitted transactions.

6. Checkpointing:
• A checkpoint is a mechanism to periodically save the state of the
database to the disk. It provides a way to reduce the amount of work
required during recovery because, after a checkpoint, only transactions
that began after the checkpoint need to be considered for redo or undo.
• Checkpoints are recorded in the log, and after a crash, recovery can start
from the most recent checkpoint rather than from the very beginning of
the log.
Advantages of Log-Based
Recovery
• Data Integrity: Log-based recovery ensures
that the database can be restored to a
consistent state, preventing data corruption.
• Crash Resilience: Even if a system crashes
unexpectedly, the DBMS can recover to a
consistent state using the transaction log.
• Efficient Recovery: Log-based recovery
allows for efficient recovery, as only the
transactions that need to be undone or redone
are processed, reducing the overhead during
recovery.
Deadlock Handling
• What is Deadlock?
• The Deadlock is a condition in a multi-user database
environment where transactions are unable to the complete
because they are each waiting for the resources held by other
transactions. This results in a cycle of the dependencies
where no transaction can proceed.
• Deadlock handling in a Database Management System
(DBMS) is a critical aspect of ensuring the smooth execution
of transactions in multi-user environments.
• A deadlock occurs when two or more transactions hold
resources and wait for each other to release them, creating a
circular waiting situation where none of the transactions can
proceed.
• There are three main strategies to handle deadlocks in DBMS:
1. Deadlock Prevention
2. Deadlock Avoidance
3. Deadlock Detection and Recovery
Deadlock Prevention
This approach involves ensuring that the system will never enter
a deadlock situation. It is done by enforcing certain conditions
that prevent circular wait, which is the core of a deadlock. These
conditions are:
• No Preemption: Resources cannot be forcibly taken from a
transaction. Once a resource is allocated to a transaction, it is
held until the transaction completes or releases it.
• Hold and Wait: A transaction must request all the resources
it needs at once. It cannot hold a resource while requesting
others. If all the needed resources are not available, it must
release the ones it holds and wait.
• Circular Wait: A circular chain of transactions must not exist,
where each transaction holds at least one resource needed by
the next transaction in the chain. This can be prevented by
numbering resources and enforcing a strict order in which
transactions must request them (this is called resource
ordering).
Deadlock Avoidance
In deadlock avoidance, the system dynamically analyzes the
resource allocation state to ensure that deadlock cannot
occur. One of the most common methods for deadlock
avoidance is the Banker's Algorithm.
• The Banker's Algorithm ensures that a transaction only
requests resources if the system is in a safe state. A safe
state is one where there is a sequence of transactions
that can all complete without causing a deadlock. If a
transaction requests resources that might lead the
system into an unsafe state (where deadlock could
occur), the system will delay granting the request.
• The system keeps track of the maximum resources each
transaction might need, and only grants resources when it
can guarantee that all transactions will eventually finish.
Deadlock Detection and Recovery

This strategy allows deadlocks to occur but continuously monitors


the system for them. If a deadlock is detected, the system takes
action to break the deadlock and allow the system to continue
operating. The steps involved are:
• Detection: The DBMS uses a Wait-for Graph to monitor
transactions and resources. In the wait-for graph, each
transaction is a node, and a directed edge from one transaction
to another indicates that the first transaction is waiting for the
second one. If there is a cycle in this graph, it indicates a
deadlock.
• Recovery: Once a deadlock is detected, the system can resolve
it by:
● Terminating a transaction: The system can choose to
abort one of the transactions involved in the deadlock,
freeing up the resources and allowing the other transactions
to proceed.
● Rollback: The system can roll back one of the transactions
to an earlier state and let it restart. The transaction can then
acquire the necessary resources or choose a different path
Comparison of Strategies

Strategy Description Pros Cons


Prevents the
May limit
system from Ensures no
resource
Prevention entering a deadlocks
usage or
deadlock occur.
flexibility.
state.
Requires
Ensures the Reduces the knowledge of
Avoidance system avoids chance of future
unsafe states. deadlock. resource
needs.
Detects Requires
More flexible,
Detection & deadlocks and overhead to
allows
Recovery recovers from detect and
deadlocks.
them. recover.
End of Chapter - 6

You might also like