0% found this document useful (0 votes)
6 views37 pages

Unit 3_Part 2_Transaction management

The document outlines the syllabus for DBMS Unit 3, covering topics such as database design, transaction management, and ACID properties. It explains the concept of transactions, their operations, states, and the importance of maintaining database consistency through properties like Atomicity, Consistency, Isolation, and Durability. Additionally, it discusses schedules in DBMS, including serial and non-serial schedules, and introduces the concept of serializability and its types.

Uploaded by

Chaitanya Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views37 pages

Unit 3_Part 2_Transaction management

The document outlines the syllabus for DBMS Unit 3, covering topics such as database design, transaction management, and ACID properties. It explains the concept of transactions, their operations, states, and the importance of maintaining database consistency through properties like Atomicity, Consistency, Isolation, and Durability. Additionally, it discusses schedules in DBMS, including serial and non-serial schedules, and introduces the concept of serializability and its types.

Uploaded by

Chaitanya Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

DBMS UNIT 3-SYLLABUS

Database Design: Dependencies and Normal forms, Functional Dependencies,


1NF,2NF,3NF, and BCNF, Higher normal forms-4NF,5NF, Transaction management:
ACID properties, Serializability, Concurrency Control (2PL, Timestamp
Protocol), Database Recovery Management- Log based recovery, checkpoints

TOPIC: Transaction management:


Transaction

o The transaction is a set of logically related operation. It contains a group of tasks.


o A transaction is an action or series of actions. It is performed by a single user to perform
operations for accessing the contents of the database.

Example: Suppose an employee of bank transfers Rs 800 from X's account to Y's account.
This small transaction contains several low-level tasks:

X's Account

1. Open_Account(X)
2. Old_Balance = X.balance
3. New_Balance = Old_Balance - 800
4. X.balance = New_Balance
5. Close_Account(X)

Y's Account

1. Open_Account(Y)
2. Old_Balance = Y.balance
3. New_Balance = Old_Balance + 800
4. Y.balance = New_Balance
5. Close_Account(Y)

Operations of Transaction: Following are the main operations of transaction:

Play0:00

Read(X): Read operation is used to read the value of X from the database and stores it in a
buffer in main memory.

Write(X): Write operation is used to write the value back to the database from the buffer.

DBMS Unit 3 by Dr. Deepika Bhatia


Let's take an example to debit transaction from an account which consists of following
operations:

1. 1. R(X);
2. 2. X = X - 500;
3. 3. W(X);

Let's assume the value of X before starting of the transaction is 4000.

o The first operation reads X's value from database and stores it in a buffer.
o The second operation will decrease the value of X by 500. So buffer will contain 3500.
o The third operation will write the buffer's value to the database. So X's final value will
be 3500.

But it may be possible that because of the failure of hardware, software or power, etc. that
transaction may fail before finished all the operations in the set.

For example: If in the above transaction, the debit transaction fails after executing operation
2 then X's value will remain 4000 in the database which is not acceptable by the bank.

To solve this problem, we have two important operations:

Commit: It is used to save the work done permanently.

Rollback: It is used to undo the work done.

Topic: What are Transaction properties

i.e (ACID Properties) in DBMS?

Transaction property

The transaction has the four properties. These are used to maintain consistency in a database,
before and after the transaction.

Property of Transaction

1. Atomicity

2. Consistency

3. Isolation

4. Durability

DBMS Unit 3 by Dr. Deepika Bhatia


Transactions refer to the single logical units of work that access and (possibly) modify the
contents present in any given database. We can access the transactions using the read and write
operations.

If we want to maintain database consistency, then certain properties need to be followed in the
transactions known as the ACID (Atomicity, Consistency, Isolation, Durability) properties. Let
us discuss them in detail.

A – Atomicity
Atomicity means that an entire transaction either takes place all at once or it doesn’t occur at
all. It means that there’s no midway. The transactions can never occur partially. Every
transaction can be considered as a single unit, and they either run to completion or do not get
executed at all. We have the following two operations here:

—Commit: In case a transaction commits, the changes made are visible to us. Thus, atomicity
is also called the ‘All or nothing rule’.

—Abort: In case a transaction aborts, the changes made to the database are not visible to us.

Consider this transaction T that consists of T1 and T2: Transfering 100 from account A to
account B.

DBMS Unit 3 by Dr. Deepika Bhatia


In case the transaction fails when the T1 is completed but the T2 is not completed (say, after
write(A) but before write(B)), then the amount has been deducted from A but not added to B.
This would result in a database state that is inconsistent. Thus, the transaction has to be
executed in its entirety in order to ensure the correctness of the database state.

C – Consistency
Consistency means that we have to maintain the integrity constraints so that any given database
stays consistent both before and after a transaction. If we refer to the example discussed above,
then we have to maintain the total amount, both before and after the transaction.

Total after T occurs = 400 + 300 = 700.

Total before T occurs = 500 + 200 = 700.

Thus, the given database is consistent. Here, an inconsistency would occur when T1 completes,
but then the T2 fails. As a result, the T would remain incomplete.

I – Isolation
Isolation ensures the occurrence of multiple transactions concurrently without a database state
leading to a state of inconsistency. A transaction occurs independently, i.e. without any
interference. Any changes that occur in any particular transaction wou ld NOT be ever visible
to the other transactions unless and until this particular change in this transaction has been
committed or written to the memory.

The property of isolation ensures that when we execute the transactions concurrently, it will
result in such a state that’s equivalent to the achieved state that was serially executed in a
particular order.

Let A = 500, B = 500

Let us consider two transactions here- T and T”

DBMS Unit 3 by Dr. Deepika Bhatia


Suppose that T has been executed here till Read(B) and then T’’ starts. As a result, the
interleaving of operations would take place. And due to this, T’’ reads the correct value of A
but incorrect value of B.

T’’: (X+B = 50, 000+500=50, 500)

Thus, the sum computed here is not consistent with the sum that is obtained at the end of the
transaction:

T: (A+B = 50, 000 + 450 = 50, 450).

It results in the inconsistency of a database due to the loss of a total of 50 units. The transactions
must, thus, take place in isolation. Also, the changes must only be visible after we have made
them on the main memory.

D – Durability
The durability property states that once the execution of a transaction is completed, the
modifications and updates on the database gets written on and stored in the disk. These persist
even after the occurrence of a system failure. Such updates become permanent and get stored
in non-volatile memory. Thus, the effects of this transaction are never lost.

States of Transaction
In a database, the transaction can be in one of the following states -

DBMS Unit 3 by Dr. Deepika Bhatia


Active state

o The active state is the first state of every transaction. In this state, the transaction is
being executed.

o For example: Insertion or deletion or updating a record is done here. But all the records
are still not saved to the database.

Partially committed

o In the partially committed state, a transaction executes its final operation, but the data
is still not saved to the database.

o In the total mark calculation example, a final display of the total marks step is executed
in this state.

Committed

A transaction is said to be in a committed state if it executes all its operations successfully. In


this state, all the effects are now permanently saved on the database system.

Failed state

o If any of the checks made by the database recovery system fails, then the transaction is
said to be in the failed state.

o In the example of total mark calculation, if the database is not able to fire a query to
fetch the marks, then the transaction will fail to execute.

Aborted

o If any of the checks fail and the transaction has reached a failed state then the database
recovery system will make sure that the database is in its previous consistent state. If
not then it will abort or roll back the transaction to bring the database in to a consistent
state.

o If the transaction fails in the middle of the transaction then before executing the
transaction, all the executed transactions are rolled back to its consistent state.

o After aborting the transaction, the database recovery module will select one of the two
operations:

1. Re-start the transaction

2. Kill the transaction

SCHEDULES IN DBMS
A series of operation from one transaction to another transaction is known as schedule. It is
used to preserve the order of the operation in each of the individual transaction.

DBMS Unit 3 by Dr. Deepika Bhatia


1. Serial Schedule

The serial schedule is a type of schedule where one transaction is executed completely before
starting another transaction. In the serial schedule, when the first transaction completes its
cycle, then the next transaction is executed.

For example: Suppose there are two transactions T1 and T2 which have some operations. If it
has no interleaving of operations, then there are the following two possible outcomes:

1. Execute all the operations of T1 which was followed by all the operations of T2.

2. Execute all the operations of T1 which was followed by all the operations of T2.

o In the given (a) figure, Schedule A shows the serial schedule where T1 followed by T2.

o In the given (b) figure, Schedule B shows the serial schedule where T2 followed by T1.

2. Non-serial Schedule

o If interleaving of operations is allowed, then there will be non -serial schedule.

o It contains many possible orders in which the system can execute the individual
operations of the transactions.

o In the given figure (c) and (d), Schedule C and Schedule D are the non -serial schedules.
It has interleaving of operations.

3. Serializable schedule

o The serializability of schedules is used to find non-serial schedules that allow the
transaction to execute concurrently without interfering with one another.

o It identifies which schedules are correct when executions of the transaction have
interleaving of their operations.

o A non-serial schedule will be serializable if its result is equal to the result of its
transactions executed serially.

DBMS Unit 3 by Dr. Deepika Bhatia


DBMS Unit 3 by Dr. Deepika Bhatia
Here,

Schedule A and Schedule B are serial schedule.

Schedule C and Schedule D are Non-serial schedule.

Topic: Serializability in DBMS

Serializability in DBMS

What is Serializability in DBMS?

In the field of computer science, serializability is a term that is a property of the system that
describes how the different process operates the shared data. If the result given by the system
is similar to the operation performed by the system, then in this situation, we call that system
serializable. Here the cooperation of the system means there is no overlapping in the execution
of the data. In DBMS, when the data is being written or read then, the DBMS can stop all the
other processes from accessing the data.

Testing of Serializability

Serialization Graph is used to test the Serializability of a schedule.

Assume a schedule S. For S, we construct a graph known as precedence graph. This graph has
a pair G = (V, E), where V consists a set of vertices, and E consists a set of edges. The set of
vertices is used to contain all the transactions participating in the schedule. The set of edges is
used to contain all edges Ti ->Tj for which one of the three conditions holds:

1. Create a node Ti → Tj if Ti executes write (Q) before Tj executes read (Q).


2. Create a node Ti → Tj if Ti executes read (Q) before Tj executes write (Q).
3. Create a node Ti → Tj if Ti executes write (Q) before Tj executes write (Q).

o If a precedence graph contains a single edge Ti → Tj, then all the instructions of Ti are
executed before the first instruction of Tj is executed.

DBMS Unit 3 by Dr. Deepika Bhatia


o If a precedence graph for schedule S contains a cycle, then S is non -serializable. If the
precedence graph has no cycle, then S is known as serializable.

For example:

Explanation:rward Skip 10s

Read(A): In T1, no subsequent writes to A, so no new edges


Read(B): In T2, no subsequent writes to B, so no new edges
Read(C): In T3, no subsequent writes to C, so no new edges
Write(B): B is subsequently read by T3, so add edge T2 → T3
Write(C): C is subsequently read by T1, so add edge T3 → T1
Write(A): A is subsequently read by T2, so add edge T1 → T2
Write(A): In T2, no subsequent reads to A, so no new edges
Write(C): In T1, no subsequent reads to C, so no new edges
Write(B): In T3, no subsequent reads to B, so no new edges

Precedence graph for schedule S1:

DBMS Unit 3 by Dr. Deepika Bhatia


The precedence graph for schedule S1 contains a cycle that's why Schedule S1 is non-
serializable.

Explanation:

Read(A): In T4,no subsequent writes to A, so no new edges


Read(C): In T4, no subsequent writes to C, so no new edges
Write(A): A is subsequently read by T5, so add edge T4 → T5
Read(B): In T5,no subsequent writes to B, so no new edges
Write(C): C is subsequently read by T6, so add edge T4 → T6
Write(B): A is subsequently read by T6, so add edge T5 → T6
Write(C): In T6, no subsequent reads to C, so no new edges
Write(A): In T5, no subsequent reads to A, so no new edges
Write(B): In T6, no subsequent reads to B, so no new edges

DBMS Unit 3 by Dr. Deepika Bhatia


Precedence graph for schedule S2:

The precedence graph for schedule S2 contains no cycle that's why ScheduleS2 is serializable.

Types of Serializability

In DBMS, all the transaction should be arranged in a particular order, even if all the transaction
is concurrent. If all the transaction is not serializable, then it produces the incorrect result.

In DBMS, there are different types of serializable. Each type of serializable has some
advantages and disadvantages. The two most common types of serializable are

View serializability and conflict serializability.

1. Conflict Serializability

Conflict Serializable Schedule

o A schedule is called conflict serializability if after swapping of non -conflicting


operations, it can transform into a serial schedule.
o The schedule will be a conflict serializable if it is conflict equivalent to a serial schedule.

Conflicting Operations

The two operations become conflicting if all conditions satisfy:

1. Both belong to separate transactions.


2. They have the same data item.
3. They contain at least one write operation.

Example:

Swapping is possible only if S1 and S2 are logically equal.

DBMS Unit 3 by Dr. Deepika Bhatia


Here, S1 = S2. That means it is non-conflict.

Here, S1 ≠ S2. That means it is conflict.Skip 10s

Conflict Equivalent

In the conflict equivalent, one can be transformed to another by swapping non-conflicting


operations. In the given example, S2 is conflict equivalent to S1 (S1 can be converted to S2 by
swapping non-conflicting operations).

Two schedules are said to be conflict equivalent if and only if:

1. They contain the same set of the transaction.


2. If each pair of conflict operations are ordered in the same way.

Example:

DBMS Unit 3 by Dr. Deepika Bhatia


Schedule S2 is a serial schedule because, in this, all operations of T1 are performed before
starting any operation of T2. Schedule S1 can be transformed into a serial schedule by swapping
non-conflicting operations of S1.

After swapping of non-conflict operations, the schedule S1 becomes:

T1 T2

Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)

Since, S1 is conflict serializable.

2. View Serializability

View serializability is a type of operation in the serializable in which each transaction should
produce some result and these results are the output of proper sequential execution of the data
item. Unlike conflict serialized, the view serializability focuses on preventing inconsistency in
the database. In DBMS, the view serializability provides the user to view the database in a
conflicting way.

o A schedule will view serializable if it is view equivalent to a serial schedule.


o If a schedule is conflict serializable, then it will be view serializable.
o The view serializable which does not conflict serializable contains blind writes.

View Equivalent

DBMS Unit 3 by Dr. Deepika Bhatia


Two schedules S1 and S2 are said to be view equivalent if they satisfy the following conditions:

1. Initial Read

An initial read of both schedules must be the same. Suppose two schedule S1 and S2. In
schedule S1, if a transaction T1 is reading the data item A, then in S2, transaction T1 should
also read A.

Above two schedules are view equivalent because Initial read operation in S1 is done by T1
and in S2 it is also done by T1.

2. Updated Read

In schedule S1, if Ti is reading A which is updated by Tj then in S2 also, Ti should read A


which is updated by Tj.

Above two schedules are not view equal because, in S1, T3 is reading A updated by T2 and in
S2, T3 is reading A updated by T1.

3. Final Write

A final write must be the same between both the schedules. In schedule S1, if a transaction T1
updates A at last then in S2, final writes operations should also be done by T1.

DBMS Unit 3 by Dr. Deepika Bhatia


Above two schedules is view equal because Final write operation in S1 is done by T3 and in
S2, the final write operation is also done by T3.

Example:

Schedule S

With 3 transactions, the total number of possible schedule

1. = 3! = 6
2. S1 = <T1 T2 T3>
3. S2 = <T1 T3 T2>
4. S3 = <T2 T3 T1>
5. S4 = <T2 T1 T3>
6. S5 = <T3 T1 T2>
7. S6 = <T3 T2 T1>

Taking first schedule S1:

DBMS Unit 3 by Dr. Deepika Bhatia


Schedule S1

Step 1: final updation on data items

In both schedules S and S1, there is no read except the initial read that's why we don't need to
check that condition.

Step 2: Initial Read

The initial read operation in S is done by T1 and in S1, it is also done by T1.

Step 3: Final Write

The final write operation in S is done by T3 and in S1, it is also done by T3. So, S and S1 are
view Equivalent.

The first schedule S1 satisfies all three conditions, so we don't need to check another schedule.

Hence, view equivalent serial schedule is:

1. T1 → T2 → T3

Testing of Serializability in DBMS with Examples

Serializability is a type of property of DBMS in which each transaction is executed


independently and automatically, even though these transactions are executed concurrently. In
other words, we can say that if there are several transactions executed concurrently, then the
main work of the serializability function is to arrange these several transactions in a sequential
manner.

For better understanding, let's explain these with an example. Suppose there are two users Sona
and Archita. Each executes two transactions. Let's transactions T1 and T2 are executed by
Sona, and T3 and T4 are executed by Archita. Suppose transaction T1 re ads and writes the data
item A, transaction T2 reads the data item B, transaction T3 reads and writes the data item C
and transaction T4 reads the data item D. Lets the schedule the above transaction as below.

•T1: Read A → Write A→ Read B → Write B`

DBMS Unit 3 by Dr. Deepika Bhatia


•`T2: Read B → Write B`
•`T3: Read C → Write C→ Read D → Write D`
`T4: Read D → Write D

Let's first discuss why these transactions are not serializable.

In order for a schedule to be considered serializable, it must first satisfy the conflict
serializability property. In our example schedule above, notice that Transaction 1 (T1) and
Transaction 2 (T2) read data item B before either writing it. This causes a conflict between T1
and T2 because they are both trying to read and write the same data item concurrently.
Therefore, the given schedule does not conflict with serializability.

However, there is another type of serializability called view serializability which our example
does satisfy. View serializability requires that if two transactions cannot see each other's
updates (i.e., one transaction cannot see the effects of another co ncurrent transaction), the
schedule is considered to view serializable. In our example, Transaction 2 (T2) cannot see any
updates made by Transaction 4 (T4) because they do not share common data items. Therefore,
the schedule is viewed as serializable.

It's important to note that conflict serializability is a stronger property than view serializability
because it requires that all potential conflicts be resolved before any updates are made (i.e.,
each transaction must either read or write each data item before any other transaction can write
it). View serializability only requires that if two transactions cannot see each other's updates,
then the schedule is view serializable & it doesn't matter whether or not there are potential
conflicts between them.

All in all, both properties are necessary for ensuring correctness in concurrent transactions in a
database management system.

Benefits of Serializability in DBMS

Below are the benefits of using the serializable in the database.

1. Predictable execution: In serializable, all the threads of the DBMS are executed at one
time. There are no such surprises in the DBMS. In DBMS, all the variables are updated
as expected, and there is no data loss or corruption.
2. Easier to Reason about & Debug: In DBMS all the threads are executed alone, so it
is very easier to know about each thread of the database. This can make the debugging
process very easy. So we don't have to worry about the concurrent process.
3. Reduced Costs: With the help of serializable property, we can reduce the cost of the
hardware that is being used for the smooth operation of the database. It can also reduce
the development cost of the software.

DBMS Unit 3 by Dr. Deepika Bhatia


4. Increased Performance:In some cases, serializable executions can perform better than
their non-serializable counterparts since they allow the developer to optimize their code
for performance.

Topic: Concurrency Control in DBMS

DBMS Concurrency Control


Concurrency Control is the management procedure that is required for controlling concurrent
execution of the operations that take place on a database.

But before knowing about concurrency control, we should know about concurrent execution.

Concurrent Execution in DBMS


o In a multi-user system, multiple users can access and use the same database at one time,
which is known as the concurrent execution of the database. It means that the same
database is executed simultaneously on a multi-user system by different users.
o While working on the database transactions, there occurs the requirement of using the
database by multiple users for performing different operations, and in that case,
concurrent execution of the database is performed.
o The thing is that the simultaneous execution that is performed should be done in an
interleaved manner, and no operation should affect the other executing operations, thus
maintaining the consistency of the database. Thus, on making the concurrent execution
of the transaction operations, there occur several challenging problems that need to be
solved.

Problems with Concurrent Execution


In a database transaction, the two main operations are READ and WRITE operations. So,
there is a need to manage these two operations in the concurrent execution of the transactions
as if these operations are not performed in an interleaved manner, and the data may become
inconsistent. So, the following problems occur with the Concurrent Execution of the
operations:

Problem 1: Lost Update Problems (W - W Conflict)


The problem occurs when two different database transactions perform the read/write
operations on the same database items in an interleaved manner (i.e., concurrent execution)
that makes the values of the items incorrect hence making the database inconsistent.

DBMS Unit 3 by Dr. Deepika Bhatia


PlayNext
Unmute

Current Time 0:00

Duration 18:10
Loaded: 0.37%
Â
Fullscreen
Backward Skip 10sPlay VideoForward Skip 10s

For example:

Consider the below diagram where two transactions T X and TY, are performed on the
same account A where the balance of account A is $300.

o At time t1, transaction TX reads the value of account A, i.e., $300 (only read).
o At time t2, transaction TX deducts $50 from account A that becomes $250 (only
deducted and not updated/write).
o Alternately, at time t3, transaction TY reads the value of account A that will be $300
only because TX didn't update the value yet.

DBMS Unit 3 by Dr. Deepika Bhatia


o At time t4, transaction TY adds $100 to account A that becomes $400 (only added but
not updated/write).
o At time t6, transaction TX writes the value of account A that will be updated as $250
only, as TY didn't update the value yet.
o Similarly, at time t7, transaction TY writes the values of account A, so it will write as
done at time t4 that will be $400. It means the value written by T X is lost, i.e., $250 is
lost.

Hence data becomes incorrect, and database sets to inconsistent.

Dirty Read Problems (W-R Conflict)


The dirty read problem occurs when one transaction updates an item of the database, and
somehow the transaction fails, and before the data gets rollback, the updated database item is
accessed by another transaction. There comes the Read -Write Conflict between both
transactions.

For example:

Consider two transactions T X and TY in the below diagram performing read/write


operations on account A where the available balance in account A is $300:

o At time t1, transaction TX reads the value of account A, i.e., $300.


o At time t2, transaction TX adds $50 to account A that becomes $350.
o At time t3, transaction TX writes the updated value in account A, i.e., $350.

DBMS Unit 3 by Dr. Deepika Bhatia


o Then at time t4, transaction TY reads account A that will be read as $350.
o Then at time t5, transaction TX rollbacks due to server problem, and the value changes
back to $300 (as initially).
o But the value for account A remains $350 for transaction T Y as committed, which is the
dirty read and therefore known as the Dirty Read Problem.

Unrepeatable Read Problem (W-R Conflict)


Also known as Inconsistent Retrievals Problem that occurs when in a transaction, two different
values are read for the same database item.

For example:

Consider two transactions, TX and TY, performing the read/write operations on account
A, having an available balance = $300. The diagram is shown below:

o At time t1, transaction TX reads the value from account A, i.e., $300.
o At time t2, transaction TY reads the value from account A, i.e., $300.
o At time t3, transaction TY updates the value of account A by adding $100 to the
available balance, and then it becomes $400.
o At time t4, transaction TY writes the updated value, i.e., $400.
o After that, at time t5, transaction TX reads the available value of account A, and that
will be read as $400.

DBMS Unit 3 by Dr. Deepika Bhatia


o It means that within the same transaction TX, it reads two different values of account
A, i.e., $ 300 initially, and after updation made by transaction T Y, it reads $400. It is an
unrepeatable read and is therefore known as the Unrepeatable read problem.

Thus, in order to maintain consistency in the database and avoid such problems that take place
in concurrent execution, management is needed, and that is where the concept of Concurrency
Control comes into role.

Concurrency Control
Concurrency Control is the working concept that is required for controlling and managing the
concurrent execution of database operations and thus avoiding the inconsistencies in the
database. Thus, for maintaining the concurrency of the database, we have the concurrency
control protocols.

Concurrency Control Protocols


The concurrency control protocols ensure the atomicity, consistency, isolation,
durability and serializability of the concurrent execution of the database transactions.
Therefore, these protocols are categorized as:

1. Lock Based Concurrency Control Protocol

2. Time Stamp Concurrency Control Protocol

3. Validation Based Concurrency Control Protocol

A. Timestamp Ordering Protocol

o The Timestamp Ordering Protocol is used to order the transactions based on their
Timestamps. The order of transaction is nothing but the ascending order of the
transaction creation.

o The priority of the older transaction is higher that's why it executes first. To
determine the timestamp of the transaction, this protocol uses system time or logical
counter.

o The lock-based protocol is used to manage the order between conflicting pairs
among transactions at the execution time. But Timestamp based protocols start
working as soon as a transaction is created.

o Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has
entered the system at 007 times and transaction T2 has entered the system at 009
times. T1 has the higher priority, so it executes first as it is entered the system first.

DBMS Unit 3 by Dr. Deepika Bhatia


o The timestamp ordering protocol also maintains the timestamp of last 'read' and
'write' operation on a data.

Basic Timestamp ordering protocol works as follows:

1. Check the following condition whenever a transaction Ti issues a Read (X) operation:

o If W_TS(X) >TS(Ti) then the operation is rejected.

o If W_TS(X) <= TS(Ti) then the operation is executed.

o Timestamps of all the data items are updated.

2. Check the following condition whenever a transaction Ti issues a Write(X) operation:

o If TS(Ti) < R_TS(X) then the operation is rejected.

o If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back otherwise
the operation is executed.

Where,

TS(TI) denotes the timestamp of the transaction Ti.

R_TS(X) denotes the Read time-stamp of data-item X.

W_TS(X) denotes the Write time-stamp of data-item X.

Advantages and Disadvantages of TO protocol:

o TO protocol ensures serializability since the precedence graph is as follows:

DBMS Unit 3 by Dr. Deepika Bhatia


o TS protocol ensures freedom from deadlock that means no transaction ever waits.

o But the schedule may not be recoverable and may not even be cascade - free.

Advantages :

High Concurrency: Timestamp-based concurrency control allows for a high degree of


concurrency by ensuring that transactions do not interfere with each other.
Efficient: The technique is efficient and scalable, as it does not require locking and can
handle a large number of transactions.
No Deadlocks: Since there are no locks involved, there is no possibility of deadlocks
occurring.
Improved Performance: By allowing transactions to execute concurrently, the overall
performance of the database system can be improved.

Disadvantages:

Limited Granularity: The granularity of timestamp-based concurrency control is limited


to the precision of the timestamp. This can lead to situations where transactions are
unnecessarily blocked, even if they do not conflict with each other.
Timestamp Ordering: In order to ensure that transactions are executed in the correct
order, the timestamps need to be carefully managed. If not managed properly, it can lead to
inconsistencies in the database.
Timestamp Synchronization: Timestamp-based concurrency control requires that all
transactions have synchronized clocks. If the clocks are not synchronized, it can lead to
incorrect ordering of transactions.
Timestamp Allocation: Allocating unique timestamps for each transaction can be
challenging, especially in distributed systems where transactions may be initiated at
different locations.

B. 2PL locking protocol

Locking and unlocking of the database should be done in such a way that there is no
inconsistency, deadlock and no starvation.

Every transaction will lock and unlock the data item in two different phases.
• Growing Phase − All the locks are issued in this phase. No locks are released, after
all changes to data-items are committed and then the second phase (shrinking phase)
starts.
• Shrinking phase − No locks are issued in this phase, all the changes to data-items are
noted (stored) and then locks are released.
The 2PL locking protocol is represented diagrammatically as follows −

DBMS Unit 3 by Dr. Deepika Bhatia


In the growing phase transaction reaches a point where all the locks it may need has been
acquired. This point is called LOCK POINT.
After the lock point has been reached, the transaction enters a shrinking phase.
o The two-phase locking protocol divides the execution phase of the transaction into
three parts.
o In the first part, when the execution of the transaction starts, it seeks permission for
the lock it requires.
o In the second part, the transaction acquires all the locks. The third phase is started as
soon as the transaction releases its first lock.
o In the third phase, the transaction cannot demand any new locks. It only releases the
acquired locks.

There are two phases of 2PL:


Growing phase: In the growing phase, a new lock on the data item may be acquired by the
transaction, but none can be released.
Shrinking phase: In the shrinking phase, existing lock held by the transaction may be
released, but no new locks can be acquired.

DBMS Unit 3 by Dr. Deepika Bhatia


In the below example, if lock conversion is allowed then the following phase can happen:
1. Upgrading of lock (from S(a) to X (a)) is allowed in growing phase.
2. Downgrading of lock (from X(a) to S(a)) must be done in shrinking phase.
Example:

The following way shows how unlocking and locking work with 2 -PL.
Transaction T1:
o Growing phase: from step 1-3
o Shrinking phase: from step 5-7
o Lock point: at 3
Transaction T2:
o Growing phase: from step 2-6
o Shrinking phase: from step 8-9
o Lock point: at 6

Types: Two phase locking is of 3 types −


1. Strict two phase locking protocol
A transaction can release a shared lock after the lock point, but it cannot release any exclusive
lock until the transaction commits. This protocol creates a cascade less schedule.
Cascading schedule: In this schedule one transaction is dependent on another transaction. So
if one has to rollback then the other has to rollback.
2. Rigorous two phase locking protocol
A transaction cannot release any lock either shared or exclusive until it commits.

DBMS Unit 3 by Dr. Deepika Bhatia


The 2PL protocol guarantees serializability, but cannot guarantee that deadlock will not
happen.
Example
Let T1 and T2 are two transactions.
T1=A+B and T2=B+A

T1 T2

Lock-X(A) Lock-X(B)

Read A; Read B;

Lock-X(B) Lock-X(A)

Here,
Lock-X(B) : Cannot execute Lock-X(B) since B is locked by T2.
Lock-X(A) : Cannot execute Lock-X(A) since A is locked by T1.
In the above situation T1 waits for B and T2 waits for A. The waiting time never ends. Both
the transaction cannot proceed further at least any one releases the lock voluntarily. This
situation is called deadlock.
The wait for graph is as follows −

Wait for graph: It is used in the deadlock detection method, creating a node for each
transaction, creating an edge Ti to Tj, if Ti is waiting to lock an item locked by Tj. A cycle in
WFG indicates a deadlock has occurred. WFG is created at regular interva ls.
3. Conservative two-phase locking protocol :
• The transaction must lock all the data items it requires in the transaction before the
transaction begins.
• If any of the data items are not available for locking before execution of the lock, then
no data items are locked.
• The read-and-write data items need to know before the transaction begins. This is not
possible normally.
• Conservative two-phase locking protocol is deadlock-free.

DBMS Unit 3 by Dr. Deepika Bhatia


• Conservative two-phase locking protocol does not ensure a strict schedule.

C. Validation Based Protocol

Validation phase is also known as optimistic concurrency control technique. In the validation
based protocol, the transaction is executed in the following three phases:
1. Read phase: In this phase, the transaction T is read and executed. It is used to read the
value of various data items and stores them in temporary local variables. It can perform
all the write operations on temporary variables without an update to the actual database.
2. Validation phase: In this phase, the temporary variable value will be validated against
the actual data to see if it violates the serializability.
3. Write phase: If the validation of the transaction is validated, then the temporary results
are written to the database or system otherwise the transaction is rolled back.
Here each phase has the following different timestamps:
Start(Ti): It contains the time when Ti started its execution.
Validation (Ti): It contains the time when Ti finishes its read phase and starts its validation
phase.
Finish(Ti): It contains the time when Ti finishes its write phase.
o This protocol is used to determine the time stamp for the transaction for serialization
using the time stamp of the validation phase, as it is the actual phase which determines
if the transaction will commit or rollback.
o Hence TS(T) = validation(T).
o The serializability is determined during the validation process. It can't be decided in
advance.
o While executing the transaction, it ensures a greater degree of concurrency and also less
number of conflicts.
o Thus it contains transactions which have less number of rollbacks.

Topics: Thomas write Rule


Thomas Write Rule provides the guarantee of serializability order for the protocol. It improves
the Basic Timestamp Ordering Algorithm.
The basic Thomas write rules are as follows:
o If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and operation is
rejected.
o If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the transaction and
continue processing.

DBMS Unit 3 by Dr. Deepika Bhatia


o If neither condition 1 nor condition 2 occurs, then allowed to execute the WRITE
operation by transaction Ti and set W_TS(X) to TS(T).
If we use the Thomas write rule then some serializable schedule can be permitted that does not
conflict serializable as illustrate by the schedule in a given figure:

Figure: A Serializable Schedule that is not Conflict Serializable


In the above figure, T1's read and precedes T1's write of the same data item. This schedule does
not conflict serializable.
Thomas write rule checks that T2's write is never seen by any transaction. If we delete the write
operation in transaction T2, then conflict serializable schedule can be obtained which is shown
in below figure.

Figure: A Conflict Serializable Schedule

Topic: Database recovery management

Failure Classification
To find that where the problem has occurred, we generalize a failure into the following
categories:

DBMS Unit 3 by Dr. Deepika Bhatia


1. Transaction failure
The transaction failure occurs when it fails to execute or when it reaches a point from where
it can't go any further. If a few transaction or process is hurt, then this is called as transaction
failure.
Reasons for a transaction failure could be -
1. Logical errors: If a transaction cannot complete due to some code error or an
internal error condition, then the logical error occurs.
2. Syntax error: It occurs where the DBMS itself terminates an active
transaction because the database system is not able to execute it. For
example, The system aborts an active transaction, in case of deadlock or
resource unavailability.
2. System Crash
3. System failure can occur due to power failure or other hardware or software
failure. Example: Operating system error.
Fail-stop assumption: In the system crash, non-volatile storage is assumed not to be
corrupted.
3. Disk Failure
4. It occurs where hard-disk drives or storage drives used to fail frequently. It
was a common problem in the early days of technology evolution.
5. Disk failure occurs due to the formation of bad sectors, disk head crash, and
unreachability to the disk or any other failure, which destroy all or part of disk
storage.

DBMS Recovery Techniques

(A) Log Based Recovery

(i) Immediate Mode (ii) Deferred Mode

(B) Shadow Paging Recovery Method

DBMS Unit 3 by Dr. Deepika Bhatia


(C) Checkpoint Recovery Methods

A. Log Based Recovery in DBMS


A log is a sequence of records that contains the history of all updates made to the Database.
Log the most commonly used structure for recording database modification. Some time log
record is also known as system log.
Update log has the following fields-
1. Transaction Identifier: To get the Transaction that is executing.
2. Data item Identifier: To get the data item of the Transaction that is running.
3. The old value of the data item (Before the write operation).
4. The new value of the data item (After the write operation).
We denote various kinds of log records, as shown in the following points. This is the basic
structure of the format of a log record.
1. <T, Start >. The Transaction has started.
2. <T, X, V1,V2>. The Transaction has performed write on data. V is a value that X will
have value before writing, and V2 is a Value that X will have after the writing operation.
3. <T, Commit>. The Transaction has been committed.
4. <T, Abort>. The Transaction has aborted.
Consider the data Item A and B with initial value 1000. (A=B=1000)

In the above table, in the left column, a transaction is written, and in the right column of the
table, a log record is written for this Transaction.
Key Points – Following points should be remembered while studying the Log Based
Recovery.
1. Whenever a transaction performs a write, it is essential that the log record for that
write is to be created before the D.B. is modified.
2. Once a log record exists, we can output the modification into D.B. if required. Also,
we have the ability to undo the modification that has already been updated in D.B.
DBMS Unit 3 by Dr. Deepika Bhatia
Log Based Recovery work in two modes These modes are as follow-
1. Immediate Mode
2. Deferred Mode
Log Based Recovery in Immediate Mode
In immediate Mode of log-based recovery, database modification is performed while
Transaction is in Active State.
It means as soon as Transaction is performed or executes its WRITE Operation, then
immediately these changes are saved in Database also. In immediate Mode, there is no need
to wait for the execution of the COMMIT Statement to update the Database.
Explanation
Consider the transition T1 as shown in the above table. The log of this Transaction is written
in the second column. So when the value of data items A and B are changed from 1000 to 950
and 1050 respectively at that time, the value of A and B will also be Update in the Database.
In the case of Immediate Mode, we Need both Old value and New value of the Data Item in
the Log File.
Now, if the system is crashed or failed in the following cases may be possible.
Case 1: If the system crashes after Transaction executing the Commit statement.
In this case, when Transaction executed commit statement, then corresponding commit
entry will also be made to the Log file immediately.
To recover the database recovery manager will check the log file to recover the Database, then
the recovery manager will find both <T, Start > and < T, Commit> in the Log file then it
represents that Transaction T has been completed successfully before the system failed
so REDO(T) operation will be performed and Updated values of Data Item A and B will be set
in Database.
Case 2: If Transaction failed before executing the Commit, it means there is no commit
statement in Transaction as shown in the table given below, then there will be no entry for
Commit in the log file.

DBMS Unit 3 by Dr. Deepika Bhatia


So, in this case, when the system will fail or crash, then the recovery manager will check the
Log file, and he will find the < T, Start> entry in the Log file but not find the < T, Commit>
entry.

It means before system failure; Transaction was not completed successfully, so to ensure the
atomicity property UNDO(T) operation will be performed because Update Values are written
in the Database immediately after the write operation. So Recovery manager will set the old
value of data items A and B.

Log Based Recovery in Deferred Mode: In the Deferred Mode of Log-based recovery
method, all modifications to Database are recorded but WRITE Operation is deferred until
the Transaction is partially committed. It means In the case of Deferred mode, Database is
modified after Commit operation of Transaction is performed. For database Recovery in DBMS
in Deferred Mode, there may be two possible cases.

Case 1: If the system fails or crashes after Transaction performed the commit operation. In this
situation, since the Transaction has performed the commit operation successfully so there will
be an entry for the commit statement in the Logfile of the Transaction .

So after System Failure, when the recovery manager will recover the Database, then he will
check the log file, and the recovery manager will find both <T, Start> and <T, Commit> It
means Transaction has been completed successfully before the system crash so in this situation
REDO(T) operation will be performed and Updated value of Data item A and B will be set in
Database.

Case 2: If Transaction failed before executing the Commit, it means there is no commit
statement in Transaction as shown in the table given below, then there will be no entry for
Commit in the log file.

So, in this case, when the system will fail or crash, then the recovery manager will check the
Log file, and he will find the < T, Start> entry in the Log file but not find the < T, Commit>
entry. It means before system failure, Transaction was not completed successfully, so to ensure
the atomicity property, the Recovery Manager will set the old value of data items A and B.

Note – In this case of Deferred Mode, there is no need to Perform UNDO (T). Update
values of data item not written to Database immediately after the WRITE operation.

DBMS Unit 3 by Dr. Deepika Bhatia


In deferred modes, updated values will be written only after the Transaction commit. So,
in this case, there is an old value of the data item in the Database.

B. Shadow Paging Recovery Method It is a commonly used method for database


recovery systems in DBMS. It requires less disk access than do-log methods.Here the
D.B. is partitioned into some number of fixed-length blocks known as pages, and it
maintains two-page tables during the life cycle of Transaction. Concept of shadow
paging
Now let see the concept of shadow paging step by step −
•Step 1 − Page is a segment of memory. Page table is an index of pages. Each
table entry points to a page on the disk.
• Step 2 − Two page tables are used during the life of a transaction: the current
page table and the shadow page table. Shadow page table is a copy of the current
page table.
• Step 3 − When a transaction starts, both the tables look identical, the current
table is updated for each write operation.
• Step 4 − The shadow page is never changed during the life of the transaction.
• Step 5 − When the current transaction is committed, the shadow page entry
becomes a copy of the current page table entry and the disk block with the old
data is released.
• Step 6 − The shadow page table is stored in non-volatile memory. If the system
crash occurs, then the shadow page table is copied to the current page table.
The shadow paging is represented diagrammatically as follows −

At the starting of the Transaction, the page tables are identical at that time.

DBMS Unit 3 by Dr. Deepika Bhatia


Here each entry contains a pointer to a certain block on the disk. The key idea is to maintain
two-page tables during the transaction-1) Current page table 2) Shadow page table.
When the Transaction starts, both the pages are identical. But during the Transaction, the
current page table makes all the changes while the shadow page table remains as it was before.
On the shadow page, the instructions of the Transaction are stored.

Advantages
The advantages of shadow paging are as follows −
• No need for log records.
• No undo/ Redo algorithm.
• Recovery is faster.
Disadvantages
The disadvantages of shadow paging are as follows −
• Data is fragmented or scattered.
• Garbage collection problem. Database pages containing old versions of modified data
need to be garbage collected after every transaction.
• Concurrent transactions are difficult to execute.

C. Checkpoint Based Recovery method

When more than one transaction are being executed in parallel, the logs are interleaved. At the
time of recovery, it would become hard for the recovery system to backtrack all logs, and then
start recovering. To ease this situation, most modern DBMS use the concept of 'checkpoints'.
Checkpoint
Keeping and maintaining logs in real time and in real environment may fill out all the memory
space available in the system. As time passes, the log file may grow too big to be handled at
all. Checkpoint is a mechanism where all the previous logs are removed from the system and
stored permanently in a storage disk. Checkpoint declares a point before which the DBMS was
in consistent state, and all the transactions were committed.
Recovery via checkpoint
When a system with concurrent transactions crashes and recovers, it behaves in the following
manner −

DBMS Unit 3 by Dr. Deepika Bhatia


• The recovery system reads the logs backwards from the end to the last checkpoint.
• It maintains two lists, an undo-list and a redo-list.
• If the recovery system sees a log with <T n, Start> and <Tn, Commit> or just <Tn,
Commit>, it puts the transaction in the redo-list.
• If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it
puts the transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed. All the
transactions in the redo-list and their previous logs are removed and then redone before saving
their logs.

DBMS Unit 3 by Dr. Deepika Bhatia

You might also like