0% found this document useful (0 votes)
2 views

DBDM UNIT 4

Unit 4 of the document focuses on transaction management in databases, explaining the concept of a transaction as a logical unit of processing that involves operations like insert, update, and delete. It outlines the properties of transactions, known as ACID (Atomicity, Consistency, Isolation, Durability), and discusses various transaction states, operations, and types of schedules (serial and non-serial). Additionally, it covers serializability and the importance of maintaining database consistency during concurrent transaction execution.

Uploaded by

Anusha Xavier
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

DBDM UNIT 4

Unit 4 of the document focuses on transaction management in databases, explaining the concept of a transaction as a logical unit of processing that involves operations like insert, update, and delete. It outlines the properties of transactions, known as ACID (Atomicity, Consistency, Isolation, Durability), and discusses various transaction states, operations, and types of schedules (serial and non-serial). Additionally, it covers serializability and the importance of maintaining database consistency during concurrent transaction execution.

Uploaded by

Anusha Xavier
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

DBDM

UNIT -4 TRANSACTION MANAGEMENT


UNIT - 4 TRANSACTION MANAGEMENT

Database transaction Concepts:

A Database Transaction is a logical unit of processing in a DBMS which entails one or more database
access operations. In a nutshell, database transactions represent real-world events of any enterprise.

A transaction is made to change data in a database which can be done by inserting new data, updating
the existing data, or by deleting the data that is no longer required.

All types of database access operations which are held between the beginning and end transaction
statements are considered as a single logical transaction in DBMS. During the transaction the
database is inconsistent. Only once the database is committed the state is changed from one consistent
state to another.
Definition:
The transaction is a set of logically related operations. It contains a group of tasks.A transaction
is an action or series of actions. It is performed by a single user to perform operations for accessing the
contents of the database.
Example: Suppose an employee of bank transfers Rs 800 from X's account to Y's account. This small
transaction contains several low-level tasks:

X's Account

Open_Account(X) Old_Balance =
X.balance
New_Balance = Old_Balance - 800 X.balance =
New_Balance Close_Account(X)

Y's Account

Open_Account(Y) Old_Balance =
Y.balance
New_Balance = Old_Balance + 800 Y.balance =
New_Balance Close_Account(Y)

Key facts :
● A transaction is a program unit whose execution may or may not change the contents of a
database.
● The transaction concept in DBMS is executed as a single unit.
● If the database operations do not update the database but only retrieve data, this type of
transaction is called a read-only transaction.
● A successful transaction can change the database from one CONSISTENT STATE to
another
● DBMS transactions must be atomic, consistent, isolated and durable
● If the database were in an inconsistent state before a transaction, it would remain in the
inconsistent state after the transaction.

Transaction Operations:
➢ The operations performed in a transaction include one or more of database operations like
insert, delete, update or retrieve data.
➢ It is an atomic process that is either performed into completion entirely or is not
performed at all.
➢ A transaction involving only data retrieval without any data update is called a read-only
transaction.
➢ Each high level operation can be divided into a number of low level tasks or operations. For
example, a data update operation can be divided into three tasks −
● read_item() − reads data item from storage to main memory.
● modify_item() − change value of item in the main memory.
● write_item() − write the modified value from main memory to storage.

Database access is restricted to read_item() and write_item() operations. Likewise, for all transactions, read
and write forms the basic database operations.
Basic Database Operations:

Read(X): Read operation is used to read the value of X from the database and stores it in a
buffer in main memory.

Write(X): Write operation is used to write the value back to the database from the buffer.

Let's take an example to debit transaction from an account which consists of following operations:

1. R(X);
2. X = X - 500;
3. W(X);

Let's assume the value of X be


● The first operation reads X's value from the database and stores it in a buffer.
● The second operation will decrease the value of X by 500. So the buffer will contain
3500.
● The third operation will write the buffer's value to the database. So X's final value will be 3500.
But it may be possible that because of the failure of hardware, software or power, etc. that transaction may
fail before finishing all the operations in the set.
For example: If in the above transaction, the debit transaction fails after executing operation 2 then
X's value will remain 4000 in the database which is not acceptable by the bank.

Low Level Operations:


The low level operations performed in a transaction are −
● begin_transaction − A marker that specifies start of transaction execution.
● read_item or write_item − Database operations that may be interleaved with main
memory operations as a part of transaction.
● end_transaction − A marker that specifies end of transaction.
● commit − A signal to specify that the transaction has been successfully completed in
its entirety and will not be undone.
● rollback − A signal to specify that the transaction has been unsuccessful and so all
temporary changes in the database are undone. A committed transaction cannot be
rolled back.
States of Transactions
The various states of a transaction concept in DBMS are listed below:

Transaction types
State
A transaction enters into an active state when the execution process begins.
Active State
During this state read or write operations can be performed.
Partially Committed A transaction goes into the partially committed state after the end of transaction.
When the transaction is committed to state, it has already completed its
Committed State execution successfully. Moreover, all of its changes are recorded to the database
permanently.
A transaction considers failed when any one of the checks fails or if the
Failed State
transaction is aborted while it is in the active state.
State of transaction reaches terminated state when certain transactions which are
Terminated State
leaving the system can’t be restarted.

Let’s study a state transition diagram that highlights how a transaction moves between these various
states.
1. Once a transaction states execution, it becomes active. It can issue READ or WRITE
operations.
2. Once the READ and WRITE operations complete, the transaction becomes partially
committed state.
3. Next, some recovery protocols need to ensure that a system failure will not result in an
inability to record changes in the transaction permanently. If this check is a success, the
transaction commits and enters into the committed state.
4. If the check is a fail, the transaction goes to the Failed state.
5. If the transaction is aborted while it’s in the active state, it goes to the failed state. The
transaction should be rolled back to undo the effect of its write operations on the database.
6. The terminated state refers to the transaction leaving the system.

Transaction properties:

The transaction has four properties. These are used to maintain consistency in a database, before and after
the transaction.
Transactions access data using read and write operations. In order to maintain consistency in a database,
before and after the transaction, certain properties are followed. These are called ACID properties.
Atomicity

● It states that all operations of the transaction take place at once if not, the transaction is
aborted.
● There is no midway, i.e., the transaction cannot occur partially. Each transaction is treated as
one unit and either run to completion or is not executed at all.

Atomicity involves the following two operations:

Abort: If a transaction aborts then all the changes made are not visible.

Commit: If a transaction commits then all the changes made are visible.

Example: Let's assume that following transaction T consisting of T1 and T2. A consists of Rs 600 and
B consists of Rs 300. Transfer Rs 100 from account A to account B.

T1 T2

Read(A) Read(B)

A:= A-100 Y:= Y+100

Write(A) Write(B)

After completion of the transaction, A consists of Rs 500 and B consists of Rs 400.

If the transaction T fails after the completion of transaction T1 but before completion of transaction
T2, then the amount will be deducted from A but not added to B. This shows the inconsistent database
state. In order to ensure correctness of database state, the transaction must be executed in entirety.

Consistency

● The integrity constraints are maintained so that the database is consistent before and after the
transaction.
● The execution of a transaction will leave a database in either its prior stable state or a new
stable state.
● The consistent property of database states that every transaction sees a consistent
database instance.
● The transaction is used to transform the database from one consistent state to another
consistent state.
For example: The total amount must be maintained before or after the transaction.

1. Total before T occurs = 600+300=900


2. Total after T occurs= 500+400=900

Therefore, the database is consistent. In the case when T1 is completed but T2 fails, then
inconsistency will occur.

Isolation:

● It shows that the data which is used at the time of execution of a transaction cannot be used
by the second transaction until the first one is completed.
● In isolation, if the transaction T1 is being executed and using the data item X, then that data
item can't be accessed by any other transaction T2 until the transaction T1 ends.
● The concurrency control subsystem of the DBMS enforced the isolation property.

Durability:
The durability property is used to indicate the performance of the database's consistent state. It states that the
transaction made the permanent changes.
● They cannot be lost by the erroneous operation of a faulty transaction or by the system
failure. When a transaction is completed, then the database reaches a state known as the
consistent state. That consistent state cannot be lost, even in the event of a system's failure.
● The recovery subsystem of the DBMS has the responsibility of Durability property.

Schedules
A schedule is defined as an execution sequence of transactions.
A schedule maintains the order of the operation in each individual transaction. A schedule is the
arrangement of transaction operations.
A schedule may contain a set of transactions. We already know that a transaction is a set of operations.
To run transactions concurrently, we arrange or schedule their operations in an interleaved fashion.

Definition:
Transactions are a set of instructions that perform operations on databases. When multiple transactions
are running concurrently, then a sequence is needed in which the operations are to be performed
because at a time, only one operation can be performed on the database. This sequence of operations
is known as Schedule, and this process is known as Scheduling.
Schedules are divided into 2 categories, which are as follows −

● Serial Schedule
● Non-serial Schedule

Serial Schedule
As the name says, all the transactions are executed serially one after the other.

In serial Schedule, a transaction does not start execution until the currently running transaction
finishes execution.

This type of execution of the transaction is also known as non-interleaved execution.

Serial Schedules are always recoverable, cascades, strict and consistent. A serial schedule always

Consider two transactions T1 and T2 shown above, which perform some operations. If it has no
interleaving of operations, then there are the following two possible outcomes - Either execute all T1
operations, which were followed by all T2 operations. Or execute all T2 operations, which were
followed by all T1 operations. In the above figure, the Schedule shows the serial Schedule where T1
is followed by T2, i.e. T1 -> T2. Where R(A) -> reading some data item ‘A’. And, W(B) ->
writing/updating some data item ‘B’. If n = number of transactions, then a number of serial schedules
possible = n!.
Therefore, for the above 2 transactions, a total number of serial schedules possible = 2.
Non-serial Schedule
In a non-serial Schedule, multiple transactions execute concurrently/simultaneously, unlike the serial
Schedule, where one transaction must wait for another to complete all its operations.

In the Non-Serial Schedule, the other transaction proceeds without the completion of the previous
transaction. All the transaction operations are interleaved or mixed with each other.Non-serial schedules are
NOT always recoverable, cascades, strict and consistent.
In this Schedule, there are two transactions, T1 and T2, executing concurrently. The operations of T1 and T2
are interleaved. So, this Schedule is an example of a Non-Serial Schedule.
Total number of non-serial schedules = Total number of schedules – Total number of serial schedules
Non-serial schedules are further categorized into serializable and non-serializable schedules. Let's now
discuss further Serializability.

Difference between Serial Schedule and Serializable Schedule

Serializability in DBMS
➢ Serializability is a concept that helps to identify which non-serial schedules are
correct and will maintain the consistency of the database.
➢ A serializable schedule always leaves the database in a consistent state.
➢ A serial schedule is always a serializable schedule because, in a serial Schedule,
a transaction only starts when the other transaction has finished execution.
➢ A non-serial schedule of n transactions is said to be a serializable schedule, if it is
equivalent to the serial Schedule of those n transactions.
➢ A serial schedule does not allow concurrency; only one transaction executes at a
time, and the other starts when the already running transaction is finished.

Types of Serializability:
1. Conflict Serializability
2. View Serializability
Conflict Serializability

A schedule is called conflict serializable if it can be transformed into a serial schedule by


swapping non-conflicting operations. An operations pair become conflicting if all conditions
satisfy:

1. Both belong to separate transactions.


2. They have the same data item.
3. They contain at least one write operation.

In this schedule, Write(A)/Read(A) and Write(B)/Read(B) are called as conflicting operations.


This is because all the above conditions hold true for them. Thus, by swapping(non-conflicting)
2nd pair of the read/write operation of 'A' data item and 1st pair of the read/write operation of 'B'
data item, this non-serial Schedule can be converted into a serializable Schedule. Therefore, it is
conflict serializable.

View Serializability:

● A schedule is viewed serializable if it is view equivalent to a serial schedule.


● If a schedule is conflict serializable, then it will be view serializable.
● The view serializable which does not conflict with serializable, contains blind writes.
Non-Serializability in DBMS
A non-serial schedule that is not serializable is called a non-serializable schedule.
Non-serializable schedules may/may not be consistent or recoverable. Non-serializable Schedule is
divided into types:

1. Recoverable Schedule
2. Non-recoverable Schedule

Recoverable Schedule
A schedule is recoverable if each transaction commits only after all the transactions from which it has
read have committed. In other words, if some transaction Ty reads a value that has been updated/written
by some other transaction Tx, then the commit of Ty must occur after the commit of Tx.
The schedule shown above is Recoverable since T1 commits before T2, which makes the value
read by T2 correct.

Recoverable schedules are further categorized into 3 types:

1. Cascading Schedule
2. Cascadeless Schedule
3. Strict Schedule

Cascading Schedule
If in a schedule, several other dependent transactions are forced to rollback/abort because of the failure
of one transaction, then such a schedule is called a Cascading Schedule or Cascading Rollback or
Cascading Abort. It simply leads to the wastage of CPU time.
Here, Transaction T2 depends on transaction T1, and transaction T3 depends on transaction T2. Thus,
in this Schedule, the failure of transaction T1 will cause transaction T2 to roll back, and a similar case
for transaction T3. Therefore, it is a cascading schedule. If the transactions T2 and T3 had been
committed before the failure of transaction T1, then the Schedule would have been irrecoverable.
Cascadeless Schedule
If in a schedule, a transaction is not allowed to read a data item until and unless the last transaction
that has been written is committed/aborted, then such a schedule is called a Cascadeless Schedule.
It avoids cascading rollback and thus saves CPU time.
To prevent cascading rollbacks, it disallows a transaction from reading uncommitted changes from
another transaction in the same Schedule.
In other words, if some transaction Ty wants to read a value that has been updated or written by some
other transaction Tx, then only after the commit of Tx, the commit of Ty must read it. Look at the
example shown below.

Here, the updated value of X is read by transaction T2 only after the commit of transaction T1. Hence,
the Schedule is a Cascadeless schedule.

Strict Schedule
If in a schedule, until the last transaction that has written it is committed or aborted, a transaction is neither
allowed to read nor write data item, then such a schedule is called as Strict Schedule.

Let's say we have two transactions Ta and Tb. The write operation of transaction Ta precedes the read or
write operation of transaction Tb, so the commit or abort operation of transaction Ta should also precede the
read or write of Tb.

A strict Schedule allows only committed read and write operations. This Schedule implements more
restrictions than cascadeless schedule. Consider an example shown below.
Here, transaction Tb reads/writes the written value of transaction Ta only after the transaction Ta commits.
Hence, the Schedule is a strict Schedule.

Non-Recoverable Schedule
If a transaction reads the value of an operation from an uncommitted transaction and commits before
the transaction from where it has read the value, then such a schedule is called
Non-Recoverable schedule.

A non-recoverable schedule means when there is a system failure, we may not be able to recover to a
consistent database state. If the commit operation of Ti doesn't occur before the commit operation of
Tj, it is non-recoverable.

Consider the following Schedule involving two transactions T1 and T2. T2 read the value of A written
by T1, and committed. T1 might later abort/commit; therefore the value read by T2 is wrong, but since
T2 committed, this Schedule is non-recoverable.

Concurrency Control
Concurrency control concept comes under the Transaction in database management system (DBMS).
It is a procedure in DBMS which helps us for the management of two simultaneous processes to
execute without conflicts between each other, these conflicts occur in multi user systems.
Concurrency can simply be said to be executing multiple transactions at a time. It is required to
increase time efficiency. If many transactions try to access the same data, then inconsistency arises.
Concurrency control required to maintain consistency data.
For example, if we take ATM machines and do not use concurrency, multiple persons cannot draw
money at a time in different places. This is where we need concurrency.
Advantages
The advantages of concurrency control are as follows −

● Waiting time will be decreased.


● Response time will decrease.
● Resource utilization will increase.
● System performance & Efficiency is increased.
Control concurrency Definition
The simultaneous execution of transactions over shared databases can create several data integrity
and consistency problems.
For example, if too many people are logging in the ATM machines, serial updates and
synchronization in the bank servers should happen whenever the transaction is done, if not it gives
wrong information and wrong data in the database.
Main problems in using Concurrency
The problems which arise while using concurrency are as follows −

● Updates will be lost − One transaction does some changes and another transaction
deletes that change. One transaction nullifies the updates of another transaction.
● Uncommitted Dependency or dirty read problem − On variable has updated in one
transaction, at the same time another transaction has started and deleted the value of the
variable there the variable is not getting updated or committed that has been done on
the first transaction this gives us false values or the previous values of the variables this
is a major problem.
● Inconsistent retrievals − One transaction is updating multiple different variables,
another transaction is in a process to update those variables, and the problem occurs
is inconsistency of the same variable in different instances.
Concurrency control techniques
The concurrency control techniques are as follows −
1. Locking
2. Time Stamping
3. Optimistic
Locking
Lock guaranties exclusive use of data items to a current transaction. It first accesses the data items
by acquiring a lock, after completion of the transaction it releases the lock.
Types of Locks
The types of locks are as follows −

● Shared Lock [Transaction can read only the data item values]
● Exclusive Lock [Used for both read and write data item values]
1. Shared Lock (S):
A shared lock is also called a Read-only lock. With the shared lock, the data item can be shared
between transactions. This is because you will never have permission to update data on the data item.

For example, consider a case where two transactions are reading the account balance of a person. The
database will let them read by placing a shared lock. However, if another transaction wants to update
that account’s balance, shared lock prevent it until the reading process is over.

2. Exclusive Lock (X):


With the Exclusive Lock, a data item can be read as well as written. This is exclusive and can’t be held
concurrently on the same data item. X-lock is requested using lock-x instruction.
Transactions may unlock the data item after finishing the ‘write’ operation.
For example, when a transaction needs to update the account balance of a person. You can allows this
transaction by placing X lock on it. Therefore, when the second transaction wants to read or write,
exclusive lock prevent this operation.
Time Stamping
Time stamp is a unique identifier created by DBMS that indicates relative starting time of a transaction.
Whatever transaction we are doing it stores the starting time of the transaction and denotes a specific
time.
This can be generated using a system clock or logical counter. This can be started whenever a transaction
is started. Here, the logical counter is incremented after a new timestamp has been assigned.
Optimistic
It is based on the assumption that conflict is rare and it is more efficient to allow transactions to proceed
without imposing delays to ensure serializability.
Two Phase Locking Protocol

Two Phase Locking Protocol also known as 2PL protocol is a method of concurrency control in DBMS
that ensures serializability by applying a lock to the transaction data which blocks other transactions to
access the same data simultaneously. Two Phase Locking protocol helps to eliminate the concurrency
problem in DBMS.
This locking protocol divides the execution phase of a transaction into three different parts.
● In the first phase, when the transaction begins to execute, it requires permission for the
locks it needs.
● In this third phase, the transaction cannot demand any new locks. Instead, it only releases the
acquired locks.
The Two-Phase Locking protocol allows each transaction to make a lock or unlock request in two steps:
● Growing Phase: In this phase transaction may obtain locks but may not release any locks.
● Shrinking Phase: In this phase, a transaction may release locks but not obtain any new lock
It is true that the 2PL protocol offers serializability. However, it does not ensure that deadlocks do not
happen.In the above-given diagram, you can see that local and global deadlock detectors are
searching for deadlocks and solve them with resuming transactions to their initial states.
Characteristics of Good Concurrency Protocol

An ideal concurrency control DBMS mechanism has the following objectives:


● Must be resilient to site and communication failures.
● It allows the parallel execution of transactions to achieve maximum concurrency.
● Its storage mechanisms and computational methods should be modest to minimize
overhead.
● It must enforce some constraints on the structure of atomic actions of transactions.

You might also like