DMC Theory Study Materfria 3
DMC Theory Study Materfria 3
UNIT 2
What is a transaction?
A simple understanding:
A transaction is a completed agreement between a
buyer and a seller to exchange goods or services
A Transaction in a database is any operation that reads from or writes data to a database.
The various statements that are used to perform read or write operation on a database
or to access a database are :
Here the set of above statements are executed to execute a transaction of transferring Rs 500
to your friends account.
INCONSISTENT STATE
Transaction in Progress
READ Y
Beginning of WRITE Y = Y -500 End of
WRITE Y
Transaction READ F Transaction
WRITE F = F+500
WRITE F (Completed Transaction)
CONSISTENT STATE CONSISTENT STATE
All these SQL statements must be executed successfully. If any statement fails the entire
transaction is ROLLED BACK to the previous consistent state (state before the transaction
started)
A successful transaction takes a database from one consistent state to another
consistent state. Or we can say from a previous consistent state to the next
consistent state.
INCONSISTENT STATE
Transaction in Progress
Consistent Consistent
State State
InConsistent InConsistent
State State
Consistent Consistent
State State
All Transactions DO NOT Update the database.
For eg: In an employee table (Employee) if we want to find the salary of an employee whose
employee number is 1057 then the query is :
In such type of a transaction the database was in a consistent state before the database was
accessed for READ transaction, and it remained in a consistent state as there were no changes
made to the database.
Transaction in Progress
Beginning of End of
Transaction Transaction
CONSISTENT STATE CONSISTENT STATE
READ Transaction
Transaction in Progress
B C
Transaction in Progress
Beginning of End of
Transaction Transaction
CONSISTENT STATE CONSISTENT STATE
Transaction in Progress
Beginning of End of
Transaction Transaction
CONSISTENT STATE CONSISTENT STATE
A transaction can be a single SQL statement or a group of related SQL statements.
For eg:
Consider the group of SQL statements:
COMMIT;
Execution ends : NEXT CONSISTENT STATE
A transaction can be a single SQL statement or a group of related SQL statements.
For eg:
Consider the group of SQL statements:
POWER FAILURE
ROLL BACK
PROPERTIES OF TRANSACTION
A = Atomicity A transaction is either successfully completed or aborted
In a single User database system the properties of ISOLATION and SERIALIZABILITY are
automatically achieved as only one transaction at a time is executed.
2. A ROLLBACK statement is reached. This aborts (undo) all changes and the database
comes back to the previous consistent state.
3. The end of program is encountered. This is same as COMMIT. All changes are
permanently saved and the database goes to the next consistent state.
4. The program is terminated. All changes are aborted and the database is rolled back to
the previous consistent state.
TRANSACTION LOG
DBMS uses a transaction log to keep track of all transactions.
This transaction log is used whenever the database has to be ROLLBACK to the previous
consistent state.
In case of a system failure two things happen:
1. All uncommitted transactions are rolled back
2. All committed transactions which were not written to the physical database are rolled
forward (physically written on the database)
This is a situation where two concurrent transactions are updating the same data, and
one of the updates is lost as it is overwritten by the other transaction.
Example :
Suppose if we a PRODUCT table, with a field : Product Quantity on hand (P_QOH)
If the P_QOH of a product is 35, and there are two transactions T1 and T2 which
update the P_QOH like :
Transaction T1: Buy 100 units P_QOH = P_QOH + 100
Transaction T2 : Sell 30 units P_QOH = P_QOH - 30
There are now two possibilities how these transactions are executed:
1. Serial Execution of Transaction
Time Transaction Action Value
1 T1 Read P_QOH 35
2 T1 P_QOH= P_QOH + 100
3 T1 Write P_QOH 135
4 T2 Read P_QOH 135
5 T2 P_QOH = P_QOH-30
6 T2 Write P_QOH 105
2. LOST Updates
Time Transaction Action Value
1 T1 Read P_QOH 35
2 T2 Read P_QOH 35
3 T1 P_QOH = 35 + 100
4 T2 P_QOH = 35-30
5 T1 Write P_QOH 135
6 T2 Write P_QOH 5
Here T1 transaction executed and then rolled back, T2 transaction executes normally
The problem of uncommitted data arises when T1 rolls back after T2 accesses the data.
Example :
T1 transaction calculates the total of QOH for all the products in PRODUCT table
T2 transaction updates the QOH of two products
Transaction T1 Transaction T2
Select sum(QOH) from PRODUCT Update PRODUCT set QOH = QOH + 10
where prod_code = P1
Update PRODUCT set QOH = QOH -10
where prod_code = P2
COMMITT;
Transaction T1 is as :
Prod code QOH (Before) QOH (After)
P1 8 8
P2 32 40
P3 15 55
P4 23 78
P5 8 86
P6 6 92
Total QOH = 92
Transaction T2 is as :
Prod code QOH (before ) QOH (after)
P1 8 8
P2 32 32
P3 15 15+ 10 = 25
P4 23 23 -10 = 13
P5 8 8
P6 6 6
Total QOH = 92
Time Transaction Action Value of QOH Total
1 T1 Read QOH for Prod_code = P1 8 8
2 T1 Read QOH for Prod_code = P2 32 40
3 T2(update) Read QOH for Prod_code = P3 15
4 T2 QOH= QOH+10 (15+10)
5 T2 Write QOH for prod_code = P3 25
6 T1 Read QOH for Prod_code = P3 25 65
7 T1 Read QOH for Prod_code = P4 23 88
8 T2 Read QOH for Prod_code = P4 23
9 T2 QOH = QOH -10 (23-10)
10 T2 Write QOH for Prod_code=P4 13
11 T2 Commit
12 T1 Read QOH for prod_code=P5 8 96
13 T1 Read QOH for Prod_code=P6 6 102
But if two transactions T1 & T2 access same data, problems can arise if there is no
serializability. Also the order of execution of the two transaction is important as:
Serializable transactions are those transactions which if interleaved or executed one after
the other will produce the same results
Like:
Scheduler takes care of isolation by not allowing two transactions to update the same data
value at the same time.
So it has to take care of conflicting operations like :
Conflicting operations in concurrent transactions are scheduled using the following
methods :
1. Locking Method
2. Time stamping Method
3. Optimistic Method
1. LOCKING METHOD
In case of Concurrent transaction LOCKS are used to ensure that the data item is used by
only one transaction at a time.
For example if transaction T1 is using a data item , Transaction T2 will be denied access to
that data item.
If T1 wants to access a data item, it first has to go through the following steps:
check if the data item to be free from any other transaction using it
if free
lock the data item
use it (execute the transaction)
release the lock (unlock) so that some other transaction can use it
else
wait
All the lock information and entire managing of locks is done by a LOCK MANAGER
LOCK GRANULARITY
Like :
Database level locks
Table level locks
Page level locks
Row level lock
Field level lock
DATABASE LEVEL LOCKS Disadvantage: very slow
Here the entire database is locked by a transaction. access in case of
Imagine a Payroll database having two tables : Table A and Table B multiple transactions
Transaction T1 uses Table A
In case of database level locks transaction T1 locks the entire database (Table A & B)
TABLE LEVEL LOCKS
In case of table level locks the entire table (all rows of that table) are locked.
In this case if T1 is using one row of the table it locks the entire table, now if transaction T2
tries to access some other row it will be denied access of the table as it is locked.
Improves the access to data as multiple transactions can use different rows of the
same table at the same time.
But lot of overhead is involved as locks exists for each row of the table
FIELD LEVEL LOCKS
Here locks are provided for each field (attribute) of the table.
Multiple transaction can use the same row of the table but different fields within
that row.
Though it give fast access to data but it is rarely implemented as it requires lot of
overhead
LOCK TYPES
Be it any level of lock : Locks are basically of two types>> BINARY LOCKS
SHARED LOCKS
EXCLUSIVE LOCKS
BINARY LOCKS
A binary lock has two states: Locked or Unlocked
If a row or table or a database is locked by a transaction no other transaction can use that
object
When a transaction locks an object, after it finishes its operati
on it has to unlock that object so that some other transaction can use that object.
This means that if T1 and T2 are two transactions who need to access common data item
for a READ operation. Since none of the transaction updates the data both transactions can
be granted access to the common data item at the same time.
A shared lock is issued if a transaction wants to read data and there is no exclusive lock
held on that data item by any other transaction.
EXCLUSIVE LOCKS
An exclusive lock is assigned if a transactions is a “conflict” operation or a WRITE operation.
If two transactions T1 and T2 need to access a common data item to perform a write or
update operation then only of the transaction is given access.
An exclusive lock is issued if a transaction wants to write(update) data and there is no
other lock held on that data item by any other transaction.
Problems that can arise due to the use of locks:
1. The transaction is not serializable.
2. Deadlock: where two transactions are waiting for each other to unlock data.
TWO PHASE LOCKING PROTOCOL TO ENSURE SERIALIZABILITY
The two phase locking protocol ensures serializability by defining how the transactions
acquire and release locks.
The two phases are :
1. The GROWING PHASE: In this phase the transaction acquires all the locks it requires. No
UNLOCK (Release of Locks) operation can take place during this phase.
After all the locks are acquired, the transaction is in the LOCKED state
2. The SHRINKING PHASE where the transaction releases all the locks and cannot acquire
any new lock.
Rules for two phase locking :
1. Acquire all locks before the transaction actually begins operation
2. Two transactions cannot have conflicting locks. (Conflicting lock: WRITE)
3. In the same transaction there cannot be a UNLOCK operation can come
before a LOCK operation.
4. The actual operation or change in data can only take place when a
transaction is in its LOCKED PHASE (after acquiring all locks)
5. After the transaction is completes release all locks
Disadvantage : Possibility of deadlock A transaction acquires a lock and waiting to
acquire another lock held by some other transaction which is also waiting.
DEADLOCKS
A deadlock occurs when two transactions wait indefinitely for each other to unlock data.
1. Deadlock detection. The DBMS periodically tests the database for deadlocks. If a
deadlock is found, one of the transactions (known as the “victim”) is aborted (rolled
back and restarted) and the other transaction continues.
2. Deadlock avoidance. The transaction must acquire all of the locks it needs before it
starts execution.
The choice of the best deadlock control method to use depends on the database
environment.
For example, if the probability of deadlocks is low, deadlock detection is recommended.
However, if the probability of deadlocks is high, deadlock prevention is recommended.
If response time is not high on the system’s priority list, deadlock avoidance might be
employed.
CONCURRENCY CONTROL WITH TIME STAMPING METHOD
The scheduler schedules the concurrent transactions by the method of Time stamping.
In this method a global, unique time stamp value is assigned to each transaction.
The order of execution of the transactions is based on the time stamp value.
All database operations (Read and Write) within the same transaction must have the same time
stamp.
For eg:
1. Read your account balance READ Y
2. Deduct the amount from your balance WRITE Y = Y -500 One transaction
3. Write the remaining balance to your account WRITE Y
4. Read your friend’s account balance READ F
5. Add the amount to his account balance WRITE F = F+500 Same time stamp
6. Write the new updated balance to his account WRITE F
The DBMS executes conflicting operations (WRITE) in time stamp order.
This ensures serializability of the transactions.
If two transactions conflict, one is stopped, rolled back, rescheduled, and assigned a new
time stamp value.
Time stamping method uses a lot of system resources because many transactions might
have to be stopped, rescheduled, and restamped
WAIT/DIE AND WOUND/WAIT SCHEME
(For deciding which transaction has to be rolled back and which continues execution)
Consider two conflicting transactions T1 and T2, each with a unique time stamp.
T1 has a time stamp of 11548789 and T2 has a time stamp of 19562545.
(older transaction) (new transaction)
The optimistic approach requires neither locking nor time stamping techniques.
It is based on an optimistic approach assuming that the majority of the database operations
do not conflict.
Using an optimistic approach :
A transaction is executed without restrictions until it is committed.
Each transaction moves through two or three phases, referred to as read, validation, and
write.
During the read phase, the transaction reads the database, executes the needed
computations, and makes the updates to a temporary copy of the database values which is
not accessed by the remaining transactions.
During the validation phase, the transaction is validated to ensure that the changes made
will not affect the integrity and consistency of the database. If the validation test is positive,
the transaction goes to the write phase. If the validation test is negative, the transaction is
restarted and the changes are discarded.
During the write phase, the changes are permanently applied to the database.
Need:
Database Recovery management is required whenever a transaction has to be aborted. At
that point of time the transaction has to be rolled back. This means that the all the
operations have top undone and the database has to be brought from a given state (which is
usually a an inconsistent state) to a previous consistent state.
The need for database recovery management can also be due to certain critical events like:
1. Hardware/Software failures : This includes problem like disk crash, problem in
motherboard, OS problems, loss of data etc.
The write-ahead-log protocol ensures that transaction logs are always written before any
data is actually updated. This ensures that, in case of a failure, the database can be
recovered to a previous consistent state, using the data in the transaction log.
Redundant transaction logs or many copies of the transaction log are maintained to
ensure that even if one copy is damaged the DBMS will be able to recover data anyhow.
Database buffers or temporary storage areas of RAM are used to speed up operations. To
improve processing time, the DBMS reads the data from the physical disk and stores a
copy of it on a “buffer”. When a transaction updates data, it actually updates the copy of
the data in the buffer which is much faster than accessing the physical disk every time.
Later on, with a single operation the data from the buffers are written to a physical disk
Database checkpoints are created Checkpoints is the time when DBMS writes all of its
updated buffers to disk. While this is happening, the DBMS does not execute any other
requests. This operation is registered in the transaction log
To bring the database to a consistent state after a failure the recovery technique uses two
methods :