Unit 4 Concurrency Control
Unit 4 Concurrency Control
TRANSACTION
SYSTEMS MANAGEMENT
Course description:
CO 1 Apply the database management
system concepts. This course is designed to introduce under
graduate students to the foundations of
CO 2 Design relational and ER model for
database design. database systems, focusing on basics such as
the relational algebra and data model, schema
CO 3. Examine issues in data storage and query normalization, query optimization, and
processing and frame appropriate solutions.
transactions.
CO 4. Analyze the role and issues like efficiency, privacy,
security, ethical responsibility and strategic advantage in
data management
Abraham Silberschatz, Henry F. Korth and Sudarshan S., Database System Concepts, McGraw-Hill , 6th
Edition, 2011.
Ramez Elmasri and Shamkant B. Navathe. Fundamental Database Systems, Addison-Wesley, 5th
Edition, 2005.
Raghu Ramakrishnan, Database Management System, Tata McGraw-Hill, 3rd Edition, 2006.
Hector Garcia-Molina, Jeff Ulman and Jennifer Widom, Database Systems: The Complete Book,
Prentice Hall, 2003.
Various data items are accessed and updated by a unit of program execution is called transaction.
Eg. An amount of 50 dollar is transferred from Account X to Account Y.
There are many issues to deal with transaction.
There might be hardware and software crashes and failures.
Multiple transactions are executed parallely.
In a multi-user system, multiple users can access and use the same database at one
time, which is known as the concurrent execution of the database.
It means that the same database is executed simultaneously on a multi-user system by
different users.
While working on the database transactions, there occurs the requirement of using the
database by multiple users for performing different operations, and in that case,
concurrent execution of the database is performed.
The thing is that the simultaneous execution that is performed should be done in an
interleaved manner, and no operation should affect the other executing operations.
Thus maintaining the consistency of the database.
Thus, on making the concurrent execution of the transaction operations, there occur
several challenging problems that need to be solved.
The dirty read problem occurs when one transaction updates an item of the database, and somehow the transaction fails, and
before the data gets rollback, the updated database item is accessed by another transaction. There comes the Read-Write Conflict
between both transactions.
For example:
Consider two transactions TX and TY in the below diagram performing read/write operations on account A where the available
balance in account A is $300:
At time t1, transaction TX reads the value of account A, i.e., $300.
At time t2, transaction TX adds $50 to account A that becomes $350.
At time t3, transaction TX writes the updated value in account A, i.e., $350.
Then at time t4, transaction TY reads account A that will be read as $350.
Then at time t5, transaction TX rollbacks due to server problem, and the value changes back to $300 (as initially).
But the value for account A remains $350 for transaction TY as committed, which is the dirty read and therefore known as the
Dirty Read Problem.
Also known as Inconsistent Retrievals Problem that occurs when in a transaction, two different values are read for the same
database item.
Consider two transactions, TX and TY, performing the read/write operations on account A, having an available
balance = $300.
At time t1, transaction TX reads the value from account A, i.e., $300.
At time t2, transaction TY reads the value from account A, i.e., $300.
At time t3, transaction TY updates the value of account A by adding $100 to the available balance, and then it becomes
$400.
At time t4, transaction TY writes the updated value, i.e., $400.
After that, at time t5, transaction TX reads the available value of account A, and that will be read as $400.
It means that within the same transaction TX, it reads two different values of account A, i.e., $ 300 initially, and after
updation made by transaction TY, it reads $400. It is an unrepeatable read and is therefore known as the Unrepeatable read
problem.
In the fields of databases and transaction processing (transaction management), a schedule (or history) of a system is
an abstract model to describe execution of transactions running in the system.
Often it is a list of operations (actions) ordered by time, performed by a set of transactions that are executed together
in the system.
If the order in time between certain operations is not determined by the system, then a partial order is used.
Examples of such operations are requesting a read operation, reading, writing, aborting, committing,
requesting a lock, locking, etc.
Not all transaction operation types should be included in a schedule, and typically only selected
operation types (e.g., data access operations) are included, as needed to reason about and describe certain
phenomena.
Schedules and schedule properties are fundamental concepts in database concurrency control theory.
Start of
T2
End of
T1
Serial
Schedule
Not a Serial
Schedule
Topic 3
Schedules
It is discussed before that a non-serial schedule may lead to inconsistency sometimes.
Here, in the conflict schedule, we will discuss how, with a non-serial Schedule, we can ensure consistency in the
system.
To ensure a consistent system, we first need to confirm whether a non-serial schedule will produce a compatible
system or not.
As we have already discussed that the serial schedules are consistent, we can compare the output of the serial
Schedule to the non-serial Schedule.
If both the outputs are just equivalent, we can ensure that the non-serial Schedule will produce a consistent system.
If we try to convert a non-serial transaction to a serial transaction, we need to swap instructions between the non-
serial Schedule.
The instructions of the same transaction can not be swapped. So there is no conflict between the instructions of the
same transaction.
The instructions can conflict only when they belong to different transactions.
They must operate on the same data value to have conflicting instructions.
As it is known to all of us that if we use serial schedule then the database will never be trapped in an inconsistent
state as there are no concurrent transactions are running in this schedule.
But with the use of a non-serial schedule, there are high chances that the database may go inconsistent state as
concurrent transactions are allowed in this schedule.
So to just make sure the consistency of the database we need to check the view serializability of a given non-serial
schedule.
T1 T2
T4
T3
If precedence graph is acyclic, the serializability order can be obtained by a topological sorting of the graph.
This is a linear order consistent with the partial order of the graph.
For example, a serializability order for Schedule A would be
T5 T1 T3 T2 T4
The precedence graph test for conflict serializability cannot be used directly to test for view serializability.
Extension to test for view serializability has cost exponential in the size of the precedence graph.
The problem of checking if a schedule is view serializable falls in the class of NP-complete problems.
Thus existence of an efficient algorithm is extremely unlikely.
However practical algorithms that just check some sufficient conditions for view serializability can still be
used.
For concurrently running transactions, the effect of transaction failures have to be addressed.
The schedules which can be recovered are said to be recoverable schedules
Below is the example of recoverable schedule.
T8 and T9 transactions are executed which is recoverable schedule.
A database must provide a mechanism that will ensure that all possible schedules are
Either conflict or view serializable, and
Are recoverable and preferably cascadeless
A policy in which only one transaction can execute at a time generates serial schedules, but provides a poor degree
of concurrency
Are serial schedules recoverable/cascadeless?
Testing a schedule for serializability after it has executed is a little too late!
Goal – to develop concurrency control protocols that will assure serializability.
Concurrency-control protocols allow concurrent schedules, but ensure that the schedules are conflict/view
serializable, and are recoverable and cascadeless .
Concurrency control protocols generally do not examine the precedence graph as it is being created
Instead a protocol imposes a discipline that avoids nonseralizable schedules.
Different concurrency control protocols provide different tradeoffs between the amount of concurrency they allow
and the amount of overhead that they incur.
Some applications are willing to live with weak levels of consistency, allowing schedules that are not serializable
E.g. A read-only transaction that wants to get an approximate total balance of all accounts
E.g. Database statistics computed for query optimization can be approximate
Such transactions need not be serializable with respect to other transactions
View Serializability
Cascadeless schedules
Topic 4 Weak levels of consistency
Testing for Serializability
Pre-claiming Lock Protocols evaluate the transaction to list all the data items on which they need locks.
Before initiating an execution of the transaction, it requests DBMS for all the lock on all those data
items.
If all the locks are granted then this protocol allows the transaction to begin. When the transaction is
completed then it releases all the lock.
If all the locks are not granted then this protocol allows the transaction to rolls back and waits until all
the locks are granted.
A lock is granted for the transaction when there is no lock on the data item which is requested by the transaction.
Shared lock can be granted to any transaction which requests data item which holds shared lock.
A transaction cannot be granted lock when there is exclusive lock held by data item.
A transaction can wait until data item is given access and upto the locks are released.
Contention: some threads/processes have to wait until a lock (or a whole set of locks) is released.
If one of the threads holding a lock dies, stalls, blocks, or enters an infinite loop, other threads waiting for the lock
may wait forever.
Overhead: the use of locks adds overhead for each access to a resource, even when the chances for collision are
very rare. (However, any chance for such collisions is a race condition.)
Debugging: bugs associated with locks are time dependent and can be very subtle and extremely hard to
replicate, such as deadlocks.
Instability: the optimal balance between lock overhead and lock contention can be unique to the
problem domain (application) and sensitive to design, implementation, and even low-level system
architectural changes. These balances may change over the life cycle of an application and may entail
tremendous changes to update (re-balance).
The first phase of Strict-2PL is similar to 2PL. In the first phase, after acquiring all the locks, the transaction
continues to execute normally.
The only difference between 2PL and strict 2PL is that Strict-2PL does not release a lock after using it.
Strict-2PL waits until the whole transaction to commit, and then it releases all the locks at a time.
Strict-2PL protocol does not have shrinking phase of lock release.
It does not have cascading abort as 2PL does.
Granted locks are in the black rectangles and the waiting requests are stored in white rectangles.
The type of lock granted or requested is recorded in the lock table.
At the end of the queue, new requests are added.
Once the request is unlocked, it is deleted.
Locks are implemented efficiently by the lock manager.
Topic 3
Two Phase Locking Protocol
Example of Lock Based Protocol Lock conversion
Topic 4 Automatic Acquisition of locks
Pitfalls of Lock Based Protocol Implementation of Locking
Lock table
A rolled back transactions is restarted with its original timestamp in both in wait-die
and in wound-wait schemes,.
Deadlocks are not possible in time-out based scheme.
It is simple to implement;
Starvation is possible.
If Ti Tj is in E, then there is a directed edge from Ti to Tj, implying that Ti is waiting for Tj to release a data item.
When Ti requests a data item currently being held by Tj, then the edge Ti Tj is inserted in the wait-for
graph. This edge is removed only when Tj is no longer holding a data item needed by Ti.
The system is in a deadlock state if and only if the wait-for graph has a cycle. Must invoke a deadlock-
detection algorithm periodically to look for cycles.
The protocol manages concurrent execution such that the time-stamps determine the serializability order.
In order to assure such behavior, the protocol maintains for each data Q two timestamp values:
W-timestamp(Q) is the largest time-stamp of any transaction that executed write(Q) successfully.
R-timestamp(Q) is the largest time-stamp of any transaction that executed read(Q) successfully.
The timestamp ordering protocol ensures that any conflicting read and write operations are executed in timestamp order.
The timestamp-ordering protocol guarantees serializability since all the arcs in the precedence graph are
of the form:
Once the transaction has reached its commit point, this technique does not physically update the
database.
All transaction updates are recorded in the local transaction workspace.
It will not have changed the database in any way so UNDO is not needed. if a transaction fails
before reaching its commit point,
It may be REDO the the operations, because their effect is not yet written in the database.
It is No-undo/redo algorithm
Before the transaction reaches its commit point., the database may be updated by some operations of a transaction.
These operations are recorded in a log on disk.
Recovery is still possible.
The effect of its operation must be undone, if a transaction fails to reach its commit point.
we require both undo and redo i.e. the transaction must be rolled back.
This technique is undo/redo algorithm.
COURSE NAME
Shadow paging
COURSE NAME
This unit focuses on the basic concepts of Transaction Management and
concepts, Transaction recovery and ACID properties in Database Management
Topic 1 System.
Transaction Concepts
Topic 2 Lesson 1: Represents the introduction to Transaction recovery and ACID
ACID properties properties.
Topic 3
Lesson 2: Represents the operations of serializability such as conflict and serial
Database Recovery Techniques schedules.
Topic 4
System Log information Lesson 3: Represents the lock based protocols such as two phase locking with
Topic 5 example and lock table implementation.
Undoing
Lesson 4: Introduces the concept of Deadlock handling, prevention and
Topic 6 detection strategies and timestamp ordering protocol.
Deferred Update
Topic 7 Lesson 5: Describes the database recovery techniques such as deferred update,
immediate update, shadow paging.
Immediate Update
Topic 8
Shadow paging Fourth Unit Summary
COURSE NAME : DATABASE MANAGEMENT SYSTEMS
© Kalasalingam academy of research and education
Thank You!