0% found this document useful (0 votes)
14 views

DBMS UNIT 5

Concurrency Control in DBMS is essential for managing simultaneous database operations to maintain consistency, integrity, and isolation among users. It addresses challenges such as lost updates, uncommitted data, and inconsistent retrievals through various techniques like Two-phase locking, Timestamp ordering, Multi-version concurrency control, and Validation concurrency control. These methods ensure efficient transaction processing and prevent conflicts, ultimately enhancing database performance and user experience.

Uploaded by

pragyasinghs138
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

DBMS UNIT 5

Concurrency Control in DBMS is essential for managing simultaneous database operations to maintain consistency, integrity, and isolation among users. It addresses challenges such as lost updates, uncommitted data, and inconsistent retrievals through various techniques like Two-phase locking, Timestamp ordering, Multi-version concurrency control, and Validation concurrency control. These methods ensure efficient transaction processing and prevent conflicts, ultimately enhancing database performance and user experience.

Uploaded by

pragyasinghs138
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Unit-V

What is Concurrency Control in DBMS?


Concurrency Control is a crucial Database Management System (DBMS)
component. It manages simultaneous operations without them conflicting with
each other. The primary aim is maintaining consistency, integrity, and isolation
when multiple users or applications access the database simultaneously.

In a multi-user database environment, it’s common for numerous users to want to


access and modify the database simultaneously. This is what we call concurrent
execution. Imagine a busy library where multiple librarians are updating book
records simultaneously. Just as multiple librarians shouldn’t try to update the same
record simultaneously, database users shouldn’t interfere with each other’s
operations.

Executing transactions concurrently offers many benefits, like improved system


resource utilization and increased throughput. However, these simultaneous
transactions mustn’t interfere with each other. The ultimate goal is to ensure the
database remains consistent and correct. For instance, if two people try to book the
last seat on a flight at the exact moment, the system must ensure that only one
person gets the seat.

But concurrent execution can lead to various challenges:

 Lost Updates: Consider two users trying to update the same data. If one user
reads a data item and then another user reads the same item and updates it, the
first user’s updates could be lost if they weren’t aware of the second user’s
actions.

1
 Uncommitted Data: If one user accesses data that another user has updated
but not yet committed (finalized), and then the second user decides to abort
(cancel) their transaction, the first user has invalid data.

 Inconsistent Retrievals: A transaction reads several values from the database,


but another transaction modifies some of those values in the middle of its
operation.

To address these challenges, the DBMS employs concurrency control


techniques. Think of it like traffic rules. Just as traffic rules ensure vehicles don’t
collide, concurrency control ensures transactions don’t conflict.

Why is Concurrency Control Needed?

As we just discussed above about what concurrency control is, from that we can
now figure out that we need concurrency control because of the following reasons
listed below:

 Ensure Database Consistency: Without concurrency control, simultaneous


transactions could interfere with each other, leading to inconsistent database
states. Proper concurrency control ensures the database remains consistent
even after numerous concurrent transactions.

 Avoid Conflicting Updates: When two transactions attempt to update the


same data simultaneously, one update might overwrite the other without proper
control. Concurrency control ensures that updates don’t conflict and cause
unintended data loss.

 Prevent Dirty Reads: Without concurrency control, one transaction might


read data that another transaction is in the middle of updating (but hasn’t
finalized). This can lead to inaccurate or “dirty” reads, where the data doesn’t
reflect the final, committed state.

2
 Enhance System Efficiency: By managing concurrent access to the database,
concurrency control allows multiple transactions to be processed in parallel.
This improves system throughput and makes optimal use of resources.

 Protect Transaction Atomicity: For a series of operations within a


transaction, it’s crucial that all operations succeed (commit) or none do (abort).
Concurrency control ensures that transactions are atomic and treated as a single
indivisible unit, even when executed concurrently with others.

Concurrency Control Techniques in DBMS


The various concurrency control techniques are:

 Two-phase locking Protocol

 Time stamp ordering Protocol

 Multi version concurrency control

 Validation concurrency control

Let’s understand each technique one by one in detail

1. Two-phase locking Protocol

Two-phase locking (2PL) is a protocol used in database management systems to


control concurrency and ensure transactions are executed in a way that preserves
the consistency of a database. It’s called “two-phase” because, during each
transaction, there are two distinct phases: the Growing phase and the Shrinking
phase.

3
Figure 1.1 Two-phase locking Protocol

Breakdown of the Two-Phase Locking protocol


 Phases:
 Growing Phase: During this phase, a transaction can obtain (acquire) any
number of locks as required but cannot release any. This phase continues
until the transaction acquires all the locks it needs and no longer requests.

 Shrinking Phase: Once the transaction releases its first lock, the Shrinking
phase starts. During this phase, the transaction can release but not acquire
any more locks.

 Lock Point: The exact moment when the transaction switches from the
Growing phase to the Shrinking phase (i.e. when it releases its first lock) is
termed the lock point.

The primary purpose of the Two-Phase Locking protocol is to ensure conflict-


serializability, as the protocol ensures a transaction does not interfere with others in
ways that produce inconsistent results.

4
2. Time stamp ordering Protocol

The Timestamp Ordering Protocol is a concurrency control method used in


database management systems to maintain the serializability of transactions. This
method uses a timestamp for each transaction to determine its order in relation to
other transactions. Instead of using locks, it ensures transaction order based on
their timestamps.

Breakdown of the Time stamp ordering protocol

 Read Timestamp (RTS):

 This is the latest or most recent timestamp of a transaction that has read the
data item.

 Every time a data item X is read by a transaction T with timestamp TS, the
RTS of X is updated to TS if TS is more recent than the current RTS of X.

 Write Timestamp (WTS):

 This is the latest or most recent timestamp of a transaction that has written or
updated the data item.

 Whenever a data item X is written by a transaction T with timestamp TS, the


WTS of X is updated to TS if TS is more recent than the current WTS of X.

The timestamp ordering protocol uses these timestamps to determine whether a


transaction’s request to read or write a data item should be granted. The protocol
ensures a consistent ordering of operations based on their timestamps, preventing
the formation of cycles and, therefore, deadlocks.

3. Multi version concurrency control


Multi version Concurrency Control (MVCC) is a technique used in database
management systems to handle concurrent operations without conflicts, using

5
multiple versions of a data item. Instead of locking the items for write operations
(which can reduce concurrency and lead to bottlenecks or deadlocks), MVCC will
create a separate version of the data item being modified.

Figure 1.2:

Breakdown of the Multi version concurrency control (MVCC)

 Multiple Versions: When a transaction modifies a data item, instead of


changing the item in place, it creates a new version of that item. This means
that multiple versions of a database object can exist simultaneously.

 Reads aren’t Blocked: One of the significant advantages of MVCC is that


read operations don’t get blocked by write operations. When a transaction
reads a data item, it sees a version of that item consistent with the last time it
began a transaction or issued a read, even if other transactions are currently
modifying that item.

 Timestamps or Transaction IDs: Each version of a data item is tagged with a


unique identifier, typically a timestamp or a transaction ID. This identifier

6
determines which version of the data item a transaction sees when it accesses
that item. A transaction will always see its own writes, even if they are
uncommitted.

 Garbage Collection: As transactions create newer versions of data items,


older versions can become obsolete. There’s typically a background process
that cleans up these old versions, a procedure often referred to as “garbage
collection.”

 Conflict Resolution: If two transactions try to modify the same data item
concurrently, the system will need a way to resolve this. Different systems
have different methods for conflict resolution. A common one is that the first
transaction to commit will succeed, and the other transaction will be rolled
back or will need to resolve the conflict before proceeding.

4. Validation concurrency control

Validation (or Optimistic) Concurrency Control (VCC) is an advanced


database concurrency control technique. Instead of acquiring locks on data items,
as is done in most traditional (pessimistic) concurrency control techniques,
validation concurrency control allows transactions to work on private copies of
database items and validates the transactions only at the time of commit.

The central idea behind optimistic concurrency control is that conflicts between
transactions are rare, and it’s better to let transactions run to completion and only
check for conflicts at commit time.

Breakdown of Validation Concurrency Control (VCC):

 Phases: Each transaction in VCC goes through three distinct phases:

 Read Phase: The transaction reads values from the database and makes
changes to its private copy without affecting the actual database.
7
 Validation Phase: Before committing, the transaction checks if the changes
made to its private copy can be safely written to the database without
causing any conflicts.

 Write Phase: If validation succeeds, the transaction updates the actual


database with the changes made to its private copy.

 Validation Criteria: During the validation phase, the system checks for
potential conflicts with other transactions. If a conflict is found, the system can
either roll back the transaction or delay it for a retry, depending on the specific
strategy implemented.

Real-Life Example
Scenario: A world-famous band, “The Algorithmics,” is about to release tickets
for their farewell concert. Given their massive fan base, the ticketing system is
expected to face a surge in access requests.
EventBriteMax must ensure that ticket sales are processed smoothly without
double bookings or system failures.
 Two-Phase Locking Protocol (2PL):
 Usage: Mainly for premium ticket pre-sales to fan club members. These
sales occur a day before the general ticket release.
 Real-Life Example: When a fan club member logs in to buy a ticket, the
system uses 2PL. It locks the specific seat they choose during the
transaction. Once the transaction completes, the lock is released. This
ensures that no two fan club members can book the same seat at the same
time.
 Timestamp Ordering Protocol:
 Usage: For general ticket sales.

8
 Real-Life Example: As thousands rush to book their tickets, each
transaction gets a timestamp. If two fans try to book the same seat
simultaneously, the one with the earlier timestamp gets priority. The other
fan receives a message suggesting alternative seats.

 Multi-Version Concurrency Control (MVCC):

 Usage: Implemented in the mobile app version of the ticketing platform.

 Real-Life Example: Fans using the mobile app see multiple versions of the
seating chart. When a fan selects a seat, they’re essentially choosing from a
specific version of the seating database. If their choice conflicts with a
completed transaction, the system offers them the next best seat based on the
latest version of the database. This ensures smooth mobile user experience
without frequent transactional conflicts.

 Validation Concurrency Control:

 Usage: For group bookings where multiple seats are booked in a single
transaction.

 Real-Life Example: A group of friends tries to book 10 seats together. They


choose their seats and proceed to payment. Before finalizing, the system
validates that all 10 seats are still available (i.e., no seat was booked by
another user in the meantime). If there’s a conflict, the group is prompted to
choose a different set of seats. If not, their booking is confirmed.

The concert ticket sales go off without a hitch. Fans rave about the smooth
experience, even with such high demand. Behind the scenes, EventBriteMax’s
effective implementation of the four concurrency control protocols played a crucial
role in ensuring that every fan had a fair chance to purchase their ticket and no

9
seats were double-booked. The Algorithmics go on to have a fantastic farewell
concert, with not a single problem in the ticketing process.

Multiple Granularity Locking in DBMS

The various Concurrency Control schemes have used different methods and every
individual Data item is the unit on which synchronization is performed. A certain
drawback of this technique is if a transaction Ti needs to access the entire
database, and a locking protocol is used, then Ti must lock each item in the
database. It is less efficient, it would be simpler if Ti could use a single lock to
lock the entire database. But, if it considers the second proposal, this should not
in fact overlook certain flaws in the proposed method. Suppose another
transaction just needs to access a few data items from a database, so locking the
entire database seems to be unnecessary moreover it may cost us a loss of
Concurrency, which was our primary goal in the first place. To bargain between
Efficiency and Concurrency. Use Granularity.

Let’s start by understanding what is meant by Granularity.

Granularity

It is the size of the data item allowed to lock. Now Multiple Granularity means
hierarchically breaking up the database into blocks that can be locked and can be
tracked what needs to lock and in what fashion. Such a hierarchy can be
represented graphically as a tree.

For example, consider the tree, which consists of four levels of nodes. The
highest level represents the entire database. Below it is nodes of type area; the
database consists of exactly these areas. The area has children nodes which are
called files. Every area has those files that are its child nodes. No file can span
more than one area.

10
Finally, each file has child nodes called records. As before, the file consists of
exactly those records that are its child nodes, and no record can be present in
more than one file. Hence, the levels starting from the top level are:

 database
 area
 file
 record

Figure 1.3: Multi Granularity tree Hiererchy

Consider the above diagram for the example given, each node in the tree can be
locked individually. As in the 2-phase locking protocol, it shall use shared and
exclusive lock modes. When a transaction locks a node, in either shared or
exclusive mode, the transaction also implicitly locks all the descendants of that
node in the same lock mode. For example, if transaction Ti gets an explicit lock
on file Fc in exclusive mode, then it has an implicit lock in exclusive mode on all
the records belonging to that file. It does not need to lock the individual records
of Fc explicitly. this is the main difference between Tree-Based Locking and
Hierarchical locking for multiple granularities.

11
Now, with locks on files and records made simple, how does the system
determine if the root node can be locked? One possibility is for it to search the
entire tree but the solution nullifies the whole purpose of the multiple-granularity
locking scheme. A more efficient way to gain this knowledge is to introduce a
new lock mode, called Intention lock mode.
Intention Mode Lock
In addition to S and X lock modes, there are three additional lock modes with
multiple granularities:
 Intention-Shared (IS): explicit locking at a lower level of the tree but only
with shared locks.
 Intention-Exclusive (IX): explicit locking at a lower level with exclusive or
shared locks.
 Shared & Intention-Exclusive (SIX): the subtree rooted by that node is
locked explicitly in shared mode and explicit locking is being done at a lower
level with exclusive mode locks.
The compatibility matrix for these lock modes are described below:

Figure. 1.4: Multi Granularity tree Hierarchy


12
The multiple-granularity locking protocol uses the intention lock modes to ensure
serializability. It requires that a transaction Ti that attempts to lock a node must
follow these protocols:

 Transaction Ti must follow the lock-compatibility matrix.


 Transaction Ti must lock the root of the tree first, and it can lock it in any
mode.
 Transaction Ti can lock a node in S or IS mode only if Ti currently has the
parent of the node-locked in either IX or IS mode.
 Transaction Ti can lock a node in X, SIX, or IX mode only if Ti currently has
the parent of the node-locked in either IX or SIX modes.
 Transaction Ti can lock a node only if Ti has not previously unlocked any
node (i.e., Ti is two-phase).
 Transaction Ti can unlock a node only if Ti currently has none of the children
of the node-locked.

Observe that the multiple-granularity protocol requires that locks be acquired in


top-down (root-to-leaf) order, whereas locks must be released in bottom-up (leaf
to-root) order.

As an illustration of the protocol, consider the tree given above and the
transactions:

 Say transaction T1 reads record Ra2 in file Fa. Then, T1 needs to lock the
database, area A1, and Fa in IS mode (and in that order), and finally to lock
Ra2 in S mode.

 Say transaction T2 modifies record Ra9 in file Fa . Then, T2 needs to lock the
database, area A1, and file Fa (and in that order) in IX mode, and at last to
lock Ra9 in X mode.

13
 Say transaction T3 reads all the records in file Fa. Then, T3 needs to lock the
database and area A1 (and in that order) in IS mode, and at last to lock Fa in S
mode.

 Say transaction T4 reads the entire database. It can do so after locking the
database in S mode.

Note that transactions T1, T3, and T4 can access the database concurrently.
Transaction T2 can execute concurrently with T1, but not with either T3 or T4.
This protocol enhances concurrency and reduces lock overhead. Deadlock is still
possible in the multiple-granularity protocol, as it is in the two-phase locking
protocol. These can be eliminated by using certain deadlock elimination
techniques.

Thomas write Rule


Thomas Write Rule provides the guarantee of serializability order for the protocol.
It improves the Basic Timestamp Ordering Algorithm.
The basic Thomas write rules are as follows:

o If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and
operation is rejected.

o If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the
transaction and continue processing.

o If neither condition 1 nor condition 2 occurs, then allowed to execute the


WRITE operation by transaction Ti and set W_TS(X) to TS(T).

If we use the Thomas write rule then some serializable schedule can be permitted
that does not conflict serializable as illustrate by the schedule in a given figure:

14
Figure: A Serializable Schedule that is not Conflict Serializable

In the above figure, T1's read and precedes T1's write of the same data item. This
schedule does not conflict serializable.

Thomas write rule checks that T2's write is never seen by any transaction. If we
delete the write operation in transaction T2, then conflict serializable schedule can
be obtained which is shown in below figure.

Figure: A Conflict Serializable Schedule

Recovery with Concurrent Transaction

o Whenever more than one transaction is being executed, then the interleaved
of logs occur. During recovery, it would become difficult for the recovery
system to backtrack all logs and then start recovering.

o To ease this situation, 'checkpoint' concept is used by most DBMS.

As we have discussed checkpoint in Transaction Processing Concept of this


tutorial, so you can go through the concepts again to make things more clear.

15
16

You might also like