DBMS UNIT 5
DBMS UNIT 5
Lost Updates: Consider two users trying to update the same data. If one user
reads a data item and then another user reads the same item and updates it, the
first user’s updates could be lost if they weren’t aware of the second user’s
actions.
1
Uncommitted Data: If one user accesses data that another user has updated
but not yet committed (finalized), and then the second user decides to abort
(cancel) their transaction, the first user has invalid data.
As we just discussed above about what concurrency control is, from that we can
now figure out that we need concurrency control because of the following reasons
listed below:
2
Enhance System Efficiency: By managing concurrent access to the database,
concurrency control allows multiple transactions to be processed in parallel.
This improves system throughput and makes optimal use of resources.
3
Figure 1.1 Two-phase locking Protocol
Shrinking Phase: Once the transaction releases its first lock, the Shrinking
phase starts. During this phase, the transaction can release but not acquire
any more locks.
Lock Point: The exact moment when the transaction switches from the
Growing phase to the Shrinking phase (i.e. when it releases its first lock) is
termed the lock point.
4
2. Time stamp ordering Protocol
This is the latest or most recent timestamp of a transaction that has read the
data item.
Every time a data item X is read by a transaction T with timestamp TS, the
RTS of X is updated to TS if TS is more recent than the current RTS of X.
This is the latest or most recent timestamp of a transaction that has written or
updated the data item.
5
multiple versions of a data item. Instead of locking the items for write operations
(which can reduce concurrency and lead to bottlenecks or deadlocks), MVCC will
create a separate version of the data item being modified.
Figure 1.2:
6
determines which version of the data item a transaction sees when it accesses
that item. A transaction will always see its own writes, even if they are
uncommitted.
Conflict Resolution: If two transactions try to modify the same data item
concurrently, the system will need a way to resolve this. Different systems
have different methods for conflict resolution. A common one is that the first
transaction to commit will succeed, and the other transaction will be rolled
back or will need to resolve the conflict before proceeding.
The central idea behind optimistic concurrency control is that conflicts between
transactions are rare, and it’s better to let transactions run to completion and only
check for conflicts at commit time.
Read Phase: The transaction reads values from the database and makes
changes to its private copy without affecting the actual database.
7
Validation Phase: Before committing, the transaction checks if the changes
made to its private copy can be safely written to the database without
causing any conflicts.
Validation Criteria: During the validation phase, the system checks for
potential conflicts with other transactions. If a conflict is found, the system can
either roll back the transaction or delay it for a retry, depending on the specific
strategy implemented.
Real-Life Example
Scenario: A world-famous band, “The Algorithmics,” is about to release tickets
for their farewell concert. Given their massive fan base, the ticketing system is
expected to face a surge in access requests.
EventBriteMax must ensure that ticket sales are processed smoothly without
double bookings or system failures.
Two-Phase Locking Protocol (2PL):
Usage: Mainly for premium ticket pre-sales to fan club members. These
sales occur a day before the general ticket release.
Real-Life Example: When a fan club member logs in to buy a ticket, the
system uses 2PL. It locks the specific seat they choose during the
transaction. Once the transaction completes, the lock is released. This
ensures that no two fan club members can book the same seat at the same
time.
Timestamp Ordering Protocol:
Usage: For general ticket sales.
8
Real-Life Example: As thousands rush to book their tickets, each
transaction gets a timestamp. If two fans try to book the same seat
simultaneously, the one with the earlier timestamp gets priority. The other
fan receives a message suggesting alternative seats.
Real-Life Example: Fans using the mobile app see multiple versions of the
seating chart. When a fan selects a seat, they’re essentially choosing from a
specific version of the seating database. If their choice conflicts with a
completed transaction, the system offers them the next best seat based on the
latest version of the database. This ensures smooth mobile user experience
without frequent transactional conflicts.
Usage: For group bookings where multiple seats are booked in a single
transaction.
The concert ticket sales go off without a hitch. Fans rave about the smooth
experience, even with such high demand. Behind the scenes, EventBriteMax’s
effective implementation of the four concurrency control protocols played a crucial
role in ensuring that every fan had a fair chance to purchase their ticket and no
9
seats were double-booked. The Algorithmics go on to have a fantastic farewell
concert, with not a single problem in the ticketing process.
The various Concurrency Control schemes have used different methods and every
individual Data item is the unit on which synchronization is performed. A certain
drawback of this technique is if a transaction Ti needs to access the entire
database, and a locking protocol is used, then Ti must lock each item in the
database. It is less efficient, it would be simpler if Ti could use a single lock to
lock the entire database. But, if it considers the second proposal, this should not
in fact overlook certain flaws in the proposed method. Suppose another
transaction just needs to access a few data items from a database, so locking the
entire database seems to be unnecessary moreover it may cost us a loss of
Concurrency, which was our primary goal in the first place. To bargain between
Efficiency and Concurrency. Use Granularity.
Granularity
It is the size of the data item allowed to lock. Now Multiple Granularity means
hierarchically breaking up the database into blocks that can be locked and can be
tracked what needs to lock and in what fashion. Such a hierarchy can be
represented graphically as a tree.
For example, consider the tree, which consists of four levels of nodes. The
highest level represents the entire database. Below it is nodes of type area; the
database consists of exactly these areas. The area has children nodes which are
called files. Every area has those files that are its child nodes. No file can span
more than one area.
10
Finally, each file has child nodes called records. As before, the file consists of
exactly those records that are its child nodes, and no record can be present in
more than one file. Hence, the levels starting from the top level are:
database
area
file
record
Consider the above diagram for the example given, each node in the tree can be
locked individually. As in the 2-phase locking protocol, it shall use shared and
exclusive lock modes. When a transaction locks a node, in either shared or
exclusive mode, the transaction also implicitly locks all the descendants of that
node in the same lock mode. For example, if transaction Ti gets an explicit lock
on file Fc in exclusive mode, then it has an implicit lock in exclusive mode on all
the records belonging to that file. It does not need to lock the individual records
of Fc explicitly. this is the main difference between Tree-Based Locking and
Hierarchical locking for multiple granularities.
11
Now, with locks on files and records made simple, how does the system
determine if the root node can be locked? One possibility is for it to search the
entire tree but the solution nullifies the whole purpose of the multiple-granularity
locking scheme. A more efficient way to gain this knowledge is to introduce a
new lock mode, called Intention lock mode.
Intention Mode Lock
In addition to S and X lock modes, there are three additional lock modes with
multiple granularities:
Intention-Shared (IS): explicit locking at a lower level of the tree but only
with shared locks.
Intention-Exclusive (IX): explicit locking at a lower level with exclusive or
shared locks.
Shared & Intention-Exclusive (SIX): the subtree rooted by that node is
locked explicitly in shared mode and explicit locking is being done at a lower
level with exclusive mode locks.
The compatibility matrix for these lock modes are described below:
As an illustration of the protocol, consider the tree given above and the
transactions:
Say transaction T1 reads record Ra2 in file Fa. Then, T1 needs to lock the
database, area A1, and Fa in IS mode (and in that order), and finally to lock
Ra2 in S mode.
Say transaction T2 modifies record Ra9 in file Fa . Then, T2 needs to lock the
database, area A1, and file Fa (and in that order) in IX mode, and at last to
lock Ra9 in X mode.
13
Say transaction T3 reads all the records in file Fa. Then, T3 needs to lock the
database and area A1 (and in that order) in IS mode, and at last to lock Fa in S
mode.
Say transaction T4 reads the entire database. It can do so after locking the
database in S mode.
Note that transactions T1, T3, and T4 can access the database concurrently.
Transaction T2 can execute concurrently with T1, but not with either T3 or T4.
This protocol enhances concurrency and reduces lock overhead. Deadlock is still
possible in the multiple-granularity protocol, as it is in the two-phase locking
protocol. These can be eliminated by using certain deadlock elimination
techniques.
o If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and
operation is rejected.
o If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the
transaction and continue processing.
If we use the Thomas write rule then some serializable schedule can be permitted
that does not conflict serializable as illustrate by the schedule in a given figure:
14
Figure: A Serializable Schedule that is not Conflict Serializable
In the above figure, T1's read and precedes T1's write of the same data item. This
schedule does not conflict serializable.
Thomas write rule checks that T2's write is never seen by any transaction. If we
delete the write operation in transaction T2, then conflict serializable schedule can
be obtained which is shown in below figure.
o Whenever more than one transaction is being executed, then the interleaved
of logs occur. During recovery, it would become difficult for the recovery
system to backtrack all logs and then start recovering.
15
16