Chapter 1 Transaction Management and Concurrency Control Lec 1 and
Chapter 1 Transaction Management and Concurrency Control Lec 1 and
Chapter 1
Transaction Management and
Concurrency Control
1
Agenda
Introduction
Transaction and System Concepts
Properties of Transaction
Schedules and Recoverability
Serializability of Schedules
2
Why We study about
Transaction?
3
Introduction
One criterion for classifying a database system is according to
the number of users who can use the system concurrently.
Single-user VS multi-user systems
A DBMS is single-user if at most one user can use the system
at a time
A DBMS is multi-user if many users can use the system and
4
Introduction(cont..)
It depends on
Computing systems(CPU +programming language)
Single-processor computer system(one cpu)
Multiprogramming execute some commands from one process, then
suspend that process and execute some commands from the next process.
A process is resumed at the point where it was suspended
whenever
it gets its turn to use the CPU again.
concurrent execution of processes is actually interleaved
Inter-leaved Execution
5
Concurrent Transactions
B B B
CPU2
A A
CPU1 A
CPU1
time
t1 t2 t1 t2
Interleaved processing Parallel processing
(Single processor) (Two or more processors)
6
Introduction (cont..)
What is Transaction?
Business(money) Exchange(dictionary definition)
7
Special actions: commit, abort
Transaction: Database Read and Write
Operations
A database is represented as a collection of named data
items
Read-item (X)
1. Find the address of the disk block that contains item X
2. Copy the disk block into a buffer in main memory
3. Copy the item X from the buffer to the program variable named X
Write-item (X)
1. Find the address of the disk block that contains item X.
2. Copy that disk block into a buffer in main memory
3. Copy item X from the program variable named X into its correct
location in the buffer.
4. Store the updated block from the buffer back to disk (either
immediately or at some later point in time). 8
Transaction(example)
Example: fund Transfer
transaction to transfer $50 from account A to account B:
For a user it is one activity
To database
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Two main issues to deal with:
Failures of various kinds, such as hardware failures and system
crashes 9
Why concurrency control (During multiple
transaction Execution)is needed?
12
Problems in concurrent execution
of Transaction(Dirty Read)
occurs when one transaction updates a database
item and then the transaction fails for some
reason.
TT1: : WRITE(X)
1 WRITE(X)
TT2: : READ(X)
2 READ(X)
TT1: : ABORT
1 ABORT
13
Problems in concurrent execution of
Transaction
( Incorrect Summary)
Occurs if one transaction is calculating an aggregate summary
function on a number of database items while other transactions are
updating some of these items, the aggregate function may calculate
some values before they are updated and others after they are updated.
14
Transaction
(The Unrepeatable Read
Problem)
a transaction T reads the same item
twice and the item is changed by
another transaction T’ between the two
reads. Hence, T receives different
values for its two reads of the same
item.
15
How those problems are
solved?
DBMS has a Concurrency Control subsystem to assure database
remains in consistent state despite concurrent execution of
transactions.
Other problems
System failures may occur
Types of failures:
System crash
Local errors
Concurrency control enforcement
Disk failure
Physical failures
Consistency
Isolation
Durability
21
Atomicity and Consistency
Atomicity Consistency
Transactions are Transactions take
atomic – they don’t the database from
have parts one consistent state
(conceptually)
into another
can’t be executed
In the middle of a
partially; it should not
be detectable that they transaction the
interleave with database might not
another transaction be consistent
22
Isolation and Durability
Isolation Durability
The effects of a Once a transaction has
transaction are not completed, its changes
visible to other
are made permanent
transactions until it has
completed Even if the system
From outside the crashes, the effects of a
transaction has either transaction must
happened or not remain in place
23
Properties of Transaction(cont..)
Transfer £50 from account A Atomicity - shouldn’t take
to account B money from A without
Read(A) giving it to B
A = A - 50 Consistency - money isn’t
lost or gained
Write(A) transaction
Isolation - other queries
Read(B)
shouldn’t see A or B
B = B+50 change until completion
Write(B) Durability - the money does
not go back to A
24
Who will enforce the ACID
properties?
The transaction manager
It schedules the operations of transactions
COMMIT and ROLLBACK are used to ensure atomicity
Locks or timestamps are used to ensure consistency and
isolation for concurrent transactions (next lectures)
A log is kept to ensure durability in the event of system
failure.
25
COMMIT and ROLLBACK
28
Schedules and
Recoverability(cont..)
a schedule for a set of transactions must consist of all
instructions of those transactions
must preserve the order in which the instructions
appear in each individual transaction.
Example
Transaction T1: r1(X); w1(X); r1(Y); w1(Y); c1
Transaction T2: r2(X); w2(X); c2
A schedule, S:
r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); c1; c2
29
Schedule(cont..)
Operations
read(Q,q)
read the value of the database item Q and store in
the local variable q.
write(Q,q)
write the value of the database item Q and store in the
local variable q.
other operations such as arithmetic
commit
rollback
30
Example: A “Good” Schedule
31
Example: A “Bad” Schedule
• Another possible
schedule
32
When does Conflicts occur
between two operations?
Two operations conflict if they satisfy ALL
three conditions:
1. they belong to different transactions AND
2. they access the same item AND
3. at least one is a write_item()operation
Example.: Transaction T1 T2
Read(X) Read(X)
Read(X) Write(X)
Write(X) Read(X)
33
Write(X) Write(X)
Serializability of Schedules
What is serializable schedules?
types of schedules that are always considered to be correct
when concurrent transactions are executing.
Suppose that two users
for example, two airline reservations agents submit to the
DBMS transactions T1 and T2 approximately at the same time.
If no interleaving of operations is permitted, there are only two
possible outcomes:
34
Serializability of Schedules(cont..)
1. Execute all the operations of transaction
T1 (in sequence) followed by all the
operations of transaction T2 (in sequence).
2.Execute all the operations of transaction T2
(in sequence) followed by all the
operations of transaction T1 (in sequence).
35
Serializability of
Schedules(classification)
Serial Schedule
Non-serial schedule
Serializable schedule
Conflict equivalent—all pairs of
view
36
Serializability of
Schedules(classification)
Serial Schedule
Schedule where operations of each transaction are
executed consecutively without any interleaved
operations from other transactions. The opposite of
serial is non serial schedule.
No guarantee that results of all serial executions of a
given set of transactions will be identical.
37
38
Serializability of Schedules
Objective of serializability is to find non_serial schedules that
allow transactions to execute concurrently without interfering
with one another.
S2
S1
read_item(X);
read_item(X);
X:=X*1.1;
X:=X+10;
write_item(X);
write_item(X);
Schedules S1 and S2 are result equivalent for X=100 but not in general 40
Conflict Equivalent Schedules
Let I and J be consecutive instructions by two different transactions within a schedule S.
If I and J do not conflict, we can swap their order to produce a new schedule S'.
The instructions appear in the same order in S and S', except for I and J, whose order
41
Conflict Equivalent
schedule(example)
Serial Schedule S1
T1 T2
read_item(A);
write_item(A);
order doesn’t matter
order matters
read_item(B);
write_item(B);
read_item(A):
write_item(A);
read_item(B);
order matters order
write_item(B); doesn’t matter
42
Conflict Equivalence(Example)
Schedule S1’ T2
T1
read_item(A);
read_item(B);
same order as in S1
write_item(A);
read_item(A):
write_item(A);
43
Example 2
Consider the schedule S1 shown figure(a) in next slide containing operations
from two concurrently executing transactions T7 and T8. Since the write
operation on balx in T8 does not conflict with the subsequent read operation on
baly in T7, we can change the order of these operations to produce the
equivalent schedule S2 shown in Figure (b). If we also now change the order of
the following non-conflicting operations, we produce the equivalent serial
schedule S3 shown in figure (c).
44
Cont’d……
Change the order of the write(balx) of
T8 with the write(baly) of T7.
Non serial S1
45
Cont’d
46
Testing for conflict serializability
Under the constrained write rule (that is, a transaction updates a data item based on
its old value, which is first read by the transaction), a precedence (or serialization)
graph can be produced to test for conflict serializability.
For a schedule S, a precedence graph is a directed graph G = (N, E) that consists of a
set of nodes N and a set of directed edges E, which is constructed as follows:
Create a node for each transaction.
Create a directed edge Ti → Tj, if Tj reads the value of an item written by Ti.
Create a directed edge Ti → Tj, if Tj writes a value into an item after it has been
read by Ti.
Create a directed edge Ti → Tj, if Tj writes a value into an item after it has been
written by Ti.
If an edge Ti → Tj exists in the precedence graph for S, then in any serial schedule
S’ equivalent to S, Ti must appear before Tj. If the precedence graph contains a
cycle the schedule is not conflict serializable.
47
Example….
Consider the two transactions shown in Figure below Transaction T9 is
transferring £100 from one account with balance balx to another account with
balance baly, while T10 is increasing the balance of these two accounts by 10%.
48
Cont’d…
Let us draw precedence graph
49
Exercise
Which of the following schedules are conflict serializable ,not conflict
serializable and draw equivalent serial schedule.
50
View Equivalence and View
Serializability
Another less restrictive definition of equivalence of schedules is called view
equivalence.
This leads to another definition of serializability called view serializability.
Two schedules S and S’ are said to be view equivalent if the following three
conditions hold:
1. The same set of transactions participates in S and S’, and S and S’ include the same
operations of those transactions.
2. For any operation ri(X) of Ti in S, if the value of X read by the operation has been
written by an operation wj(X) of Tj (or if it is the original value of X before the
schedule started), the same condition must hold for the value of X read by operation
ri(X) of Ti in S’.
3. If the operation wk(Y) of Tk is the last operation to write item Y in S, then wk(Y) of
Tk must also be the last operation to write item Y in S’.
A schedule is view serializable if it is view equivalent to a serial schedule
51
Cont’d…
Every conflict serializable schedule is view serializable, although the converse is
not true.
e.g. The schedule below is view serializable, although it is not conflict serializable.
In this example, transactions T12 and T13 do not conform to the constrained
write rule; in other words, they perform blind writes.
any view serializable schedule that is not conflict serializable contains one or more
blind writes.
52
Recoverability
Serializability identifies schedules that maintain the consistency of the database,
assuming that none of the transactions in the schedule fails.
If a transaction fails, the atomicity property requires that we undo the effects of the
transaction.
In addition, the durability property states that once a transaction commits, its changes
cannot be undone. This leads to recoverable schedule.
Recoverable schedule A schedule where, for each pair of transactions Ti and Tj, if Tj
reads a data item previously written by Ti, then the commit operation of Ti precedes the
commit operation of Tj.
53
Concurrency control techniques
Serializability can be achieved in several ways.
There are two main concurrency control techniques that allow transactions to execute
safely in parallel subject to certain constraints: locking and timestamp methods.
Locking Methods
What is Locking? A procedure used to control concurrent access to data. When one
transaction is accessing the database, a lock may deny access to other transactions to
prevent incorrect results.
There are several locking variations, but all share the same fundamental characteristic,
namely that a transaction must claim a shared (read) or exclusive (write) lock on a data
item before the corresponding database read or write operation.
Shared lock: If a transaction has a shared lock on a data item, it can read the item but
not update it.
Exclusive lock: If a transaction has an exclusive lock on a data item, it can both read
and update the item.
54
Cont’d…
If a transaction holds the exclusive lock on the item, no other transactions can read
or update that data item.
How the locks are used?
Any transaction that needs to access a data item must first lock the item (i.e.
requesting for shared or exclusive locks.)
If the item is not already locked by another transaction, the lock will be granted.
If the item is currently locked, the DBMS determines whether the request is
compatible with the existing lock. If a shared lock is requested on an item that
already has a shared lock on it, the request will be granted; otherwise, the
transaction must wait until the existing lock is released.
A transaction continues to hold a lock until it explicitly releases it either during
execution or when it terminates (aborts or commits). It is only when the exclusive
lock has been released that the effects of the write operation will be made visible to
other transactions.
55
Incorrect locking schedule
Assume the following schedule If we schedule the transactions
that we have seen in the earlier: by applying lock it becomes
S = {write_lock(T9, balx), read(T9,
balx), write(T9, balx),
unlock(T9,balx),write_lock(T10,
balx), read(T10, balx),
write(T10, balx), unlock(T10,
balx),write_lock(T10, baly),
read(T10, baly), write(T10,
baly), unlock(T10, baly),
commit(T10), write_lock(T9,
baly), read(T9, baly), write(T9,
baly),unlock(T9, baly),
commit(T9)}
56
cont’d…(diagrammatically)
If, prior to execution, balx = 100,
baly = 400, the result should be
balx = 220, baly = 330, if T9
executes before T10, or balx =
210 and baly = 340, if T10
executes before T9. However, the
result of executing schedule S
would give balx = 220 and baly =
340. (S is not a serializable
schedule.) till the serializability
isn’t guaranteed.
To guarantee serializability, we
must follow an additional protocol
i.e. 2PL
57
Two-phase locking (2PL)
A transaction follows the two-phase locking protocol if all locking
operations precede the first unlock operation in the transaction.
According to the rules of this protocol, every transaction can be divided
into two phases: a growing phase and shrinking phase.
Growing phase- in which it acquires all the locks needed but cannot
release any locks.
Shrinking phase - in which it releases its locks but cannot acquire any
new locks.
58
Preventing the lost update
problem using 2PL
T2 blocks T1 from accessing balx because T2 issued with exclusive lock.
e.g.
59
Preventing the uncommitted
dependency (Dirty read) problem
using 2PL
To prevent this problem occurring, T4 first requests an exclusive lock on
balx. It can then proceed to read the value of balx from the database,
increment it by £100, and write the new value back to the database. When
the rollback is executed, the updates of transaction T4 are undone and the
value of balx in the database is returned to its original value of £100.
Then the exclusive lock is released by T4 and granted by T3.
60
Preventing the inconsistent
analysis (incorrect summary)
problem using 2PL
To prevent this problem occurring, T5 must precede its reads by exclusive
locks, and T6 must precede its reads with shared locks. Therefore, when
T5 starts it requests and obtains an exclusive lock on balx. Now, when T6
tries to share lock balx the request is not immediately granted and T6 has
to wait until the lock is released, which is when T5 commits.
61
Cascading rollback
Is a situation, in which a single transaction leads to a series of rollbacks.
E.g. Consider a schedule consisting of the three transactions shown in
Figure below, which conforms to the two-phase locking protocol. All txns
executing their database operations. Meanwhile, T14 has failed and has
been rolled back. However, since T15 is dependent on T14 (it has read an
item that has been updated by T14), T15 must also be rolled back.
Similarly, T16 is dependent on T15, so it too must be rolled back.
62
Cascading rollback (cont’d…)
Are undesirable since they potentially lead to the undoing of a significant
amount of work.
Clearly, it would be useful if we could design protocols that prevent
cascading rollbacks.
One way to achieve this with two-phase locking is to leave the release of
all locks until the end of the transaction.
In this way, the problem illustrated earlier slide would not occur, as T15
would not obtain its exclusive lock until after T14 had completed the
rollback. This is called rigorous 2PL.
Another variant of 2PL, called strict 2PL, only holds exclusive locks
until the end of the transaction.
Most database systems implement one of these two variants of 2PL.
63
Concurrency control with index
structures
can be managed by treating each page of the index as a data item and applying
the two-phase locking protocol described earlier.
For searches, obtain shared locks on nodes starting at the root and proceeding
downwards along the required path. Release the lock on a (parent) node once a
lock has been obtained on the child node.
For insertions, a conservative approach would be to obtain exclusive locks on
all nodes as we descend the tree to the leaf node to be modified. This ensures
that a split in the leaf node can propagate all the way up the tree to the root.
However, if a child node is not full, the lock on the parent node can be released.
64
Latches
DBMSs also support another type of lock called a latch.
It is held for a much shorter duration than a normal lock.
A latch can be used before a page is read from, or written to, disk to
ensure that the operation is atomic.
For example, a latch would be obtained to write a page from the database
buffers to disk, the page would then be written to disk, and the latch
immediately unset.
65
Deadlock
An impasse that may result when two (or more) transactions are each waiting
for locks to be released that are held by the other.
Once deadlock occurs, the applications involved cannot resolve the problem.
Instead, the DBMS has to recognize that deadlock exists and break the
deadlock in some way.
Unfortunately, there is only one way to break deadlock: abort one or more of
the transactions.
Figure in next slide shows two transactions, T17 and T18, that are deadlocked
because each is waiting for the other to release a lock on an item it holds. At
time t2, transaction T17 requests and obtains an exclusive lock on item balx,
and at time t3 transaction T18 obtains an exclusive lock on item baly. Then at
t6, T17 requests an exclusive lock on item baly. Since T18 holds a lock on
baly, transaction T17 waits. Meanwhile, at time t7, T18 requests a lock on item
balx, which is held by transaction T17. Neither transaction can continue
because each is waiting for a lock it cannot obtain until the other completes. 66
Example
In Figure below we may decide to abort transaction T18. Once this is
complete, the locks held by transaction T18 are released and T17 is able to
continue again. Deadlock should be transparent to the user, so the DBMS
should automatically restart the aborted transaction(s)
67
cont’d…
There are three general techniques for handling deadlock: timeouts,
deadlock prevention, and deadlock detection and recovery.
With timeouts, the transaction that has requested a lock waits for at most a
specified period of time.
Using deadlock prevention, the DBMS looks ahead to determine if a
transaction would cause deadlock, and never allows deadlock to occur.
Using deadlock detection and recovery, the DBMS allows deadlock to
occur but recognizes occurrences of deadlock and breaks them.
Since it is more difficult to prevent deadlock than to use timeouts or
testing for deadlock and breaking it when it occurs, systems generally
avoid the deadlock prevention method.
68