0% found this document useful (0 votes)
15 views40 pages

Topic 3 Concurrency Control

The document discusses transactions and concurrency control in databases, emphasizing the importance of ACID properties (Atomicity, Consistency, Isolation, Durability) for maintaining data integrity. It outlines various concurrency problems such as lost updates and uncommitted updates, and introduces concurrency control techniques like locking methods and timestamping to manage simultaneous operations. Additionally, it explains the concept of serializability and the use of precedence graphs to ensure that transactions can execute concurrently without interfering with each other.

Uploaded by

Nidhi Sood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views40 pages

Topic 3 Concurrency Control

The document discusses transactions and concurrency control in databases, emphasizing the importance of ACID properties (Atomicity, Consistency, Isolation, Durability) for maintaining data integrity. It outlines various concurrency problems such as lost updates and uncommitted updates, and introduces concurrency control techniques like locking methods and timestamping to manage simultaneous operations. Additionally, it explains the concept of serializability and the use of precedence graphs to ensure that transactions can execute concurrently without interfering with each other.

Uploaded by

Nidhi Sood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Transactions & Concurrency

Control

Bassam Hammo

Transactions

 A transaction is an action, or a series of actions, carried out by a


single user or an application program, which reads or updates
the contents of a database.

1
Transactions

 A transaction is a ‘logical unit  Transactions are the unit of


of work’ on a database recovery, consistency, and
 Each transaction does something integrity as well
in the database
 ACID properties
 No part of it alone achieves
 Atomicity
anything of use or interest
 Consistency
 Isolation
 Durability

Atomicity and Consistency

 Atomicity  Consistency
 Transactions are atomic – they  Transactions take the database
don’t have parts (conceptually) from one consistent state into
 can’t be executed partially; it another
should not be detectable that  In the middle of a transaction
they interleave with another the database might not be
transaction consistent

2
Atomicity

Consistency

Ti
Consistent Database Consistent Database

3
Isolation and Durability

 Isolation  Durability
 The effects of a transaction are  Once a transaction has
not visible to other transactions completed, its changes are made
until it has completed permanent
 From outside the transaction has  Even if the system crashes, the
either happened or not effects of a transaction must
 To me this actually sounds like a remain in place
consequence of atomicity…

Isolation

4
Global Recovery

Example of a transaction

 Transfer 50 JD from account A to Atomicity - shouldn’t take money


account B from A without giving it to B
Read(A) Consistency - money isn’t lost or
gained
A = A - 50
Isolation - other queries shouldn’t see
Write(A)
A or B change until completion
Read(B) transaction
Durability - the money does not go
B = B+50 back to A
Write(B)

5
The Transaction Manager

 The transaction manager  Locks or timestamps are used to


enforces the ACID properties ensure consistency and isolation
for concurrent transactions
 It schedules the operations of
(next lectures)
transactions
 A log is kept to ensure durability
 COMMIT and ROLLBACK are
in the event of system failure
used to ensure atomicity
(discussed)

Concurrency

 If we don’t allow for


concurrency then
 Large databases are used by
transactions are run
many people
sequentially
 Many transactions to be run on the
database  Have a queue of transactions

 It is desirable to let them run at the  Long transactions (e.g. backups)


same time as each other will make others wait for long
periods
 Need to preserve isolation

6
Concurrency Problems

 In order to run transactions  This leads to several sorts of


concurrently we interleave problems
their operations  Lost updates

 Each transaction gets a share  Uncommitted updates

of the computing time  Incorrect analysis


 All arise because isolation is
broken

Lost Update

 T1 and T2 read X, both T1 T2


modify it, then both write it
out Read(X)
 The net effect of T1 and T2 X = X - 5
should be no change on X Read(X)
 Only T2’s change is seen,
X = X + 5
however, so the final value of X
has increased by 5 Write(X)
Write(X)
COMMIT
COMMIT

7
Uncommitted Update

 T2 sees the change to X made


by T1, but T1 is rolled back T1 T2
 The change made by T1 is
undone on rollback Read(X)
 It should be as if that change X = X - 5
never happened Write(X)
Read(X)
X = X + 5
Write(X)
ROLLBACK
COMMIT

Inconsistent analysis

 T1 doesn’t change the sum of


X and Y, but T2 sees a change T1 T2
 T1 consists of two parts – take 5
from X and then add 5 to Y Read(X)
 T2 sees the effect of the first, X = X - 5
but not the second Write(X)
Read(X)
Read(Y)
Sum = X+Y
Read(Y)
Y = Y + 5
Write(Y)

8
Need for concurrency control

 Transactions running concurrently may interfere with each


other, causing various problems (lost updates etc.)
 Concurrency control: the process of managing simultaneous
operations on the database without having them interfere with
each other.

Schedules

 A schedule is a sequence of the operations by a set of concurrent


transactions that preserves the order of operations in each of
the individual transactions
 A serial schedule is a schedule where operations of each
transaction are executed consecutively without any interleaved
operations from other transactions (each transaction commits
before the next one is allowed to begin)

9
The Scheduler

 The scheduler component of a DBMS must ensure that the individual


steps of different transactions preserve consistency.

Serial schedules

 Serial schedules are guaranteed to avoid interference and keep


the database consistent
 However databases need concurrent access which means
interleaving operations from different transactions

10
Serializability

 The objective of serializability is to find nonserial schedules that


allow transactions to execute concurrently without interfering
with one another.
 In other words, we want to find nonserial schedules that are
equivalent to some serial schedule. Such a schedule is called
serializable.

Uses of Serializability
 being serializable means
 the schedule is equivalent to some serial schedule
 Serial schedules are correct
 Therefore, serializable schedules are also correct schedules
 serializability is hard to test
 Use precedence graph (PG)
 Need the methods (or protocols) to enforce serializabilty
 Two phase locking(2PL)
 Time stamp ordering (TSO)

11
Conflict Serialisability

 Conflict serialisable schedules  Important questions: how to


are the main focus of determine whether a
concurrency control schedule is conflict
 They allow for interleaving serialisable
and at the same time they are  How to construct conflict
guaranteed to behave as a serialisable schedules
serial schedule

Conflicting Operations

No. Case Conflict Non-Conf


1 Ii & Ij operate on different data X
items
2 Ii = Read(Q) & Ij = Read (Q) X
3 Ii = Read(Q) & Ij = Write (Q) X
4 Ii = Write(Q) & Ij = Write (Q) X
5 Ii = Write(Q) & Ij = Read (Q) X
The only conflicting operation is the Write operation

12
Precedence Graph (PG)

 Precedence graph
 Used to test for conflict serializability of a schedule
 A directed graph G=(V,E)
 V: a finite set of transactions
 E: a set of arcs from Ti to Tj if an action of Ti comes first and conflicts
with one of Tj’s actions

More on PG

 The serialization order is obtained through topological


sorting
 A schedule S is conflict serializable iff there is no cycle in the
precedence graph (acyclic)

13
Serialization Graph
 Consider the schedule S:

Time T1 T2
t1 Write(X)
t2 Read(Y)
t3 Read(Y)
t4 Read(X)

The precedence graph is

T1 T2

Thus, it is conflict equivalent to T1,T2

Serialization Graph
 Consider the schedule:
Time T1 T2 T3
t1 Read (X)
t2 Write (Y)
t3 Write (X)
t4 Read (X)
t5 Read (Y)

The precedence graph is

T1 T2

There is a cycle. T3
Hence it is NOT conflict serializable

14
Serialization Graph
 Consider the schedule:
Time T1 T2
t1 read(balx)
t2 read(balx)
t3 write(balx)
t4 read(baly)
t5 write(baly)
t6 read(baly)
t7 write(baly)

The precedence graph is

T1 T2
There is a cycle.
Hence it is NOT conflict serializable

Serialization Graph
 Consider the following PG:

T1 T2

T3

15
Serialization Graph
 Consider the following PG:

T1 T2

T3

Cycle T1  T2  T1
Cycle T1  T2  T3  T1

Concurrency Control Techniques

 How can the DBMS ensure serializability?


 Two basic concurrency control techniques:
 Locking methods
 Timestamping

16
Locking

 Transaction uses locks to deny access to other transactions


and so prevent incorrect updates.
 Generally, a transaction must claim a
 read (shared), or
 write (exclusive)
lock on a data item before read or write.
 Lock prevents another transaction from modifying item or
even reading it, in the case of a write lock.

Locking

Lock Table
Serializable Schedule

17
Two-Phase Locking Protocol

 Each transaction issues lock and unlock requests in 2 phases:


 Growing phase
 A transaction may obtain locks, but may not
release any lock
 Shrinking phase
 A transaction may release locks, but may not
obtain any new locks

2 PL Protocol
 Basics of locking:
 Each transaction T must obtain a S ( shared) lock on object before reading,
and an X ( exclusive) lock on object before writing.
 If an X lock is granted on object O, no other lock (X or S) might be
granted on O at the same time.
 If an S lock is granted on object O, no X lock might be granted on O at the
same time.
 Conflicting locks are expressed by the compatibility matrix:

S X
S √ --
X -- --

18
Basics of Locking

 A transaction does not request the same lock twice.


 A transaction does not need to request a S lock on an object for
which it already holds an X lock.
 If a transaction has an S lock and needs an X lock it must wait until
all other S locks (except its own) are released
 After a transaction has released one of its lock (unlock) it may not
request any further locks (2PL: growing phase / shrinking phase)
 Using strict two-phase locking (strict 2PL) a transactions releases
all its lock at the end of its execution.
(strict) 2PL allows only serializable schedules.

Preventing Lost Update Problem Using 2PL

Time T1 T2
t1 start
t2 start lock-X(balx)
t3 lock-X(balx) read(balx)
t4 wait balx=balx + 100
t5 wait write(balx)
t6 wait commit/unlock(balx)
t7 read(balx)
t8 balx=balx -10
t9 write(balx)
t10 commit/unlock(balx)

19
Preventing Uncommitted Dependency Problem using 2PL

Time T1 T2
t1 start
t2 lock-X(balx)
t3 read(balx)
t4 start balx=balx + 100
t5 lock-X(balx) write(balx)
t6 wait rollback/unlock(balx)
t7 read(balx)
t8 balx=balx -10
t9 write(balx)
t10 commit/unlock(balx)

Preventing Inconsistent Analysis Problem using 2PL


Time T1 T2
t1 start
t2 start sum=0
t3 lock-X(balx)
t4 read(balx) lock-S(balx)
t5 balx=balx -10 wait
t6 write (balx) wait
t7 lock-X(balz) wait
t8 read(balz) wait
t9 balz=balz+10 wait
t10 write(balz) wait
t11 commit/unlock(balx,balz) wait
t12 read(balx)
t13 sum=sum+balx
t14 lock-S(baly)
t15 read(baly)
t16 sum=sum+baly
t17 lock-S(balz)
t18 read (balz)
t19 sum=sum+balz
t20 commit/unlock(balx,baly,balz)

20
Locking methods: problems

 Deadlock: May result when two (or more) transactions are


each waiting for locks held by the other to be released.

Deadlock

consider the following partial schedule:

Time T1 T2
t1 lock-S(A)
t2 lock-S(B)
t3 read(B)
t4 read(A)
t5 lock-X(B)
t6 lock-X(A)

The transactions are now deadlocked

21
Deadlock Example

Time T1 T2
t1 start
t2 lock-X(balx) start
t3 read(balx) lock-X(baly)
t4 balx=balx -10 read(baly)
t5 write (balx) baly=baly + 100
t6 lock-X(baly) write (baly)
t7 wait lock-X(balx)
t8 wait wait
t9 wait wait
t10 .. ..

Deadlock Detection

 Given a schedule, we can detect deadlocks which will


happen in this schedule using a wait-for graph (WFG).

22
Precedence/Wait-For Graphs

 Precedence graph  Wait-for Graph


 Each transaction is a vertex  Each transaction is a vertex
 Arcs from T1 to T2 if  Arcs from T2 to T1 if
 T1 reads X before T2 writes X  T1 read-locks X then T2 tries to
 T1 writes X before T2 reads X write-lock it
 T1 writes X before T2 writes X  T1 write-locks X then T2 tries to
read-lock it
 T1 write-locks X then T2 tries to
write-lock it

Example

T1 Read(X) T1
T2 Read(Y)
T1 Write(X)
T2 T3
T2 Read(X)
T3 Read(Z) Wait for graph
T3 Write(Z)
T1 Read(Y)
T1
T3 Read(X)
T1 Write(Y)
T2 T3
Precedence graph

23
Example

T1 Read(X) T1
T2 Read(Y)
T1 Write(X)
T2 T3
T2 Read(X)
T3 Read(Z) Wait for graph
T3 Write(Z)
T1 Read(Y)
T1
T3 Read(X)
T1 Write(Y)
T2 T3
Precedence graph

Example

T1 Read(X) T1
T2 Read(Y)
T1 Write(X)
T2 T3
T2 Read(X)
T3 Read(Z) Wait for graph
T3 Write(Z)
T1 Read(Y)
T1
T3 Read(X)
T1 Write(Y)
T2 T3
Precedence graph

24
Example

T1 Read(X) T1
T2 Read(Y)
T1 Write(X)
T2 T3
T2 Read(X)
T3 Read(Z) Wait for graph
T3 Write(Z)
T1 Read(Y)
T1
T3 Read(X)
T1 Write(Y)
T2 T3
Precedence graph

Example

T1 Read(X) S-lock(X) T1
T2 Read(Y) S-lock(Y)
T1 Write(X) X-lock(X)
T2 T3
T2 Read(X) tries S-lock(X)
T3 Read(Z) Wait for graph
T3 Write(Z)
T1 Read(Y)
T1
T3 Read(X)
T1 Write(Y)
T2 T3
Precedence graph

25
Example

T1 Read(X) S-lock(X) T1
T2 Read(Y) S-lock(Y)
T1 Write(X) X-lock(X)
T2 T3
T2 Read(X) tries S-lock(X)
T3 Read(Z) S-lock(Z) Wait for graph
T3 Write(Z) X-lock(Z)
T1 Read(Y) S-lock(Y)
T1
T3 Read(X) tries S-lock(X)
T1 Write(Y)
T2 T3
Precedence graph

Example

T1 Read(X) S-lock(X) T1
T2 Read(Y) S-lock(Y)
T1 Write(X) X-lock(X)
T2 T3
T2 Read(X) tries S-lock(X)
T3 Read(Z) S-lock(Z) Wait for graph
T3 Write(Z) X-lock(Z)
T1 Read(Y) S-lock(Y)
T1
T3 Read(X) tries S-lock(X)
T1 Write(Y) tries X-lock(Y)
T2 T3
Precedence graph

26
Solution

 Only one way to break deadlock: abort one or more of the


transactions.
 Deadlock should be transparent to user, so DBMS should
restart transaction(s).

Deadlock Prevention

 Deadlocks can arise with 2PL  Conservative 2PL


 Deadlock is less of a problem  All locks must be acquired
than an inconsistent DB before the transaction starts
 We can detect and recover from  Hard to predict what locks are
deadlock needed
 It would be nice to avoid it  Low ‘lock utilisation’ -
altogether transactions can hold on to locks
for a long time, but not use
them much

27
Deadlock Prevention

 We impose an ordering on  This prevents deadlock


the resources  If T1 is waiting for a resource
 Transactions must acquire locks from T2 then that resource must
in this order come after all of T1’s current
 Transactions can be ordered on
locks
the last resource they locked  All the arcs in the wait-for graph
point ‘forwards’ - no cycles

Example of resource ordering

 Suppose resource order is: X < Y  It is impossible to end up in a


 This means, if you need locks on X situation when T1 is waiting for a
and Y, you first acquire a lock on X lock on X held by T2, and T2 is
and only after that a lock on Y waiting for a lock on Y held by T1.
 (even if you want to write to Y
before doing anything to X)

28
Timestamp

 Transactions can be run  An alternative is timestamping


concurrently using a variety  Requires less overhead in terms of
of techniques tracking locks or detecting
deadlock
 We looked at using locks to
 Determines the order of
prevent interference transactions before they are
executed

Timestamp

 Each transaction has a  Each resource has two


timestamp, TS, and if T1 timestamps
starts before T2 then TS(T1)  R(X), the largest timestamp of
< TS(T2) any transaction that has read X
 Can use the system clock or an  W(X), the largest timestamp of
incrementing counter to any transaction that has written
generate timestamps X

29
Timestamp Protocol

 If T tries to read X  T tries to write X


 If TS(T) < W(X) T is rolled back  If TS(T) < W(X) or TS(T) <
and restarted with a later R(X) then T is rolled back and
timestamp restarted with a later timestamp
 If TS(T) ≥ W(X) then the read  Otherwise the write succeeds
succeeds and we set R(X) to be and we set W(X) to TS(T)
max(R(X), TS(T))

Timestamp Example 1

 Given T1 and T2 we will


assume
 The transactions make
alternate operations
T1 T2
 Timestamps are allocated
from a counter starting at 1 Read(X) Read(X)
 T1 goes first
Read(Y) Read(Y)
Y=Y+X Z=Y-X
Write(Y) Write(Z)

30
Timestamp Example 1

X Y Z
R
T1 T2
Read(X) Read(X) W
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS

Timestamp Example 1

X Y Z
R 1
T1 T2
Read(X) Read(X) W
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 1

31
Timestamp Example 1

X Y Z
R 2
T1 T2
Read(X) Read(X) W
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 1 2

Timestamp Example 1

X Y Z
R 2 1
T1 T2
Read(X) Read(X) W
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 1 2

32
Timestamp Example 1

X Y Z
R 2 2
T1 T2
Read(X) Read(X) W
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 1 2

Timestamp Example 1

X Y Z
R 2 2
T1 T2
Read(X) Read(X) W
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 1 2

33
Timestamp Example 1

X Y Z
R 2 2
T1 T2
Read(X) Read(X) W
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 1 2

Timestamp Example 1

X Y Z
R 2 2
T1 T2
Read(X) Read(X) W
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 1 2

34
Timestamp Example 1

X Y Z
R 2 2
T1 T2
Read(X) Read(X) W
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 3 2

Timestamp Example 1

X Y Z
R 2 2
T1 T2
Read(X) Read(X) W 2
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 3 2

35
Timestamp Example 1

X Y Z
R 3 2
T1 T2
Read(X) Read(X) W 2
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 3 2

Timestamp Example 1

X Y Z
R 3 3
T1 T2
Read(X) Read(X) W 2
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 3 2

36
Timestamp Example 1

X Y Z
R 3 3
T1 T2
Read(X) Read(X) W 2
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 3 2

Timestamp Example 1

X Y Z
R 3 3
T1 T2
Read(X) Read(X) W 3 2
Read(Y) Read(Y)
Y=Y+X Z=Y-X
T1 T2
Write(Y) Write(Z)
TS 3 2

37
Timestamp ordering – example 2
 Consider the following concurrent schedule

TS(T1) > (WTS(X) = 0), read allowed;


TS(T1) >←WTS(X)
RTS(X) 10 = 0;
TS(T1) = RTS(X) = 10; write allowed;
1. Read(X) WTS(X) ← 10
2. X = X – k
3. Write(X)
TS(T2) > WTS(X) = 10; 1. Read(X)
TS(T2)
TS(T2) = RTS(X)
> WTS(X) = 20;
= 10, readwrite allowed;
allowed; 2. X = X * 1.01
WTS(X)
RTS(X) ← 20← 20
3. Write(X)
4. Read(Y) TS(T1)
TS(T1) > WTS(Y)
> WTS(Y) = read
= 0, 0; allowed;
TS(T1)
RTS(Y) = RTS(Y) = 10; write allowed;
← 10
5. Y = Y + k WTS(Y) ← 10
6. Write(Y)
4. Read(Y)
RTS(X) : 0 10 10 20 20 20 20 20 20 5. Y = Y * 1.01
WTS(X): 0 0 10 10 20 20 20 20 20 6. Write(Y)
RTS(Y) : 0 0 0 0 0 10 10 20 20 TS(T2) > WTS(Y) = 10;
TS(T2) > WTS(Y) = 10, read allowed;
20 TS(T2)
WTS(Y): 0 0 0 0 0 0 10 10
RTS(Y)=←RTS(Y)
20 = 20; write allowed;
WTS(Y) ← 20

T1 (TS = 10) T2 (TS = 20)

Thomas’ write rule

 Write-write conflict may be acceptable in many cases


 Suppose T1 do a write(X) and then T2 do a write(X) and
there is no transaction accessing X in between
 Then T2 only overwrite a value that is never being used
 In such case, it can be argued that such a write is acceptable

38
Thomas’ write rule

 In timestamp ordering, it is referred as the Thomas write rule:


 If a transaction T issue a write(X):
 If TS(T) < RTS(X) then write is rejected, T has to abort
 Else If TS(T) < WTS(X) then write is ignored

 Else, allow the write, and update WTS(X) accordingly

Timestamp

 The protocol means that  Problems


transactions with higher times  Long transactions might keep
take precedence getting restarted by new
 Equivalent to running
transactions - starvation
transactions in order of their  Rolls back old transactions,
final time values which may have done a lot of
 Transactions don’t wait - no
work
deadlock

39
Optimistic concurrency control

 2PL & TSO are pessimistic protocols


 They assume transactions will have problems

 Most optimistic point-of-view:


 Assume no problem and let transaction execute
 But before commit, do a final check
 Only when a problem is discovered, then one aborts
 Basis for optimistic concurrency control

Optimistic concurrency control

 Each transaction T is divided into 3 phases:


1. Read and execution: T reads from the database and
execute. However, T only writes to temporary location (not
to the database itself)
2. Validation: T checks whether there is conflict with other
transaction, abort if necessary
3. Write : T actually write the values in temporary location to
the database
 Each transaction must follow the same order

40

You might also like