DBMS UNIT 5 Part 2
DBMS UNIT 5 Part 2
51
CONCURRENCY CONTROL
Lock-Based Protocols
Deadlock handling
Multiple Granularity
Timestamp-Based Protocols
Validation-Based Protocols
Multiversion Schemes
Lock-Based Protocols
52
Lock-compatibility matrix
– Second Phase:
□ can release a lock-S
Drawbacks
□ Protocol does not guarantee recoverability or cascade freedom
Need to introduce commit dependencies to ensure recoverability
□ Transactions may have to lock data items that they do not access.
increased locking overhead, and additional waiting time
potential decrease in concurrency
Schedules not possible under two-phase locking are possible under
tree protocol, and vice versa.
******END******
Deadlock Handling
73
1 2
lock-X on X
write (X)
lock-X on Y
write (X)
wait for lock-X on X
wait for lock-X on Y
Deadlock Handling
74
IS IX S S IX X
IS
IX
S
S IX
X
Multiple Granularity Locking Scheme
72
2. The root of the tree must be locked first, and may be locked in
any mode.
3. A node Q can be locked by Ti in S or IS mode only if the parent
of Q is currently locked by Ti in either IX or IS mode.
4. A node Q can be locked by Ti in X, SIX, or IX mode only if the
parent of Q is currently locked by Ti in either IX or SIX mode.
5. Ti can lock a node only if it has not previously unlocked any
node (that is, Ti is two-phase).
6. Ti can unlock a node Q only if none of the children of Q are
currently locked by Ti.
Observe that locks are acquired in root-to-leaf order, whereas they
are released in leaf-to-root order.
******END******
Timestamp-Based Protocols
80
T1 T2 T3 T4 T5
read(X)
read(Y)
read(Y)
write(Y)
write(Z)
read(Z)
read(X)
abort
read(X)
write(Z)
abort
write(Y)
write(Z)
Correctness of Timestamp-Ordering Protocol
84
transaction transaction
with smaller with larger
timestamp timestamp
Solution 1:
□ A transaction is structured such that its writes are all performed at the
end of its processing
□ All writes of a transaction form an atomic action; no transaction may
execute while a transaction is being written
□ A transaction that aborts is restarted with a new timestamp
interleaved, but each transaction must go through the three phases in that
order.
□ Assume for simplicity that the validation and write phase occur
together, atomically and serially
I.e., only one transaction executes validation/write at a time.
If for all Ti with TS (Ti) < TS (Tj) either one of the following condition
holds:
□ finish(Ti) < start(Tj)
□ start(Tj) < finish(Ti) < validation(Tj) and the set of data items
written by Ti does not intersect with the set of data items read by
Tj.
then validation succeeds and Tj can be committed. Otherwise,
validation fails and Tj is aborted.
Justification: Either the first condition is satisfied, and there is no
overlapped execution, or the second condition is satisfied and
the writes of Tj do not affect reads of Ti since they occur after Ti
has finished its reads.
the writes of Ti do not affect reads of Tj since Tj does not read
any item written by Ti.
Schedule Produced by Validation
90
T14 T15
read(B)
read(B)
B:= B-50
read( )
A:= A+50
read(A)
(validate)
display (A+B)
(validate)
write (B)
write (A)
Multiversion Schemes
91
Observe that
□ Reads always succeed
□ Ti increments ts-counter by 1
RECOVERY SYSTEM
Failure classification
Storage structure
Recovery and atomicity
Recovery algorithm
Log-based recovery
Shadow paging
Recovery with concurrent transactions
Buffer management
Failure with loss of non-volatile storage
Early lock release and Logical Undo
operations
Failure Classification
106
Transaction failure :
□ Logical errors: transaction cannot complete due to some internal error
condition
□ System errors: the database system must terminate an active transaction
due to an error condition (e.g., deadlock)
System crash: a power failure or other hardware or software failure
causes the system to crash.
□ Fail-stop assumption: non-volatile storage contents are assumed to not
be corrupted by system crash
Database systems have numerous integrity checks to prevent
corruption of disk data
Disk failure: a head crash or similar disk failure destroys all or part of disk
storage
□ Destruction is assumed to be detectable: disk drives use checksums to
detect failures
Recovery Algorithms
107
Volatile storage:
□ does not survive system crashes
Nonvolatile storage:
□ survives system crashes
Protecting storage media from failure during data transfer (one solution):
□ Execute output operation as follows (assuming two copies of each
block):
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the same
information onto the second physical block.
3. The output is completed only after the second write successfully
completes.
Stable-Storage Implementation (Cont.)
110
We assume, for simplicity, that each data item fits in, and is stored
inside, a single block.
Data Access (Cont.)
113
Transaction transfers data items between system buffer blocks and its
private work-area using the following operations :
□ read(X) assigns the value of data item X to the local variable xi.
□ write(X) assigns the value of local variable xi to data item {X} in the
buffer block.
□ both these commands may necessitate the issue of an input(BX)
instruction before the assignment, if the block BX in which X resides is
not already in memory.
Transactions
□ Perform read(X) while accessing X for the first time;
buffer
Buffer Block A input(A)
X A
Buffer Block B Y B
output(B)
read(X)
write(Y)
x2
x1
y1
memory disk
******END******
Recovery and Atomicity
115
□ shadow-paging
<T0 start>
<T0, A, 1000, 950>
To, B, 2000, 2050
A = 950
B = 2050
<T0 commit>
<T1 start> x1
<T1, C, 700, 600>
C = 600
BB, BC
<T1 commit>
BA
Note: BX denotes block containing X.
Immediate Database Modification (Cont.)
124
Tc Tf
T1
T2
T3
T4
To start with, both the page tables are identical. Only current page
table is used for data item accesses during execution of the
transaction.
Whenever any page is about to be written for the first time
□ A copy of this page is made onto an unused page.
To commit a transaction :
1. Flush all modified pages in main memory to disk
2. Output current page table to disk
3. Make the current page table the new shadow page table, as follows:
□ keep a pointer to the shadow page table at a fixed (known)
location on disk.
□ to make the current page table the new shadow page table, simply
update the pointer to point to current page table on disk
Once pointer to shadow page table has been written, transaction is
committed.
No recovery is needed after a crash - new transactions can start right
away, using the shadow page table.
Pages not pointed to from current/shadow page table should be freed
(garbage collected).
Shadow Paging (Cont.)
133
Disadvantages :
□ Copying the entire page table is very expensive
Can be reduced by using a page table structured like a B+-tree
No need to copy entire tree, only need to copy paths in the
tree that lead to updated leaf nodes
□ Commit overhead is high even with above extension
Need to flush every updated page, and page table
□ Data gets fragmented (related pages get separated on disk)
□ in virtual memory
******END******
Recovery with Early Lock Release and
Logical Undo Operations
Recovery with Early Lock Release
Support for high-concurrency locking techniques, such as those used
for B+-tree concurrency control, which release locks early
Supports “logical undo”
Recovery based on “repeating history”, whereby recovery executes
exactly the same actions as normal processing
Logical Undo Logging
Operations like B+-tree insertions and deletions release locks early.
They cannot be undone by restoring old values (physical undo), since once a lock is released,
other transactions may have updated the B+-tree.
Instead, insertions (resp. deletions) are undone by executing a deletion (resp. insertion)
operation (known as logical undo).
For such operations, undo log records should contain the undo operation to
be executed
Such logging is called logical undo logging, in contrast to physical undo logging
Operations are called logical operations
Other examples:
delete of tuple, to undo insert of tuple
allows early lock release on space allocation information
subtract amount deposited, to undo deposit
allows early lock release on bank balance
Physical Redo
Redo information is logged physically (that is, new value for each
write) even for operations with logical undo
Logical redo is very complicated since database state on disk may not be “operation
consistent” when recovery starts
Physical redo logging does not conflict with early lock release
Operation Logging
Operation logging is done as follows:
1. When operation starts, log <Ti, Oj, operation-begin>. Here Oj is a unique identifier of the
operation instance.
2. While operation is executing, normal log records with physical redo and physical undo
information are logged.
3. When operation completes, <Ti, Oj, operation-end, U> is logged, where U contains
information needed to perform a logical undo information.
Example: insert of (key, record-id) pair (K5, RID7) into index I9