SlideShare a Scribd company logo
Unit-IV:
Transactions
By Dr. Jagtap
Ref. Database Concepts by Korth
ACID Properties
A transaction is a unit of program execution that
accesses and possibly updates various data items.
To preserve the integrity of data the database
system must ensure:
• Atomicity. Either all operations of the transaction are properly
reflected in the database or none are.
• Consistency. Execution of a transaction in isolation preserves
the consistency of the database.
• Isolation. Although multiple transactions may execute
concurrently, each transaction must be unaware of other
concurrently executing transactions. Intermediate transaction
results must be hidden from other concurrently executed
transactions.
– That is, for every pair of transactions Ti and Tj, it appears to Ti that
either Tj, finished execution before Ti started, or Tj started
execution after Ti finished.
• Durability. After a transaction completes successfully, the
changes it has made to the database persist, even if there are
system failures.
Ref. Database Concepts by Korth
Transaction State
• Active – the initial state; the transaction stays in this state while it
is executing
• Partially committed – after the final statement has been executed.
• Failed -- after the discovery that normal execution can no longer
proceed.
• Aborted – after the transaction has been rolled back and the
database restored to its state prior to the start of the transaction.
Two options after it has been aborted:
– restart the transaction
• can be done only if no internal logical error
– kill the transaction
• Committed – after successful completion.
Ref. Database Concepts by Korth
Transaction State (Cont.)
Ref. Database Concepts by Korth
Concurrent Executions
• Multiple transactions are allowed to run concurrently in the
system. Advantages are:
– increased processor and disk utilization, leading tobetter
transaction throughput
• E.g. one transaction can be using the CPU while another is
reading from or writing to the disk
– reduced average response time for transactions: short
transactions need not wait behind long ones.
• Concurrency control schemes – mechanisms to achieve isolation
– that is, to control the interaction among the concurrent
transactions in order to prevent them from destroying the
consistency of the database
Ref. Database Concepts by Korth
Schedules
• Schedule – a sequences of instructions that specify the
chronological order in which instructions of concurrent
transactions are executed
– a schedule for a set of transactions must consist of all
instructions of those transactions
– must preserve the order in which the instructions
appear in each individual transaction.
• A transaction that successfully completes its execution
will have a commit instructions as the last statement
– by default transaction assumed to execute commit
instruction as its last step
• A transaction that fails to successfully complete its
execution will have an abort instruction as the last
statement
Ref. Database Concepts by Korth
Schedule 1
• Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance
from A to B.
• A serial schedule in which T1 is followed by T2 :
Ref. Database Concepts by Korth
Schedule 2
• A serial schedule where T2 is followed by T1
Ref. Database Concepts by Korth
Serializability
• Basic Assumption – Each transaction preserves database
consistency.
• Thus serial execution of a set of transactions preserves
database consistency.
• A (possibly concurrent) schedule is serializable if it is
equivalent to a serial schedule. Different forms of schedule
equivalence give rise to the notions of:
1. conflict serializability
2. view serializability
Ref. Database Concepts by Korth
Conflicting Instructions
• Instructions li and lj of transactions Ti and Tj respectively, conflict
if and only if there exists some item Q accessed by both li and lj,
and at least one of these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’tconflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
• A conflict between li and lj forces a (logical) temporal order
between them.
– If li and lj are consecutive in a schedule and they do not
conflict, their results would remain the same even if they
had been interchanged in the schedule.
Ref. Database Concepts by Korth
Conflict Serializability
• If a schedule S can be transformed into a schedule S´ by a
series of swaps of non-conflicting instructions, we say
that S and S´ are conflict equivalent.
• We say that a schedule S is conflict serializable if it is
conflict equivalent to a serial schedule
Ref. Database Concepts by Korth
Conflict Serializability (Cont.)
• Schedule 3 can be transformed into Schedule 6, a serial schedule
where T2 follows T1, by series of swaps of non-conflicting
instructions. Therefore Schedule 3 is conflict serializable.
Schedule 3 Schedule 6
Ref. Database Concepts by Korth
Unit-IV_transaction.pptx
Unit-IV_transaction.pptx
Unit-IV_transaction.pptx
Unit-IV_transaction.pptx
Second Method using Precedence Graph
Unit-IV_transaction.pptx
Unit-IV_transaction.pptx
View Serializability-
If a given schedule is found to be view equivalent to
some serial schedule, then it is called as a view
serializable schedule.
Thumb Rule 1
“Initial readers must be same for all the data items”.
For each data item X, if transaction Ti reads X from the database initially in schedule S1,
then in schedule S2 also, Ti must perform the initial read of X from the database.
Thumb Rule 2
“Write-read sequence must be same.”.
If transaction Ti reads a data item that has been updated by the transaction Tj in schedule S1,
then in schedule S2 also, transaction Ti must read the same data item that has been updated by
the transaction Tj.
Thumb Rule 3
“Final writers must be same for all the data items”.
For each data item X, if X has been updated at last by transaction Ti in schedule S1, then
in schedule S2 also, X must be updated at last by transaction Ti.
How to check whether the schedule is for
View Serializable
Method 1 : All conflict serializable schedules are
view serializable.
All view serializable schedules may or may not be
conflict serializable.
Method 2 : No blind write means not a view
serializable schedule.
Method 3: By using the above three conditions,
write all the dependencies. Then, draw a graph
using those dependencies. If there exists no cycle
in the graph, then the schedule is view serializable
otherwise not.
Irrecoverable Schedules-
If in a schedule,
A transaction performs a dirty read operation from an
uncommitted transaction And commits before the
transaction from which it has read the value then
such a schedule is known as an Irrecoverable
Schedule.
Unit-IV_transaction.pptx
W1(B) , W2(B) (T1 → T2)
W1(B) , W3(B) (T1 → T3)
W1(B) , W4(B) (T1 → T4)
W2(B) , W3(B) (T2 → T3)
W2(B) , W4(B) (T2 → T4)
W3(B) , W4(B) (T3 → T4)
The given schedule S is conflict serializable.
Hence we conclude that the given schedule is also view serializable.
Unit-IV_transaction.pptx
R1(A) , W3(A) (T1 → T3)
R2(A) , W3(A) (T2 → T3)
R2(A) , W1(A) (T2 → T1)
W3(A) , W1(A) (T3 → T1)
There exists a cycle in the precedence graph.
Therefore, the given schedule S is not conflict serializable.
There exists a blind write W3 (A) in the given schedule S.
T1 firstly reads A and T3 firstly updates A.
So, T1 must execute before T3.
Thus, we get the dependency T1 → T3.
Final updation on A is made by the transaction T1.
So, T1 must execute after all other transactions.
Thus, we get the dependency (T2, T3) → T1.
There exists no write-read sequence.
Check whether the given schedule S is view serializable or not. If
yes, then give the serial schedule.
S : R1(A) , W2(A) , R3(A) , W1(A) , W3(A)
R1(A) , W2(A) (T1 → T2)
R1(A) , W3(A) (T1 → T3)
W2(A) , R3(A) (T2 → T3)
W2(A) , W1(A) (T2 → T1)
W2(A) , W3(A) (T2 → T3)
R3(A) , W1(A) (T3 → T1)
W1(A) , W3(A) (T1 → T3)
There exists a cycle in the precedence graph.
Therefore, the given schedule S is not conflict serializable.
There exists a blind write W2 (A) in the given schedule S.
T1 firstly reads A and T2 firstly updates A.
So, T1 must execute before T2.
Thus, we get the dependency T1 → T2.
Final updation on A is made by the transaction T3.
So, T3 must execute after all other transactions.
Thus, we get the dependency (T1, T2) → T3.
From write-read sequence,
we get the dependency T2 → T3
There exists no cycle in the dependency graph. Therefore, the given
schedule S is view serializable.
Recoverable Schedules-
If in a schedule,
A transaction performs a dirty read operation from an
uncommitted transaction And its commit operation is delayed
till the uncommitted transaction either commits or roll backs
then such a schedule is known as a Recoverable Schedule.
Unit-IV_transaction.pptx
Cascading Schedule-
If in a schedule, failure of one transaction causes several other
dependent transactions to rollback or abort, then such a
schedule is called as a Cascading Schedule or Cascading
Rollback or Cascading Abort.
It simply leads to the wastage of CPU time.
Here,
Transaction T2 depends on transaction T1.
Transaction T3 depends on transaction T2.
Transaction T4 depends on transaction T3.
In this schedule,
The failure of transaction T1 causes the transaction T2 to rollback.
The rollback of transaction T2 causes the transaction T3 to rollback.
The rollback of transaction T3 causes the transaction T4 to rollback.
Such a rollback is called as a Cascading Rollback.
Cascadeless Schedule-
If in a schedule, a transaction is not allowed to read a
data item until the last transaction that has written it
is committed or aborted, then such a schedule is
called as a Cascadeless Schedule.
Unit-IV_transaction.pptx
Strict Schedule-
If in a schedule, a transaction is neither allowed to read nor write a
data item until the last transaction that has written it is committed or
aborted, then such a schedule is called as a Strict Schedule.
In other words,
Strict schedule allows only committed read and write operations.
Clearly, strict schedule implements more restrictions than cascadeless
schedule.
Concurrency Control
• A database must provide a mechanism that will ensure that all
possible schedules are
– either conflict or view serializable, and
– are recoverable and preferably cascadeless
• A policy in which only one transaction can execute at a time
generates serial schedules, but provides a poor degree of
concurrency
– Are serial schedules recoverable/cascadeless?
• Testing a schedule for serializability after it has executed is a little
too late!
• Goal – to develop concurrency control protocols that will assure
serializability.
Ref. Database Concepts by Korth
Concurrency Control vs. Serializability Tests
• Concurrency-control protocols allow concurrent schedules, but
ensure that the schedules are conflict/view serializable, and are
recoverable and cascadeless .
• Concurrency control protocols generally do not examine the
precedence graph as it is being created
– Instead a protocol imposes a discipline that avoids nonseralizable
schedules.
– We study such protocols in Chapter 16.
• Different concurrency control protocols provide different tradeoffs
between the amount of concurrency they allow and the amount of
overhead that they incur.
• Tests for serializability help us understand why a concurrency
control protocol is correct.
Ref. Database Concepts by Korth
Ref. Database Concepts by Korth
Concurrency Control
• Lock-Based Protocols
• Timestamp-Based Protocols
Lock-Based Protocols
• A lock is a mechanism to control concurrent access to a data item
• Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.
• Lock requests are made to concurrency-control manager. Transaction can
proceed only after request is granted.
Ref. Database Concepts by Korth
Lock-Based Protocols (Cont.)
• Lock-compatibility matrix
Ref. Database Concepts by Korth
• A transaction may be granted a lock on an item if the requested
lock is compatible with locks already held on the item by other
transactions
• Any number of transactions can hold shared locks on an item,
– but if any transaction holds an exclusive on the item no other
transaction may hold any lock on the item.
• If a lock cannot be granted, the requesting transaction is made
to wait till all incompatible locks held by other transactions have
been released. The lock is then granted.
Lock-Based Protocols (Cont.)
• Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
• Locking as above is not sufficient to guarantee serializability — if A
and B get updated in-between the read of A and B, the displayed
sum would be wrong.
• A locking protocol is a set of rules followed by all transactions
while requesting and releasing locks. Locking protocols restrict the
set of possible schedules.
Ref. Database Concepts by Korth
Pitfalls of Lock-Based Protocols
• Consider the partial schedule
• Neither T3 nor T4 can make progress— executing lock-S(B)
causes T4 to wait for T3 to release its lock on B, while executing
lock-X(A) causes T3 to wait for T4 to release its lock on A.
• Such a situation is called a deadlock.
– Tohandle a deadlock one of T3 or T4 must be rolledback
and its locks released.
Ref. Database Concepts by Korth
Pitfalls of Lock-Based Protocols (Cont.)
• The potential for deadlock exists in most locking protocols. Deadlocks
are a necessary evil.
• Starvation is also possible if concurrency control manager is badly
designed. For example:
– A transaction may be waiting for an X-lock on an item, while a
sequence of other transactions request and are granted an S-lock on
the same item.
– The same transaction is repeatedly rolled back due to deadlocks.
• Concurrency control manager can be designed to prevent starvation.
Ref. Database Concepts by Korth
Implementation of Locking
• A lock manager can be implemented as a separate
process to which transactions send lock and unlock
requests
• The lock manager replies to a lock request by sending a
lock grant messages (or a message asking the transaction
to roll back, in case of a deadlock)
• The requesting transaction waits until its request is
answered
• The lock manager maintains a data-structure called a
lock table to record granted locks and pending
requests
• The lock table is usually implemented as an in-memory
hash table indexed on the name of the data item being
locked Ref. Database Concepts by Korth
• Black rectangles indicate granted locks,
white ones indicate waiting requests
• Lock table also records the type of lock
granted or requested
• New request is added to the end of the
queue of requests for the data item, and
granted if it is compatible with all earlier
locks
• Unlock requests result in the request being
T23
17 123 Lock Table
T23 T1 T8 T2
1912
deleted, and later requests are checked to
see if they can now be granted
• If transaction aborts, all waiting or granted
requests of the transaction are deleted
– lock manager may keep a list of locks
held by each transaction, to implement
this efficiently
granted
waiting
T8
144
T1 T23
14
Ref. Database Concepts by Korth
Deadlock Handling
• System is deadlocked if there is a set of transactions such that
every transaction in the set is waiting for another transaction
in the set.
• Deadlock prevention protocols ensure that the system will
never enter into a deadlock state. Some prevention strategies :
– Require that each transaction locks all its data items before it
begins execution (predeclaration).
– Impose partial ordering of all data items and require that a
transaction can lock data items only in the order specified by the
partial order (graph-based protocol).
Ref. Database Concepts by Korth
More Deadlock Prevention Strategies
• Following schemes use transaction timestamps for
the sake of deadlock prevention alone.
• wait-die scheme — non-preemptive
– older transaction may wait for younger one to release
data item. Younger transactions never wait for older
ones; they are rolled back instead.
– a transaction may die several times before acquiring
needed data item
• wound-wait scheme — preemptive
– older transaction wounds (forces rollback) of younger
transaction instead of waiting for it. Younger
transactions may wait for older ones.
– may be fewer rollbacks than wait-die scheme.
Ref. Database Concepts by Korth
Deadlock prevention (Cont.)
• Both in wait-die and in wound-wait schemes, a rolled
back transactions is restarted with its original
timestamp. Older transactions thus have precedence
over newer ones, and starvation is hence avoided.
• Timeout-Based Schemes:
– a transaction waits for a lock only for a specified amount of
time. After that, the wait times out and the transaction is
rolled back.
– thus deadlocks are not possible
– simple to implement; but starvation is possible. Also
difficult to determine good value of the timeout interval.
Ref. Database Concepts by Korth
Ref. Database Concepts by Korth
Deadlock Detection
• Deadlocks can be described as a wait-for graph, which
consists of a pair G = (V,E),
– V is a set of vertices (all the transactions in the system)
– E is a set of edges; each element is an ordered pair Ti Tj.
• If Ti  Tj is in E, then there is a directed edge from Ti to Tj,
implying that Ti is waiting for Tj to release a dataitem.
• When Ti requests a data item currently being held by Tj,
then the edge Ti Tj is inserted in the wait-for graph. This
edge is removed only when Tj is no longer holding a data
item needed by Ti.
• The system is in a deadlock state if and only if the wait-for
graph has a cycle. Must invoke a deadlock-detection
algorithm periodically to look for cycles.
Deadlock Detection (Cont.)
Wait-for graph without a cycle Wait-for graph with a cycle
Ref. Database Concepts by Korth
Ref. Database Concepts by Korth
Deadlock Recovery
• When deadlock is detected :
– Some transaction will have to rolled back (made a
victim) to break deadlock. Select that transaction as
victim that will incur minimum cost.
– Rollback -- determine how far to roll back transaction
• Total rollback: Abort the transaction and then restart it.
• More effective to roll back transaction only as far as
necessary to break deadlock.
– Starvation happens if same transaction is always
chosen as victim. Include the number of rollbacks in
the cost factor to avoid starvation
The Two-Phase Locking Protocol
• This is a protocol which ensures conflict-
serializable schedules.
• Phase 1: Growing Phase
–transaction may obtain locks
–transaction may not release locks
• Phase 2: Shrinking Phase
–transaction may release locks
–transaction may not obtain locks
Ref. Database Concepts by Korth
Growing Phase: In this phase the transaction can only acquire
locks, but cannot release any lock. The transaction enters the
growing phase as soon as it acquires the first lock it wants.
From now on it has no option but to keep acquiring all the
locks it would need. It cannot release any lock at this phase
even if it has finished working with a locked data item.
Ultimately the transaction reaches a point where all the lock it
may need has been acquired. This point is called Lock Point.
Shrinking Phase: After Lock Point has been reached, the
transaction enters the shrinking phase. In this phase the
transaction can only release locks, but cannot acquire any new
lock. The transaction enters the shrinking phase as soon as it
releases the first lock after crossing the Lock Point. From now
on it has no option but to keep releasing all the acquired locks.
• Two-phase locking does not ensure freedom from deadlocks
• Strict Two Phase Locking Protocol
In this protocol, a transaction may release all the shared
locks after the Lock Point has been reached, but it cannot
release any of the exclusive locks until the transaction
commits. This protocol helps in creating cascade less
schedule.
Rigorous Two Phase Locking Protocol, a transaction is not
allowed to release any lock (either shared or exclusive)
until it commits. This means that until the transaction
commits, other transaction might acquire a shared lock on
a data item on which the uncommitted transaction has a
shared lock; but cannot acquire any lock on a data item on
which the uncommitted transaction has an exclusive lock.
Ref. Database Concepts by Korth
Timestamp-Based Protocols
A timestamp is a tag that can be attached to any transaction or any data
item, which denotes a specific time on which the transaction or data item
had been activated in any way. We, who use computers, must all be familiar
with the concepts of “Date Created” or “Last Modified” properties of files
and folders. Well, timestamps are things like that.
A timestamp can be implemented in two ways. The simplest one is to
directly assign the current value of the clock to the transaction or the data
item. The other policy is to attach the value of a logical counter that keeps
incrementing as new timestamps are required.
The timestamp of a transaction denotes the time when it was first
activated. The timestamp of a data item can be of the following two types:
W-timestamp (Q): This means the latest time when the data item Q has
been written into.
R-timestamp (Q): This means the latest time when the data item Q has
been read from.
These two timestamps are updated each time a successful read/write
operation is performed on the data item Q.
Ref. Database Concepts by Korth
Timestamp-Based Protocols (Cont.)
• The timestamp ordering protocol ensures that any conflicting read
and write operations are executed in timestamp order.
• Suppose a transaction Ti issues aread(Q)
1. If TS(Ti)  W-timestamp(Q), then Ti needs to read a value of Q
that was already overwritten.
■ Hence, the read operation is rejected, and Ti is rolled back.
2. If TS(Ti) W-timestamp(Q), then the read operation is executed,
and R-timestamp(Q) is set to max(R-timestamp(Q), TS(Ti)).
Ref. Database Concepts by Korth
Timestamp-Based Protocols (Cont.)
• Suppose that transaction Ti issueswrite(Q).
1. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was
needed previously, and the system assumed that that value would
never be produced.
■ Hence, the write operation is rejected, and Ti is rolled back.
2. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete
value of Q.
■ Hence, this write operation is rejected, and Ti is rolled back.
3. Otherwise, the write operation is executed, and W-timestamp(Q) is set
to TS(Ti).
Ref. Database Concepts by Korth
Correctness of Timestamp-Ordering Protocol
• The timestamp-ordering protocol guarantees
serializability since all the arcs in the precedence
graph are of the form:
Thus, there will be no cycles in the precedence
graph
• Timestamp protocol ensures freedom from
deadlock as no transaction ever waits.
• But the schedule may not be cascade-free, and
may not even be recoverable.
Ref. Database Concepts by Korth
Basic Steps in Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
Ref. Database Concepts by Korth
Ref. Database Concepts by Korth
Basic Steps in Query Processing
(Cont.)
• Parsing and translation
– translate the query into its internal form. This is then translated
into relational algebra.
– Parser checks syntax, verifies relations
• Evaluation
– The query-execution engine takes a query-evaluation plan,
executes that plan, and returns the answers to the query.
Ref. Database Concepts by Korth
Basic Steps in Query Processing :
Optimization
• A relational algebra expression may have many equivalent expressions
– E.g., salary75000(salary(instructor)) is equivalent to
salary(salary75000(instructor))
• Each relational algebra operation can be evaluated using one of several
different algorithms
– Correspondingly, a relational-algebra expression can be evaluated in
many ways.
• Annotated expression specifying detailed evaluation strategy is called an
evaluation-plan.
– E.g., can use an index on salary to find instructors with salary <
75000,
– or can perform complete relation scan and discard instructors with
salary  75000
Basic Steps: Optimization (Cont.)
• Query Optimization: Amongst all equivalent evaluation plans choose the
one with lowest cost.
– Cost is estimated using statistical information from the
database catalog
• e.g. number of tuples in each relation, size of tuples, etc.
Ref. Database Concepts by Korth
Measures of Query Cost
• Cost is generally measured as total elapsed time for answering query
– Many factors contribute to time cost
• disk accesses, CPU, or even network communication
• Typically disk access is the predominant cost, and is also relatively easy
to estimate. Measured by taking into account
– Number of seeks * average-seek-cost
– Number of blocks read * average-block-read-cost
– Number of blocks written * average-block-write-cost
• Cost to write a block is greater than cost to read a block
– data is read back after being written to ensure that the
write was successful
Ref. Database Concepts by Korth
Equivalence Rules (Cont.)
The set operations union and intersection are commutative
E1  E2 = E2  E1
E1  E2 = E2  E1
 (set difference is not commutative).
Set union and intersection are associative.
(E1  E2)  E3 = E1  (E2  E3)
(E1  E2)  E3 = E1  (E2  E3)
The selection operation distributes over ,  and –.
 (E1 – E2) =  (E1) – (E2)
and similarly for  and  in place of –
Also:  (E1 – E2) = (E1) – E2
and similarly for  in place of –, but not for 
The projection operation distributes over union
L(E1  E2) = (L(E1))  (L(E2))
Ref. Database Concepts by Korth
Query Optimization
• Introduction
• Transformation of Relational Expressions
Query Optimization
• Alternative ways of evaluating a given query
– Equivalent expressions
– Different algorithms for each operation
Ref. Database Concepts by Korth
Unit-IV_transaction.pptx
Unit-IV_transaction.pptx
Ref. Database Concepts by Korth
Performance Tuning
• Adjusting various parameters and design choices to improve system performance
for a specific application.
• Tuning is best done by
1. identifying bottlenecks, and
2. eliminating them.
• Can tune a database system at 3 levels:
– Hardware -- e.g., add disks to speed up I/O, add memory to increase buffer
hits, move to a faster processor.
– Database system parameters -- e.g., set buffer size to avoid paging of buffer,
set checkpointing intervals to limit log size. System may have automatic
tuning.
– Higher level database design, such as the schema, indices and transactions
(more later)
Ref. Database Concepts by Korth
Tunable Parameters
• Tuning of hardware
• Tuning of schema
• Tuning of indices
• Tuning of materialized views
• Tuning of transactions
Tuning of Hardware
• Even well-tuned transactions typically require a few I/O operations
– Typical disk supports about 100 random I/O operations per second
– Suppose each transaction requires just 2 random I/O operations. Then to
support n transactions per second, we need to stripe data across n/50 disks
(ignoring skew)
• Number of I/O operations per transaction can be reduced by keeping more data in
memory
– If all data is in memory, I/O needed only for writes
– Keeping frequently used data in memory reduces disk accesses, reducing
number of disks required, but has a memory cost
Ref. Database Concepts by Korth
Ref. Database Concepts by Korth
Tuning the Database Design
• Schema tuning
– Vertically partition relations to isolate the data that is accessed most often --
only fetch needed information.
• E.g., split account into two, (account-number, branch-name) and
(account-number, balance).
• Branch-name need not be fetched unless required
– Improve performance by storing a denormalized relation
• E.g., store join of account and depositor; branch-name and balance
information is repeated for each holder of an account, but join need not
be computed repeatedly.
• better to use materialized views
Tuning the Database Design (Cont.)
• Index tuning
– Create appropriate indices to speed up slow queries/updates
– Speed up slow updates by removing excess indices (tradeoff between queries
and updates)
– Choose type of index (B-tree/hash) appropriate for most frequent types of
queries.
– Choose which index to make clustered
Ref. Database Concepts by Korth
Tuning the Database Design (Cont.)
Materialized Views
• Materialized views can help speed up certain queries
– Particularly aggregate queries
• Overheads
– Space
– Time for view maintenance
• Immediate view maintenance:done as part of update txn
– time overhead paid by update transaction
• Deferred view maintenance: done only when required
– update transaction is not affected, but system time is spent on
view maintenance
Ref. Database Concepts by Korth
Ref. Database Concepts by Korth
Tuning of Transactions
• Basic approaches to tuning of transactions
– Improve set orientation
– Reduce lock contention
• Rewriting of queries to improve performance was important in the past, but
smart optimizers have made this less important
• Communication overhead and query handling overheads significant part of cost
of each call
– Combine multiple embedded SQL/ODBC/JDBC queries into a singleset-
oriented query
• Set orientation -> fewer calls to database
• E.g. tune program that computes total salary for each department using
a separate SQL query by instead using a single query that computes
total salaries for all department at once (using group by)
– Use stored procedures: avoids re-parsing and re-optimization
of query
Tuning of Transactions (Cont.)
• Reducing lock contention
• Long transactions (typically read-only) that examine large parts of a relation
result in lock contention with update transactions
– E.g. large query to compute bank statistics and regular bank transactions
Ref. Database Concepts by Korth
RECOVERY METHODS
Log-Based Recovery
The log is a sequence of records. Log of each transaction is
maintained in some stable storage so that if any failure occurs,
then it can be recovered from there.
If any operation is performed on the database, then it will be
recorded in the log. But the process of storing the logs should
be done before the actual transaction is applied in the
database.
Let's assume there is a transaction to modify the City of a student. The following logs are
written for this transaction.
When the transaction is initiated, then it writes 'start' log.
<Tn, Start>
When the transaction modifies the City from 'Noida' to 'Bangalore', then another log is
written to the file.
<Tn, City, 'Noida', 'Bangalore' >
When the transaction is finished, then it writes another log to indicate the end of the
transaction.
<Tn, Commit>
There are two approaches to modify the database:
Deferred database modification:
•The deferred modification technique occurs if the
transaction does not modify the database until it has
committed.
•In this method, all the logs are created and stored in the
stable storage, and the database is updated when a
transaction commits.
Immediate database modification:
The Immediate modification technique occurs if database
modification occurs while the transaction is still active.
In this technique, the database is modified immediately after
every operation. It follows an actual database modification.
SHADOW PAGING
Shadow Paging is recovery technique that is used to
recover database. In this recovery technique, database is
considered as made up of fixed size of logical units of storage
which are referred as pages. pages are mapped into physical
blocks of storage, with help of the page table which allow one
entry for each logical page of database. This method uses two
page tables named current page table and shadow page table.
CHECK POINTS
The checkpoint is used to declare a point before which
the DBMS was in the consistent state, and all
transactions were committed. During transaction
execution, such checkpoints are traced. After
execution, transaction log files will be created.

More Related Content

PPT
unit06-dbms-new.ppt
PDF
UNIT 2- TRANSACTION CONCEPTS AND CONCURRENCY CONCEPTS (1).pdf
PPT
Transactions in dbms
PPT
20.SCHEDULES.ppt
PPT
15. Transactions in DBMS
PPT
PPT
Ch15 3717
PDF
Serializability
unit06-dbms-new.ppt
UNIT 2- TRANSACTION CONCEPTS AND CONCURRENCY CONCEPTS (1).pdf
Transactions in dbms
20.SCHEDULES.ppt
15. Transactions in DBMS
Ch15 3717
Serializability

Similar to Unit-IV_transaction.pptx (20)

PPT
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
PDF
Unit 5 - PPT.pdf DBMS SRM university chennai
PPTX
Transaction and serializability
PPT
Transaction management
PPTX
Transactions
PDF
Advance_DBMS-Lecture_notesssssssssssssssss.pdf
PPT
PPT
Chapter17
PPTX
wheguyewfwufwuyvweyfgse6fgsyfs6yfsdtyfsy6udfgsyu
PDF
DBMS 4.pdf
PPT
Unit06 dbms
PPTX
serializability in dbms
PPTX
Distributed Database Design and Relational Query Language
PPTX
Transactions.pptx
PDF
Cs501 transaction
PDF
dbms sanat ppt.pdf
PPTX
unit 4.pptx
PPTX
10497_31714437622.pptxbsznsnznsnznxbxxbbdnxn
PDF
Concepts of Data Base Management Systems
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
Unit 5 - PPT.pdf DBMS SRM university chennai
Transaction and serializability
Transaction management
Transactions
Advance_DBMS-Lecture_notesssssssssssssssss.pdf
Chapter17
wheguyewfwufwuyvweyfgse6fgsyfs6yfsdtyfsy6udfgsyu
DBMS 4.pdf
Unit06 dbms
serializability in dbms
Distributed Database Design and Relational Query Language
Transactions.pptx
Cs501 transaction
dbms sanat ppt.pdf
unit 4.pptx
10497_31714437622.pptxbsznsnznsnznxbxxbbdnxn
Concepts of Data Base Management Systems
Ad

Recently uploaded (20)

PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPTX
CH1 Production IntroductoryConcepts.pptx
PPT
Drone Technology Electronics components_1
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
The-Looming-Shadow-How-AI-Poses-Dangers-to-Humanity.pptx
PDF
Structs to JSON How Go Powers REST APIs.pdf
PPTX
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
web development for engineering and engineering
PDF
Monitoring Global Terrestrial Surface Water Height using Remote Sensing - ARS...
PPTX
24AI201_AI_Unit_4 (1).pptx Artificial intelligence
PPTX
Road Safety tips for School Kids by a k maurya.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
436813905-LNG-Process-Overview-Short.pptx
PPTX
anatomy of limbus and anterior chamber .pptx
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Geotechnical Engineering, Soil mechanics- Soil Testing.pdf
PPTX
Practice Questions on recent development part 1.pptx
Strings in CPP - Strings in C++ are sequences of characters used to store and...
CH1 Production IntroductoryConcepts.pptx
Drone Technology Electronics components_1
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
The-Looming-Shadow-How-AI-Poses-Dangers-to-Humanity.pptx
Structs to JSON How Go Powers REST APIs.pdf
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
UNIT-1 - COAL BASED THERMAL POWER PLANTS
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
web development for engineering and engineering
Monitoring Global Terrestrial Surface Water Height using Remote Sensing - ARS...
24AI201_AI_Unit_4 (1).pptx Artificial intelligence
Road Safety tips for School Kids by a k maurya.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
436813905-LNG-Process-Overview-Short.pptx
anatomy of limbus and anterior chamber .pptx
OOP with Java - Java Introduction (Basics)
Geotechnical Engineering, Soil mechanics- Soil Testing.pdf
Practice Questions on recent development part 1.pptx
Ad

Unit-IV_transaction.pptx

  • 1. Unit-IV: Transactions By Dr. Jagtap Ref. Database Concepts by Korth
  • 2. ACID Properties A transaction is a unit of program execution that accesses and possibly updates various data items. To preserve the integrity of data the database system must ensure: • Atomicity. Either all operations of the transaction are properly reflected in the database or none are. • Consistency. Execution of a transaction in isolation preserves the consistency of the database. • Isolation. Although multiple transactions may execute concurrently, each transaction must be unaware of other concurrently executing transactions. Intermediate transaction results must be hidden from other concurrently executed transactions. – That is, for every pair of transactions Ti and Tj, it appears to Ti that either Tj, finished execution before Ti started, or Tj started execution after Ti finished. • Durability. After a transaction completes successfully, the changes it has made to the database persist, even if there are system failures. Ref. Database Concepts by Korth
  • 3. Transaction State • Active – the initial state; the transaction stays in this state while it is executing • Partially committed – after the final statement has been executed. • Failed -- after the discovery that normal execution can no longer proceed. • Aborted – after the transaction has been rolled back and the database restored to its state prior to the start of the transaction. Two options after it has been aborted: – restart the transaction • can be done only if no internal logical error – kill the transaction • Committed – after successful completion. Ref. Database Concepts by Korth
  • 4. Transaction State (Cont.) Ref. Database Concepts by Korth
  • 5. Concurrent Executions • Multiple transactions are allowed to run concurrently in the system. Advantages are: – increased processor and disk utilization, leading tobetter transaction throughput • E.g. one transaction can be using the CPU while another is reading from or writing to the disk – reduced average response time for transactions: short transactions need not wait behind long ones. • Concurrency control schemes – mechanisms to achieve isolation – that is, to control the interaction among the concurrent transactions in order to prevent them from destroying the consistency of the database Ref. Database Concepts by Korth
  • 6. Schedules • Schedule – a sequences of instructions that specify the chronological order in which instructions of concurrent transactions are executed – a schedule for a set of transactions must consist of all instructions of those transactions – must preserve the order in which the instructions appear in each individual transaction. • A transaction that successfully completes its execution will have a commit instructions as the last statement – by default transaction assumed to execute commit instruction as its last step • A transaction that fails to successfully complete its execution will have an abort instruction as the last statement Ref. Database Concepts by Korth
  • 7. Schedule 1 • Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B. • A serial schedule in which T1 is followed by T2 : Ref. Database Concepts by Korth
  • 8. Schedule 2 • A serial schedule where T2 is followed by T1 Ref. Database Concepts by Korth
  • 9. Serializability • Basic Assumption – Each transaction preserves database consistency. • Thus serial execution of a set of transactions preserves database consistency. • A (possibly concurrent) schedule is serializable if it is equivalent to a serial schedule. Different forms of schedule equivalence give rise to the notions of: 1. conflict serializability 2. view serializability Ref. Database Concepts by Korth
  • 10. Conflicting Instructions • Instructions li and lj of transactions Ti and Tj respectively, conflict if and only if there exists some item Q accessed by both li and lj, and at least one of these instructions wrote Q. 1. li = read(Q), lj = read(Q). li and lj don’tconflict. 2. li = read(Q), lj = write(Q). They conflict. 3. li = write(Q), lj = read(Q). They conflict 4. li = write(Q), lj = write(Q). They conflict • A conflict between li and lj forces a (logical) temporal order between them. – If li and lj are consecutive in a schedule and they do not conflict, their results would remain the same even if they had been interchanged in the schedule. Ref. Database Concepts by Korth
  • 11. Conflict Serializability • If a schedule S can be transformed into a schedule S´ by a series of swaps of non-conflicting instructions, we say that S and S´ are conflict equivalent. • We say that a schedule S is conflict serializable if it is conflict equivalent to a serial schedule Ref. Database Concepts by Korth
  • 12. Conflict Serializability (Cont.) • Schedule 3 can be transformed into Schedule 6, a serial schedule where T2 follows T1, by series of swaps of non-conflicting instructions. Therefore Schedule 3 is conflict serializable. Schedule 3 Schedule 6 Ref. Database Concepts by Korth
  • 17. Second Method using Precedence Graph
  • 20. View Serializability- If a given schedule is found to be view equivalent to some serial schedule, then it is called as a view serializable schedule. Thumb Rule 1 “Initial readers must be same for all the data items”. For each data item X, if transaction Ti reads X from the database initially in schedule S1, then in schedule S2 also, Ti must perform the initial read of X from the database. Thumb Rule 2 “Write-read sequence must be same.”. If transaction Ti reads a data item that has been updated by the transaction Tj in schedule S1, then in schedule S2 also, transaction Ti must read the same data item that has been updated by the transaction Tj. Thumb Rule 3 “Final writers must be same for all the data items”. For each data item X, if X has been updated at last by transaction Ti in schedule S1, then in schedule S2 also, X must be updated at last by transaction Ti.
  • 21. How to check whether the schedule is for View Serializable Method 1 : All conflict serializable schedules are view serializable. All view serializable schedules may or may not be conflict serializable. Method 2 : No blind write means not a view serializable schedule. Method 3: By using the above three conditions, write all the dependencies. Then, draw a graph using those dependencies. If there exists no cycle in the graph, then the schedule is view serializable otherwise not.
  • 22. Irrecoverable Schedules- If in a schedule, A transaction performs a dirty read operation from an uncommitted transaction And commits before the transaction from which it has read the value then such a schedule is known as an Irrecoverable Schedule.
  • 24. W1(B) , W2(B) (T1 → T2) W1(B) , W3(B) (T1 → T3) W1(B) , W4(B) (T1 → T4) W2(B) , W3(B) (T2 → T3) W2(B) , W4(B) (T2 → T4) W3(B) , W4(B) (T3 → T4) The given schedule S is conflict serializable. Hence we conclude that the given schedule is also view serializable.
  • 26. R1(A) , W3(A) (T1 → T3) R2(A) , W3(A) (T2 → T3) R2(A) , W1(A) (T2 → T1) W3(A) , W1(A) (T3 → T1) There exists a cycle in the precedence graph. Therefore, the given schedule S is not conflict serializable. There exists a blind write W3 (A) in the given schedule S.
  • 27. T1 firstly reads A and T3 firstly updates A. So, T1 must execute before T3. Thus, we get the dependency T1 → T3. Final updation on A is made by the transaction T1. So, T1 must execute after all other transactions. Thus, we get the dependency (T2, T3) → T1. There exists no write-read sequence.
  • 28. Check whether the given schedule S is view serializable or not. If yes, then give the serial schedule. S : R1(A) , W2(A) , R3(A) , W1(A) , W3(A)
  • 29. R1(A) , W2(A) (T1 → T2) R1(A) , W3(A) (T1 → T3) W2(A) , R3(A) (T2 → T3) W2(A) , W1(A) (T2 → T1) W2(A) , W3(A) (T2 → T3) R3(A) , W1(A) (T3 → T1) W1(A) , W3(A) (T1 → T3) There exists a cycle in the precedence graph. Therefore, the given schedule S is not conflict serializable. There exists a blind write W2 (A) in the given schedule S.
  • 30. T1 firstly reads A and T2 firstly updates A. So, T1 must execute before T2. Thus, we get the dependency T1 → T2. Final updation on A is made by the transaction T3. So, T3 must execute after all other transactions. Thus, we get the dependency (T1, T2) → T3. From write-read sequence, we get the dependency T2 → T3 There exists no cycle in the dependency graph. Therefore, the given schedule S is view serializable.
  • 31. Recoverable Schedules- If in a schedule, A transaction performs a dirty read operation from an uncommitted transaction And its commit operation is delayed till the uncommitted transaction either commits or roll backs then such a schedule is known as a Recoverable Schedule.
  • 33. Cascading Schedule- If in a schedule, failure of one transaction causes several other dependent transactions to rollback or abort, then such a schedule is called as a Cascading Schedule or Cascading Rollback or Cascading Abort. It simply leads to the wastage of CPU time.
  • 34. Here, Transaction T2 depends on transaction T1. Transaction T3 depends on transaction T2. Transaction T4 depends on transaction T3. In this schedule, The failure of transaction T1 causes the transaction T2 to rollback. The rollback of transaction T2 causes the transaction T3 to rollback. The rollback of transaction T3 causes the transaction T4 to rollback. Such a rollback is called as a Cascading Rollback.
  • 35. Cascadeless Schedule- If in a schedule, a transaction is not allowed to read a data item until the last transaction that has written it is committed or aborted, then such a schedule is called as a Cascadeless Schedule.
  • 37. Strict Schedule- If in a schedule, a transaction is neither allowed to read nor write a data item until the last transaction that has written it is committed or aborted, then such a schedule is called as a Strict Schedule. In other words, Strict schedule allows only committed read and write operations. Clearly, strict schedule implements more restrictions than cascadeless schedule.
  • 38. Concurrency Control • A database must provide a mechanism that will ensure that all possible schedules are – either conflict or view serializable, and – are recoverable and preferably cascadeless • A policy in which only one transaction can execute at a time generates serial schedules, but provides a poor degree of concurrency – Are serial schedules recoverable/cascadeless? • Testing a schedule for serializability after it has executed is a little too late! • Goal – to develop concurrency control protocols that will assure serializability. Ref. Database Concepts by Korth
  • 39. Concurrency Control vs. Serializability Tests • Concurrency-control protocols allow concurrent schedules, but ensure that the schedules are conflict/view serializable, and are recoverable and cascadeless . • Concurrency control protocols generally do not examine the precedence graph as it is being created – Instead a protocol imposes a discipline that avoids nonseralizable schedules. – We study such protocols in Chapter 16. • Different concurrency control protocols provide different tradeoffs between the amount of concurrency they allow and the amount of overhead that they incur. • Tests for serializability help us understand why a concurrency control protocol is correct. Ref. Database Concepts by Korth
  • 40. Ref. Database Concepts by Korth Concurrency Control • Lock-Based Protocols • Timestamp-Based Protocols
  • 41. Lock-Based Protocols • A lock is a mechanism to control concurrent access to a data item • Data items can be locked in two modes : 1. exclusive (X) mode. Data item can be both read as well as written. X-lock is requested using lock-X instruction. 2. shared (S) mode. Data item can only be read. S-lock is requested using lock-S instruction. • Lock requests are made to concurrency-control manager. Transaction can proceed only after request is granted. Ref. Database Concepts by Korth
  • 42. Lock-Based Protocols (Cont.) • Lock-compatibility matrix Ref. Database Concepts by Korth • A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the item by other transactions • Any number of transactions can hold shared locks on an item, – but if any transaction holds an exclusive on the item no other transaction may hold any lock on the item. • If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other transactions have been released. The lock is then granted.
  • 43. Lock-Based Protocols (Cont.) • Example of a transaction performing locking: T2: lock-S(A); read (A); unlock(A); lock-S(B); read (B); unlock(B); display(A+B) • Locking as above is not sufficient to guarantee serializability — if A and B get updated in-between the read of A and B, the displayed sum would be wrong. • A locking protocol is a set of rules followed by all transactions while requesting and releasing locks. Locking protocols restrict the set of possible schedules. Ref. Database Concepts by Korth
  • 44. Pitfalls of Lock-Based Protocols • Consider the partial schedule • Neither T3 nor T4 can make progress— executing lock-S(B) causes T4 to wait for T3 to release its lock on B, while executing lock-X(A) causes T3 to wait for T4 to release its lock on A. • Such a situation is called a deadlock. – Tohandle a deadlock one of T3 or T4 must be rolledback and its locks released. Ref. Database Concepts by Korth
  • 45. Pitfalls of Lock-Based Protocols (Cont.) • The potential for deadlock exists in most locking protocols. Deadlocks are a necessary evil. • Starvation is also possible if concurrency control manager is badly designed. For example: – A transaction may be waiting for an X-lock on an item, while a sequence of other transactions request and are granted an S-lock on the same item. – The same transaction is repeatedly rolled back due to deadlocks. • Concurrency control manager can be designed to prevent starvation. Ref. Database Concepts by Korth
  • 46. Implementation of Locking • A lock manager can be implemented as a separate process to which transactions send lock and unlock requests • The lock manager replies to a lock request by sending a lock grant messages (or a message asking the transaction to roll back, in case of a deadlock) • The requesting transaction waits until its request is answered • The lock manager maintains a data-structure called a lock table to record granted locks and pending requests • The lock table is usually implemented as an in-memory hash table indexed on the name of the data item being locked Ref. Database Concepts by Korth
  • 47. • Black rectangles indicate granted locks, white ones indicate waiting requests • Lock table also records the type of lock granted or requested • New request is added to the end of the queue of requests for the data item, and granted if it is compatible with all earlier locks • Unlock requests result in the request being T23 17 123 Lock Table T23 T1 T8 T2 1912 deleted, and later requests are checked to see if they can now be granted • If transaction aborts, all waiting or granted requests of the transaction are deleted – lock manager may keep a list of locks held by each transaction, to implement this efficiently granted waiting T8 144 T1 T23 14 Ref. Database Concepts by Korth
  • 48. Deadlock Handling • System is deadlocked if there is a set of transactions such that every transaction in the set is waiting for another transaction in the set. • Deadlock prevention protocols ensure that the system will never enter into a deadlock state. Some prevention strategies : – Require that each transaction locks all its data items before it begins execution (predeclaration). – Impose partial ordering of all data items and require that a transaction can lock data items only in the order specified by the partial order (graph-based protocol). Ref. Database Concepts by Korth
  • 49. More Deadlock Prevention Strategies • Following schemes use transaction timestamps for the sake of deadlock prevention alone. • wait-die scheme — non-preemptive – older transaction may wait for younger one to release data item. Younger transactions never wait for older ones; they are rolled back instead. – a transaction may die several times before acquiring needed data item • wound-wait scheme — preemptive – older transaction wounds (forces rollback) of younger transaction instead of waiting for it. Younger transactions may wait for older ones. – may be fewer rollbacks than wait-die scheme. Ref. Database Concepts by Korth
  • 50. Deadlock prevention (Cont.) • Both in wait-die and in wound-wait schemes, a rolled back transactions is restarted with its original timestamp. Older transactions thus have precedence over newer ones, and starvation is hence avoided. • Timeout-Based Schemes: – a transaction waits for a lock only for a specified amount of time. After that, the wait times out and the transaction is rolled back. – thus deadlocks are not possible – simple to implement; but starvation is possible. Also difficult to determine good value of the timeout interval. Ref. Database Concepts by Korth
  • 51. Ref. Database Concepts by Korth Deadlock Detection • Deadlocks can be described as a wait-for graph, which consists of a pair G = (V,E), – V is a set of vertices (all the transactions in the system) – E is a set of edges; each element is an ordered pair Ti Tj. • If Ti  Tj is in E, then there is a directed edge from Ti to Tj, implying that Ti is waiting for Tj to release a dataitem. • When Ti requests a data item currently being held by Tj, then the edge Ti Tj is inserted in the wait-for graph. This edge is removed only when Tj is no longer holding a data item needed by Ti. • The system is in a deadlock state if and only if the wait-for graph has a cycle. Must invoke a deadlock-detection algorithm periodically to look for cycles.
  • 52. Deadlock Detection (Cont.) Wait-for graph without a cycle Wait-for graph with a cycle Ref. Database Concepts by Korth
  • 53. Ref. Database Concepts by Korth Deadlock Recovery • When deadlock is detected : – Some transaction will have to rolled back (made a victim) to break deadlock. Select that transaction as victim that will incur minimum cost. – Rollback -- determine how far to roll back transaction • Total rollback: Abort the transaction and then restart it. • More effective to roll back transaction only as far as necessary to break deadlock. – Starvation happens if same transaction is always chosen as victim. Include the number of rollbacks in the cost factor to avoid starvation
  • 54. The Two-Phase Locking Protocol • This is a protocol which ensures conflict- serializable schedules. • Phase 1: Growing Phase –transaction may obtain locks –transaction may not release locks • Phase 2: Shrinking Phase –transaction may release locks –transaction may not obtain locks Ref. Database Concepts by Korth
  • 55. Growing Phase: In this phase the transaction can only acquire locks, but cannot release any lock. The transaction enters the growing phase as soon as it acquires the first lock it wants. From now on it has no option but to keep acquiring all the locks it would need. It cannot release any lock at this phase even if it has finished working with a locked data item. Ultimately the transaction reaches a point where all the lock it may need has been acquired. This point is called Lock Point. Shrinking Phase: After Lock Point has been reached, the transaction enters the shrinking phase. In this phase the transaction can only release locks, but cannot acquire any new lock. The transaction enters the shrinking phase as soon as it releases the first lock after crossing the Lock Point. From now on it has no option but to keep releasing all the acquired locks.
  • 56. • Two-phase locking does not ensure freedom from deadlocks • Strict Two Phase Locking Protocol In this protocol, a transaction may release all the shared locks after the Lock Point has been reached, but it cannot release any of the exclusive locks until the transaction commits. This protocol helps in creating cascade less schedule. Rigorous Two Phase Locking Protocol, a transaction is not allowed to release any lock (either shared or exclusive) until it commits. This means that until the transaction commits, other transaction might acquire a shared lock on a data item on which the uncommitted transaction has a shared lock; but cannot acquire any lock on a data item on which the uncommitted transaction has an exclusive lock. Ref. Database Concepts by Korth
  • 57. Timestamp-Based Protocols A timestamp is a tag that can be attached to any transaction or any data item, which denotes a specific time on which the transaction or data item had been activated in any way. We, who use computers, must all be familiar with the concepts of “Date Created” or “Last Modified” properties of files and folders. Well, timestamps are things like that. A timestamp can be implemented in two ways. The simplest one is to directly assign the current value of the clock to the transaction or the data item. The other policy is to attach the value of a logical counter that keeps incrementing as new timestamps are required. The timestamp of a transaction denotes the time when it was first activated. The timestamp of a data item can be of the following two types: W-timestamp (Q): This means the latest time when the data item Q has been written into. R-timestamp (Q): This means the latest time when the data item Q has been read from. These two timestamps are updated each time a successful read/write operation is performed on the data item Q. Ref. Database Concepts by Korth
  • 58. Timestamp-Based Protocols (Cont.) • The timestamp ordering protocol ensures that any conflicting read and write operations are executed in timestamp order. • Suppose a transaction Ti issues aread(Q) 1. If TS(Ti)  W-timestamp(Q), then Ti needs to read a value of Q that was already overwritten. ■ Hence, the read operation is rejected, and Ti is rolled back. 2. If TS(Ti) W-timestamp(Q), then the read operation is executed, and R-timestamp(Q) is set to max(R-timestamp(Q), TS(Ti)). Ref. Database Concepts by Korth
  • 59. Timestamp-Based Protocols (Cont.) • Suppose that transaction Ti issueswrite(Q). 1. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was needed previously, and the system assumed that that value would never be produced. ■ Hence, the write operation is rejected, and Ti is rolled back. 2. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete value of Q. ■ Hence, this write operation is rejected, and Ti is rolled back. 3. Otherwise, the write operation is executed, and W-timestamp(Q) is set to TS(Ti). Ref. Database Concepts by Korth
  • 60. Correctness of Timestamp-Ordering Protocol • The timestamp-ordering protocol guarantees serializability since all the arcs in the precedence graph are of the form: Thus, there will be no cycles in the precedence graph • Timestamp protocol ensures freedom from deadlock as no transaction ever waits. • But the schedule may not be cascade-free, and may not even be recoverable. Ref. Database Concepts by Korth
  • 61. Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Ref. Database Concepts by Korth
  • 62. Ref. Database Concepts by Korth Basic Steps in Query Processing (Cont.) • Parsing and translation – translate the query into its internal form. This is then translated into relational algebra. – Parser checks syntax, verifies relations • Evaluation – The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.
  • 63. Ref. Database Concepts by Korth Basic Steps in Query Processing : Optimization • A relational algebra expression may have many equivalent expressions – E.g., salary75000(salary(instructor)) is equivalent to salary(salary75000(instructor)) • Each relational algebra operation can be evaluated using one of several different algorithms – Correspondingly, a relational-algebra expression can be evaluated in many ways. • Annotated expression specifying detailed evaluation strategy is called an evaluation-plan. – E.g., can use an index on salary to find instructors with salary < 75000, – or can perform complete relation scan and discard instructors with salary  75000
  • 64. Basic Steps: Optimization (Cont.) • Query Optimization: Amongst all equivalent evaluation plans choose the one with lowest cost. – Cost is estimated using statistical information from the database catalog • e.g. number of tuples in each relation, size of tuples, etc. Ref. Database Concepts by Korth
  • 65. Measures of Query Cost • Cost is generally measured as total elapsed time for answering query – Many factors contribute to time cost • disk accesses, CPU, or even network communication • Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account – Number of seeks * average-seek-cost – Number of blocks read * average-block-read-cost – Number of blocks written * average-block-write-cost • Cost to write a block is greater than cost to read a block – data is read back after being written to ensure that the write was successful Ref. Database Concepts by Korth
  • 66. Equivalence Rules (Cont.) The set operations union and intersection are commutative E1  E2 = E2  E1 E1  E2 = E2  E1  (set difference is not commutative). Set union and intersection are associative. (E1  E2)  E3 = E1  (E2  E3) (E1  E2)  E3 = E1  (E2  E3) The selection operation distributes over ,  and –.  (E1 – E2) =  (E1) – (E2) and similarly for  and  in place of – Also:  (E1 – E2) = (E1) – E2 and similarly for  in place of –, but not for  The projection operation distributes over union L(E1  E2) = (L(E1))  (L(E2))
  • 67. Ref. Database Concepts by Korth Query Optimization • Introduction • Transformation of Relational Expressions
  • 68. Query Optimization • Alternative ways of evaluating a given query – Equivalent expressions – Different algorithms for each operation Ref. Database Concepts by Korth
  • 71. Ref. Database Concepts by Korth Performance Tuning • Adjusting various parameters and design choices to improve system performance for a specific application. • Tuning is best done by 1. identifying bottlenecks, and 2. eliminating them. • Can tune a database system at 3 levels: – Hardware -- e.g., add disks to speed up I/O, add memory to increase buffer hits, move to a faster processor. – Database system parameters -- e.g., set buffer size to avoid paging of buffer, set checkpointing intervals to limit log size. System may have automatic tuning. – Higher level database design, such as the schema, indices and transactions (more later)
  • 72. Ref. Database Concepts by Korth Tunable Parameters • Tuning of hardware • Tuning of schema • Tuning of indices • Tuning of materialized views • Tuning of transactions
  • 73. Tuning of Hardware • Even well-tuned transactions typically require a few I/O operations – Typical disk supports about 100 random I/O operations per second – Suppose each transaction requires just 2 random I/O operations. Then to support n transactions per second, we need to stripe data across n/50 disks (ignoring skew) • Number of I/O operations per transaction can be reduced by keeping more data in memory – If all data is in memory, I/O needed only for writes – Keeping frequently used data in memory reduces disk accesses, reducing number of disks required, but has a memory cost Ref. Database Concepts by Korth
  • 74. Ref. Database Concepts by Korth Tuning the Database Design • Schema tuning – Vertically partition relations to isolate the data that is accessed most often -- only fetch needed information. • E.g., split account into two, (account-number, branch-name) and (account-number, balance). • Branch-name need not be fetched unless required – Improve performance by storing a denormalized relation • E.g., store join of account and depositor; branch-name and balance information is repeated for each holder of an account, but join need not be computed repeatedly. • better to use materialized views
  • 75. Tuning the Database Design (Cont.) • Index tuning – Create appropriate indices to speed up slow queries/updates – Speed up slow updates by removing excess indices (tradeoff between queries and updates) – Choose type of index (B-tree/hash) appropriate for most frequent types of queries. – Choose which index to make clustered Ref. Database Concepts by Korth
  • 76. Tuning the Database Design (Cont.) Materialized Views • Materialized views can help speed up certain queries – Particularly aggregate queries • Overheads – Space – Time for view maintenance • Immediate view maintenance:done as part of update txn – time overhead paid by update transaction • Deferred view maintenance: done only when required – update transaction is not affected, but system time is spent on view maintenance Ref. Database Concepts by Korth
  • 77. Ref. Database Concepts by Korth Tuning of Transactions • Basic approaches to tuning of transactions – Improve set orientation – Reduce lock contention • Rewriting of queries to improve performance was important in the past, but smart optimizers have made this less important • Communication overhead and query handling overheads significant part of cost of each call – Combine multiple embedded SQL/ODBC/JDBC queries into a singleset- oriented query • Set orientation -> fewer calls to database • E.g. tune program that computes total salary for each department using a separate SQL query by instead using a single query that computes total salaries for all department at once (using group by) – Use stored procedures: avoids re-parsing and re-optimization of query
  • 78. Tuning of Transactions (Cont.) • Reducing lock contention • Long transactions (typically read-only) that examine large parts of a relation result in lock contention with update transactions – E.g. large query to compute bank statistics and regular bank transactions Ref. Database Concepts by Korth
  • 79. RECOVERY METHODS Log-Based Recovery The log is a sequence of records. Log of each transaction is maintained in some stable storage so that if any failure occurs, then it can be recovered from there. If any operation is performed on the database, then it will be recorded in the log. But the process of storing the logs should be done before the actual transaction is applied in the database.
  • 80. Let's assume there is a transaction to modify the City of a student. The following logs are written for this transaction. When the transaction is initiated, then it writes 'start' log. <Tn, Start> When the transaction modifies the City from 'Noida' to 'Bangalore', then another log is written to the file. <Tn, City, 'Noida', 'Bangalore' > When the transaction is finished, then it writes another log to indicate the end of the transaction. <Tn, Commit> There are two approaches to modify the database:
  • 81. Deferred database modification: •The deferred modification technique occurs if the transaction does not modify the database until it has committed. •In this method, all the logs are created and stored in the stable storage, and the database is updated when a transaction commits. Immediate database modification: The Immediate modification technique occurs if database modification occurs while the transaction is still active. In this technique, the database is modified immediately after every operation. It follows an actual database modification.
  • 82. SHADOW PAGING Shadow Paging is recovery technique that is used to recover database. In this recovery technique, database is considered as made up of fixed size of logical units of storage which are referred as pages. pages are mapped into physical blocks of storage, with help of the page table which allow one entry for each logical page of database. This method uses two page tables named current page table and shadow page table.
  • 83. CHECK POINTS The checkpoint is used to declare a point before which the DBMS was in the consistent state, and all transactions were committed. During transaction execution, such checkpoints are traced. After execution, transaction log files will be created.