0% found this document useful (0 votes)
35 views15 pages

CO4 Notes Transaction

Collections of database operations that form a single logical unit are called transactions. A transaction ensures that the database changes from one consistent state to another. Transactions have four main properties: [1] Atomicity - all operations are completed or none are, [2] Consistency - the transaction does not violate any integrity constraints, [3] Isolation - transactions do not interfere with each other, [4] Durability - completed transactions persist even after failures.

Uploaded by

Kîrãñ Kûmãr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views15 pages

CO4 Notes Transaction

Collections of database operations that form a single logical unit are called transactions. A transaction ensures that the database changes from one consistent state to another. Transactions have four main properties: [1] Atomicity - all operations are completed or none are, [2] Consistency - the transaction does not violate any integrity constraints, [3] Isolation - transactions do not interfere with each other, [4] Durability - completed transactions persist even after failures.

Uploaded by

Kîrãñ Kûmãr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

CO4

Transaction

Collections of operations that form a single logical unit of work are called transactions.
Normally a transaction is performed by a single user to perform operations for accessing the
contents of the database. When a transaction is done on the database then the database will be
updated. These days we will come across various Transactions in our day-to-day life. A
successful transaction can change the database from one CONSISTENT STATE to another.
• For example, a transfer of funds from your bank account to your friend’s account looks
like a single operation from your standpoint, but within the database system, it consists
of several operations.
• A transaction includes one or more database access operations—these can include
insertion, deletion, update or retrieval operations.
• One way of specifying the transaction boundaries is by specifying explicit begin
transaction and end transaction statements in an application program; in this case, all
database access operations between the two are considered as forming one transaction.
• Usually, a transaction is initiated by a user program written in a high-level data-
manipulation language (typically SQL), or programming language (for example, C++,
or Java), with embedded database accesses in JDBC or ODBC.
Note: The following SIX operations are logically related and forms as a single unit that
constitutes a Transaction.
1. Read your account balance.
2. Deduct the amount from your balance.
3. Write the remaining balance to your account.
4. Read your friend’s account balance.
5. Add the amount to his account balance.
6. Write the new updated balance to his account.
Suppose account A has a balance of Rs 1000 and wants to transfer Rs 800 from A's account to
B's account, that has a balance, say 2000. This small transaction contains several low-level
tasks:
• In the example, initially the balance in the database for A’s account is Rs 1000 and it is
present in the database means it is present in the Hard Disk.
• Now to perform any arithmetic operation, first of all that A’s account has to be read for
the available balance. For that read(A) operation is used. Remember this read operation
is being performed on the Hard Disk (Database).
• After read operation, to perform arithmetic operation of A=A-800, the data of Rs 1000
has to be brought into RAM memory.
• After getting Rs 1000 into RAM, then the CPU will start its execution i.e performs
subtraction of Rs 800 from Rs 1000.
• Now, to perform the transfer(addition) of Rs 800 to B’s account of Rs 2000, then this
Rs 2000 of B has to be read and brought into RAM again.
• After getting Rs 2000 of B’s balance into RAM, the CPU again starts performing the
addition operation i.e., B=B+800.
• Remember all these operations are being done only inside RAM memory, but not in the
Database (Hard Disk).
• Finally, to save these changes into the Database finally, we have to use commit
operation.

Transaction Timing Diagram:

Transaction Operations:

• BEGIN_TRANSACTION: This marks the beginning of transaction execution.


• READ or WRITE: These specify read or write operations on the database items that are
executed as part of a transaction.
• END_TRANSACTION: This specifies that READ and WRITE transaction operations
have ended and marks the end of transaction execution. However, at this point it may
be necessary to check whether the changes introduced by the transaction can be
permanently applied to the database (committed) or whether the transaction has to be
aborted because it violates serializability
• COMMIT_TRANSACTION: This signals a successful end of the transaction so that
any changes (updates) executed by the transaction can be safely committed to the
database and will not be undone.
• ROLLBACK (or ABORT): This signals that the transaction has ended unsuccessfully,
so that any changes or effects that the transaction may have applied to the database must
be undone.

Transaction States:

A transaction goes through many different states throughout its life cycle. These states are
called as transaction states. Transaction states are as follows-
1. Active state
2. Partially committed state
3. Committed state
4. Failed state
5. Aborted state
6. Terminated state

State Transition Diagram:

1. Active State

• This is the first state in the life cycle of a transaction.


• A transaction is called in an active state as long as its instructions are getting executed.
• All the changes made by the transaction now are stored in the buffer in main memory.

2. Partially Committed State

A transaction starts in the active state. When it finishes its final statement, it enters the partially
committed state. At this point, the transaction has completed its execution, but it is still possible
that it may have to be aborted, since the actual output may still be temporarily residing in main
memory, and thus a hardware failure may preclude its successful completion. After entering
this state, the transaction is considered to be partially committed.

3. Committed State

After all the changes made by the transaction have been successfully stored into the database,
it enters into a committed state. Now, the transaction is considered to be fully committed.

NOTE-
After a transaction has entered the committed state, it is not possible to roll back the transaction.
In other words, it is not possible to undo the changes that has been made by the transaction.
This is because the system is updated into a new consistent state. The only way to undo the
changes is by carrying out another transaction called as compensating transaction that performs
the reverse operations.

4. Failed State

When a transaction is getting executed in the active state or partially committed state and some
failure occurs due to which it becomes impossible to continue the execution, it enters into a
failed state.

5. Aborted State

After the transaction has failed and entered into a failed state, all the changes made by it have
to be undone. To undo the changes made by the transaction, it becomes necessary to roll back
the transaction. After the transaction has rolled back completely, it enters into an aborted state.

6.Terminated State

This is the last state in the life cycle of a transaction. After entering the committed state or
aborted state, the transaction finally enters into a terminated state where its life cycle finally
comes to an end.

ACID PROPERTIES OF TRANSACTION

The following properties are called as ACID Properties of a transaction-


• Atomicity
• Consistency
• Isolation
• Durability

To understand the ACID Properties of Transaction, Let Ti be a transaction that transfers $50
from account A to account B. This transaction can be defined as:
Atomicity:

Either all operations of the transaction are reflected properly in the database, or none are. It is
the responsibility of the transaction recovery subsystem of a DBMS to ensure atomicity. If a
transaction fails to complete for some reason, such as a system crash in the midst of transaction
execution, the recovery technique must undo any effects of the transaction on the database.

Suppose that, just before the execution of transaction Ti, the values of accounts A and B are
$1000 and $2000, respectively. Now suppose that, during the execution of transaction Ti , a
failure occurs that prevents Ti from completing its execution successfully.

Further, suppose that the failure happened after the write(A) operation but before the write(B)
operation. In this case, the values of accounts A and B reflected in the database are $950 and
$2000. The system destroyed $50 as a result of this failure. In particular, we note that the sum
A + B is no longer preserved. Thus, because of the failure, the state of the system no longer
reflects a real state of the world that the database is supposed to capture. We term such a state
an inconsistent state. We must ensure that such inconsistencies are not visible in a database
system.

Note, however, that the system must at some point be in an inconsistent state. Even if
transaction Ti is executed to completion, there exists a point at which the value of account Ais
$950 and the value of account B is $2000, which is clearly an inconsistent state. This state,
however, is eventually replaced by the consistent state where the value of account A is $950,
and the value of account B is $2050. Thus, if the transaction never started or was guaranteed
to complete, such an inconsistent state would not be visible except during the execution of the
transaction. That is the reason for the atomicity requirement: If the atomicity property is
present, all actions of the transaction are reflected in the database, or none are.

Consistency:

The consistency requirement here is that the sum of A and B be unchanged by the execution of
the transaction. Without the consistency requirement, money could be created or destroyed by
the transaction! It can be verified easily that, if the database is consistent before an execution
of the transaction, the database remains consistent after the execution of the transaction.
Ensuring consistency for an individual transaction is the responsibility of the application
programmer who codes the transaction. This task may be facilitated by automatic testing of
integrity constraints.

Isolation:

Even if the consistency and atomicity properties are ensured for each transaction, if several
transactions are executed concurrently, their operations may interleave in some undesirable
way, resulting in an inconsistent state. For example, as we saw earlier, the database is
temporarily inconsistent while the transaction to transfer funds from A to B is executing, with
the deducted total written to A and the increased total yet to be written to B. If a second
concurrently running transaction reads A and B at this intermediate point and computes A+B,
it will observe an inconsistent value. Furthermore, if this second transaction, then performs
updates on A and B based on the inconsistent values that it read, the database may be left in an
inconsistent state even after both transactions have completed.

Ensuring the isolation property is the responsibility of a component of the database system
called the concurrency-control system.

Durability:

Once the execution of the transaction completes successfully, and the user who initiated the
transaction has been notified that the transfer of funds has taken place, it must be the case that
no system failure can result in a loss of data corresponding to this transfer of funds. The
durability property guarantees that, once a transaction completes successfully, all the updates
that it carried out on the database persist, even if there is a system failure after the transaction
completes execution.

Multiple Transaction Processing or Need of concurrent execution

In a multi-user system, multiple users can access and use the same database at one time, which
is known as the concurrent execution of the database. It means that the same database is
executed simultaneously on a multi-user system by different users. While working on the
database transactions, there occurs the requirement of using the database by multiple users for
performing different operations, and in that case, concurrent execution of the database is
performed.

The thing is that the simultaneous execution that is performed should be done in an interleaved
manner, and no operation should affect the other executing operations, thus maintaining the
consistency of the database. Thus, on making the concurrent execution of the transaction
operations, there occur several challenging problems that need to be solved

Transaction-processing systems usually allow multiple transactions to run concurrently.


Allowing multiple transactions to update data concurrently causes several complications with
consistency of the data. Ensuring consistency in spite of concurrent execution of transactions
requires extra work; it is far easier to insist that transactions run serially—that is, one at a time,
each starting only after the previous one has completed. However, there are two good reasons
for allowing concurrency:
1) Improved throughput and resource utilization.

• While a read or write on behalf of one transaction is in progress on one disk, another
transaction can be running in the CPU, while another disk may be executing a read or
write on behalf of a third transaction. All of this increases the throughput of the system.
• Correspondingly, the processor and disk utilization also increase; in other words, the
processor and disk spend less time idle.

2) Reduced waiting time.


• There may be a mix of transactions running on a system, some short and some long. If
transactions run serially, a short transaction may have to wait for a preceding long
transaction to complete, which can lead to unpredictable delays in running a transaction.
• Concurrent execution reduces the unpredictable delays in running transactions.
Moreover, it also reduces the average response time: the average time for a transaction
to be completed after it has been submitted.

Multiple Transaction Processing Problems


Or
Need Of Concurrency Protocols

Lost Update Problem:

This problem occurs when two transactions that access the same database items have their
operations interleaved in a way that makes the value of some database items incorrect. Suppose
that transactions T1 and T2 are submitted at approximately the same time, and suppose that
their operations are interleaved; then the final value of item X is incorrect because T2 reads the
value of X before T1 changes it in the database, and hence the updated value resulting from T1
is lost.

For example, if X = 80 at the start, N = 5 (T1 transfers 5 rupees from the X to Y), and M = 4
(T2 adds 4 rupees to the account X), the final result should be X = 79. However, in the
interleaving of operations shown in Figure, it is X = 84 because the update in T1 was lost.
The Temporary Update (or Dirty Read) Problem

This problem occurs when one transaction updates a database item and then the transaction
fails for some reason. Meanwhile, the updated item is accessed (read) by another transaction
before it is changed back (or rolled back) to its original value.

Figure shows an example where T1 updates item X and then fails before completion, so the
system must roll back X to its original value. Before it can do so, however, transaction T2 reads
the temporary value of X, which will not be recorded permanently in the database because of
the failure of T1. The value of item X that is read by T2 is called dirty data because it has been
created by a transaction that has not completed and committed yet; hence, this problem is also
known as the dirty read problem.

Unrepeatable Read Problem

The unrepeatable read problem occurs when two or more different values of the same data are
read during the read operations in the same transaction. Example: Consider two transactions A
and B performing read/write operations on a data DT in the database DB. The current value of
DT is 1000. Transaction A and B initially read the value of DT as 1000. Transaction A modifies
the value of DT from 1000 to 1500 and then again transaction B reads the value and finds it to
be 1500. Transaction B finds two different values of DT in its two different read operations.

Phantom Read Problem

In the phantom read problem, data is read through two different read operations in the same
transaction. In the first read operation, a value of the data is obtained but in the second
operation, an error is obtained saying the data does not exist. For Example: Consider two
transactions A and B performing read/write operations on a data DT in the database DB. The
current value of DT is 1000. Transaction B initially reads the value of DT as 1000. Transaction
A deletes the data DT from the database DB and then again transaction B reads the value and
finds an error saying the data DT does not exist in the database DB.

SCHEDULES

When transactions are executing concurrently in an interleaved fashion, then the order of
execution of operations from all the various transactions is known as a schedule. A schedule S
of n transactions T1, T2, … , Tn is an ordering of the operations of the transactions. Operations
from different transactions can be interleaved in the schedule S. However, for each transaction
Ti that participates in the schedule S, the operations of Ti in S must appear in the same order in
which they occur in Ti.
Let T1 and T2 be two transactions that transfer funds from one account to another. Transaction
T1 transfers $50 from account A to account B. Transaction T2 transfers 10 percent of the
balance from account A to account B.

Serial Schedules

Each serial schedule consists of a sequence of instructions from various transactions, where the
instructions belonging to one single transaction appear together in that schedule. For a set of n
transactions, there exist n factorial (n!) different valid serial schedules.
Non Serial Schedules

When the database system executes several transactions concurrently, the corresponding
schedule no longer needs to be serial. If two transactions are running concurrently, the operating
system may execute one transaction for a little while, then perform a context switch, execute
the second transaction for some time, and then switch back to the first transaction for some
time, and so on. Suppose that the two transactions are executed concurrently. One possible
schedule appears in Figure.

After this execution takes place, we arrive at the same state as the one in which the transactions
are executed serially in the order T1 followed by T2. The sum A + B is indeed preserved.

But not all concurrent executions result in a correct state. To illustrate, consider the schedule
of Figure. After the execution of this schedule, we arrive at a state where the final values of
accounts A and B are $950 and $2100, respectively. This final state is an inconsistent state, the
sum A + B is not preserved by the execution of the two transactions.
If control of concurrent execution is left entirely to the operating system, many possible
schedules, including ones that leave the database in an inconsistent state, such as the one just
described, are possible. It is the job of the database system to ensure that any schedule that is
executed will leave the database in a consistent state. The concurrency-control component of
the database system carries out this task.

Serializability of Schedules

We can ensure consistency of the database under concurrent execution by making sure that any
schedule that is executed has the same effect as a schedule that could have occurred without
any concurrent execution. That is, the schedule should, in some sense, be equivalent to a serial
schedule. Such schedules are called serializable schedules.

Serial schedules are always serializable, but if steps of multiple transactions are interleaved, it
is harder to determine whether a schedule is serializable.

There are different forms of schedule equivalence, but we focus on a particular form called
conflict serializability.

Conflict Serializability:

Let us consider a schedule S in which there are two consecutive instructions, I and J, of
transactions Ti and Tj. If I and J refer to different data items, then we can swap I and J without
affecting the results of any instruction in the schedule. However, if I and J refer to the same
data item Q, then the order of the two steps may matter. Since we are dealing with only read
and write instructions, there are four cases that we need to consider:

• I = read(Q), J = read(Q). The order of I and J does not matter, since the same value of
Q is read by Ti and Tj , regardless of the order.
• I = read(Q), J = write(Q). If I come before J, then Ti does not read the value of Q that
is written by Tj in instruction J. If J comes before I, then Ti reads the value of Q that is
written by Tj. Thus, the order of I and J matters.
• I = write(Q), J = read(Q). The order of I and J matters for reasons similar to those of
the previous case.
• I = write(Q), J = write(Q). Since both instructions are write operations, the order of
these instructions does not affect either Ti or Tj. However, the value obtained by the
next read(Q) instruction of S is affected, since the result of only the latter of the two
write instructions is preserved in the database. If there is no other write(Q) instruction
after I and J in S, then the order of I and J directly affects the final value of Q in the
database state that results from schedule S.
Since the write(A) instruction of T2 in schedule 3 of Figure does not conflict with the read(B)
instruction of T1, we can swap these instructions. We continue to swap nonconflicting
instructions:
• Swap the read(B) instruction of T1 with the read(A) instruction of T2.
• Swap the write(B) instruction of T1 with the write(A) instruction of T2.
• Swap the write(B) instruction of T1 with the read(A) instruction of T2.

If a schedule S can be transformed into a schedule S’ by a series of swaps of nonconflicting


instructions, we say that S and S are conflict equivalent. The concept of conflict
equivalence leads to the concept of conflict serializability. We say that a schedule S is
conflict serializable if it is conflict equivalent to a serial schedule. Thus, following schedule
is conflict serializable, since it is conflict equivalent to the serial schedule.
Testing For Conflict Serializability Through Precedence Graph:

It is a simple and efficient method for determining conflict serializability of a schedule.


Consider a schedule S. We construct a directed graph, called a precedence graph, from S. This
graph consists of a pair G = (V, E), where V is a set of vertices and E is a set of edges. The set
of vertices consists of all the transactions participating in the schedule. The set of edges consists
of all edges Ti →Tj for which one of three conditions holds:

If the precedence graph for S has a cycle, then schedule S is not conflict serializable. If the
graph contains no cycles, then the schedule S is conflict serializable.
For example, the precedence graph for schedule 1 in Figure contains the single edge T1 → T2,
since all the instructions of T1 are executed before the first instruction of T2 is executed.
Similarly, Figure 14.10b shows the precedence graph for schedule 2 with the single edge T2
→T1, since all the instructions of T2 are executed before the first instruction of T1 is executed.
The precedence graph for the following schedule appears in Figure. It contains the edge T1
→T2, because T1 executes read(A) before T2 executes write(A). It also contains the edge
T2→T1, because T2 executes read(B) before T1 executes write(B). Since it has cycle, so the
schedule is not conflict serializable.

You might also like