0% found this document useful (0 votes)
2 views

AdvChapter 3

Chapter 3 discusses transaction processing and management in databases, defining a transaction as a logical unit of work that can involve multiple operations. It outlines the ACID properties (Atomicity, Consistency, Isolation, Durability) that transactions must adhere to, and explains the importance of serializability and recoverability in maintaining database integrity during concurrent transactions. The chapter also details the roles of various components in a DBMS, including the transaction manager, scheduler, and recovery manager, in ensuring consistent and reliable transaction processing.

Uploaded by

qqq
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

AdvChapter 3

Chapter 3 discusses transaction processing and management in databases, defining a transaction as a logical unit of work that can involve multiple operations. It outlines the ACID properties (Atomicity, Consistency, Isolation, Durability) that transactions must adhere to, and explains the importance of serializability and recoverability in maintaining database integrity during concurrent transactions. The chapter also details the roles of various components in a DBMS, including the transaction manager, scheduler, and recovery manager, in ensuring consistent and reliable transaction processing.

Uploaded by

qqq
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 13

Chapter 3: Transaction Processing and Management

Introduction

Transaction: An action, or series of actions, carried out by a single


user or application program, which reads or updates the contents of
the database.
A transaction is a logical unit of work on the database. It may be
an entire program, a part of a program, or a single command (for
example, the SQL command INSERT or UPDATE), and it may involve
any number of operations on the database.
In the database context, the execution of an application program
can be thought of as one or more transactions with non-database
processing taking place in between.
To illustrate the concepts of a transaction, we examine two relations
from the instance of the DreamHome rental database shown in the
Figure below:
Figure Example of Transactions.
A B
read(staffNo = x, salary) delete(staffNo = x)
salary = salary * 1.1 for all PropertyForRent
records, pno
write(staffNo = x, salary) begin
read(propertyNo =
pno,
staffNo)
if (staffNo = x) then
begin
staffNo =
newStaffNo
write(propertyNo =
pno,
staffNo)
end
end
Staff (staffNo, fName, lName, position, sex, DOB, salary, branchNo)
PropertyForRent (propertyNo, street, city, postcode, type, rooms,
rent, ownerNo, staffNo, branchNo)

A simple transaction against this database is to update the salary of


a particular member of staff given the staff number, x. At a high
level, we could write this transaction as shown in Figure above. In
this chapter we denote a database read or write operation on a data
item x as read(x) or write(x).
Additional qualifiers may be added as necessary; for example, in
Figure above, we have used the notation read(staffNo = x, salary) to
indicate that we want to read the data item salary for the tuple with
primary key value x. In this example, we have a transaction
consisting of two database operations (read and write) and a
nondatabase
operation (salary = salary*1.1).
A more complicated transaction is to delete the member of staff
with a given staff number x, as shown in Figure below.
In this case, as well as having to delete the tuple in the Staff
relation, we also need to find all the PropertyForRent tuples that this
member of staff managed and reassign them to a different member
of staff, newStaffNo say. If all these updates are not made,
referential integrity will be lost and the database will be in an
inconsistent state: a property will be managed by a member of
staff who no longer exists in the database.
A transaction should always transform the database from one
consistent state to another, although we accept that consistency
may be violated while the transaction is in progress.
For example, during the transaction in Figure below, there may be
some moment when one tuple of PropertyForRent contains the new
newStaffNo value and another still contains the old one, x. However,
at the end of the transaction, all necessary tuples should have the
new newStaffNo value.
A transaction can have one of two outcomes. If it completes
successfully, the transaction is said to have committed and the
database reaches a new consistent state. On the other hand, if the
transaction does not execute successfully, the transaction is
aborted. If a transaction is aborted, the database must be restored
to the consistent state it was in before the transaction started. Such
a transaction is rolled back or undone.

A committed transaction cannot be aborted. If we decide that the


committed transaction was a mistake, we must perform another
compensating transaction to reverse its effects. However, an
aborted transaction that is rolled back can be restarted later and,
depending on the cause of the failure, may successfully execute and
commit at that time.
Figure State transition diagram for a transaction.

end Commit
Partially Committed
transaction
Committed

Begin Active
Transaction
Abort

Failed Aborted
Abort

The keywords BEGIN TRANSACTION, COMMIT, and ROLLBACK (or


their equivalent†) are available in many data manipulation
languages to delimit transactions. If these delimiters are not used,
the entire program is usually regarded as a single transaction, with
the DBMS automatically performing a COMMIT when the program
terminates correctly and a ROLLBACK if it does not.
Figure below shows the state transition diagram for a transaction.
Note that in addition to the obvious states of ACTIVE, COMMITTED,
and ABORTED, there are two other states:
PARTIALLY COMMITTED, which occurs after the final statement
has been executed.At this point, it may be found that the
transaction has violated serializability.
Alternatively, the system may fail and any data updated by
the transaction may not have been safely recorded on
secondary storage. In such cases, the transaction would go
into the FAILED state and would have to be aborted. If the
transaction has been successful, any updates can be safely
recorded and the transaction can go to the COMMITTED state.
FAILED, which occurs if the transaction cannot be committed
or the transaction is aborted while in the ACTIVE state,
perhaps due to the user aborting the transaction or as a result
of the concurrency control protocol aborting the transaction to
ensure serializability.
Properties of Transactions
There are properties that all transactions should possess. The four
basic, or so-called ACID, properties of a transaction are (Haerder
and Reuter, 1983):
 Atomicity The ‘all or nothing’ property. A transaction is an
indivisible unit that is either performed in its entirety or is not
performed at all. It is the responsibility of the recovery
subsystem of the DBMS to ensure atomicity.
 Consistency A transaction must transform the database from
one consistent state to another consistent state. It is the
responsibility of both the DBMS and the application
developers to ensure consistency. The DBMS can ensure
consistency by enforcing all the constraints that have been
specified on the database schema, such as integrity and
enterprise constraints. However, in itself this is insufficient to
ensure consistency.
For example, suppose we have a transaction that is intended
to transfer money from one bank account to another and the
programmer makes an error in the transaction logic and
debits one account but credits the wrong account, then the
database is in an inconsistent state. However, the DBMS
would not have been responsible for introducing this
inconsistency and would have had no ability to detect the
error.
 Isolation Transactions execute independently of one another.
In other words, the partial effects of incomplete transactions
should not be visible to other transactions. It is the
responsibility of the concurrency control subsystem to ensure
isolation.
 Durability The effects of a successfully completed
(committed) transaction are permanently recorded in the
database and must not be lost because of a subsequent
failure. It is the responsibility of the recovery subsystem to
ensure durability.
Second assignment discussions questions:
 What is transaction?
 Describe the properties of transaction?
 Discuss the basic states of transaction.

Topic: DBMS transaction subsystem, Serializability and


Recovery

The architecture for a DBMS here in the figure below identifying four
high-level database modules that handle transactions, concurrency
control, and recovery.
The transaction manager coordinates transactions on behalf of
application programs. It communicates with the scheduler, the
module responsible for implementing a particular strategy for
concurrency control. The scheduler is sometimes referred to as the
lock manager if the concurrency control protocol is locking based.
The objective of the scheduler is to maximize concurrency without
allowing concurrently executing transactions to interfere with one
another, and so compromise the integrity or consistency of the
database.

Figure. DBMS transaction subsystem.

Transaction Scheduler
Manager

Buffer Recovery
Manager Manager

Access File
Manager Manager

System Database and


Manager System Catalog
If a failure occurs during the transaction, then the database could be
inconsistent. It is the task of the recovery manager to ensure that
the database is restored to the state it was in before the start of the
transaction, and therefore a consistent state. Finally, the
buffer manager: is responsible for the efficient transfer of data
between disk storage and main memory.
Serializability and Recoverability
The objective of a concurrency control protocol is to schedule
transactions in such a way as to avoid any interference between
them, and hence prevent the types of problem described in the
previous section.

One obvious solution is to allow only one transaction to execute at a


time: one transaction is committed before the next transaction is
allowed to begin.
However, the aim of a multi-user DBMS is also to maximize the
degree of concurrency or parallelism in the system, so that
transactions that can execute without interfering with one another
can run in parallel. For example, transactions that access different
parts of the database can be scheduled together without
interference. In this section, we examine serializability as a means
of helping to identify those executions of transactions that are
guaranteed to ensure consistency. First, we give some definitions.
Schedule A sequence of the operations by a set of concurrent
transactions
that preserves the order of the operations in each of the
individual
transactions.
A transaction comprises a sequence of operations consisting of read
and/or write actions to the database, followed by a commit or abort
action. A schedule S consists of a sequence of the operations from a
set of n transactions T1, T2, . . . , Tn, subject to the constraint that
the order of operations for each transaction is preserved in the
schedule. Thus, for each transaction Ti in schedule S, the order of
the operations in Ti must be the same in schedule S.
Serial Schedule A schedule where the operations of each
transaction are executed consecutively without any interleaved
operations from other transactions.
In a serial schedule, the transactions are performed in serial order.
For example, if we have two transactions T 1 and T2, serial order
would be T1 followed by T2, or T2 followed by T1. Thus, in serial
execution there is no interference between transactions, since only
one is executing at any given time. However, there is no guarantee
that the results of all serial executions of a given set of transactions
will be identical. In banking, for example, it matters whether interest
is calculated on an account before a large deposit is made or after.

Non-serial schedule A schedule where the operations from a set


of concurrent transactions are interleaved.
The problems described above resulted from the mismanagement of
concurrency, which left the database in an inconsistent state in the
first two examples and presented the user with the wrong result in
the third. Serial execution prevents such problems occurring. No
matter which serial schedule is chosen, serial execution never
leaves the database in an inconsistent state, so every serial
execution is considered correct, although different results may be
produced. The objective of serializability is to find non-serial
schedules that allow transactions to execute concurrently without
interfering with one another, and thereby produce a database state
that could be produced by a serial execution.
If a set of transactions executes concurrently, we say that the
(nonserial) schedule is correct if it produces the same results as
some serial execution. Such a schedule is called serializable. To
prevent inconsistency from transactions interfering with one
another, it is essential to guarantee serializability of concurrent
transactions. In serializability, the ordering of read and write
operations is important:
 If two transactions only read a data item, they do not conflict
and order is not important.
 If two transactions either read or write completely separate
data items, they do not conflict and order is not important.
 If one transaction writes a data item and another either reads
or writes the same data item, the order of execution is
important.
Recoverability and Schedules

Recoverability
Serializability identifies schedules that maintain the consistency of
the database, assuming that none of the transactions in the
schedule fails. An alternative perspective examines the
recoverability of transactions within a schedule.

If a transaction fails, the atomicity property requires that we undo


the effects of the transaction. In addition, the durability property
states that once a transaction commits, its changes cannot be
undone (without running another, compensating, transaction).
Recoverable schedule A schedule where, for each pair of
transactions Ti and Tj, if Tj reads a data item previously written by
Ti, then the commit operation of Ti precedes the commit operation
of Tj.

Transactions and Recovery


Transactions represent the basic unit of recovery in a database
system. It is the role of the recovery manager to guarantee two of
the four ACID properties of transactions, namely atomicity and
durability, in the presence of failures.
The recovery manager has to ensure that, on recovery from failure,
either all the effects of a given transaction are permanently
recorded in the database or none of them are.
The situation is complicated by the fact that database writing is not
an atomic (single-step) action, and it is therefore possible for a
transaction to have committed but for its effects not to have been
permanently recorded in the database, simply because they have
not yet reached the database.
Consider again the first example of this chapter, in which the salary
of a member of staff is being increased. To implement the read
operation, the DBMS carries out the following steps:
 find the address of the disk block that contains the record with
primary key value x;
 transfer the disk block into a database buffer in main memory;
 copy the salary data from the database buffer into the
variable salary.
For the write operation, the DBMS carries out the following steps:
 find the address of the disk block that contains the record with
primary key value x;
 transfer the disk block into a database buffer in main memory;
 copy the salary data from the variable salary into the
database buffer; write the database buffer back to disk.
The database buffers occupy an area in main memory from which
data is transferred to and from secondary storage. It is only once
the buffers have been flushed to secondary storage that any
update operations can be regarded as permanent.
This flushing of the buffers to the database can be triggered by a
specific command (for example, transaction commit) or
automatically when the buffers become full. The explicit writing of
the buffers to secondary storage is known as force-writing.
If a failure occurs between writing to the buffers and flushing the
buffers to secondary storage, the recovery manager must determine
the status of the transaction that performed the write at the time of
failure. If the transaction had issued its commit, then to ensure
durability the recovery manager would have to redo that
transaction’s updates to the database (also known as roll forward).
On the other hand, if the transaction had not committed at the time
of failure, then the recovery manager would have to undo
(rollback) any effects of that transaction on the database to
guarantee transaction atomicity. If only one transaction has to be
undone, this is referred to as partial undo.
A partial undo can be triggered by the scheduler when a transaction
is rolled back and restarted as a result of the concurrency control
protocol, as described in the previous section. A transaction can also
be aborted unilaterally, for example, by the user or by an exception
condition in the application program. When all active transactions
have to be undone, this is referred to as global undo.

Recoverability and Schedules

Recoverability
Serializability identifies schedules that maintain the consistency of
the database, assuming that none of the transactions in the
schedule fails. An alternative perspective examines the
recoverability of transactions within a schedule. If a transaction fails,
the atomicity property requires that we undo the effects of the
transaction. In addition, the durability property states that once a
transaction commits, its changes cannot be undone (without
running another, compensating, transaction).
Recoverable schedule A schedule where, for each pair of
transactions Ti and Tj, if Tj reads a data item previously written by
Ti, then the commit operation of Ti precedes the commit operation
of Tj.

Transactions and Recovery


Transactions represent the basic unit of recovery in a database
system. It is the role of the recovery manager to guarantee two of
the four ACID properties of transactions, namely atomicity and
durability, in the presence of failures.
The recovery manager has to ensure that, on recovery from failure,
either all the effects of a given transaction are permanently
recorded in the database or none of them are.
The situation is complicated by the fact that database writing is not
an atomic (single-step) action, and it is therefore possible for a
transaction to have committed but for its effects not to have been
permanently recorded in the database, simply because they have
not yet reached the database.
Consider again the first example of this chapter, in which the salary
of a member of staff is being increased. To implement the read
operation, the DBMS carries out the following steps:
 find the address of the disk block that contains the record with
primary key value x;
 transfer the disk block into a database buffer in main memory;
 copy the salary data from the database buffer into the
variable salary.
For the write operation, the DBMS carries out the following steps:
 find the address of the disk block that contains the record with
primary key value x;
 transfer the disk block into a database buffer in main memory;
 copy the salary data from the variable salary into the
database buffer; write the database buffer back to disk.
The database buffers occupy an area in main memory from which
data is transferred to and from secondary storage. It is only once
the buffers have been flushed to secondary storage that any
update operations can be regarded as permanent.
This flushing of the buffers to the database can be triggered by a
specific command (for example, transaction commit) or
automatically when the buffers become full. The explicit writing of
the buffers to secondary storage is known as force-writing.
If a failure occurs between writing to the buffers and flushing the
buffers to secondary
storage, the recovery manager must determine the status of the
transaction that performed the write at the time of failure. If the
transaction had issued its commit, then to ensure durability the
recovery manager would have to redo that transaction’s updates to
the database (also known as rollforward).
On the other hand, if the transaction had not committed at the time
of failure, then the recovery manager would have to undo
(rollback) any effects of that transaction on the database to
guarantee transaction atomicity. If only one transaction has to be
undone, this is referred to as partial undo.
A partial undo can be triggered by the scheduler when a transaction
is rolled back and restarted as a result of the concurrency control
protocol, as described in the previous section. A transaction can also
be aborted unilaterally, for example, by the user or by an exception
condition in the application program. When all active transactions
have to be undone, this is referred to as global undo.

You might also like