0% found this document useful (0 votes)
76 views40 pages

Week 10-1 Transactions

The document discusses transactions and concurrency control in distributed systems. It defines key concepts like atomicity, consistency, isolation, and durability (ACID) properties of transactions. Transactions group related operations so that either all operations are performed or none are. Concurrency control techniques like locking and timestamp ordering aim to achieve serial equivalence, where concurrent transactions appear to execute serially to avoid problems like lost updates or inconsistent retrievals. Locking involves transactions acquiring locks on objects before read/write operations and releasing locks after to ensure serial ordering of conflicting operations between transactions.

Uploaded by

laliaga30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views40 pages

Week 10-1 Transactions

The document discusses transactions and concurrency control in distributed systems. It defines key concepts like atomicity, consistency, isolation, and durability (ACID) properties of transactions. Transactions group related operations so that either all operations are performed or none are. Concurrency control techniques like locking and timestamp ordering aim to achieve serial equivalence, where concurrent transactions appear to execute serially to avoid problems like lost updates or inconsistent retrievals. Locking involves transactions acquiring locks on objects before read/write operations and releasing locks after to ensure serial ordering of conflicting operations between transactions.

Uploaded by

laliaga30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

ITEC801

Distributed Systems
Transactions
Coulouris Chapter 16,17

A transaction is an indivisible unit of work. It may consist of many

Transaction

All effects of the execution of an operation


Jim Gray and Andreas Reuter

ITEC 801

Transactions

operations over many distributed resources. The key is that either


everything is completed or the system leaves the state as it was before
anything was done. This relates to the concept of contract in
programming, where the system is in a starting consistent state, and must
do everything to leave the system in another consistent state. While the
transaction is in process the system will be in an inconsistent state.

The definition of transaction comes from the book Transaction Processing


by Jim Gray and Andreas Reuter. Jim Gray was the worlds foremost
authority on transaction processing. Tragically he went out sailing one day
in 2007 to never return.
Transaction Processing is the most extensive (and expensive) book on the
subject. Only buy it if you are working very deeply in this area.
http://www.theregister.co.uk/2007/04/30/jim_gray_tribute/
ITEC 801

Transactions

To maintain correctness in a system, it makes sense that dependent actions

Transaction

A transaction is a series of actions, all of which


must be done or none of them

ITEC 801

Transactions

are grouped so that either all or none of them are done.


If only a subset of the actions are done, the system is left in an inconsistent
state.

An atomic operation is one that is indivisible all subsets are done.


We now know that atoms are made of subatomic particles, which are
themselves made of other subatomic particles that we can separate in a
particle accelerator (atom smasher).
However, we get the word atom from very ancient philosophy.

Atomic
Operations

atma is the Sanskrit word for the indivisible soul, expressing unity

ITEC 801

Transactions

Tina Turner playing The Acid Queen in the Whos movie Tommy.

ACID Properties

6JG#EKF3WGGP

ITEC 801

rather than duality.


Transactions are examples of indivisible operations, even though the
actions in them can be quite complex like subatomic particles.

Transactions

ACID Properties
Atomicity
Consistency
Isolation
Durability

ITEC 801

Transactions

All or Nothing
Either all steps of a transaction are done or none.
A transaction is a whole single unit of work. It cannot be divided and its
substeps cannot be done alone or in part.
In a distributed system, all changes must be made at all participants.

Atomicity
All or Nothing
All changes to system state are
completed or none are done.

ITEC 801

Transactions

A transaction is a correct transformation of state. If the system starts in a


valid state, it will be left in a valid state at the end of the transaction. This
is why atomicity is important, since the system may be in an inconsistent
state while the transaction is in progress. A transaction is a program that
correctly transforms state. This is key to the notion of program
correctness.

Consistency

A transaction is a correct
transformation of state

ITEC 801

Transactions

This means that for any transaction T, it appears that each of the other

Isolation

transactions either occurred before or after T.

Although transactions are operating


concurrently, each transaction is
independent of the others.

ITEC 801

Transactions

10

Once a transaction completes, its eects are stored and survive any
failures.

Durability

The effects of a completed transaction


cannot be undone

ITEC 801

Transactions

11

pH or the ACID test is ACID good enough?

pH Test?
ACID is controversial
C.J. Date not strong enough for databases
C.J. Date Database Systems Eighth Edition

Distributed systems too strong for most


distributed systems

ITEC 801

Transactions

12

Chris Date believes for databases it is not strong enough, particularly


consistency, which he believes should maintain correctness at every update,
not just after a series of updates. However, multiple updates may be done
in multi-assignments.
At the other extreme, ACID properties might be too strong to be
practical. However, most distributed applications are not databases.
ACID, however, is a good way to think of systems and a very useful
paradigm.

Operations of the Account interface


deposit (amount)
deposit amount in the account
withdraw (amount)
withdraw amount from the account
balance -> amount
the balance of the account
set_balance (amount)
set the balance of the account to amount

Operations of the Branch interface


create (name) -> account
create a new account with a given name
lookup (name) -> account
the account with the given name
branch_total -> amount
total of all the balances at the branch
ITEC 801

Transactions

13

A clients banking transaction


Transaction T:
a.withdraw (100)
b.deposit (100)
c.withdraw (200)
b.deposit (200)

Transfers happen by withdrawals followed by deposits

ITEC 801

Transactions

14

Transactions
The isolation requirement would be met by executing
transaction serially one at a time
Not very efficient
Servers should maximise concurrency

ITEC 801

Transactions

15

Co-ordinators
Each transaction created and maintained by a coordinator
Each transaction gets an identifier (TID)
TID may be implicitly or explicitly associated with each
operation
When client invokes close_transaction co-ordinator saves
result of transaction

ITEC 801

Transactions

16

Operations in Coordinator interface


open_transaction -> transaction
starts a new transaction and delivers a unique TID trans. This
identifier will be used in the other operations in the transaction.
close_transaction (trans) -> (commit, abort)
ends a transaction: a commit return value indicates that the
transaction has committed; an abort return value indicates that it
has aborted.
abort_transaction (trans)
aborts the transaction.

ITEC 801

Transactions

17

Transaction life histories


Successful

Aborted by client

Aborted by server

openTransaction

openTransaction

openTransaction

operation
operation

operation
operation

operation
operation
server aborts
transaction

operation

operation

closeTransaction

abortTransaction

ITEC 801

operation ERROR
reported to client

Transactions

18

Example
Initial balances
A $100
B $200
C $300

ITEC 801

Transactions

19

The lost update problem


Transaction T

Transaction U

balance := b.balance
b.set_balance (balance*1.1)
a.withdraw (balance/10)

balance = b.balance
b.set_balance (balance*1.1)
c.withdraw (balance/10)

balance = b.balance

b.set_balance (balance*1.1)
a.withdraw (balance/10)

ITEC 801

$200
balance = b.balance

$200

b.set_balance (balance*1.1)

$220

c.withdraw (balance/10)

$280

$220
$80

Transactions

20

The inconsistent retrievals problem


A and B initially $200
Transaction V
a.withdraw (100)
b.deposit (100)
a.withdraw (100)

Transaction W
branch.total
$100
total = a.balance

$100

total = total + b.balance

$300

total = total + c.balance


b.deposit (100)

ITEC 801

$300

Transactions

21

See next slide for an example

Serial Equivalence

An interleaving of operations of transactions in


which the combined effect is the same if the
transactions had been performed one at a time in
some order is a serially equivalent interleaving

ITEC 801

Transactions

22

A serially equivalent interleaving of T and U


Transaction T

Transaction U

balance = b.balance
b.set_balance (balance * 1.1)
a.withdraw (balance/10)

balance = b.balance
b.set_balance (balance*1.1)
c.withdraw (balance/10)

balance = b.balance

$200

b.set_balance (balance*1.1) $220


balance = b.balance

$220

b.set_balance (balance*1.1) $242


a.withdraw (balance/10)

$80
c.withdraw (balance/10)

ITEC 801

Transactions

$278

23

Conflicting Operations
A pair of operations conflicts when their combined effect
depends upon the order in which they are executed.
For two transactions to be serially equivalent it is
necessary and sufficient that all pairs of conflicting
operations of the two transactions be executed in the
same order at all of the objects they both access

ITEC 801

Transactions

24

Read and Write Operation


Conflict Rules
Operations of different Conflict
transactions
read

read

No

read

write

Yes

write

write

Yes

Reason
Because the effect of a pair of read operations
does not depend on the order in which they are
executed
Because the effect of a read and a write operation
depends on the order of their execution
Because the effect of a pair of write operations
depends on the order of their execution

ITEC 801

Transactions

25

Concurrency Control
Serial equivalence is a criterion for concurrency control
protocols
Three main approaches
Locking
Optimistic concurrency control
Timestamp ordering

ITEC 801

Transactions

26

Locking
Coulouris 16.4

ITEC 801

27

Transactions

Locking
One way to achieve serial equivalence of transaction is
to serialize access to the involved objects
This can be done using locks
The server locks an object that is to be used by a client
transaction
If another client requests access to an already locked
object that other request is suspended until the object is
unlocked

ITEC 801

Transactions

28

Transactions T and U with exclusive locks


Transaction T
balance = b.get_balance
b.set_balance (bal*1.1)
a.withdraw (bal/10)

Transaction U
balance = b.balance
b.set_balance (bal*1.1)
c.withdraw (bal/10)

Operations

Locks

Operations

open_transaction
bal = b.balance

lock B

b.set_balance (bal*1.1)
a.withdraw (bal/10)
lock A
close_transaction

open_transaction
bal = b.balance

Locks

waits for Ts
lock on B

unlock A , B
lock B
b.set_balance (bal*1.1)
c.withdraw (bal/10)
lock C
close_transaction

ITEC 801

unlock B, C

Transactions

29

Locks
Serial equivalence requires that all of a transactions
access to a particular object be serialized with respect to
accesses by other transaction
All pairs of conflicting operation of two transaction must be executed in
the same order

A transaction is not allowed any new locks after it is ahs


released a lock

ITEC 801

Transactions

30

Two-Phase Locking
A growing phase during which locks are acquired
A shrinking phase during which locks are released

ITEC 801

Transactions

31

Strict Two-Phase locking


Other transactions that need to read or write an object
must be delayed until other transactions that wrote the
same object have committed or aborted
Locks are held until all objects updates written to
permanent storage

ITEC 801

Transactions

32

Granularity
Servers have many objects
Want to lock the minimum number of objects for each
transaction
You wouldnt want to lock all accounts to do an
operation on one

ITEC 801

Transactions

33

Many Writers/Single Reader


No problem with having many transactions reading the
same object
Read locks/write locks
Can promote a read lock to a write lock if no other locks
(including reads) on object

ITEC 801

Transactions

34

Lock compatibility

Lock requested
read
write

For one object


Lock already set

ITEC 801

none

OK

OK

read

OK

wait

write

wait

wait

Transactions

35

Summary of locks in strict two-phase locking


1. When an operation accesses an object within a transaction:
If the object is not already locked then
it is locked and the operation proceeds.
If the object has a conflicting lock set by another transaction then
the transaction must wait until it is unlocked. (b)
If the object has a non-conflicting lock set by another transaction then
the lock is shared and the operation proceeds.
If the object has already been locked in the same transaction then
the lock will be promoted if necessary and the operation proceeds.
(Where promotion is prevented by a conflicting lock, rule (b) is used.)
2. When a transaction is committed or aborted
the server unlocks all objects it locked for the transaction
ITEC 801

Transactions

36

Lock Implementation
Handled by a separate object in server called lock
manager
Will hold a set of locks, each associated with a particular
object
Identified of locked object
Transaction identifies of transactions currently holding the lock
Lock type

ITEC 801

Transactions

37

Methods of lock are synchronised so that the threads


attempting to acquire or release a lock will not interfere
with one another

ITEC 801

Transactions

38

Problems with Locks


Deadlocks Coulouris 16.4.1

ITEC 801

Transactions

39

Segue to next section!

Problems with Locks


Heavy handed
Pessimistic take care of worse-case scenarios where many processes
contend for same resources
Thus quite a bit of overhead

But what if we have relatively few conflicts?

ITEC 801

Transactions

40

Optimistic Locking
Coulouris 16.5

ITEC 801

41

Transactions

Optimistic Concurrency Control


For the case of relatively few conflicts, we can be more
optimistic problems only occur infrequently
Lower overhead
Avoid problems of deadlocks and cascading aborts

ITEC 801

Transactions

42

Optimistic Concurrency Control


Basic idea is start transaction without locking
Acquire objects note their cycle number
Process
Validate objects
if cycle number of any objects has changed abort
this is a read-write conflict
if objects validate commit changes

ITEC 801

Transactions

43

Timestamps
Coulouris 16.6

ITEC 801

Transactions

44

With timestamps, if a process tries to access an object that is already being


accessed, its transaction is aborted. Locks at least will make a process wait.
Thus timestamp locking is even more pessimistic.

Timestamp Ordering
Pessimistic like locking
However, timestamps will abort transaction on access
Locks will make process wait.

ITEC 801

Transactions

45

Transactions
ITEC 801

Transactions

46

We have already introduced the notion of a transaction as an indivisible


operation applied to a system.
Originally, transactions were simple, flat transactions.

Transactions

A transaction is a correct
transformation of state

ITEC 801

Transactions

47

Flat transactions just begin, do their work, commit, and finish.

Flat Transactions

ITEC 801

Transactions

48

There are other kinds of applications that dont fit flat transactions.
This is somewhat a research topic, but there are some examples

Non-Flat Transactions

ITEC 801

Transactions

49

A trip can comprise of several legs.

Trip Planning

ITEC 801

To book the entire trip, each leg must be booked.


However, each leg must be confirmed before committing to the whole trip.
If one or more legs cant be confirmed the whole trip must be cancelled, or
maybe tried for another day.

Transactions

50

Bulk updates are somewhat the reverse situation.


All updates together could be taken as a single transaction.
However, if something goes wrong, the scenario means we dont want to
back everything out and start again.
Rather we must start where we left o.

Bulk Updates

UG
ITEC 801

H!

Transactions

51

Spheres
of
Control

ITEC 801

Transactions

52

Spheres of Control are the theoretical basis for transactions.


An Abstract Data Type is in eect a black box.
An ADTs internal working are invisible.
Only outputs from the ADT are observed

Spheres of Control
Research that formed basis
of transactions in 1970s
Spheres of Control system
uses a hierarchy of ADTs

ITEC 801

Transactions

53

An output from an ADT is a commitment that the output is permanent

Spheres of Control

and correct.
Thus a commit in a transaction allows its eects to be seen by the outside
world (other processes and transactions).
Prior to commitment, the results are not available.

An output from an ADT is a


commitment

Commits that output is correct

ITEC 801

Transactions

54

Process control ensures that the information required by an atomic process


is not modified by others.
Process atomicity is the single unit of work (all actions) that are
considered.
Process commitment: While a process is underway, no eects of that
process are seen by the others until process is complete.

Spheres of Control
Process Control: Constrains
dependencies on other processes
Process Atomicity: Amount of
processing that has a single identity
Process Commitment: Changes
made by a process are internal
until changes committed
ITEC 801

Transactions

55

The important point about spheres of control is that they can be nested

Spheres of Control

ITEC 801

Transactions

and chained.
When an inner sphere completes, its results are released to the next outer
sphere, but not beyond.
Thus when an inner sphere commits, those results are only available to the
outer world if the outer sphere commits.
This is important to understand transactions beyond flat transactions.

56

Distributed
Transactions
Coulouris Chapter 17

ITEC 801

57

Transactions

Distributed Transactions
Previous section assumed that access to all the objects
that the transactions were dealing were stored on one
server
In the case of locking, that one server managed all the
locks
This is not always true in practice and we need to
consider distributed transactions

ITEC 801

Transactions

58

The first enhancement we consider to flat transactions is savepoints.


An individual transaction might not need to be completely rolled back.
If we fail to acquire a lock or other reason, we can roll the transaction back
to the previous savepoint.
This is an important optimisation for long-lived transactions, particularly
those in a distributed environment.

Savepoints

ITEC 801

Transactions

59

Chained transactions are a variation on savepoints.

Chained Transactions

ITEC 801

Transactions

When an inner transaction is completed, its work is committed, whereas


savepoints are volatile.
This means previous transactions in the chain cannot be rolled back by an
outer transaction.
This is useful for bulk updates.

60

Distributed transactions
Flat transaction

Nested transactions
M

T11
X
T

T
T

T
Client

Client

12
21

Y
P

ITEC 801

22

Transactions

61

Nested banking transaction


X
Client
T
T := open_transaction
open_sub_transaction
a.withdraw (10)
open_sub_transaction
b.withdraw (20)
open_sub_transaction
c.deposit (10)
open_sub_transaction
d.deposit (20)
close_transaction

ITEC 801

a.withdraw (10)

b.withdraw (20)

Y
T

Z
T

T4

Transactions

c.deposit (10)
d.deposit (20)

62

A generalization of savepoints is nested transactions rather than flat with


a single controlling transaction, these can be hierarchically organized.
In nested transactions, when inner transactions commit, their results are
only known to the outer transaction.
If the outer transaction rolls back, the eects of the inner transactions are
also rolled back, even if they have already committed.
If an inner transaction fails, the outer transaction can retry it.
This is the reason for nested transactions - we dont have to redo the
previous transactions before a failure.
But note what we do depends on the application.
Consider again the travel application.

A further generalization on nested transactions.


In this case an earlier nested transaction cannot be rolled back, but if its
eects need to be undone, a compensating transaction is applied.
That is compensating transactions undo the work of previously committed
transactions.

Multi-Level Transactions

ITEC 801

Transactions

63

Distributed Transactions
Each server applies concurrency control as per what
weve already seen
But distributed transactions must also be serialized
globally

ITEC 801

Transactions

64

Co-ordinator
Atomicity requires either all servers commit results of the
transaction or they all abort
One server takes on the role of co-ordinator to achieve
this

ITEC 801

Transactions

65

Two-Phase
Commit
Coulouris 17.3
Tanenbaum 8.5

ITEC 801

Transactions

66

When two people get married, the celebrant asks crowd first if there is
anyone who objects to these two getting married. Celebrant then asks
bride if she takes this man, and asks groom if he takes this woman.
If everyone agrees, the celebrant commits the marriage and the two are
now married.
That is exactly what two-phase commit does. The master controller is the
celebrant, the other participants the resources that have participated in
the transaction.
ITEC 801

Transactions

67

Distributed banking transaction


join

openTransaction
closeTransaction

participant
A

a.withdraw (4)

join
BranchX
T
Client

participant
b.withdraw (T, 3)

T := open_transaction
a.withdraw (4)
c.deposit (4)
b.withdraw (3)
d.deposit (3)
close_transaction

B
join

BranchY
participant

Note: the coordinator is in one of the servers, e.g. BranchX

ITEC 801

b.withdraw (3)

Transactions

c.deposit (4)

d.deposit (3)

BranchZ
68

Two-Phase Commit Protocol


A communication sequence to enable distribute atomic
transactions
Requires a co-ordinator
Does not have to be same co-ordinator for all
transactions

ITEC 801

Transactions

69

Operations for two-phase commit protocol


can_commit? (trans) -> (Yes, No)
Call from coordinator to participant to ask whether it can commit a transaction.
Participant replies with its vote.
commit (trans)
Call from coordinator to participant to tell participant to commit its part of a
transaction.
abort (trans)
Call from coordinator to participant to tell participant to abort its part of a
transaction.
have_committed (trans, participant)
Call from participant to coordinator to confirm that it has committed the
transaction.
get_decision (trans) -> (Yes, No)
Call from participant to coordinator to ask for the decision on a transaction after
it has voted Yes but has still had no reply after some delay. Used to recover from
server crash or delayed messages.
ITEC 801

Transactions

70

The two-phase commit protocol


Phase 1 (voting phase):
1. The coordinator sends a can_commit? request to each of the participants in the
transaction.
2. When a participant receives a can_commit? request it replies with its vote (Yes or No)
to the coordinator. Before voting Yes, it prepares to commit by saving objects in
permanent storage. If the vote is No the participant aborts immediately.
Phase 2 (completion according to outcome of vote):
3. The coordinator collects the votes (including its own).
(a)
If there are no failures and all the votes are Yes the coordinator decides to
commit the transaction and sends a commit request to each of the participants.
(b)
Otherwise the coordinator decides to abort the transaction and sends
doAbort requests to all participants that voted Yes.
4. Participants that voted Yes are waiting for a commit or abort request from the
coordinator. When a participant receives one of these messages it acts accordingly and
in the case of commit, makes a have_committed call as confirmation to the coordinator.
ITEC 801

Transactions

71

Communication in two-phase commit protocol

Coordinator
step

status

prepared to commit
(waiting for votes)

committed

Participant
can_commit?
Yes

step

status

prepared to commit

commit

have_committed

(uncertain)
4

committed

done

ITEC 801

Transactions

72

Two Phase Commit

ITEC 801

Transactions

73

There are a total of three states in which either a coordinator or


participant is blocked waiting for an incoming message. First, the
participant may be waiting in its INIT state for a vote request message
from the coordinator. If that message is not received after some time,
the participant will simply decide to locally abort the transaction and
thus send a vote abort message to the coordinator.

Two-Phase Commit
Finite State Machines (FSM) in 2PC

FSM for coordinator


ITEC 801

FSM for participant


Transactions

74

Likewise, the coordinator can be blocked in state WAIT, waiting for the
votes of each participant. If not all votes have been collected after a
certain period of time, the coordinator should vote for an abort as well
and subsequently send global abort to all participants. Finally, a
participant can be blocked in state READY, waiting for the global vote
as sent by the coordinator. If that message is not received within a

A better solution is to let a participant P contact another participant Q


to see if it can decide from Qs state what it should do.

Two-Phase Commit

If Q is in commit, then this is possible only if the coordinator sent a


global commit message to Q just before crashing. Apparently this
message has not yet been sent to P. Consequently, P may also decide
to locally commit. Likewise, if Q is in state ABORT, P can safely abort as
well.
Actions taken by a participant P when residing in state
READY and having contacted another participant Q.
ITEC 801

Transactions

75

Two-Phase Commit

...
Outline of the steps taken by the coordinator in a twophase commit protocol.
ITEC 801

Transactions

76

Now suppose Q is in state INIT: This situation can occur when the
coordinator has sent a vote request to all participants but this message
has not reached Q. In other words, the coordinator has crashed while
multicasting vote request. In this case, it is safe to abort the
transaction, both P and Q can make a transition to ABORT.
Coordinator starts by sending a multicast (vote request) to all
participants in order to collect their votes. It subsequently records that
it is entering the WAIT state, after which it waits for incoming votes
from participants. If not all votes have been collected but no more
votes are received within a given time interval prescribed in advance,
the coordinator assumes that one or more participants have failed.
Consequently it must abort the transaction and multicast a global
abort message to all participants. If no failures occur, the coordinator
will eventually have collected all votes. If all participants as well as the
coordinator vote to commit, global commit is first logged and
subsequently sent to all processes. Otherwise, coordinator multicasts a
global abort.

Two-Phase Commit

Outline of the steps taken by the


coordinator in a two-phase commit protocol.

ITEC 801

Transactions

77

First the process waits for a vote request from the coordinator. Note
that this waiting can be done by a separate thread running in the
processs address space. If no message comes in the transaction is
simply aborted. Apparently, the coordinator had failed.

Two-Phase Commit
(a) The steps taken by a participant process in 2PC.

ITEC 801

Transactions

78

After receiving the vote request, the participant may decide to vote for
commiting the transaction for which it records its decision in the local
log and then informs the coordinator by sending a vote commit
message. The participant must then wait for the global decision.
Assuming this decision comes on time, it is simply written to the local
log, after which it can be carried out. However if it times out, it
executes a termination protocol by first multicasting a decision request
message to all other processes., after which it blocks waiting for a

Two-Phase Commit
(b) The steps for handling incoming decision requests..

ITEC 801

Transactions

You might also like