Oracle Distributed Transactions: Written and Presented by Joel Goodman
Oracle Distributed Transactions: Written and Presented by Joel Goodman
• Email: [email protected]
• Blog: dbatrain.wordpress.com
• EMEA Curriculum and Certification Specialist
• Global DBA Certification Exam Development Team Leader
• Global DBA Curriculum Quality Review Team
• OCP DBA – 7.3 to 12c
• OCM DBA – 9i to 11g
• Member of Oak Table
• Oracle University 1997 to Present
• Oracle Support 1994 to 1996
• IT Training Mainframe Technology 1986 to 1993
• Software Development and Consultancy 1983 to 1986
• Application Development and Support 1976 to 1982
2 .
Mini-Lesson Objectives
• Connectivity Overview
• Distributed Database Characteristics
• Remote and Distributed Queries
• Remote and Distributed Transactions
• Distributed Transaction Branches and RAC
• Two-Phase In-Doubt Transactions
• The RECO Process In-Doubt Resolution
• Manual In-Doubt Resolution
• Data Dictionary Entries and Cleanup
• 2 Phase Commit Crash Recovery
• Distributed Systems Recovery
3 .
Distributed Database Architecture
4 .
Oracle Distributed Databases
• Local autonomy
• No reliance on a central site
• Continuous operation
• Location independence
• Fragmentation independence
• Replication independence
5 .
Oracle Distributed Database
6 .
Server-Server Connectivity
7 .
Remote and Distributed Systems Concepts
• Remote Queries
• Distributed Queries
• Remote Transactions
• Distributed Transactions
• Transaction Branches
• Single Phase Commit
• Two Phased Commit
• Commit Point Site
• Global System Change Numbers
• In-Doubt Transaction Resolution
8 .
Remote Queries
9 .
Distributed Queries
10 .
Remote Transactions
11 .
Distributed Transactions
13 .
Distributed Transaction Locking
14 .
Distributed Transaction Branches and RAC
15 .
Single Phase Commit
• Used for:
– Single Database Commits and Rollbacks
– Remote Transactions
– Tightly coupled transaction with XA
• The local instance sends a commit request to a
single remote instance
• The remote instance either commits or rolls back
• Result of the request at the remote instance is
returned to the local instance
• DB Links use loosely coupled transactions and
require 2 phase commit
16 .
Two Phase Commit
• Each database instance involved in a two phase
commit operation performs one or more roles
based on its position in the Session Tree:
– Client - Issues transactional requests over DB Links
to one or more Servers
– Database Server - Receives transactional requests
over a DB Link
– Global Coordinator – Originating node for the
distributed transaction. Sends prepare, commit or
rollback statements to other adjacent nodes
– Local Coordinator – References nodes which are
non-adjacent to the Global Coordinator and
coordinates those nodes to prepare, commit or
rollback.
– Commit Point Site – Node which is first to commit
as part of the two-phase commit process
17 .
The Commit Point Site
18 .
Distributed Transaction Roles
CPS=2
CPS=10
Client or Remote
Middle Tier Database A
Remote
CPS=5
Database D
Local Remote
Database Database B
And Global CPS=20
And Local
Coordinator Coordinator
Remote
CPS=1 CPS=10 Database E and
Local Commit
Remote Point Site
Database C and
Commit Point
CPS=COMMIT_POINT_STRENGTH Site
19 .
Global System Change Numbers
20 .
Two-Phase Commit Phases
21 .
The Prepare Phase
• Steps in the Prepare phase:
• Global Coordinator sends a request to each instance except for the
Commit Point Site , asking for a guarantee that a commit or
rollback request at a later stage will be done if requested even if an
intervening instance failure occurs
• When asked to PREPARE a site does the following:
– Replies with “ READ ONLY” if no updates done on node
– Locks all tables in the transaction for both read and write
– Flushes Redo to disk to guarantee ability to commit or rollback later
when asked even if the instance fails after this point
– Exchanges SCNs with all involved Instances to determine the Global
Commit SCN for the transaction
– If a site is a Local Coordinator then it recursively issues PREPARE
commands to its own adjacent nodes
• The Commit Point Site is not asked to prepare. If all sites prepare
then it will be asked to commit. If at least one site rejects the
prepare, then it will be asked to rollback
22 .
The Commit Phase
• Steps in the Commit phase when committing:
• The Global Coordinator sends a commit request to the Commit
Point Site
• If the Commit Point Site is also a Local Coordinator then it must
recursively prepare all but one of its own adjacent nodes and then
issue a commit to its local Commit Point Site
• The Commit Point Site commits, releases locks and forgets about
the transaction removing it from the dictionary
• The Global Coordinator sends commit requests to adjacent
prepared nodes
• If any adjacent nodes are Local Coordinators then commits are
propagated to their adjacent nodes
• These nodes then commit, release locks and signal that they have
committed
• Note that “READ ONLY” sites are not asked to commit
23 .
The Commit Phase
• Steps in the Commit phase when Rolling Back:
• The Global Coordinator sends a commit request to the
Commit Point Site
• If the Commit Point Site is also a Local Coordinator then it
must recursively prepare all but one of its own adjacent
nodes and then issue a commit to its local Commit Point
Site
• If any site is unable to commit for any reason, then it rolls
back, releases locks and propagates the rollback to the
Local Coordinator or Global Coordinator as appropriate
• Any Local Coordinators sends rollback requests to adjacent
subordinate prepared nodes and back to the Global
Coordinator
• The Global Coordinator then rolls back and sends rollback
requests to all the adjacent nodes
24 .
The Forget Phase
• Steps in the Forget Phase:
• Any site that is the Global Coordinator updates the data
dictionary to remove reference to the transaction when all
have finished committing or rolling back
• Any site that is not a coordinator may forget the transaction
when it has either committed or rolled back
25 .
In-Doubt Transaction Failure
SQL> Commit;
ORA-02054 transaction 1.44.99 in-doubt
26 .
Resolving In-Doubt Transactions
27 .
RECO Process In-Doubt Resolution
28 .
Manual In-Doubt Resolution
• Manual In-Doubt resolution should only be done if:
• The in-doubt transaction has locks on critical data
or undo segments
• The cause of the machine, network, or software
failure cannot be resolved quickly
• If one of the participating databases is lost forever
• Requires Knowledge of the Data and Current State
at each remaining database
• Uses Views:
• DBA_2PC_PENDING based on pending_trans$
• DBA_2PC_NEIGHBORS based on ps1$ & pss1$
• Also check v$global_transaction
• May require manual removal from Data Dictionary
• Contact Oracle Support if you are uncertain
29 .
DBA_2PC_PENDING Data Dictionary View
• local_tran_id Local tran ID (format x.y.z)
• global_tran_id Global tran ID format global_name.hhhhhhhh.local_txn_id
• state Collecting, Prepared,Committed,Forced Commit,Forced Abort
• mixed YES means Mixed Outcome on different nodes
• advice C (commit), R (rollback), or null for no advice
• tran_comment Contains comments from application COMMIT command
• fail_time Timestamp of when the row was inserted
• force_time Timestamp of manual force decision or else null
• retry_time Timestamp of when RECO last tried to recover the transaction
• os_user Operating system account for the user
• os_terminal Operating system terminal ID for the user
• host Name of the host machine
• db_user Oracle username at the commit point site
• Commit# Global commit number for committed transactions
30 .
DBA_2PC_PENDING STATES
• The STATE column of DBA_2PC_PENDING may
contain:
– Collecting: Applies to coordinators. The node is collecting
information from adjacent database servers before deciding
whether it can prepare.
– Prepared: The node has prepared and may or may not have
acknowledged this to its local coordinator but no COMMIT
has been received. The node continues holding any local
resource locks necessary for the transaction to commit
– Committed: The node has committed the transaction, but
the transaction is still pending at one or more sites
– Forced Commit: A pending transaction was manually force
committed at the local node by the DBA
– Forced Abort: A pending transaction was manually force
Rolled Back at the local node by the DBA
31 .
DBA_2PC_NEIGHBORS Data Dictionary View
32 .
V$GLOBAL_TRANSACTION View
33 .
Manual In-Doubt Resolution Procedure
SQL> COMMIT;
ORA-02054 transaction 1.44.99 in-doubt
34 .
Manual In-Doubt Resolution Procedure
• If the STATE = COMMIT then local Database committed
– Examine the global_tran_id and commit#
– Compare with the same on other nodes when accessible
– If RECO resolved no matches will exist on those nodes
– If matching PREPARED rows found they may be committed
• If the STATE = PREPARED then local Database not committed
– Examine the global_tran_id and commit#
– Examine DBA_2PC_NEIGHBORS for other Databases
– Compare with the same on other nodes when accessible
– If no other nodes are found prepared it is safe to COMMIT
– If other nodes prepared they may all be committed or rolled back
35 .
Manual In-Doubt Resolution Advice
• Developers may provide advice to DBAs
• This appears in DBA_2PC_PENDING.ADVICE
• May contain “R”, “C” or nulls
• Application May set this with:
36 .
Force COMMIT
37 .
Force ROLLBACK
38 .
Data Dictionary Cleanup - Purging Mixed
39 .
Data Dictionary Cleanup - Purging Lost DB
40 .
Summary of Dictionary Cleanup Options
Pending State of State of Normal Alternative
State Global Local DBA DBA
Column Transaction Transaction Action Action
------------ ----------------- ------------ ---------- -----------------
collecting rolled back rolled back none purge_lost_db_entry
41 .
Practicing Manual Resolution With Crash Tests
• Use the 2PC Crash Test Facility:
• First Disable RECO
42 .
Recovery Strategy for Distributed Databases
• Complete Recovery
– All changes committed up to the moment of
failure are recovered
• Tablespace or Table Point in Time Recovery
– Committed changes up to a point before the
failure are recovered
– Time, LogSeq or SCN
• Database Point in Time Recovery
– Committed changes up to a point before the
failure are recovered
– Time, Cancel or SCN
43 .
Complete Database Recovery
44 .
Tablespace or Table Point In Time Recovery
45 .
Database Point In Time Recovery - DBPITR
46 .
Summary
• Connectivity Overview
• Distributed Database Characteristics
• Remote and Distributed Queries
• Remote and Distributed Transactions
• Distributed Transaction Branches and RAC
• Two-Phase In-Doubt Transactions
• The RECO Process In-Doubt Resolution
• Manual In-Doubt Resolution
• Data Dictionary Entries and Cleanup
• 2 Phase Commit Crash Recovery
• Distributed Systems Recovery
47 .