0% found this document useful (0 votes)

79 views

61 Synchronous Log Shipping Replication

Synchronous Log Shipping Replication Takahiro Itagaki and Masao Fujii NTT Open Source Software Center PGCon 2008 Agenda Introduction - Compare with other replication solutions Details: How it works - Struggles in development Demo Future work: Where are we going?

Uploaded by

warwithin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views

61 Synchronous Log Shipping Replication

Uploaded by

warwithin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

Synchronous Log Shipping

Replication

Takahiro Itagaki and Masao Fujii

NTT Open Source Software Center

PGCon 2008
Agenda

• Introduction: What is this?

– Background
– Compare with other replication solutions
• Details: How it works
– Struggles in development
• Demo
• Future work: Where are we going?
• Conclusion

Copyright © 2008 NTT, Inc. All Rights Reserved. 2

What is this?
What is this?
• Successor of warm-standby servers
– Replication system using WAL shipping.
• using Point-in-Time Recovery mechanism
– However, no data loss after failover because of
synchronous log-shipping.

WAL
Active Server Standby Server

• Based on PostgreSQL 8.2 with a patch and

including several scripts
– Patch: Add two processes into postgres
– Scripts: For management commands

Copyright © 2008 NTT, Inc. All Rights Reserved. 4

Warm-Standby Servers (v8.2~)
Active Server (ACT) Standby Server (SBY)

Commit 1
2 Flush WAL to disk
(Return) 3
Failover

Crash!
WAL seg Sent after commits

4 archive_command
WAL seg
Redo

The last segment is not We need to wait for remounting

available in the standby server active’s storage on the standby
if the active crashes before server, or we wait the active’s
archiving it. reboot.
Copyright © 2008 NTT, Inc. All Rights Reserved. 5
Synchronous Log Shipping Servers
Active Server (ACT) Standby Server (SBY)

WAL entries are sent

before returning from
Commit 1 commits by records.
Segments are formed
2 Flush WAL to disk from records in the
WAL records standby server.
3 Send WAL records

(Return) 4
WAL seg
Failover

Crash!
Redo

We can start the standby server

after redoing remaining segments;
We’ve received all transaction logs
already in it.
Copyright © 2008 NTT, Inc. All Rights Reserved. 6
Background: Why new solution?
• We have many migration projects from Oracles
and compete with them with postgres.
– So, we hope postgres to be SUPERIOR TO ORACLE!
• Our activity in PostgreSQL 8.3
– Performance stability
• Smoothed checkpoint
– Usability; Ease to tune server parameters
• Multiple autovacuum workers
• JIT bgwriter – automatic tuning of bgwriter

• Where are alternatives of RAC?

– Oracle Real Application Clusters

Copyright © 2008 NTT, Inc. All Rights Reserved. 7

Background: Alternatives of RAC
• Oracle RAC is a multi-purpose solution
– … but we don’t need all of the properties.
• In our use:
– No downtime <- Very Important
– No data loss <- Very Important
– Automatic failover <- Important
– Performance in updates <- Important
– Inexpensive hardware <- Important
– Performance scalability <- Not important
• Goal
– Minimizing system downtime
– Minimizing performance impact in updated-workloads
Copyright © 2008 NTT, Inc. All Rights Reserved. 8
Compare with other replication solutions

No data No SQL Performance Update How to

Failover scalability performance
loss restriction copy?
Log Shipping OK OK Auto, Fast No Good
Log
warm-standby NG OK Manual No Async
Slony-I NG OK Manual Good Async Trigger
Auto,
pgpool-II OK NG Hard to Good Medium SQL
re-attach
Shared Disks OK OK Auto, Slow No Good Disk

• Log Shipping is excellent except performance scalability.

• Also, Re-attaching a repaired server is simple.
– Just same as normal hot-backup procedure
• Copy active server’s data into standby and just wait for WAL replay.
– No service stop during re-attaching

Copyright © 2008 NTT, Inc. All Rights Reserved. 9

Compare downtime with shared disks
• Cold standby with shared disks is an alternative solution
– but it takes long time to failover in heavily-updated load.
– Log-shipping saves time for mounting disks and recovery.
Shared disk system

Crash!
20 sec 60 ~ 180 sec (*)
to umount to recover
Log-shipping system and remount from the last
shared disks checkpoint

Crash! Ok, the service is restarted!

5 sec
10 sec to recover (*) Measured in PostgreSQL 8.2.
to detect the last 8.3 would take less time because
server down segement of less i/o during recovery.

Copyright © 2008 NTT, Inc. All Rights Reserved. 10

Advantages and Disadvantages
• Advantages
– Synchronous
• No data loss on failover
– Log-based (Physically same structure)
• No functional restrictions in SQL
• Simple, Robust, and Easy to setup
– Shared-nothing
• No Single Point of Failure
• No need for expensive shared disks
– Automatic Fast Failover (within 15 seconds)
• “Automatic” is essential not to wait human operations
– Less impact against update performance (less than 7%)
• Disadvantages
– No performance scalability (for now)
– Physical replication. Cannot use for upgrading purposes.

Copyright © 2008 NTT, Inc. All Rights Reserved. 11

Where is it used?
• Interactive teleconference management package
– Commercial service in active
– Manage conference booking and file transfer
– Log-shipping is an optional module for users requiring
high availability

Internet networks
Communicator

Copyright © 2008 NTT, Inc. All Rights Reserved. 12

How it works
System overview
• Based on PostgreSQL 8.2, 8.3(under porting)
• WALSender
– New child process of postmaster
– Reads WAL from walbuffers and sends WAL to WALReceiver
• WALReceiver
– New daemon to receive WAL
– Writes WAL to disk and communicates with startup process
• Using Heartbeat 2.1
– Open source high-availability software manages the resources via resource agent(RA)
– Heartbeat provides a virtual IP(VIP)
Active Standby
Heartbeat Heartbeat
VIP RA RA

PostgreSQL PostgreSQL
DB DB

WAL postgres startup WAL

wal
buffers WAL
WALSender WALReceiver

Copyright © 2008 NTT, Inc. All Rights Reserved. 14

System overview
• Based on PostgreSQL 8.2, 8.3(under porting)
• WALSender
– New child process of postmaster
– Reads WAL from walbuffers and sends WAL to WALReceiver
• WALReceiver
– New daemon to receive WAL
– Writes WAL to disk and communicates with startup process
• Using Heartbeat 2.1 In our replicator, there are two
– Open source high-availability software manages the resources via resource agent(RA)
–
nodes, active and standby
Heartbeat provides a virtual IP(VIP)
Active Standby
Heartbeat Heartbeat
VIP RA RA

PostgreSQL PostgreSQL
DB DB

WAL postgres startup WAL

wal
buffers WAL
WALSender WALReceiver

Copyright © 2008 NTT, Inc. All Rights Reserved. 15

System overview
• Based on PostgreSQL 8.2, 8.3(under porting)
• WALSender
– New child process of postmaster
– Reads WAL from walbuffers and sends WAL to WALReceiver
• WALReceiver
– New daemon to receive WAL
– Writes WAL to disk and communicates with startup process
• Using Heartbeat 2.1
– Open source high-availability software manages the resources via resource agent(RA)
– Heartbeat provides a virtual IP(VIP)
Active Standby
Heartbeat Heartbeat
VIP RA RA

In the active node, postgres is

PostgreSQL PostgreSQL
DB running in normal mode with newDB
child process WALSender
WAL postgres startup WAL

wal
buffers WAL
WALSender WALReceiver

Copyright © 2008 NTT, Inc. All Rights Reserved. 16

System overview
• Based on PostgreSQL 8.2, 8.3(under porting)
• WALSender
– New child process of postmaster
– Reads WAL from walbuffers and sends WAL to WALReceiver
• WALReceiver
– New daemon to receive WAL
– Writes WAL to disk and communicates with startup process
• Using Heartbeat 2.1
– Open source high-availability software manages the resources via resource agent(RA)
– Heartbeat provides a virtual IP(VIP)
Active Standby
Heartbeat Heartbeat
VIP RA RA

In the standby node, postgres is running

PostgreSQL PostgreSQL
in continuous
DB recovery mode with new DB
daemon WALReceiver
WAL postgres startup WAL

wal
buffers WAL
WALSender WALReceiver

Copyright © 2008 NTT, Inc. All Rights Reserved. 17

System overview
• Based on PostgreSQL 8.2, 8.3(under porting)
• WALSender
– New child process of postmaster
– Reads WAL from walbuffers and sends WAL to WALReceiver
• WALReceiver
– New daemon to receive WAL
– Writes WAL to disk and communicates with startup process
• Using Heartbeat 2.1
– Open source high-availability software manages the resources via resource agent(RA)
– Heartbeat provides a virtual IP(VIP)
Active Standby
Heartbeat Heartbeat
VIP RA RA

PostgreSQL PostgreSQL
DB DB
In order to manage these resources,
WAL postgres startup WAL
there is heartbeat in both nodes
wal
buffers WAL
WALSender WALReceiver

Copyright © 2008 NTT, Inc. All Rights Reserved. 18

System overview
• Based on PostgreSQL 8.2, 8.3(under porting)
• WALSender
– New child process of postmaster
– Reads WAL from walbuffers and sends WAL to WALReceiver
• WALReceiver
– New daemon to receive WAL
– Writes WAL to disk and communicates with startup process
• Using Heartbeat 2.1
– Open source high-availability software manages the resources via resource agent(RA)
– Heartbeat provides a virtual IP(VIP)
Active Standby
Heartbeat Heartbeat
VIP RA RA

PostgreSQL PostgreSQL
DB DB

WAL postgres startup WAL

wal
buffers WAL
WALSender WALReceiver

Copyright © 2008 NTT, Inc. All Rights Reserved. 19

WALSender
Active
postgres walbuffers WALSender
Update
Insert

Commit
Flush
WAL
Request

Read
Send / Recv
(Return)

(Return)

Copyright © 2008 NTT, Inc. All Rights Reserved. 20

WALSender
Active
postgres walbuffers WALSender
Update XLogInsert()
Insert

Commit
Flush Update command triggers XLogInsert()
and inserts WAL
WAL into walbuffers
Request

Read
Send / Recv
(Return)

(Return)

Copyright © 2008 NTT, Inc. All Rights Reserved. 21

WALSender
Active
postgres walbuffers WALSender
Update XLogInsert()
Insert

XLogWrite()
Commit
Flush
WAL
Request

Read
Send / Recv
(Return)
Commit command triggers
XLogWrite() and flushs WAL to disk
(Return)

Copyright © 2008 NTT, Inc. All Rights Reserved. 22

WALSender
Active
postgres walbuffers WALSender
Update XLogInsert()
Insert

XLogWrite()
Commit
Flush
WAL
Changed Request

Read
Send / Recv
(Return)

(Return)
We changed XLogWrite() to request
WALSender to transfer WAL

Copyright © 2008 NTT, Inc. All Rights Reserved. 23

WALSender
Active
postgres walbuffers WALSender
Update XLogInsert()
Insert

XLogWrite()
Commit WALSender reads WAL from
Flush
walbuffers
WAL and transfer them

Changed Request

Read
Send / Recv
(Return)

(Return)
After transfer finishes, commit
command returns

Copyright © 2008 NTT, Inc. All Rights Reserved. 24

WALReceiver
Standby
WALReceiver WAL Disk startup
Recv / Send
Flush

Inform

Read

Replay

Copyright © 2008 NTT, Inc. All Rights Reserved. 25

WALReceiver
Standby
WALReceiver WAL Disk startup
Recv / Send
Flush

Inform
WALReceiver receives WAL from
Read
WALSender and flushes them to disk

Replay

Copyright © 2008 NTT, Inc. All Rights Reserved. 26

WALReceiver
Standby
WALReceiver WAL Disk startup
Recv / Send
Flush

Inform

Read
WALReceiver informs startup
process of the latest LSN.
Replay

Copyright © 2008 NTT, Inc. All Rights Reserved. 27

WALReceiver
Standby
WALReceiver WAL Disk startup
Recv / Send
Flush ReadRecord()
Inform

Read Changed

Startup process reads WAL up to the latest LSN Replay

and replays.
We changed ReadRecord() so that startup
process could communicate with WALReceiver
and replay by each WAL record.

Copyright © 2008 NTT, Inc. All Rights Reserved. 28

Why replay by each WAL record?
• Minimize downtime
• Shorter delay in read-only queries (at the standby)

Our replicator Warm-Standby

Replay by each WAL record WAL segment
Needed to be replayed at failover a few records the latest one segment
Delay in read-only queries shorter longer