Mysql Replication
Mysql Replication
Replication
Tutorial
Mats Kindahl
Senior Software Engineer
Replication Technology
Lars Thalmann
Development Manager
Replication/Backup
Tutorial Outline
Terminology and Basic Concepts
Basic Replication
Replication for scale-out
Replication for high-availability
The binary log
Statements and transactions
Cluster Replication
Terminology and
Basic Concepts
MySQL Replication
Why? How?
High availability Snapshots
Fail-over possible Backup
Scale-out mysqlbinlog
Queries on many
servers
mysqldump
Off-site processing Binary log
Do not disturb master Replication
Reporting Point-in-time
recovery
Server
Terminology
Changes data
Keeps log of changes
Replication
Slave server
Ask master for events
Executes events
Slave
Binary log
Log every change
Split into transactional groups
Server
rver
Terminology
Synchronous replication Master
Master
info Relay log
Master
info Relay log
● master.info contain:
Read coordinates: master log name and
master log pos
Connection information:
● host, user, password, port
● SSL keys and certificates
/O
QL
yy log
ster log
Master
info Relay log
● relaylog.info contain:
Group master coordinates:
● Master log name and master log pos
Group relay log coordinates:
● Relay log name and relay log pos
dex
les
File: masterbin.NNNNNN
The actual contents of the binlog
File: masterbin.index
An index file over the files above
A binlog file
Coordinate
master-bin.000001
Format description
CREATE TABLE friends
(user INT, friend INT);
INSERT INTO friends (Log file, Log pos)
VALUES (12,15);
.
.
.
A binlog event
Rotate
Basic
Replication
Scenario 1: Single slave
Keep master on-line while:
Doing backup
Generating reports
Adding new slaves
Scenario 1: Steps
1.Fix my.cnf file for master and slave
2.Add user and grants on master
3.Take backup of master
4.Bootstrap slave from backup
5.Configure slave
6.Start slave
Step 1: Fix my.cnf
master> CREATE USER 'slave_user'@'slave_host';
master> GRANT REPLICATION SLAVE
> ON *.* TO 'slave_user'@'slave_host'
> IDENTIFIED BY 'slave_password';
Step 3: Backup master
Physical backup (offline)
For example: tar
Logical backup (offline)
mysqldump
On-line backup
InnoDB hot backup
MySQL on-line backup
Step 4: Bootstrap slave
Physical backup
Copy backup image to slave
Untar into database directory
Logical backup
mysql client
Step 5: Configure slave
Use CHANGE MASTER command
MASTER_PORT default is 3306
slave> CHANGE MASTER TO
> MASTER_HOST = 'master_host',
> MASTER_PORT = 3306,
> MASTER_USER = 'slave_user',
> MASTER_PASSWORD = 'slave_password';
ster
ave
Master
slave> START SLAVE; Slave
Some suggestions
Start the binary log on the master
immediately following the backup.
1.Add user and grants on master
2.Shut down master
3.Edit my.cnf
4.Take backup of master
5.Restart master
Some suggestions, contd.
Master
Slave Slave
bootstrap
slave1> STOP SLAVE;
Step 2: Note position
Take a note of where slave stopped
We need this when starting new
slave
slave1> SHOW SLAVE STATUS;
...
Relay_Master_Log_File: masterbin.000001
...
Exec_Master_Log_Pos: 409
Step 3: Backup slave
Flush tables and lock database
FLUSH TABLES WITH READ LOCK
Take backup
tar zcf slavebackup.tar.gz …
Unlock database
UNLOCK TABLES
Step 4: Start slave
We can now start slave again since:
We have the master position of
the slave
We have a backup corresponding
to that position
slave1> START SLAVE;
Step 5: Bootstrap new slave
Fix my.cnf (use new server-id!)
Install backup
tar xcf slavebackup.tar.gz ...
Configure slave using saved position
slave2> CHANGE MASTER TO
> MASTER_HOST = 'master_host',
> MASTER_PORT = 3306,
> MASTER_USER = 'slave_user',
> MASTER_PASSWORD = 'slave_password',
> MASTER_LOG_POS = 409,
> MASTER_LOG_FILE = 'masterbin.000001';
ckup
port
ster
Master
slave2> START SLAVE;
Slave Slave
Scenario 2: Summary
Taking a snapshot of a slave
STOP SLAVE
FLUSH TABLES WITH READ LOCK
SHOW SLAVE STATUS
UNLOCK TABLES
Starting replication from anywhere
MASTER_LOG_FILE
MASTER_LOG_POS
Scenario 3: Point-in-time
recovery
Binlog for point-in-time recovery
Say: time T
Backup needs to:
Save backup image
Save binlog files
Recover needs to:
Restore backup image
Apply binlog files until T
ckup
Recovery images
2009-04-10 12:00:00
Backup image +
Binlog files =
Recovery image
Saved recovery image
RI-1
Saved recovery
2009-04-11 12:00:00
images can be
archived
Active recovery image
Master Standby
Slave
Master Standby
Step 2: Standby be ahead
Pick the “most knowledgeable”
slave as standby
Do this fandango:
Stop slave
Note master position M
Stop standby
Start standby until M
Wait for standby to reach M
Standby will now have stopped
Step 2: Standby be ahead
Commands for doing this
slave> STOP SLAVE;
slave> SHOW SLAVE STATUS;
Relay_Master_Log_File: masterbin.00032
Exec_Master_Log_Pos: 567823
standby> STOP SLAVE;
standby> START SLAVE UNTIL
> MASTER_LOG_FILE = 'masterbin.00032',
> MASTER_LOG_POS = 567823;
standby> SELECT
> MASTER_POS_WAIT('masterbin.00032',
> 567823);
ster
ave
ndby
Master Standby
ster
ave
ndby
Master Standby
Step 4: Master position
Note master position of where
standby stopped
Same as before
standby> SHOW SLAVE STATUS;
...
Relay_Master_Log_File: masterbin.000032
...
Exec_Master_Log_Pos: 7685436
Step 4: Standby position
Note of last binlogged event
No changes allowed on server!
standby> SHOW MASTER STATUS\G
***************** 1. row *****************
File: standbybin.000047
Position: 7659403
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)
Step 5: Start slave until
We now have:
A binlog position on the master
A binlog position on the standby
Optional: bring standby on-line
UNLOCK TABLES
Run slave until master position
slave> START SLAVE UNTIL
> MASTER_LOG_FILE = 'masterbin.000032',
> MASTER_LOG_POS = 7685436;
ster
ave
ndby
Master Standby
ster
ave
ndby
Master Standby
ster
ave
ndby
Master Standby
Slave
slave> START SLAVE;
Scenario 4: Summary
Forwarding replication events
logslaveupdates
Standby have to be ahead of Slave
... and ways to ensure that
Synchronizing for switch-over
SHOW MASTER STATUS
START SLAVE UNTIL
MASTER_POS_WAIT()
Master
Slave
Standby
Slave
Scaling out
Master Write Client
Read
Slave Slave Slave Slave Slave
Slave
Scenario 5: Relay slave
1. Stop slave
2. Change default storage engine
3. Change engine of existing tables
4. Start slave
Step 2: Change engine
Change default engine on relay
SET GLOBAL
STORAGE_ENGINE = 'BLACKHOLE';
New tables will use BLACKHOLE
Step 3: Change engine
Change engine for existing tables
... should not be logged
So we turn of logging
SET SQL_LOG_BIN = 0;
ALTER TABLE table
ENGINE = BLACKHOLE;
SET SQL_LOG_BIN = 1;
Scenario 5: Summary
Use BLACKHOLE engine
Change default engine
SET GLOBAL STORAGE_ENGINE=engine
Change engine of existing tables
ALTER TABLE ENGINE=engine
ster
ave
Master
There are:
Master filtering
Slave filtering
Step 2: Edit my.cnf
[mysqld] [mysqld]
... ...
replicatedotable=user replicatedotable=user
replicatedotable=friend replicatedotable=message
Slave filtering
replicatedodb, replicateignoredb
replicatedotable
replicateignoretable
replicatewilddotable
replicatewildignoretable
Replication
for High-
Availability
Keeping them servers up
and running
ster
/Slave
Master Master
Client/
Slave
High-availability
One master can fail
Not scale-out
Scenario 7: Dual masters
1.Configure masters as slaves
serverid
logbin
Add user and grants
2.For scale-out usage:
logslaveupdates
3.Direct masters to each other
CHANGE MASTER TO
START SLAVE
ster
/Slave
log-slave-updates?
log-slave-updates
Master Master
Slave
Use logslaveupdates?
Necessary to forward events
Consider: recovery?
Consider: connecting a slave later?
ster
/Slave
Master Master
log-slave-updates
Slave
Shared disk
Virtual IP Manager
Shared disk
Binlog Binlog
Active/Passive pair
Master and slave share binlog
Shared store: DRBD, RAID
On fail-over, binlog positions match
ster
Circular replication?
server-id=1 server-id=2
Master Master
Master Master
server-id=4 server-id=3
Replicate in a ring
Not a recommended setup
Complicated to maintain
ster
Circular replication?
server-id=1 server-id=2
Master Master
Master Master
server-id=4 server-id=3
Circular replication?
server-id=1 server-id=2
Master Master
1919
4711
Master Master
server-id=4 server-id=3
Where do we start?
Different position on 2 and 3
Lag between 2 and 3
Lag between 3 and 4
Circular replication
1.Create replication progress table
2.For every transaction:
Figure out binlog position
Write it to table with transaction
Need to use special client code
3.On failure:
Fetch position from replication
progress table
Change to position and start slave
Step 1: Replication progress
Create replication progress table
Name: Replication_progress
Column: Server_id
Column: Master_log_file
Column: Master_log_pos
CREATE TABLE Replication_progress (
Server_id INT UNSIGNED,
Log_file CHAR(64),
Log_pos INT UNSIGNED,
PRIMARY KEY (Server_id)
) ENGINE=MYISAM;
Step 2: Transaction position
Set AUTOCOMMIT
SET AUTOCOMMIT=0
Unlock tables
UNLOCK TABLES
Step 2: How to fail-over
Decide fail-over server
$Failover_id
Find position
($File, $Pos) =
`SELECT Log_file, Log_pos
FROM Replication_progress
WHERE Server_id = $Failover_id`
Circular replication
server-id=1 server-id=2
Master Master
Master Master
server-id=4 server-id=3
Circular replication
server-id=1 server-id=2
Master Master
Master Master
server-id=4 server-id=3
6.0 feature
CHANGE MASTER TO
MASTER_LOG_FILE = ...,
MASTER_LOG_POS = ...,
IGNORE_SERVER_IDS = (3);
The binary
log
CREATE TABLE tbl (a INT, b INT)
BEGIN
Groups
INSERT INTO tbl VALUES (1,2)
Events
INSERT INTO tbl2 VALUES (2,3)
COMMIT
Rotate: master-bin.000023
Statement logging
Statements use Query log event
Statements are logged verbatim
...with some exceptions
USE statement added
... with current database
mysqld.1> show binlog events from 106 limit 1\G
*************************** 1. row ***************************
Log_name: masterbin.000001
Pos: 106
Event_type: Query
Server_id: 1
End_log_pos: 200
Info: use `test`; CREATE TABLE tbl (a INT, b INT)
1 row in set (0.00 sec)
Statement logging
What about this statement?
UPDATE db1.t1, db2.t2
SET db1.t1.a = db2.t2.a
Logged with the current database
Statement cannot be executed if
db1 or db2 is filtered (but not both)
Situation have to be avoided:
USE the right database
Don't qualify tables with database
Statement logging
Statement context events
User variables
RAND()
AUTO_INCREMENT
Context events written before
*************************** 1. row ***************************
Event_type: User var
Info: @`user`=_latin1 0x6D6174734073756E2E636F6D COLLATE latin1_swedish_ci
*************************** 2. row ***************************
Event_type: Query
Info: use `test`; INSERT INTO user VALUES (1,@user)
Unsafe statements
User-defined functions (UDFs)
Can do anything
Other unsafe constructions:
UUID()
FOUND_ROWS()
Two or more tables with
AUTO_INCREMENT
... and more
Statement logging
Statements are logged:
after statement is executed
before statement is committed
Non-transactional changes
Can be partially executed
Can cause inconsistency
Row-based replication
Introduced in 5.1
Replicate actual row changes
Can handle “difficult” statements
UDFs, UUID(), ...
Automatic switching
Partially executed statements
Used for Cluster replication
A foundation for new development
Binlog formats
STATEMENT
Everything replicated as statement
Same as for 5.0
MIXED
Replicates in statement format by default
Switch to row format for unsafe statements
ROW
DML is replicated in row format
DDL is replicated in statement format
Using MIXED
Server variable
For a single session only:
SET SESSION BINLOG_FORMAT=MIXED
For all sessions:
SET GLOBAL BINLOG_FORMAT=MIXED
Configuration option: d
de
binlogformat=mixed e n
mm
c o
R e
Row-based and filtering
Individual rows are filtered
Filtered based on actual database
(Statement-based on current database)
Master filters on table possible
... but not implemented
No
UPDATE db1.t1, db2.t2 prob
SET db1.t1.a = db2.t2.a lems
Row-based as a foundation
o ne
D
Conflict detection and resolution
Fine-grained filtering
Master filter on table
o ne
Cluster replication D
Multi-channel replication
Transactional behavior
Possibility to separate transactional and
non-transactional changes in a statement
Horizontal partitioning
Sending different rows to different slaves
Statements
and
Transactions
Transaction cache
Queries
Session Cache
Flush
Session Cache
Flush
BEGIN;
INSERT INTO trans VALUES (1),(2),(3);
INSERT INTO non_trans SELECT * FROM trans;
...
COMMIT/ROLLBACK;
CREATE TABLE log (uid CHAR(64), comment TEXT) ENGINE=MYISAM;
CREATE TRIGGER tr_user AFTER INSERT ON user FOR EACH ROW
INSERT INTO log VALUES(NEW.uid, “New user added”);
Statement changes:
Transactional table user
Non-transactional table log
Is this statement transactional?
Shall it be written to the cache?
Mixing engines in statements
If treated as transactional:
BEGIN;
INSERT INTO innodb_tbl VALUES...
INSERT INTO user VALUES ...
ROLLBACK;
Master and slave inconsistent
If treated as non-transactional:
BEGIN;
1
1. 3
INSERT INTO user VALUES ...
5.
in
ROLLBACK;
d
xe
Fi
Master and slave inconsistent
Non-transactional statements
Inside a transaction
<5.1.31:
If cache is not empty: cache
Otherwise: write directly
Th
≥5.1.31: is
is
th
ef
Always cached ix
Outside a transaction
Never cached
Mixing engines in statements
Don't write this:
BEGIN;
INSERT INTO myisam_tbl VALUES...
INSERT INTO innodb_tbl VALUES...
...
COMMIT;
Write this:
INSERT INTO myisam_tbl VALUES...
BEGIN;
INSERT INTO innodb_tbl VALUES...
...
COMMIT;
Triggers and replication
Non-transactional trigger
Statement becomes non-transactional
Legacy from statement-based
5.0: statement can be transactional
Non-transactional “write-ahead”
Possible with row-based replication
Not implemented yet
Events and replication
CREATE, DROP, and ALTER
DDL: Replicated as statements
Event is disabled on slave
It should not execute on slave
Executed twice otherwise
Enabled with ALTER EVENT
Binlog events
# at 235
#060420 20:16:02 server id 1 end_log_pos 351
# Position Timestamp Type Master ID
# 000000eb e2 cf 47 44 02 01 00 00 00
# Size Master Pos Flags
# 74 00 00 00 5f 01 00 00 10 00
Statement-based INSERT 2/2: Query event data
# 000000fe 02 00 00 00 00 00 00 00
# 04 00 00 1a 00 00 00 40 |................|
# 0000010e 00 00 ... |.............std|
# 0000011e 04 08 ... |.......test.INSE|
# 0000012e 52 54 ... |RT.INTO.t1.VALUE|
# 0000013e 53 20 ... |S...A...B......X|
# 0000014e 27 2c ... |...Y......X...X.|
# 0000015e 29 |.|
# Query thread_id=2 exec_time=0 error_code=0
SET TIMESTAMP=1145556962;
INSERT INTO t1 VALUES ('A','B'), ('X','Y'), ('X','X');
Row-based INSERT 1/2: Table map event
# at 235
#060420 20:07:01 server id 1 end_log_pos 275
# Position Timestamp Type Master ID
# 000000eb c5 cd 47 44 13 01 00 00 00
# Size Master Pos Flags
# 28 00 00 00 13 01 00 00 00 00
# 000000fe 0f 00 00 00 00 00 00 00
04 74 65 73 74 00 02 74 |.........test..t|
# 0000010e 31 00 02 fe fe |1....|
# Table_map: `test`.`t1` mapped to number 15
BINLOG 'xc1HRBMBAAAAKAAAABMBA...3QAAnQxAAL+/g==';
Row-based INSERT 2/2: Write event
BINLOG 'xc1HRBQBAAAALAAAAD...EBQvkBWAFZ+QFYAVg=';
Cluster replication
MySQL Cluster Replication
Where to get the log events?
Replication
MySQL MySQL MySQL
Server Server Server
Application
DB DB
Application DB DB
MySQL MySQL
Server Server
TC (DB y) TC (DB x)
DB 1 DB 3
Row-level
locking
on primary
replica DB 2 DB 4
TC (DB x) TC (DB x)
Replication
server
DB 1 DB 3
Row-level
locking
on primary DB 2 DB 4
replica
Node group 1 Node group 2
MySQL Replication Architecture
MySQL 5.1
Injector
interface SE1 SE2 SE1 SE2
Relay
Binlog Binlog
NDB Binlog
Injector
Storage Engines
Storage Engines
Row-based log from
cluster data nodes
MySQL Cluster Replication
Behaves like ordinary MySQL Replication
Application Application Application Application Application Application
Replication
MySQL MySQL MySQL MySQL MySQL MySQL
Server Server Server Server Server Server
DB DB DB DB
DB DB DB DB
Global
Local Synchronous Asynchronous
Replication – Replication
two-phase commit
The End
Mats Kindahl
[email protected]
Lars Thalmann
[email protected]