WP CloserLookatMySQLCluster 141011

MySQL(r) cluster is a separate product from the standard MySQL database. It offers five nines availability with a shared-nothing architecture. MySQL Cluster provides unique hash indexes as well as ordered T-tree indexes. It also supports several online features, including a native online backup.

Uploaded by

marcsherwood

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views

WP CloserLookatMySQLCluster 141011

Uploaded by

marcsherwood

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

www.skysql.

com

A Closer Look at MySQL Cluster: An Architectural Overview

By Max Mether, Manager, Training Services, SkySQL Ab

2011 SkySQL Ab. SkySQL and the SkySQL logo are trademarks of SkySQL Ab. Oracle and MySQL are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Keywords: MySQL Cluster, MySQL, high availability, ha, clustering Introduction MySQL Cluster is a separate product from the standard MySQL database, despite their being tightly linked. Due to the distributed nature of MySQL Cluster, it has a far more complicated architecture than a standard MySQL database. This white paper will describe the architecture of the MySQL Cluster product and how the process and data flow takes place between the nodes of the cluster. Key Features of MySQL Cluster MySQL Cluster can be distinguished from many other clustering products in that it offers five nines availability with a shared-nothing architecture. Most clustering products use a shared-disk architecture. Using a shared-nothing architecture allows the cluster to run on commodity hardware, thus greatly reducing the cost of deployment, compared to clusters using a shared-disk architecture. Having a shared-nothing architecture does, however, come with a different cost, namely, a more complex way of ensuring high availability and synchronization of the nodes which is mainly seen as increased network traffic. MySQL Cluster also provides ACID transactions with row level locking using the READCOMMITTED isolation level. MySQL Cluster is, in general, an in-memory database. However nonindexed data can be stored on disk and the disks are also used for checkpointing to ensure durability through system shutdown. MySQL cluster provides unique hash indexes as well as ordered T-tree indexes. It also supports several online features, including a native online backup, and the ability perform certain ALTER TABLE operations with the tables unlocked (most storage engines require table locks for all ALTER TABLE operations) The MySQL Cluster Architecture The MySQL Cluster architecture can be divided into several layers. The first is the application layer where the applications that communicate with the MySQL servers reside. From a MySQL server perspective, these applications are normal MySQL clients and the MySQL servers handle all communication with the lower layers of the cluster. The second layer is the SQL layer where the MySQL servers reside, called SQL or API nodes in the cluster. The number of MySQL servers is completely independent of the number of data nodes, which allows great flexibility in a clusters configuration. In particular as the number of SQL nodes can be increased without shutting down the cluster. In addition to MySQL servers, other programs that communicate directly with the data nodes (such as restoration programs or programs to view the clustered tables) are also considered API nodes. MySQL Cluster offers a C API to facilitate the creation of programs that communicate directly with the data nodes without passing through a MySQL server. All such programs would be considered API nodes. Next is the data layer where the data nodes reside. These nodes manage all the data, indexes and transactions in the cluster and are thus responsible for the availability and the durability of the data. As well, there is a management layer where the management server(s) resides.

The rest of this paper will focus on the data nodes.

Illustration. 1: The MySQL Cluster Architecture

Data partitioning Lets take a closer look at the data managed by the data nodes. Each table in the cluster is partitioned based on the number of data nodes. This means that if the cluster has two data nodes, each table is partitioned into two parts, and if the cluster has four data nodes, each table is partitioned into four parts. By default, the partitioning is done based on the hash value of the primary key. The partitioning function can be changed if needed, but for most cases the default is good enough. Each data node holds the so-called primary replica or fragment for a partition, allowing the data to be distributed evenly between all the data nodes. To ensure the availability (and redundancy) of the data, each node also holds a copy of another partition, called a secondary replica. The nodes work in pairs so that the node holding the secondary partition of another nodes primary partition will reciprocate and give its own primary partition as a secondary partition to the same node partner. These pairs are called node groups and there will be #nodes / 2 node groups in the cluster. This means that for a cluster with 2 data nodes, each node will contain the whole database, but for a cluster with 4 nodes, each node will only contain half of the data.

Illustration. 2: Data partitioning with 2 data nodes

Illustration. 3: Data partitioning with 4 data nodes

Detecting Node failure Because of the shared-nothing architecture, all nodes in the cluster must always have the same view of who is connected to the cluster. Detecting failed nodes is extremely important. In MySQL Cluster this can either be handled through TCP close or through the heartbeat circle. The data nodes are organised in a logical circle where each node sends heartbeats to the next node in the circle. If a node fails to send 3 consecutive heartbeats, the following node assumes that the former node has crashed or cannot communicate, and launches the network partitioning protocol. In parallel, all of the processing in the cluster is temporarily suspended, and the remaining nodes determine if they can continue or not. The node that is not responding is excluded from the cluster and the remaining nodes form a new cluster without the unresponsive node. Note that in this new cluster, the partner of the unresponsive node will now contain two primary fragments, as the fragment that was previously a secondary replica is now promoted to primary status. This also means that this node will handle twice the amount of traffic as before the crash. All nodes must thus be tuned to be able to handle twice the normal workload.

Illustration. 4: Heartbeat circle between 4 data nodes

Avoiding split-brain The network partitioning protocol is fundamental to guaranteeing the availability of the cluster. The largest problem derives from the fact that a network failure is indistinguishable from a node crash,

from another nodes perspective. This means that precautions must be in place so that a split-brain scenario cannot occur when the cause of the communication failure is a network failure and not an actual node crash. When there is a network partitioning, the network partitioning protocol is launched on both sides of the split and it guarantees that, in the event of an even split, there will only be one running cluster remaining. The protocol to be followed by each set of nodes that can communicate with each other is the following: Do we have at least one node from each node group? If no, then we shutdown, Do we have all nodes from any of the node groups? If yes, we can continue as the cluster. We ask the arbitrator to decide our fate.

The arbitrator follows a simple rule: the first set of nodes to ask will be given a positive answer, and all other sets will be given a negative answer. If a set of nodes cannot contact the arbitrator, they will shut down automatically. Ensuring durability Since the MySQL Cluster is generally an in-memory database, some measures have to be taken so that a cluster shutdown is possible without incurring data loss. The data nodes use two procedures to maintain data on disk. The first is the REDO log. When a transaction takes place on a node, the transaction is stored in a REDO log buffer, which is synchronously flushed to disk at even intervals. The disk based REDO log will only contain the transactions that were committed at the point in time the flush took place. The flushing takes place through a global checkpoint or GCP. Of course, such a REDO log would continue to grow ad infinitum, so the process needs to be limited. This is done through another process called a local checkpoint or LCP. During an LCP, the data nodes store a snapshot all their data on disk allowing all REDO log contents before this point in time to be discarded. For safety reasons the data nodes keep two LCPs on disk and thus need REDO logs from the time of the start of the first LCP through to the beginning of the third LCP, for a recovery to be successful. The LCPs are written in a circular fashion (LCP 3 will replace LCP 1 etc), as is the REDO log. The contents of the LCPs and the REDO log are used when a node recovers - it can rebuild data up to its last GCP from its own disk and then transfer the remaining changes from its partner node before regaining primary status for any of its fragments. When the whole cluster is shutdown gracefully, a last GCP is issued before shutting down the data nodes. Each data node can then recover its data completely from its own disk. Two-phase commit protocol and transactions MySQL Cluster provides synchronous replication between the data nodes, which means that the replication is synchronous from the clients perspective. In order to achieve this, a complex algorithm is used called the two-phase commit protocol (2PC). In the 2PC, a transaction is committed through two phases: a prepare phase and a commit phase. During the prepare phase, the nodes perform the requested operations and are ready to commit the transaction. During the commit phase, the transaction is committed and the changes cannot be undone anymore.

Illustration. 5: Two-phase commit protocol

Lets look at the actual implementation of this in MySQL Cluster using a simple primary key write as an example. First the MySQL server (or whichever SQL node the request comes from) contacts a data node. Every data node has a transaction coordinator (TC) module and the contacted data node now becomes the coordinator, or manager, for that transaction. Given the primary key value, the TC calculates in which partition the row resides and forwards the operation to the data node holding the primary replica of that partition. On the primary node, the operation is performed and the required row is locked, where upon the same operation is sent to the partner node holding the secondary replica. Once the secondary node performs the operation, the TC is contacted again and the operation can now be committed. The commit phase follows the same sequence but in the reverse order: the node with the secondary replica is contacted first and then the node with the primary replica. Once the primary node has committed the operation, the TC acknowledges the operation as committed to the client. All in all, its a total of 6 internal messages sent between nodes, and 2 messages between the client and the TC, for this simple primary key operation. If a transaction contains multiple operations, the internal steps are repeated for each operation. Operations other than primary key writes follow somewhat different workflows, but the principle remains the same.

Illustration. 5: A primary key write in a cluster with 2 data nodes

About SkySQL Ab SkySQL Ab, the company behind the SkySQL Enterprise subscription, is the first choice in affordable MySQL database solutions for the enterprise and cloud. Founded by former executives, personnel, and investors of MySQL AB, SkySQL Ab is an open source software company committed to furthering the future development of MySQL database technologies, while delivering cost-effective database solutions and exceptional customer service. SkySQL Abs customers include ATOS Worldline, Canal+, Deutsche Telekom, Easyflirt.com, FHE3, Lotte.com, Nordic Growth Market (NGM), Richemont and Virgin Mobile. SkySQLs worldwide headquarters is located in Helsinki, Finland. The company has operations in Asia, Europe and North America. For more information, please call 1(877) 303-5799, or visit www.skysql.com, and follow conversations at www.twitter.com/skysql_ab. MySQL is a registered trademark of Oracle and/or its affiliates. Other names may be trademarks of their respective owners SkySQL is not affiliated with MySQL.

Final Report - Restaurant Management System
68% (76)
Final Report - Restaurant Management System
36 pages
All India Mobile Number and Email Database-Bulk Database
20% (5)
All India Mobile Number and Email Database-Bulk Database
5 pages
White Paper Linux Cluster
No ratings yet
White Paper Linux Cluster
17 pages
ORACLE RAC Interview Questions
100% (1)
ORACLE RAC Interview Questions
3 pages
A Primer On Database Clustering Architectures: by Mike Hogan, Ceo Scaledb Inc
No ratings yet
A Primer On Database Clustering Architectures: by Mike Hogan, Ceo Scaledb Inc
7 pages
IOUG93 - Client Server Very Large Databases - Paper
No ratings yet
IOUG93 - Client Server Very Large Databases - Paper
11 pages
bda module 3
No ratings yet
bda module 3
35 pages
0zI2XrFJX5tR CjuECI f5HwGdQkpL8DAkTmwDPyFm3H0eCERMEvG9fH
No ratings yet
0zI2XrFJX5tR CjuECI f5HwGdQkpL8DAkTmwDPyFm3H0eCERMEvG9fH
13 pages
BDA (18CS72) Module-III
No ratings yet
BDA (18CS72) Module-III
14 pages
RAC FAQ's
No ratings yet
RAC FAQ's
9 pages
Concept of Cluster: Prepared By: Vinutha C
No ratings yet
Concept of Cluster: Prepared By: Vinutha C
9 pages
Architecture: Shared-Nothing Mysql Database Management System
No ratings yet
Architecture: Shared-Nothing Mysql Database Management System
5 pages
RAC Interview Questions
No ratings yet
RAC Interview Questions
14 pages
Scan Listener
No ratings yet
Scan Listener
6 pages
Two Node Mysql Cluster: 1.0 Executive Summary
No ratings yet
Two Node Mysql Cluster: 1.0 Executive Summary
9 pages
Introduction To Mysql Cluster: Architecture and Use: (Based On An Original Paper by Stewart Smith, Mysql Ab)
No ratings yet
Introduction To Mysql Cluster: Architecture and Use: (Based On An Original Paper by Stewart Smith, Mysql Ab)
7 pages
Operating System: COPE/Technical Test/Interview/Database Questions & Answers
No ratings yet
Operating System: COPE/Technical Test/Interview/Database Questions & Answers
25 pages
Module-2
No ratings yet
Module-2
104 pages
36 F WPD-Win Server Multi
No ratings yet
36 F WPD-Win Server Multi
14 pages
Cassandra Installation Review
No ratings yet
Cassandra Installation Review
6 pages
5_6172202015368675371
No ratings yet
5_6172202015368675371
59 pages
Distributed SQL: The Architecture Behind Mariadb Xpand: April 2021
No ratings yet
Distributed SQL: The Architecture Behind Mariadb Xpand: April 2021
19 pages
RAC Presentation Oracle10gR2
No ratings yet
RAC Presentation Oracle10gR2
103 pages
Distributed Processing Questions and Answers
No ratings yet
Distributed Processing Questions and Answers
2 pages
NuoDB-20 White Paper
No ratings yet
NuoDB-20 White Paper
27 pages
A Primer On Database Clustering Architectures: by Mike Hogan, Ceo Scaledb Inc
No ratings yet
A Primer On Database Clustering Architectures: by Mike Hogan, Ceo Scaledb Inc
4 pages
Windows Cluster Interview Questions and Answers - Mohammed Siddiqui - Academia
No ratings yet
Windows Cluster Interview Questions and Answers - Mohammed Siddiqui - Academia
4 pages
Cluster
No ratings yet
Cluster
3 pages
Openldap in High Availability Environments: September 2011
No ratings yet
Openldap in High Availability Environments: September 2011
12 pages
RAC TOP 51 Interview Question and Answer (Replica) 2
No ratings yet
RAC TOP 51 Interview Question and Answer (Replica) 2
9 pages
Cassandra As Used by Facebook
100% (1)
Cassandra As Used by Facebook
12 pages
Module-2
No ratings yet
Module-2
100 pages
Two Node Mysql Cluster: 1.0 Executive Summary
No ratings yet
Two Node Mysql Cluster: 1.0 Executive Summary
9 pages
Linux Multi-Core Scalability
No ratings yet
Linux Multi-Core Scalability
7 pages
Windows Cluster Interview Questionsand A
No ratings yet
Windows Cluster Interview Questionsand A
3 pages
Oracle DBA Interview Questions and Answers
No ratings yet
Oracle DBA Interview Questions and Answers
14 pages
Nosql and Data Scalability: Getting Started With
100% (1)
Nosql and Data Scalability: Getting Started With
6 pages
Microsoft Cluster Interview Questions and Answers
No ratings yet
Microsoft Cluster Interview Questions and Answers
3 pages
Distributed SQL Mariadb Xpand Architecture - Whitepaper - 1106
No ratings yet
Distributed SQL Mariadb Xpand Architecture - Whitepaper - 1106
19 pages
BDA UT2 QB Answers
100% (1)
BDA UT2 QB Answers
22 pages
Lec21Notes Merged
No ratings yet
Lec21Notes Merged
20 pages
6) Real Cluster Applications (RAC) Questions & Answers
No ratings yet
6) Real Cluster Applications (RAC) Questions & Answers
5 pages
Cassandra Unit 4
No ratings yet
Cassandra Unit 4
18 pages
What Is A Server Cluster
No ratings yet
What Is A Server Cluster
4 pages
Nosql
No ratings yet
Nosql
12 pages
Unit 2
No ratings yet
Unit 2
12 pages
Unit 4-DBP
No ratings yet
Unit 4-DBP
66 pages
Interview Questions - Microsoft Cluster Interview Questions and Answers
No ratings yet
Interview Questions - Microsoft Cluster Interview Questions and Answers
3 pages
Distributed Database Management System
No ratings yet
Distributed Database Management System
5 pages
TB XC Appliance System Scalability 092514
No ratings yet
TB XC Appliance System Scalability 092514
3 pages
Top RAC Interview Questions
No ratings yet
Top RAC Interview Questions
10 pages
Cluster
No ratings yet
Cluster
15 pages
Stonebraker SQL Vs NoSQL 2010
No ratings yet
Stonebraker SQL Vs NoSQL 2010
2 pages
Sun Cluster
No ratings yet
Sun Cluster
87 pages
Log In: Windows Cluster Interview Questions and Answers
No ratings yet
Log In: Windows Cluster Interview Questions and Answers
4 pages
cluster
No ratings yet
cluster
9 pages
Viva Voce Questions
No ratings yet
Viva Voce Questions
6 pages
What Is RAC
100% (1)
What Is RAC
6 pages
bda-ia2-bda
No ratings yet
bda-ia2-bda
7 pages
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
Commvault Licensing Information Summaries
No ratings yet
Commvault Licensing Information Summaries
19 pages
Oracle Taleo Recruiting: Requisition Management Training Guide - Section 2
No ratings yet
Oracle Taleo Recruiting: Requisition Management Training Guide - Section 2
50 pages
Information Systems in Logistics - Fin
No ratings yet
Information Systems in Logistics - Fin
26 pages
175 16SCCCS4 DB
No ratings yet
175 16SCCCS4 DB
132 pages
MAGENTO v1.0.19700 - Database Diagram
No ratings yet
MAGENTO v1.0.19700 - Database Diagram
1 page
The Basics of Oracle Architecture
No ratings yet
The Basics of Oracle Architecture
5 pages
"Logistics Management System": Bachelor of Commerce (Computer Application) - III
No ratings yet
"Logistics Management System": Bachelor of Commerce (Computer Application) - III
9 pages
Relational Data Base Management System
No ratings yet
Relational Data Base Management System
3 pages
Iit Pratical File
No ratings yet
Iit Pratical File
40 pages
Create A Folder On The Desktop of Your Computer and Name It by Your ID
No ratings yet
Create A Folder On The Desktop of Your Computer and Name It by Your ID
2 pages
Erp PDF Mini 5
No ratings yet
Erp PDF Mini 5
17 pages
Unit 16.assignment Brief 1
No ratings yet
Unit 16.assignment Brief 1
42 pages
R18 B.Tech - ECE Syllabus IV Year
No ratings yet
R18 B.Tech - ECE Syllabus IV Year
23 pages
Your Questions, Answered!
100% (1)
Your Questions, Answered!
5 pages
Nithyasree
No ratings yet
Nithyasree
1 page
Clarion ASP Users Guide
No ratings yet
Clarion ASP Users Guide
172 pages
AWS Ramp-Up Guide: Databases: For Data Engineers, Architects, Developers and Operations Engineers
No ratings yet
AWS Ramp-Up Guide: Databases: For Data Engineers, Architects, Developers and Operations Engineers
3 pages
Nosql Databases: by Amy Alexander and Tanya Christina
No ratings yet
Nosql Databases: by Amy Alexander and Tanya Christina
14 pages
Gaurav Misra: Education
No ratings yet
Gaurav Misra: Education
1 page
Avaya Oceana Solution Description R3.8 September 2020
No ratings yet
Avaya Oceana Solution Description R3.8 September 2020
143 pages
Name: Samuel Gachari REG NO: HDB212-0564/2017 Course: Bbit 4.2 Unit: Artificial Intelligence Assignment
No ratings yet
Name: Samuel Gachari REG NO: HDB212-0564/2017 Course: Bbit 4.2 Unit: Artificial Intelligence Assignment
4 pages
Online Auction A DotNet Project
No ratings yet
Online Auction A DotNet Project
35 pages
AWR Warehouse: An Introduction
No ratings yet
AWR Warehouse: An Introduction
38 pages
Mushcab 2015
No ratings yet
Mushcab 2015
6 pages
MongoDB Security Architecture WP
No ratings yet
MongoDB Security Architecture WP
17 pages
Scheme of Study BS (IT) 2020-24
No ratings yet
Scheme of Study BS (IT) 2020-24
22 pages
Bca Syllabus PDF
No ratings yet
Bca Syllabus PDF
29 pages
Geological Database (GDB) : Mincom
No ratings yet
Geological Database (GDB) : Mincom
1 page

WP CloserLookatMySQLCluster 141011

Uploaded by

WP CloserLookatMySQLCluster 141011

Uploaded by

www.skysql.

A Closer Look at MySQL Cluster: An Architectural Overview

By Max Mether, Manager, Training Services, SkySQL Ab

The rest of this paper will focus on the data nodes.

Illustration. 1: The MySQL Cluster Architecture

Illustration. 2: Data partitioning with 2 data nodes

Illustration. 3: Data partitioning with 4 data nodes

Illustration. 4: Heartbeat circle between 4 data nodes

Illustration. 5: Two-phase commit protocol

Illustration. 5: A primary key write in a cluster with 2 data nodes

You might also like