SlideShare a Scribd company logo
MARIADB
Clustrix Overview
Robbie Mihalyi Matthew White Peter Friedenbach
VP Engineering – Clustrix Director of Engineering Performance Architect
Scaling
Application
Scaling
 Scales Capacity
 Provides Fault Tolerance
 Dynamic and Flexible
DBMS Server
Load Balancer
Clustrix Scaling
 Scales Reads & Writes
 Provides Fault Tolerance
 Dynamic and Flexible
Clustrix
Load Balancer
MaxScale
SCALE-OUT RDBMS: CLUSTRIX
● Built from the ground-up - Shared nothing architecture
● All nodes are equal – any node can accept connections and can Read / Write
● Fine grained data distribution with data protection – built-in fault tolerance
● Distributed parallel query execution
SCALE-OUT RDBMS: CLUSTRIX
● To the application it looks like one Database System
● DDL and DML SQL statements are MySQL/MariaDB compatible
○ MySQL/MariaDB data types (including JSON support)
○ Referential integrity
○ Transactions – ACID compliant
○ Triggers and Stored Procedures
○ Complex Joins
SCALE-OUT RDBMS: CLUSTRIX
● Administration Features and Capabilities
○ On-line Schema changes
○ Replication
○ On-line Backup / Restore
● Multi-zone scale and availability
MARIADB
Clustrix Technology
Matthew White
Director of Engineering - Clustrix
MariaDB Corporation
Clustrix Technology
DATA DISTRIBUTION
Terminology - REPRESENTATION
26
BASE REPRESENTATION
Primary Key
ID col1 col2 col3
1 16 36 JANUARY
2 17 35 FEBRUARY
3 18 34 MARCH
4 19 33 APRIL
5 20 32 MAY
K1 REP
Index (col2)
col2 ID
32 5
33 4
34 3
35 2
36 1
K2 REP
Index col(3, 1)
col3 col1 ID
APRIL 19 4
FEBRUARY 17 2
JANUARY 16 1
MARCH 18 3
MAY 20 5
Terminology - SLICES
Slicing
● Each representation is split into slices
● The hash of keys mapped to slices
● Distribute representations independently
Independent Key Distribution
● Adapts to diverse access patterns
● Allows for broader range query evaluation
● Query plans scale with node count
Terminology - REPLICAS
Fault Tolerance
● Each slice has one or more copies
○ One node will never contain two replicas for the
same slice
● Replicas created on-line without blocking writes
● Default replicas is 2
● Can have replicas = number of nodes
○ Great for read-heavy tables
● Support for Availability Zones
ClustrixDB
S1 S2
S2
S3
S3
S4
S4
S5
S5
Dynamic Data Distribution
● Tables auto-split into slices
● Every slice has a replica on another node
○ Slices are auto distributed, auto-protected
S1
BillionsofRows
Database
Tables
ClustrixDB
S1
S2
S3
S3
S4
S4
S5
Adding Nodes – Flex Up
● Easy and simple Flex Up & Flex Down
○ Single minimal interruption of service
● Data is automatically rebalanced across the cluster
○ Tables are online for reads & writes
● All servers handle writes + reads
○ Workload is spread across more servers after Flex Up
ClustrixDB Overview 30
S1
ClustrixDB
S2
S5
ClustrixDB
S2
S5
S1
S2
S3
S3
S4
S4
S5
Loss of a Node – Automatic Fault Tolerance
● ClustrixDB detects the loss of a node
○ System automatically re-protects
○ Data is automatically redistributed
● Slices lost on the failed node are rapidly re-protected
○ Re-protection while tables are available for reads & writes
● Automated self-healing
○ After re-protect the cluster is fully protected and operational
ClustrixDB Overview 31
S1
ClustrixDB
S2
S5
S2
S5
Complexity Simplified
Q: How do you ensure data stays well distributed in a clustered environment?
• Initial Data Distributes the data into even slices across nodes
• Failed Nodes Re-protects slices to ensure proper replicas exist
• Data Growth Splits large slices into smaller slices
• Flex-Up/Flex-Down Moves slices to leverage new nodes and/or evacuate nodes
• Skewed Data Re-distributes the data to even out across nodes
• Hot Slice Balancing Finds hot slices and balances then across nodes
• Availability Zone Support Rebalancer will distribute replicas across zones
…while the DB stays open for reads & writesPatent 8,543,538
Patent 8,554,726
A: You let the REBALANCER handle it!
Clustrix Technology
QUERY PROCESSING
Query Processing Model
SQL-based
Applications
HW or SW Load
Balancer
● Load balancer spreads DB connections to all nodes
● A session is established on any node
● Session controls query execution
 Parse
 Plan
 Compile
UPDATE users
SET online = 1
WHERE id = 8797;
Session
VM VM
ID: 8797 |…| ONLINE:0
Fragment
VM
Query Processing Model
SQL-based
Applications
HW or SW Load
Balancer
● Load balancer spreads DB connections to all nodes
● A session is established on any node
● Session controls query execution
○ Parse SQL
○ Generate the execution plan
○ Compile into fragments
○ Start the execution of initial fragment
○ Send fragments
○ Coordinate transaction completion
GTM
UPDATE users
SET online = 1
WHERE id = 8797;
Session
VM VM
ID: 8797 |…| ONLINE:1 ID: 8797 |…| ONLINE:1
Commit Commit
Commit
VM
Parallel Query Processing
SQL-based
Applications
HW or SW Load
Balancer
VM VM
SELECT SUM(amount)
FROM donations;
 Parse
 Plan
 Compile
Session
VM
● Load balancer spreads DB connections to all nodes
● A session is established on any node
● Session controls query execution
○ Parse SQL
○ Generate the execution plan
○ Compile into fragments
○ Lookup record(s) location
○ Send fragments
○ Coordinate transaction completion
● Aggregate in parallel on all nodes
FragmentFragment
Parallel Query Processing
SQL-based
Applications
VM VM
Session
VM
● Load balancer spreads DB connections to all nodes
● A session is established on any node
● Session controls query execution
○ Parse SQL
○ Generate the execution plan
○ Compile into fragments
○ Lookup record(s) location
○ Send fragments
○ Coordinate transaction completion
● Aggregate in parallel on all nodes
AGGREGATE
LOCALLY
AGGREGATE
LOCALLY
AGGREGATE
LOCALLY
AGGREGATE
RESULTS
SELECT SUM(amount)
FROM donations;
HW or SW Load
Balancer
Clustrix Technology
ADMINISTRATION
Table
Online Schema Change
● Allows reads & writes during
ALTER TABLE operations
Table
QueueQueueQueue
MYTABLE __building_MYTABLE
Atomic Flip
Reads & Writes
ALTER TABLE mytable ADD (foo int);
 Queue is created to track changes
 Table is copied with new column
 Queues are applied to new copy until empty
 Atomic flip to new table
Table
Online Schema Change
● Allows reads & writes during
ALTER TABLE operations
Table
MYTABLE__retiring_MYTABLE
Atomic Flip
Reads & Writes
ALTER TABLE mytable ADD (foo int);

 Queue is created to track changes
 Table is copied with new column
 Queues are applied to new copy until empty
 Atomic flip to new table
 Original table is deleted
Backup, Replication & Disaster Recovery
Asynchronous multi-point Replication
ClustrixDB
Parallel Backup
Replicate to any cloud, any datacenter, anywhere
ClustrixGUI: Performance Monitoring
MARIADB
Clustrix Live Demo
Peter Friedenbach
Performance Architect
MariaDB Corporation
THANK YOU!

More Related Content

What's hot (20)

PPTX
Microservices Architecture - Bangkok 2018
Araf Karsh Hamid
 
PDF
엔터프라이즈의 효과적인 클라우드 도입을 위한 전략 및 적용 사례-신규진 프로페셔널 서비스 리드, AWS/고병률 데이터베이스 아키텍트, 삼성...
Amazon Web Services Korea
 
PDF
Exactly-once Semantics in Apache Kafka
confluent
 
PPTX
Azure signalR
Christoffer Noring
 
PDF
ksqlDB로 시작하는 스트림 프로세싱
confluent
 
PDF
Black and Blue APIs: Attacker's and Defender's View of API Vulnerabilities
Matt Tesauro
 
PDF
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon Web Services Korea
 
PDF
Best Practices with Azure Kubernetes Services
QAware GmbH
 
PDF
데이터베이스 운영, 서버리스로 걱정 끝! - 윤석찬, AWS 테크에반젤리스트 - AWS Builders Online Series
Amazon Web Services Korea
 
PDF
[오픈소스컨설팅] 서비스 메쉬(Service mesh)
Open Source Consulting
 
PDF
18 Months of Event Sourcing and CQRS Using Microsoft Orleans
Andy Hoyle
 
PPSX
Microservices Architecture - Cloud Native Apps
Araf Karsh Hamid
 
PDF
Introduction to Elasticsearch
Ruslan Zavacky
 
PPTX
FIWARE Orion Context Broker コンテキスト情報管理 (Orion 3.7.0対応)
fisuda
 
PDF
How netflix manages petabyte scale apache cassandra in the cloud
Vinay Kumar Chella
 
PDF
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
Amazon Web Services Korea
 
PPTX
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
Redis Labs
 
PPTX
Elastic search overview
ABC Talks
 
ODP
Kong API Gateway
Chris Mague
 
PDF
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
NETWAYS
 
Microservices Architecture - Bangkok 2018
Araf Karsh Hamid
 
엔터프라이즈의 효과적인 클라우드 도입을 위한 전략 및 적용 사례-신규진 프로페셔널 서비스 리드, AWS/고병률 데이터베이스 아키텍트, 삼성...
Amazon Web Services Korea
 
Exactly-once Semantics in Apache Kafka
confluent
 
Azure signalR
Christoffer Noring
 
ksqlDB로 시작하는 스트림 프로세싱
confluent
 
Black and Blue APIs: Attacker's and Defender's View of API Vulnerabilities
Matt Tesauro
 
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon Web Services Korea
 
Best Practices with Azure Kubernetes Services
QAware GmbH
 
데이터베이스 운영, 서버리스로 걱정 끝! - 윤석찬, AWS 테크에반젤리스트 - AWS Builders Online Series
Amazon Web Services Korea
 
[오픈소스컨설팅] 서비스 메쉬(Service mesh)
Open Source Consulting
 
18 Months of Event Sourcing and CQRS Using Microsoft Orleans
Andy Hoyle
 
Microservices Architecture - Cloud Native Apps
Araf Karsh Hamid
 
Introduction to Elasticsearch
Ruslan Zavacky
 
FIWARE Orion Context Broker コンテキスト情報管理 (Orion 3.7.0対応)
fisuda
 
How netflix manages petabyte scale apache cassandra in the cloud
Vinay Kumar Chella
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
Amazon Web Services Korea
 
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
Redis Labs
 
Elastic search overview
ABC Talks
 
Kong API Gateway
Chris Mague
 
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
NETWAYS
 

Similar to ClustrixDB: how distributed databases scale out (20)

PDF
Introduction to ClustrixDB
I Goo Lee
 
PPTX
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Clustrix
 
PPTX
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Clustrix
 
PDF
MariaDB MaxScale: an Intelligent Database Proxy
Markus Mäkelä
 
PPTX
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Clustrix
 
PDF
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
ScyllaDB
 
PDF
MariaDB MaxScale: an Intelligent Database Proxy
Markus Mäkelä
 
PPTX
ClustrixDB at Samsung Cloud
MariaDB plc
 
PDF
Webinar slides: Severalnines & MariaDB present: Automation & Management of Ma...
Severalnines
 
PDF
M|18 Why Abstract Away the Underlying Database Infrastructure
MariaDB plc
 
PPTX
Beyond Aurora. Scale-out SQL databases for AWS
Clustrix
 
PDF
Demystifying the Distributed Database Landscape
ScyllaDB
 
PDF
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Dave Anselmi
 
PDF
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
Ivan Zoratti
 
PDF
MariaDB - The Future of MySQL?
Bokowsky + Laymann GmbH
 
PDF
Five Lessons in Distributed Databases
jbellis
 
PPTX
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
ScaleBase
 
PDF
Cassandra Core Concepts - Cassandra Day Toronto
Jon Haddad
 
PDF
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
Umair Shahid
 
PDF
High-level architecture of a complete MariaDB deployment
Federico Razzoli
 
Introduction to ClustrixDB
I Goo Lee
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Clustrix
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Clustrix
 
MariaDB MaxScale: an Intelligent Database Proxy
Markus Mäkelä
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Clustrix
 
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
ScyllaDB
 
MariaDB MaxScale: an Intelligent Database Proxy
Markus Mäkelä
 
ClustrixDB at Samsung Cloud
MariaDB plc
 
Webinar slides: Severalnines & MariaDB present: Automation & Management of Ma...
Severalnines
 
M|18 Why Abstract Away the Underlying Database Infrastructure
MariaDB plc
 
Beyond Aurora. Scale-out SQL databases for AWS
Clustrix
 
Demystifying the Distributed Database Landscape
ScyllaDB
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Dave Anselmi
 
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
Ivan Zoratti
 
MariaDB - The Future of MySQL?
Bokowsky + Laymann GmbH
 
Five Lessons in Distributed Databases
jbellis
 
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
ScaleBase
 
Cassandra Core Concepts - Cassandra Day Toronto
Jon Haddad
 
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
Umair Shahid
 
High-level architecture of a complete MariaDB deployment
Federico Razzoli
 
Ad

More from MariaDB plc (20)

PDF
MariaDB Berlin Roadshow Slides - 8 April 2025
MariaDB plc
 
PDF
MariaDB München Roadshow - 24 September, 2024
MariaDB plc
 
PDF
MariaDB Paris Roadshow - 19 September 2024
MariaDB plc
 
PDF
MariaDB Amsterdam Roadshow: 19 September, 2024
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - Newpharma
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - Cloud
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - MaxScale
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB plc
 
PDF
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB plc
 
PDF
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB plc
 
PDF
Einführung : MariaDB Tech und Business Update Hamburg 2023
MariaDB plc
 
PDF
Hochverfügbarkeitslösungen mit MariaDB
MariaDB plc
 
PDF
Die Neuheiten in MariaDB Enterprise Server
MariaDB plc
 
PDF
Global Data Replication with Galera for Ansell Guardian®
MariaDB plc
 
PDF
Introducing workload analysis
MariaDB plc
 
PDF
Under the hood: SkySQL monitoring
MariaDB plc
 
MariaDB Berlin Roadshow Slides - 8 April 2025
MariaDB plc
 
MariaDB München Roadshow - 24 September, 2024
MariaDB plc
 
MariaDB Paris Roadshow - 19 September 2024
MariaDB plc
 
MariaDB Amsterdam Roadshow: 19 September, 2024
MariaDB plc
 
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB plc
 
MariaDB Paris Workshop 2023 - Newpharma
MariaDB plc
 
MariaDB Paris Workshop 2023 - Cloud
MariaDB plc
 
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB plc
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB plc
 
MariaDB Paris Workshop 2023 - MaxScale
MariaDB plc
 
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB plc
 
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB plc
 
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB plc
 
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB plc
 
Einführung : MariaDB Tech und Business Update Hamburg 2023
MariaDB plc
 
Hochverfügbarkeitslösungen mit MariaDB
MariaDB plc
 
Die Neuheiten in MariaDB Enterprise Server
MariaDB plc
 
Global Data Replication with Galera for Ansell Guardian®
MariaDB plc
 
Introducing workload analysis
MariaDB plc
 
Under the hood: SkySQL monitoring
MariaDB plc
 
Ad

Recently uploaded (20)

PPTX
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PPTX
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
PDF
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
PPT
MergeSortfbsjbjsfk sdfik k
RafishaikIT02044
 
PDF
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
PPTX
Feb 2021 Cohesity first pitch presentation.pptx
enginsayin1
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PDF
Streamline Contractor Lifecycle- TECH EHS Solution
TECH EHS Solution
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PPTX
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
PPTX
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
PDF
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PPTX
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
Platform for Enterprise Solution - Java EE5
abhishekoza1981
 
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
MergeSortfbsjbjsfk sdfik k
RafishaikIT02044
 
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
Feb 2021 Cohesity first pitch presentation.pptx
enginsayin1
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
Streamline Contractor Lifecycle- TECH EHS Solution
TECH EHS Solution
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
Tally software_Introduction_Presentation
AditiBansal54083
 
An Introduction to ZAP by Checkmarx - Official Version
Simon Bennetts
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 

ClustrixDB: how distributed databases scale out

  • 1. MARIADB Clustrix Overview Robbie Mihalyi Matthew White Peter Friedenbach VP Engineering – Clustrix Director of Engineering Performance Architect
  • 3. Application Scaling  Scales Capacity  Provides Fault Tolerance  Dynamic and Flexible DBMS Server Load Balancer
  • 4. Clustrix Scaling  Scales Reads & Writes  Provides Fault Tolerance  Dynamic and Flexible Clustrix Load Balancer MaxScale
  • 5. SCALE-OUT RDBMS: CLUSTRIX ● Built from the ground-up - Shared nothing architecture ● All nodes are equal – any node can accept connections and can Read / Write ● Fine grained data distribution with data protection – built-in fault tolerance ● Distributed parallel query execution
  • 6. SCALE-OUT RDBMS: CLUSTRIX ● To the application it looks like one Database System ● DDL and DML SQL statements are MySQL/MariaDB compatible ○ MySQL/MariaDB data types (including JSON support) ○ Referential integrity ○ Transactions – ACID compliant ○ Triggers and Stored Procedures ○ Complex Joins
  • 7. SCALE-OUT RDBMS: CLUSTRIX ● Administration Features and Capabilities ○ On-line Schema changes ○ Replication ○ On-line Backup / Restore ● Multi-zone scale and availability
  • 8. MARIADB Clustrix Technology Matthew White Director of Engineering - Clustrix MariaDB Corporation
  • 10. Terminology - REPRESENTATION 26 BASE REPRESENTATION Primary Key ID col1 col2 col3 1 16 36 JANUARY 2 17 35 FEBRUARY 3 18 34 MARCH 4 19 33 APRIL 5 20 32 MAY K1 REP Index (col2) col2 ID 32 5 33 4 34 3 35 2 36 1 K2 REP Index col(3, 1) col3 col1 ID APRIL 19 4 FEBRUARY 17 2 JANUARY 16 1 MARCH 18 3 MAY 20 5
  • 11. Terminology - SLICES Slicing ● Each representation is split into slices ● The hash of keys mapped to slices ● Distribute representations independently Independent Key Distribution ● Adapts to diverse access patterns ● Allows for broader range query evaluation ● Query plans scale with node count
  • 12. Terminology - REPLICAS Fault Tolerance ● Each slice has one or more copies ○ One node will never contain two replicas for the same slice ● Replicas created on-line without blocking writes ● Default replicas is 2 ● Can have replicas = number of nodes ○ Great for read-heavy tables ● Support for Availability Zones
  • 13. ClustrixDB S1 S2 S2 S3 S3 S4 S4 S5 S5 Dynamic Data Distribution ● Tables auto-split into slices ● Every slice has a replica on another node ○ Slices are auto distributed, auto-protected S1 BillionsofRows Database Tables
  • 14. ClustrixDB S1 S2 S3 S3 S4 S4 S5 Adding Nodes – Flex Up ● Easy and simple Flex Up & Flex Down ○ Single minimal interruption of service ● Data is automatically rebalanced across the cluster ○ Tables are online for reads & writes ● All servers handle writes + reads ○ Workload is spread across more servers after Flex Up ClustrixDB Overview 30 S1 ClustrixDB S2 S5
  • 15. ClustrixDB S2 S5 S1 S2 S3 S3 S4 S4 S5 Loss of a Node – Automatic Fault Tolerance ● ClustrixDB detects the loss of a node ○ System automatically re-protects ○ Data is automatically redistributed ● Slices lost on the failed node are rapidly re-protected ○ Re-protection while tables are available for reads & writes ● Automated self-healing ○ After re-protect the cluster is fully protected and operational ClustrixDB Overview 31 S1 ClustrixDB S2 S5 S2 S5
  • 16. Complexity Simplified Q: How do you ensure data stays well distributed in a clustered environment? • Initial Data Distributes the data into even slices across nodes • Failed Nodes Re-protects slices to ensure proper replicas exist • Data Growth Splits large slices into smaller slices • Flex-Up/Flex-Down Moves slices to leverage new nodes and/or evacuate nodes • Skewed Data Re-distributes the data to even out across nodes • Hot Slice Balancing Finds hot slices and balances then across nodes • Availability Zone Support Rebalancer will distribute replicas across zones …while the DB stays open for reads & writesPatent 8,543,538 Patent 8,554,726 A: You let the REBALANCER handle it!
  • 18. Query Processing Model SQL-based Applications HW or SW Load Balancer ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution  Parse  Plan  Compile UPDATE users SET online = 1 WHERE id = 8797; Session VM VM ID: 8797 |…| ONLINE:0 Fragment VM
  • 19. Query Processing Model SQL-based Applications HW or SW Load Balancer ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Start the execution of initial fragment ○ Send fragments ○ Coordinate transaction completion GTM UPDATE users SET online = 1 WHERE id = 8797; Session VM VM ID: 8797 |…| ONLINE:1 ID: 8797 |…| ONLINE:1 Commit Commit Commit VM
  • 20. Parallel Query Processing SQL-based Applications HW or SW Load Balancer VM VM SELECT SUM(amount) FROM donations;  Parse  Plan  Compile Session VM ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Lookup record(s) location ○ Send fragments ○ Coordinate transaction completion ● Aggregate in parallel on all nodes FragmentFragment
  • 21. Parallel Query Processing SQL-based Applications VM VM Session VM ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Lookup record(s) location ○ Send fragments ○ Coordinate transaction completion ● Aggregate in parallel on all nodes AGGREGATE LOCALLY AGGREGATE LOCALLY AGGREGATE LOCALLY AGGREGATE RESULTS SELECT SUM(amount) FROM donations; HW or SW Load Balancer
  • 23. Table Online Schema Change ● Allows reads & writes during ALTER TABLE operations Table QueueQueueQueue MYTABLE __building_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int);  Queue is created to track changes  Table is copied with new column  Queues are applied to new copy until empty  Atomic flip to new table
  • 24. Table Online Schema Change ● Allows reads & writes during ALTER TABLE operations Table MYTABLE__retiring_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int);   Queue is created to track changes  Table is copied with new column  Queues are applied to new copy until empty  Atomic flip to new table  Original table is deleted
  • 25. Backup, Replication & Disaster Recovery Asynchronous multi-point Replication ClustrixDB Parallel Backup Replicate to any cloud, any datacenter, anywhere
  • 27. MARIADB Clustrix Live Demo Peter Friedenbach Performance Architect MariaDB Corporation

Editor's Notes

  • #2: Title Slide for OpenWorks
  • #3: Note that DB in MariaDB is not bolded
  • #13: Note that DB in MariaDB is not bolded
  • #25: Title Slide for OpenWorks
  • #26: Note that DB in MariaDB is not bolded
  • #28: Simple queries Fielded by any node Routed to data node Complex queries Split into query fragments Process fragments in parallel
  • #29: Simple queries Fielded by any node Routed to data node Complex queries Split into query fragments Process fragments in parallel
  • #30: Simple queries Fielded by any node Routed to data node Complex queries Split into query fragments Process fragments in parallel
  • #34: Note that DB in MariaDB is not bolded
  • #39: Note that DB in MariaDB is not bolded
  • #42: Clustrix support MySQL replication both as master and slave – so you can replicate both ways. Within a cluster we saw earlier that all data has multiple copies For Disaster Recovery (when a whole region loses power) Clustrix has 2 options Fast Parallel Backup – This is in addition to slower MySqlDump backup Fast Parallel Replication – This is asynchronous across two Clustrix Clusters
  • #44: Title Slide for OpenWorks
  • #45: OpenWorks End Slide