SlideShare a Scribd company logo
Flexible transactional scale for the connected world.
Challenges to Scaling MySQL:
Pros & Cons of Sharding
Dave A. Anselmi @AnselmiDave
Director of Product Management
Questions for Today
o Why do DBAs think sharding is the only solution?
o What are the long-term costs of sharding?
o What is a better alternative to sharding MySQL?
o How real is it? Is it too good to be true?
PROPRIETARY & CONFIDENTIAL 2
HITTING CAPACITY
The Typical Path to Sharding…
PROPRIETARY & CONFIDENTIAL 4
SCALE
T I M E
App Too Slow
Lost Users
Hit the wall
App Too Slow
Lost Users
Hit the wall
(again)
LAMP stack
AWS, Azure,
RAX, GCE, etc
Private cloud
Migrate to
bigger
MySQL
machine
• Read slaves, then
Sharding, etc:
• Add more hardware &
DBAs
• Refactor Code
/Hardwired App
 More Expensive
 Higher Risk
 Lost Revenue
• Ongoing Refactoring
• Ongoing Hardware
• Ongoing Data balancing
• Ongoing Shard
Maintenance
 Increasing Expenses
 Increasing Maintenance
 Increasing Risk
PROPRIETARY & CONFIDENTIAL 5
SCALE
T I M E
App Too Slow
Lost Users
Hit the wall
App Too Slow
Lost Users
Hit the wall
(again)
LAMP stack
AWS, Azure,
RAX, GCE, etc
Private cloud
Migrate to
bigger
MySQL
machine
Repeat 
The Typical Path to Sharding…
• Read slaves, then
Sharding, etc:
• Add more hardware &
DBAs
• Refactor Code
/Hardwired App
 More Expensive
 Higher Risk
 Lost Revenue
This is tradition
o Lots of people do it, including Facebook
o You can hang onto MySQL
o You do get write scale… with caveats
PROPRIETARY & CONFIDENTIAL 6
‘INEVITABILITY’ of SHARDING...
Other Scaling Solutions Can’t Scale-out Writes
Approach How Pro’s Con’s
Scale-Up Keep increasing the size of
the (single) database
server
• Simple, no application changes needed
• Best solution for capacity, if it can handle
your workload
• Expensive. At some point, you’re paying 5x for
2x the performance
• Capacity Limit. Most clouds provide up to 64
‘vcpu’s at most for a single server
Read Slaves Add a ‘Slave’ read-
server(s) to ‘Master’
database server
• Simple to implement, lots of automation
available
• Read/write fan-out can be done at the
proxy level
• Best for read-heavy workloads- only adds
Read performance
• Data consistency issues can occur
• Critical apps still need to read from master
Master-Master Add additional ‘Master’(s)
to ‘Master’ database server
• Adds read + write scaling without needing
to shard
• Depending on workload, scaling can
approach linear
• Adds write scaling at the cost of read-slaves,
adding even more latency
• Application changes required to ensure data
consistency / conflict resolution
Sharding’s Two Options
Approach How Pro’s Con’s
Vertical
Sharding
Separating tables across
separate database servers
• Adds both write and read scaling,
depending on well-chosen table
distribution
• Much less difficult than ‘regular’ sharding,
with much of the gains
• Loses transactionality, referential integrity and
ACID across shards
• Consistent backups across all the shards are
very hard to manage
• Data management (skew/hotness) is ongoing
significant maintenance
Horizontal
Sharding
Partitioning tables across
separate database servers
• Adds both read and write scaling,
depending on well-chosen keys and low
skew
• Most common way to scale-out both
reads and writes
• Loses transactionality, referential integrity and
ACID across shards
• Consistent backups across all the shards are
very hard to manage
• Data management (skew/hotness) is ongoing
significant maintenance
CHALLENGES of SHARDING...
PROPRIETARY & CONFIDENTIAL 11
What are the Challenges of Sharding?
c.f. “SQL Databases Don’t Scale”, 2006
Data Skew
across the
shard array
ACID Transactions
across all nodes
Significant Application
Changes Required
Consistent Backups
across the shard array
Complicated Infra
Full redundancy
required for HA
Rolling Reboots
Sharding Key
management
Cross-node
Referential Integrity
Sharding’s Challenges: Application Changes
o Significant Application Changes required
– Do you need cross-node transactions?
– Do you need consistent data?
o ACID Transactions across all nodes
– No RDBMS ensures cross-node ACID transactions
– Avoiding cross-node transactions requires array of cross-node
replication, introducing latency and significant infra complexity
o Cross-node referential integrity
– Parent/child relationships between tables (& self-referentiality)
which cross nodes aren’t automatically maintained
– Cascade UPDATEs and DELETEs aren’t automatically maintained
PROPRIETARY & CONFIDENTIAL 12
Sharding’s Challenges: Data Maintenance
o Data Skew across the shard array
– Each shard grows independently
– As shards split (or are combined) requires topology and/or PK
changes, updates to memcache, etc
o Sharding Key management
– Careful creation of the key is critical; updates cause both app
changes and significant data re-distribution
– Shard-key:Shard lookups require fast KVS, often memcache
o Rolling Reboots
– Often needed to reset inter-shard communication and/or
consistency
PROPRIETARY & CONFIDENTIAL 13
Sharding’s Challenges: Infra Maintenance
o Complicated Infra
– Typically each shard has 2 to 3 nodes
– Shard mapping (often memcache)
– Replication between nodes for consistency
o Full redundancy required for HA
– Each shard typically has at least 1 full copy (cf Pinterest:
Master/Master, DR, and Backup node)
o Consistent Backups across the shard array
– At best you can recover to a point-in-time on each shard, and then
use replication to roll-forward to match the other shards
– Very involved and time-consuming
PROPRIETARY & CONFIDENTIAL 14
Review: Questions for Today
o Why do DBAs think sharding is the only end-game?
– Sharding is the only well-known way to scale-out Writes
o What are the long-term costs of sharding?
– Significant app changes required, else Business flexibility suffers
– Very complicated infrastructure is needed
– Lots of redundant hardware and DBA + DevOps OPEX
o What is a better alternative to sharding MySQL?
o How real is it? Is it too good to be true?
PROPRIETARY & CONFIDENTIAL 15
o Shared-nothing architecture  scales linearly
o Clustered  Built-in redundancy
o Elastic  Scale out or scale in
o Cloud-native  Works on any public or private cloud
The new RDBMS requirements
16
Load Balancer
Scaling-Out Writes + Reads
without Sharding
1. Single Logical Relational Database
2. Automatic Data Distribution
3. Automatic Query Fan-Out
4. Automatic Data Rebalancing
PROPRIETARY & CONFIDENTIAL 17
Introducing ClustrixDB
What does ClustrixDB do?
PROPRIETARY & CONFIDENTIAL 19
Scale Beyond MySQL without Replication or Sharding
 More Transactions
 More Connections
 More Reads
 More Writes
Applications
ClustrixDB
ClustrixDB: Write + Read Linear Scale-Out,
Fault-tolerant, MySQL-Compatible
PROPRIETARY & CONFIDENTIAL 20
ClustrixDB
ACID Compliant
Transactions & Joins
Optimized for OLTP
Built-In Fault Tolerance
Flex-Up and Flex-Down
Minimal DB Admin
Also runs GREAT in
the Data Center
Built to run
GREAT
in any Cloud
PROPRIETARY & CONFIDENTIAL 21
ClustrixDB Overview
o Fully consistent and ACID-compliant
database
– Supports transactions
– Supports joins
– Optimized for OLTP
– Also supports reporting SQL
o All nodes are equal (no “special” node)
o All servers are read/write
o All servers accept client connections
o Tables and indexes distributed across all
nodes
– Fully automatic distribution, rebalancing, and
re-protection
Fully Distributed & Consistent Cluster
PrivateNetwork
ClustrixDB on commodity/cloud servers
HW or SW Load
Balancer
High Concurrency
Custom:
PHP, Java, Ruby, etc
Packaged:
Magento, etc
SQL-Based
Applications
PROPRIETARY & CONFIDENTIAL 22
ClustrixDB Design: Shared Nothing
Architecture
o Query compiler
– Distribute compiled partial query fragments to
the node containing the ranking replica
o Data map
– All nodes know where all replicas are, and
current ranking replicas
o Database engine
– All nodes can perform all database operations
(no leader, aggregator, leaf, data-only, etc.
nodes)
o Data: table slices
– All table slices (default: replicas = 2) auto-
redistributed by the rebalancer
Each Node Contains
ClustrixDB
Compiler Map
Engine Data
Map
Data
Map
Data
Compiler
Compiler
Engine
Engine
Map
Scaling-Out Writes + Reads without Sharding
1. Single Logical Relational Database
1. ACID transactions across all nodes in cluster
2. Ad-hoc queries and on-line schema changes
2. Automatic Data Distribution
1. As tables and indexes are created
2. 64bit consistent hashing: slice location is predictable
3. Automatic Query Fan-Out
1. Compiled query fragments routed to node with data
2. Queries processed in parallel across all nodes in cluster
4. Automatic Data Rebalancing
1. Fixes hotspots and data imbalances
2. Reprotects to ensure HA/Fault Tolerance
PROPRIETARY & CONFIDENTIAL 23
PROPRIETARY & CONFIDENTIAL 24
ClustrixDB
Automatic Data Distribution
o Tables auto-split into slices
o Every slice has a replica on another node
– Slices are auto distributed, auto protected
No Manual Intervention Required
BillionsofRows
Database
Tables
S1 S2
S2
S3
S3
S4
S4
S5
S5S1
PROPRIETARY & CONFIDENTIAL 25
Adding + Removing Nodes: Scaling Out + In
o Easy and simple Flex Up (or Flex Down)
– Single minimal interruption of service
o All servers handle writes and reads
– Workload is spread across more servers
after Flex Up
o Data is automatically rebalanced across
the cluster
– Tables are online for reads and writes
S1
S2
S3
S3
S4
S4
S5
S1
ClustrixDB
S2
S5
PROPRIETARY & CONFIDENTIAL 26
Automatic Query Fan-Out
o Load balancer spreads DB connections
across all nodes
o A session is established on any node
o Session controls query(s) execution
– Parse SQL
– Generate the execution plan
– Compile into binary fragments
– Lookup record(s) location
– Send fragments
– Complete the transaction
 Parse
 Plan
 Compile
Session
SQL-based
Applications
UPDATE users
SET online = 1
WHERE id = 8797;
HW or SW Load
Balancer
ID: 8797 | … | ONLINE:0
Fragment
PROPRIETARY & CONFIDENTIAL 27
o Load balancer spreads DB connections
across all nodes
o A session is established on any node
o Session controls query(s) execution
– Parse SQL
– Generate the execution plan
– Compile into binary fragments
– Lookup record(s) location
– Send fragments
– Complete the transaction
Session
SQL-based
Applications
UPDATE users
SET online = 1
WHERE id = 8797;
HW or SW Load
Balancer
ID: 8797 | … | ONLINE:1
Ack
Ack
Automatic Query Fan-Out
PROPRIETARY & CONFIDENTIAL 28
Automatic Data Rebalancing
o The Rebalancer automatically:
– Initial Data
• Distributes the data into even slices across nodes
– Data Growth
• Splits large slices into smaller slices
– Flex Up/Flex Down
• Moves slices to leverage new nodes and/or evacuate nodes
– Failed Nodes
• Re-protects slices to ensure proper replicas exist
– Skewed Data
• Re-distributes the data to even out across nodes
– Hot Slice Balancing
• Finds hot slices and balances them across nodes
Q: How do you ensure data stays well distributed in a clustered environment?
A: Let the ClustrixDB Rebalancer handle it automatically!
…while the DB stays open for businessPatent 8,543,538
Patent 8,554,726
Patent 9,348,883
QUESTIONS?
THANK YOU!
Supplementary Slides
PROPRIETARY & CONFIDENTIAL 32
Online Schema Change
o Allows reads & writes during
ALTER TABLE operations
– Add or remove column
– Rename column
– Change data type
– CREATE INDEX
o Process:
– Queue created to track changes
– Copy of table created
– Queues replayed until synchronized
– Atomic flip of table between transactions
TableTable
Queue
MYTABLE __building_MYTABLE
Atomic Flip
Reads & Writes
ALTER TABLE mytable ADD (foo int);
PROPRIETARY & CONFIDENTIAL 33
Online Schema Change
o Allows reads & writes during
ALTER TABLE operations
– Add or remove column
– Rename column
– Change data type
– CREATE INDEX
o Process:
– Queue created to track changes
– Copy of table created
– Queues replayed until synchronized
– Atomic flip of table between transactions
TableTable
MYTABLE__building_MYTABLE
Atomic Flip
Reads & Writes
ALTER TABLE mytable ADD (foo int);

PERFORMANCE BENCHMARKS
SYSBENCH & YCSB
PROPRIETARY & CONFIDENTIAL 35
Sysbench OLTP 90:10 Mix
o 90% Reads + 10% Writes
– Very typical workload mix
o 1 TPS = 10 SQL
– 9 SELECT + 1 UDATE
– a.k.a 10 operations/sec
o Shows scaling TPS by
adding servers:
– Oak4 = 4 (8-core) servers
– Oak16 = 16 (8-core) servers
– Oak28 = 28 (8-core) servers
To scale TPS while maintaining
Performance (i.e. 20 ms response)
just add servers to ClustrixDB
800,000 SQL/sec
@ 20 ms
PROPRIETARY & CONFIDENTIAL 36
ClustrixDB vs. Aurora vs. MySQL RDS
o 90% Reads + 10% Writes
– Very typical workload mix
o 1 TPS = 10 SQL
– 9 SELECT + 1 UDATE
– a.k.a 10 operations/sec
o ClustrixDB shows scaling TPS
by adding servers:
– Aws4 = 4 (8-core) servers
– Aws16 = 16 (8-core) servers
– Aws20 = 20 (8-core) servers
ClustrixDB scales TPS far past Aurora’s
largest instance (db.r3.8xlarge)
MySQL
(largest)
Aurora
(largest)
ClustrixDB
PROPRIETARY & CONFIDENTIAL 37
Sysbench 100% Reads:
2.2 Million QPS @ 3ms latency (64 servers)
o Achieved 2.2 Million
SQL/sec
– With <3 ms query response
– Using 64 ClustrixDB servers
• 8-core AWS instances
Demonstrates linear scalability
to 64 servers
2,200,000 SQL/sec
@ 3 ms
PROPRIETARY & CONFIDENTIAL 38
Yahoo! Cloud Service Benchmark (YCSB)
95:05 Mix
o 95% Reads + 5% Writes
– 1 Ops/sec = 1 SQL
o Designed to compare NoSQL
engines
– ClustrixDB runs equally fast
o Shows scaling TPS by
adding servers:
– 4 nodes = 4 (8-core) servers
– 16 nodes = 16 (8-core) servers
– 24 nodes = 24 (8-core) servers
To scale TPS while maintaining
Performance (i.e. 3 ms response)
just add servers to ClustrixDB
> 875,000
SQL/sec
@ 3 ms

More Related Content

What's hot (20)

PDF
MariaDB on Docker
MariaDB plc
 
PPT
Scaling Up and Out your Virtualized SQL Servers
heraflux
 
PPTX
Running MariaDB in multiple data centers
MariaDB plc
 
PPTX
Right-Sizing your SQL Server Virtual Machine
heraflux
 
PPTX
Azure Databases with IaaS
Kellyn Pot'Vin-Gorman
 
PPTX
NoSQL A brief look at Apache Cassandra Distributed Database
Joe Alex
 
PDF
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
confluent
 
PDF
Webinar slides: Managing MySQL Replication for High Availability
Severalnines
 
PDF
Become a MySQL DBA: performing live database upgrades - webinar slides
Severalnines
 
PDF
Apache Cassandra Certification
Vskills
 
PDF
Client Drivers and Cassandra, the Right Way
DataStax Academy
 
PDF
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Dave Anselmi
 
PDF
Using all of the high availability options in MariaDB
MariaDB plc
 
PPTX
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
Esther Kundin
 
PPTX
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
MariaDB plc
 
PPTX
Kafka at Scale: Multi-Tier Architectures
Todd Palino
 
PPTX
MariaDB High Availability
MariaDB plc
 
PDF
Cassandra at eBay - Cassandra Summit 2013
Jay Patel
 
PDF
Best Practice for Achieving High Availability in MariaDB
MariaDB plc
 
PDF
Dev Ops without the Ops
Konstantin Gredeskoul
 
MariaDB on Docker
MariaDB plc
 
Scaling Up and Out your Virtualized SQL Servers
heraflux
 
Running MariaDB in multiple data centers
MariaDB plc
 
Right-Sizing your SQL Server Virtual Machine
heraflux
 
Azure Databases with IaaS
Kellyn Pot'Vin-Gorman
 
NoSQL A brief look at Apache Cassandra Distributed Database
Joe Alex
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
confluent
 
Webinar slides: Managing MySQL Replication for High Availability
Severalnines
 
Become a MySQL DBA: performing live database upgrades - webinar slides
Severalnines
 
Apache Cassandra Certification
Vskills
 
Client Drivers and Cassandra, the Right Way
DataStax Academy
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Dave Anselmi
 
Using all of the high availability options in MariaDB
MariaDB plc
 
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
Esther Kundin
 
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
MariaDB plc
 
Kafka at Scale: Multi-Tier Architectures
Todd Palino
 
MariaDB High Availability
MariaDB plc
 
Cassandra at eBay - Cassandra Summit 2013
Jay Patel
 
Best Practice for Achieving High Availability in MariaDB
MariaDB plc
 
Dev Ops without the Ops
Konstantin Gredeskoul
 

Similar to Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL? (20)

PPTX
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Clustrix
 
PPTX
Beyond Aurora. Scale-out SQL databases for AWS
Clustrix
 
PDF
Introduction to ClustrixDB
I Goo Lee
 
PPTX
Scaling SQL and NoSQL Databases in the Cloud
RightScale
 
PDF
Scaling Your Database In The Cloud
Cory Isaacson
 
PDF
Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.
Clustrix
 
PDF
20141206 4 q14_dataconference_i_am_your_db
hyeongchae lee
 
PPTX
NOSQL DATABASES UNIT-3 FOR ENGINEERING STUDENTS
Abcd463572
 
PPTX
Distribution Models.pptxgdfgdfgdfgfdgdfg
zmulani8
 
PDF
MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen
ryanthiessen
 
PPTX
Database highload solutions
larest
 
PPTX
CodeFutures - Scaling Your Database in the Cloud
RightScale
 
PDF
What every developer should know about database scalability, PyCon 2010
jbellis
 
PPTX
Database highload solutions
larest
 
PDF
MongoDB Sharding
uzzal basak
 
PPT
Scaling Your Web Application
Ketan Deshmukh
 
PDF
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...
Tammy Bednar
 
PPT
Everything You Need to Know About Sharding
MongoDB
 
PDF
Scalability Considerations
Navid Malek
 
PDF
How We Scaled Freshdesk To Take 65M Requests/week
Freshdesk Inc.
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Clustrix
 
Beyond Aurora. Scale-out SQL databases for AWS
Clustrix
 
Introduction to ClustrixDB
I Goo Lee
 
Scaling SQL and NoSQL Databases in the Cloud
RightScale
 
Scaling Your Database In The Cloud
Cory Isaacson
 
Benchmark: Beyond Aurora. Scale-out SQL databases for AWS.
Clustrix
 
20141206 4 q14_dataconference_i_am_your_db
hyeongchae lee
 
NOSQL DATABASES UNIT-3 FOR ENGINEERING STUDENTS
Abcd463572
 
Distribution Models.pptxgdfgdfgdfgfdgdfg
zmulani8
 
MySQL Conference 2011 -- The Secret Sauce of Sharding -- Ryan Thiessen
ryanthiessen
 
Database highload solutions
larest
 
CodeFutures - Scaling Your Database in the Cloud
RightScale
 
What every developer should know about database scalability, PyCon 2010
jbellis
 
Database highload solutions
larest
 
MongoDB Sharding
uzzal basak
 
Scaling Your Web Application
Ketan Deshmukh
 
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...
Tammy Bednar
 
Everything You Need to Know About Sharding
MongoDB
 
Scalability Considerations
Navid Malek
 
How We Scaled Freshdesk To Take 65M Requests/week
Freshdesk Inc.
 
Ad

More from Clustrix (10)

PDF
Achieve new levels of performance for Magento e-commerce sites.
Clustrix
 
PDF
ClustrixDB 7.5 Announcement
Clustrix
 
PPTX
Moving an E-commerce Site to AWS. A Case Study
Clustrix
 
PPTX
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Clustrix
 
PPTX
Scaling Techniques to Increase Magento Capacity
Clustrix
 
PPTX
Supersizing Magento
Clustrix
 
PDF
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Clustrix
 
PDF
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
Clustrix
 
PPTX
Clustrix Database Overview
Clustrix
 
PPTX
Clustrix Database Percona Ruby on Rails benchmark
Clustrix
 
Achieve new levels of performance for Magento e-commerce sites.
Clustrix
 
ClustrixDB 7.5 Announcement
Clustrix
 
Moving an E-commerce Site to AWS. A Case Study
Clustrix
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Clustrix
 
Scaling Techniques to Increase Magento Capacity
Clustrix
 
Supersizing Magento
Clustrix
 
Why Traditional Databases Fail so Miserably to Scale with E-Commerce Site Growth
Clustrix
 
E-Commerce Success is a Balancing Act. Ensure Success with ClustrixDB.
Clustrix
 
Clustrix Database Overview
Clustrix
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix
 
Ad

Recently uploaded (20)

PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 

Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?

  • 1. Flexible transactional scale for the connected world. Challenges to Scaling MySQL: Pros & Cons of Sharding Dave A. Anselmi @AnselmiDave Director of Product Management
  • 2. Questions for Today o Why do DBAs think sharding is the only solution? o What are the long-term costs of sharding? o What is a better alternative to sharding MySQL? o How real is it? Is it too good to be true? PROPRIETARY & CONFIDENTIAL 2
  • 4. The Typical Path to Sharding… PROPRIETARY & CONFIDENTIAL 4 SCALE T I M E App Too Slow Lost Users Hit the wall App Too Slow Lost Users Hit the wall (again) LAMP stack AWS, Azure, RAX, GCE, etc Private cloud Migrate to bigger MySQL machine • Read slaves, then Sharding, etc: • Add more hardware & DBAs • Refactor Code /Hardwired App  More Expensive  Higher Risk  Lost Revenue
  • 5. • Ongoing Refactoring • Ongoing Hardware • Ongoing Data balancing • Ongoing Shard Maintenance  Increasing Expenses  Increasing Maintenance  Increasing Risk PROPRIETARY & CONFIDENTIAL 5 SCALE T I M E App Too Slow Lost Users Hit the wall App Too Slow Lost Users Hit the wall (again) LAMP stack AWS, Azure, RAX, GCE, etc Private cloud Migrate to bigger MySQL machine Repeat  The Typical Path to Sharding… • Read slaves, then Sharding, etc: • Add more hardware & DBAs • Refactor Code /Hardwired App  More Expensive  Higher Risk  Lost Revenue
  • 6. This is tradition o Lots of people do it, including Facebook o You can hang onto MySQL o You do get write scale… with caveats PROPRIETARY & CONFIDENTIAL 6
  • 8. Other Scaling Solutions Can’t Scale-out Writes Approach How Pro’s Con’s Scale-Up Keep increasing the size of the (single) database server • Simple, no application changes needed • Best solution for capacity, if it can handle your workload • Expensive. At some point, you’re paying 5x for 2x the performance • Capacity Limit. Most clouds provide up to 64 ‘vcpu’s at most for a single server Read Slaves Add a ‘Slave’ read- server(s) to ‘Master’ database server • Simple to implement, lots of automation available • Read/write fan-out can be done at the proxy level • Best for read-heavy workloads- only adds Read performance • Data consistency issues can occur • Critical apps still need to read from master Master-Master Add additional ‘Master’(s) to ‘Master’ database server • Adds read + write scaling without needing to shard • Depending on workload, scaling can approach linear • Adds write scaling at the cost of read-slaves, adding even more latency • Application changes required to ensure data consistency / conflict resolution
  • 9. Sharding’s Two Options Approach How Pro’s Con’s Vertical Sharding Separating tables across separate database servers • Adds both write and read scaling, depending on well-chosen table distribution • Much less difficult than ‘regular’ sharding, with much of the gains • Loses transactionality, referential integrity and ACID across shards • Consistent backups across all the shards are very hard to manage • Data management (skew/hotness) is ongoing significant maintenance Horizontal Sharding Partitioning tables across separate database servers • Adds both read and write scaling, depending on well-chosen keys and low skew • Most common way to scale-out both reads and writes • Loses transactionality, referential integrity and ACID across shards • Consistent backups across all the shards are very hard to manage • Data management (skew/hotness) is ongoing significant maintenance
  • 11. PROPRIETARY & CONFIDENTIAL 11 What are the Challenges of Sharding? c.f. “SQL Databases Don’t Scale”, 2006 Data Skew across the shard array ACID Transactions across all nodes Significant Application Changes Required Consistent Backups across the shard array Complicated Infra Full redundancy required for HA Rolling Reboots Sharding Key management Cross-node Referential Integrity
  • 12. Sharding’s Challenges: Application Changes o Significant Application Changes required – Do you need cross-node transactions? – Do you need consistent data? o ACID Transactions across all nodes – No RDBMS ensures cross-node ACID transactions – Avoiding cross-node transactions requires array of cross-node replication, introducing latency and significant infra complexity o Cross-node referential integrity – Parent/child relationships between tables (& self-referentiality) which cross nodes aren’t automatically maintained – Cascade UPDATEs and DELETEs aren’t automatically maintained PROPRIETARY & CONFIDENTIAL 12
  • 13. Sharding’s Challenges: Data Maintenance o Data Skew across the shard array – Each shard grows independently – As shards split (or are combined) requires topology and/or PK changes, updates to memcache, etc o Sharding Key management – Careful creation of the key is critical; updates cause both app changes and significant data re-distribution – Shard-key:Shard lookups require fast KVS, often memcache o Rolling Reboots – Often needed to reset inter-shard communication and/or consistency PROPRIETARY & CONFIDENTIAL 13
  • 14. Sharding’s Challenges: Infra Maintenance o Complicated Infra – Typically each shard has 2 to 3 nodes – Shard mapping (often memcache) – Replication between nodes for consistency o Full redundancy required for HA – Each shard typically has at least 1 full copy (cf Pinterest: Master/Master, DR, and Backup node) o Consistent Backups across the shard array – At best you can recover to a point-in-time on each shard, and then use replication to roll-forward to match the other shards – Very involved and time-consuming PROPRIETARY & CONFIDENTIAL 14
  • 15. Review: Questions for Today o Why do DBAs think sharding is the only end-game? – Sharding is the only well-known way to scale-out Writes o What are the long-term costs of sharding? – Significant app changes required, else Business flexibility suffers – Very complicated infrastructure is needed – Lots of redundant hardware and DBA + DevOps OPEX o What is a better alternative to sharding MySQL? o How real is it? Is it too good to be true? PROPRIETARY & CONFIDENTIAL 15
  • 16. o Shared-nothing architecture  scales linearly o Clustered  Built-in redundancy o Elastic  Scale out or scale in o Cloud-native  Works on any public or private cloud The new RDBMS requirements 16 Load Balancer
  • 17. Scaling-Out Writes + Reads without Sharding 1. Single Logical Relational Database 2. Automatic Data Distribution 3. Automatic Query Fan-Out 4. Automatic Data Rebalancing PROPRIETARY & CONFIDENTIAL 17
  • 19. What does ClustrixDB do? PROPRIETARY & CONFIDENTIAL 19 Scale Beyond MySQL without Replication or Sharding  More Transactions  More Connections  More Reads  More Writes Applications ClustrixDB
  • 20. ClustrixDB: Write + Read Linear Scale-Out, Fault-tolerant, MySQL-Compatible PROPRIETARY & CONFIDENTIAL 20 ClustrixDB ACID Compliant Transactions & Joins Optimized for OLTP Built-In Fault Tolerance Flex-Up and Flex-Down Minimal DB Admin Also runs GREAT in the Data Center Built to run GREAT in any Cloud
  • 21. PROPRIETARY & CONFIDENTIAL 21 ClustrixDB Overview o Fully consistent and ACID-compliant database – Supports transactions – Supports joins – Optimized for OLTP – Also supports reporting SQL o All nodes are equal (no “special” node) o All servers are read/write o All servers accept client connections o Tables and indexes distributed across all nodes – Fully automatic distribution, rebalancing, and re-protection Fully Distributed & Consistent Cluster PrivateNetwork ClustrixDB on commodity/cloud servers HW or SW Load Balancer High Concurrency Custom: PHP, Java, Ruby, etc Packaged: Magento, etc SQL-Based Applications
  • 22. PROPRIETARY & CONFIDENTIAL 22 ClustrixDB Design: Shared Nothing Architecture o Query compiler – Distribute compiled partial query fragments to the node containing the ranking replica o Data map – All nodes know where all replicas are, and current ranking replicas o Database engine – All nodes can perform all database operations (no leader, aggregator, leaf, data-only, etc. nodes) o Data: table slices – All table slices (default: replicas = 2) auto- redistributed by the rebalancer Each Node Contains ClustrixDB Compiler Map Engine Data Map Data Map Data Compiler Compiler Engine Engine Map
  • 23. Scaling-Out Writes + Reads without Sharding 1. Single Logical Relational Database 1. ACID transactions across all nodes in cluster 2. Ad-hoc queries and on-line schema changes 2. Automatic Data Distribution 1. As tables and indexes are created 2. 64bit consistent hashing: slice location is predictable 3. Automatic Query Fan-Out 1. Compiled query fragments routed to node with data 2. Queries processed in parallel across all nodes in cluster 4. Automatic Data Rebalancing 1. Fixes hotspots and data imbalances 2. Reprotects to ensure HA/Fault Tolerance PROPRIETARY & CONFIDENTIAL 23
  • 24. PROPRIETARY & CONFIDENTIAL 24 ClustrixDB Automatic Data Distribution o Tables auto-split into slices o Every slice has a replica on another node – Slices are auto distributed, auto protected No Manual Intervention Required BillionsofRows Database Tables S1 S2 S2 S3 S3 S4 S4 S5 S5S1
  • 25. PROPRIETARY & CONFIDENTIAL 25 Adding + Removing Nodes: Scaling Out + In o Easy and simple Flex Up (or Flex Down) – Single minimal interruption of service o All servers handle writes and reads – Workload is spread across more servers after Flex Up o Data is automatically rebalanced across the cluster – Tables are online for reads and writes S1 S2 S3 S3 S4 S4 S5 S1 ClustrixDB S2 S5
  • 26. PROPRIETARY & CONFIDENTIAL 26 Automatic Query Fan-Out o Load balancer spreads DB connections across all nodes o A session is established on any node o Session controls query(s) execution – Parse SQL – Generate the execution plan – Compile into binary fragments – Lookup record(s) location – Send fragments – Complete the transaction  Parse  Plan  Compile Session SQL-based Applications UPDATE users SET online = 1 WHERE id = 8797; HW or SW Load Balancer ID: 8797 | … | ONLINE:0 Fragment
  • 27. PROPRIETARY & CONFIDENTIAL 27 o Load balancer spreads DB connections across all nodes o A session is established on any node o Session controls query(s) execution – Parse SQL – Generate the execution plan – Compile into binary fragments – Lookup record(s) location – Send fragments – Complete the transaction Session SQL-based Applications UPDATE users SET online = 1 WHERE id = 8797; HW or SW Load Balancer ID: 8797 | … | ONLINE:1 Ack Ack Automatic Query Fan-Out
  • 28. PROPRIETARY & CONFIDENTIAL 28 Automatic Data Rebalancing o The Rebalancer automatically: – Initial Data • Distributes the data into even slices across nodes – Data Growth • Splits large slices into smaller slices – Flex Up/Flex Down • Moves slices to leverage new nodes and/or evacuate nodes – Failed Nodes • Re-protects slices to ensure proper replicas exist – Skewed Data • Re-distributes the data to even out across nodes – Hot Slice Balancing • Finds hot slices and balances them across nodes Q: How do you ensure data stays well distributed in a clustered environment? A: Let the ClustrixDB Rebalancer handle it automatically! …while the DB stays open for businessPatent 8,543,538 Patent 8,554,726 Patent 9,348,883
  • 32. PROPRIETARY & CONFIDENTIAL 32 Online Schema Change o Allows reads & writes during ALTER TABLE operations – Add or remove column – Rename column – Change data type – CREATE INDEX o Process: – Queue created to track changes – Copy of table created – Queues replayed until synchronized – Atomic flip of table between transactions TableTable Queue MYTABLE __building_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int);
  • 33. PROPRIETARY & CONFIDENTIAL 33 Online Schema Change o Allows reads & writes during ALTER TABLE operations – Add or remove column – Rename column – Change data type – CREATE INDEX o Process: – Queue created to track changes – Copy of table created – Queues replayed until synchronized – Atomic flip of table between transactions TableTable MYTABLE__building_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int); 
  • 35. PROPRIETARY & CONFIDENTIAL 35 Sysbench OLTP 90:10 Mix o 90% Reads + 10% Writes – Very typical workload mix o 1 TPS = 10 SQL – 9 SELECT + 1 UDATE – a.k.a 10 operations/sec o Shows scaling TPS by adding servers: – Oak4 = 4 (8-core) servers – Oak16 = 16 (8-core) servers – Oak28 = 28 (8-core) servers To scale TPS while maintaining Performance (i.e. 20 ms response) just add servers to ClustrixDB 800,000 SQL/sec @ 20 ms
  • 36. PROPRIETARY & CONFIDENTIAL 36 ClustrixDB vs. Aurora vs. MySQL RDS o 90% Reads + 10% Writes – Very typical workload mix o 1 TPS = 10 SQL – 9 SELECT + 1 UDATE – a.k.a 10 operations/sec o ClustrixDB shows scaling TPS by adding servers: – Aws4 = 4 (8-core) servers – Aws16 = 16 (8-core) servers – Aws20 = 20 (8-core) servers ClustrixDB scales TPS far past Aurora’s largest instance (db.r3.8xlarge) MySQL (largest) Aurora (largest) ClustrixDB
  • 37. PROPRIETARY & CONFIDENTIAL 37 Sysbench 100% Reads: 2.2 Million QPS @ 3ms latency (64 servers) o Achieved 2.2 Million SQL/sec – With <3 ms query response – Using 64 ClustrixDB servers • 8-core AWS instances Demonstrates linear scalability to 64 servers 2,200,000 SQL/sec @ 3 ms
  • 38. PROPRIETARY & CONFIDENTIAL 38 Yahoo! Cloud Service Benchmark (YCSB) 95:05 Mix o 95% Reads + 5% Writes – 1 Ops/sec = 1 SQL o Designed to compare NoSQL engines – ClustrixDB runs equally fast o Shows scaling TPS by adding servers: – 4 nodes = 4 (8-core) servers – 16 nodes = 16 (8-core) servers – 24 nodes = 24 (8-core) servers To scale TPS while maintaining Performance (i.e. 3 ms response) just add servers to ClustrixDB > 875,000 SQL/sec @ 3 ms