SlideShare a Scribd company logo
Scaling MongoDB
Sharding into, and beyond the Multi-Terabyte Range
MongoNYC 2013
Kenny Gorman
Founder, ObjectRocket
@objectrocket @kennygorman
Philosophy
Design
@scale
Philosophy
scalability is the ability of a system, network, or
process to handle a growing amount of work in a
capable manner or its ability to be enlarged to
accommodate that growth.
- André B. Bondi, 'Characteristics of scalability and their impact on
performance', Proceedings of the 2nd international workshop on Software and
performance
Philosophy
● Benefits of scaling horizontally
○ Adding compute in linear relation to storage
○ Use many smaller systems
○ Cost benefits
○ Start small, grow over time
○ Incremental approach
Philosophy
● Do I need to scale?
yes...
and...
no...
Philosophy
● Scale vertically works only for a while
○ Ratio's get whack
○ You are going to wish you did something else for a
living
● Split workloads
○ Figure out what needs to be colocated with what.
● Then scale horizontally
○ Shard!
Design
● Scaling vertically vs horizontally
● MongoDB horizontal scalability; sharding
○ Sharding architecture
○ Sharding keys and collections
● Achieving your scaling goals
○ Tuning for writes
○ Tuning for reads
○ Lock scopes
Design - Architecture
Client
Routing Metadata
Data
Design - Shard keys and patterns
● Keys
○ Range based keys
■ Know your access patterns
○ Hash based keys
■ More generic/easy option
○ Use profiler and explain() to identify queries and what
patterns should be used.
○ Local and Scatter Gather
■ May not get every query to be local
PLAN AHEAD
Design - Shard keys and patterns
Design - Sharding Collections
● Shard Collection
● Based on a shard key
○ Range based
○ Hash based
● Chunks
● Chunk location
○ Balancer
○ Manual
{
"_id" : "mydb.users",
"lastmod" : ISODate("1970-01-16T20:33:39.757Z"),
"dropped" : false,
"key" : {
"_id" : "hashed"
},
"unique" : false,
"lastmodEpoch" : ObjectId("51a8d7a261c75a12f1c7f833")
}
Design - Chunks
64mb chunk
64mb chunk
64mb chunk
64mb chunk
64mb chunk
Shard: 001
DB: test
Collection: foo
Shard: 002
DB: test
Collection: foo
Balancer
Design - Chunks
{
"_id" : "mydb.users-_id_-2315986245884394206",
"lastmod" : { "t" : 89, "i" : 0 },
"lastmodEpoch" : ObjectId("51a8d7a261c75a12f1c7f833"),
"ns" : "mydb.users",
"min" : { "_id" : NumberLong("-2315986245884394206") },
"max" : { "_id" : NumberLong("-2237069208547820477") },
"shard" : "d3b07384d113edec49eaa6238ad5ff00"}
{
"_id" : "mydb.users-_id_-2395340237016371204",
"lastmod" : { "t" : 88, "i" : 0 },
"lastmodEpoch" : ObjectId("51a8d7a261c75a12f1c7f833"),
"ns" : "mydb.users",
"min" : { "_id" : NumberLong("-2395340237016371204")},
"max" : { "_id" : NumberLong("-2315986245884394206")},
"shard" : "15e894ac57eddb32713e7eae90d13e41"
}
Design - take aways
● Getting the proper shard key is critical. Once defined it's
a pain in the a^$ to change.
● Creating a shard key that achieves your goals can
sometimes be tricky. Take time to test this portion in
dev/sandbox environments.
● Use profiler and explain() to figure out if you are using
proper keys
● Understanding the Balancer's effect on your workload is
critical. You probably need more I/O capacity than you
think.
@scale - balancing
● Balancing is hard
○ Visibility
○ Balancer process
@scale - balancing
var balance_check = function(n) {
if ( n ) {
var output = db.chunks.aggregate([
{ $group : { _id : { "_id":"$ns", "shard":"$shard" },
chunks : { $sum : 1 } }},
{ $match : { "_id._id" : n } },
{ $sort : { "chunks" : 1 } }
]);
} else {
var output = db.chunks.aggregate([
{ $group : { _id : { "_id":"$ns", "shard":"$shard" },
chunks : { $sum : 1 } }},
{ $sort : { "chunks" : 1 } }
]);
}
printjson(output);
};
https://gist.github.com/kgorman/5775530
mongos> balance_check("mydb.users")
{ "result" : [
{ "_id" : {
"_id" : "mydb.users",
"shard" : "884e49a58a63060782d767feed8e6c88"
},
"chunks" : 1 #<------ !!!!!!! OH NO
},
{ "_id" : {
"_id" : "mydb.users",
"shard" : "15e894ac57eddb32713e7eae90d13e41"
},
"chunks" : 77
},
{ "_id" : {
"_id" : "mydb.users",
"shard" : "1134604ead16f77309235aa3d327bb59"
},
"chunks" : 77
},
{ "_id" : {
"_id" : "mydb.users",
"shard" : "d3b07384d113edec49eaa6238ad5ff00"
},
"chunks" : 78
@scale - balancing
@scale - balancing
WTF
● Balancer Havoc!
@scale - balancing
● Balancer 'Fixes'
○ Pre-splitting
○ Windows
○ Micro-windows
○ Custom scripts
○ Don't use it
@scale - Monitoring
● Monitor everything. But some key items:
○ Shard size
○ Balancer on/off
○ Response time
○ Balance of OPS across shards
○ Failed migration of chunks
○ Locks
○ I/O
● Get histograms!
○ Graphite
@scale - Capacity
● Don't fail to plan
● Disk space/size is critical
○ maxSize()
○ extending disk space
○ adding cpu or memory capacity
○ Slave 'tricks'
■ Shell game
● Compute resources
● You need disk I/O no matter what anyone says.
○ Size for balancer workloads too
http://blog.foursquare.com/2010/10/05/so-that-was-a-bummer/
1. We’re making changes to our operational procedures to prevent overloading, and to ensure that
future occurrences have safeguards so foursquare stays up.
@scale
● Things to watch for
○ Out of disk space or other resources
■ Don't wait!
○ Balancer havoc
○ No more I/O left
○ Shard Asymmetry
○ Scatter gather's
● Things to ensure you do
○ Use maxSize, leave yourself a bit of wiggle room
○ Leave profiler on!
○ Explain and profile your queries
Contact
@kennygorman
@objectrocket
kgorman@objectrocket.com
https://www.objectrocket.com
WE ARE HIRING!
https://www.objectrocket.com/careers/

More Related Content

What's hot (20)

PDF
Data analysis and visualization with mongo db [mongodb world 2016]
Alexander Hendorf
 
PDF
jQuery's Secrets
Bastian Feder
 
PDF
An introduction to Scala.js
Knoldus Inc.
 
PDF
HadoopとMongoDBを活用したソーシャルアプリのログ解析
Takahiro Inoue
 
PDF
MongoDB Aggregation Framework in action !
Sébastien Prunier
 
PPTX
HTML Views: Where are my classes gone?
smirolo
 
PDF
制約を用いた作図言語Pita
agehama
 
PPTX
Mythbusting: Understanding How We Measure the Performance of MongoDB
MongoDB
 
PPTX
Querying mongo db
Bogdan Sabău
 
PDF
The Ring programming language version 1.8 book - Part 49 of 202
Mahmoud Samir Fayed
 
PPTX
Mythbusting: Understanding How We Measure the Performance of MongoDB
MongoDB
 
PDF
MySQL Without the SQL - Oh My! August 2nd presentation at Mid Atlantic Develo...
Dave Stokes
 
PDF
Datacon LA - MySQL without the SQL - Oh my!
Dave Stokes
 
PPT
Java Script Basics
Ravi Kumar Hamsa
 
PPTX
jQuery
Jeremiah Gatong
 
PPTX
Windows ストアーアプリで SQLite を使ってみよう
ShinichiAoyagi
 
PPTX
Node.js and angular js
HyungKuIm
 
PDF
Tools and Projects Dec 2018 Edition
Jesus Manuel Olivas
 
PPTX
Working with NoSQL in a SQL Database (XDevApi)
Lior Altarescu
 
PPTX
NoSQL in SQL - Lior Altarescu
Wix Engineering
 
Data analysis and visualization with mongo db [mongodb world 2016]
Alexander Hendorf
 
jQuery's Secrets
Bastian Feder
 
An introduction to Scala.js
Knoldus Inc.
 
HadoopとMongoDBを活用したソーシャルアプリのログ解析
Takahiro Inoue
 
MongoDB Aggregation Framework in action !
Sébastien Prunier
 
HTML Views: Where are my classes gone?
smirolo
 
制約を用いた作図言語Pita
agehama
 
Mythbusting: Understanding How We Measure the Performance of MongoDB
MongoDB
 
Querying mongo db
Bogdan Sabău
 
The Ring programming language version 1.8 book - Part 49 of 202
Mahmoud Samir Fayed
 
Mythbusting: Understanding How We Measure the Performance of MongoDB
MongoDB
 
MySQL Without the SQL - Oh My! August 2nd presentation at Mid Atlantic Develo...
Dave Stokes
 
Datacon LA - MySQL without the SQL - Oh my!
Dave Stokes
 
Java Script Basics
Ravi Kumar Hamsa
 
Windows ストアーアプリで SQLite を使ってみよう
ShinichiAoyagi
 
Node.js and angular js
HyungKuIm
 
Tools and Projects Dec 2018 Edition
Jesus Manuel Olivas
 
Working with NoSQL in a SQL Database (XDevApi)
Lior Altarescu
 
NoSQL in SQL - Lior Altarescu
Wix Engineering
 

Viewers also liked (19)

PDF
EclipseConEurope2012 SOA - Talend with EasySOA
Marc Dutoo
 
PPTX
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB
 
PDF
Capacity Planning
MongoDB
 
PPT
HPTS talk on micro-sharding with Katta
Ted Dunning
 
PPT
MongoDB Sharding Webinar 2014
Dylan Tong
 
PDF
Building a High-Performance Distributed Task Queue on MongoDB
MongoDB
 
KEY
Sharding with MongoDB (Eliot Horowitz)
MongoSF
 
KEY
Mongodb sharding
xiangrong
 
PPTX
Event-Based Subscription with MongoDB
MongoDB
 
PDF
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Gabriele Baldassarre
 
PDF
Enterprise Integration Patterns Revisited (EIP, Apache Camel, Talend ESB)
Kai Wähner
 
PPTX
Sharding Methods for MongoDB
MongoDB
 
PPTX
The Aggregation Framework
MongoDB
 
PPTX
Back to Basics Webinar 3: Introduction to Replica Sets
MongoDB
 
PPTX
Back to Basics 2017: Introduction to Sharding
MongoDB
 
PDF
Webinar: Working with Graph Data in MongoDB
MongoDB
 
PDF
Webinar: 10-Step Guide to Creating a Single View of your Business
MongoDB
 
PDF
MongoDB as Message Queue
MongoDB
 
KEY
MongoDB, E-commerce and Transactions
Steven Francia
 
EclipseConEurope2012 SOA - Talend with EasySOA
Marc Dutoo
 
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB
 
Capacity Planning
MongoDB
 
HPTS talk on micro-sharding with Katta
Ted Dunning
 
MongoDB Sharding Webinar 2014
Dylan Tong
 
Building a High-Performance Distributed Task Queue on MongoDB
MongoDB
 
Sharding with MongoDB (Eliot Horowitz)
MongoSF
 
Mongodb sharding
xiangrong
 
Event-Based Subscription with MongoDB
MongoDB
 
Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tr...
Gabriele Baldassarre
 
Enterprise Integration Patterns Revisited (EIP, Apache Camel, Talend ESB)
Kai Wähner
 
Sharding Methods for MongoDB
MongoDB
 
The Aggregation Framework
MongoDB
 
Back to Basics Webinar 3: Introduction to Replica Sets
MongoDB
 
Back to Basics 2017: Introduction to Sharding
MongoDB
 
Webinar: Working with Graph Data in MongoDB
MongoDB
 
Webinar: 10-Step Guide to Creating a Single View of your Business
MongoDB
 
MongoDB as Message Queue
MongoDB
 
MongoDB, E-commerce and Transactions
Steven Francia
 
Ad

Similar to Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range (20)

PPTX
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
MongoDB
 
PPT
2011 mongo FR - scaling with mongodb
antoinegirbal
 
PPTX
Advanced Sharding Features in MongoDB 2.4
MongoDB
 
PDF
OSDC 2012 | Scaling with MongoDB by Ross Lawley
NETWAYS
 
PPTX
Introduction to Sharding
MongoDB
 
KEY
2011 mongo sf-scaling
MongoDB
 
PPTX
Sharding
MongoDB
 
PDF
Sharding
MongoDB
 
PPTX
Sharding Overview
MongoDB
 
PDF
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Ontico
 
PPTX
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB
 
PPTX
Introduction to Sharding
MongoDB
 
KEY
Scaling with MongoDB
MongoDB
 
PPTX
MongoDB Auto-Sharding at Mongo Seattle
MongoDB
 
PDF
Introduction to Sharding
MongoDB
 
PPTX
2014 05-07-fr - add dev series - session 6 - deploying your application-2
MongoDB
 
PDF
Scaling MongoDB - Presentation at MTP
darkdata
 
PDF
Sharding and things we'd like to see improved
Igor Donchovski
 
PPTX
Scaling to 30,000 Requests Per Second and Beyond with MongoDB
mchesnut
 
PPTX
Back to Basics 3: Scaling 30,000 Requests a Second with MongoDB
MongoDB
 
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
MongoDB
 
2011 mongo FR - scaling with mongodb
antoinegirbal
 
Advanced Sharding Features in MongoDB 2.4
MongoDB
 
OSDC 2012 | Scaling with MongoDB by Ross Lawley
NETWAYS
 
Introduction to Sharding
MongoDB
 
2011 mongo sf-scaling
MongoDB
 
Sharding
MongoDB
 
Sharding
MongoDB
 
Sharding Overview
MongoDB
 
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Ontico
 
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB
 
Introduction to Sharding
MongoDB
 
Scaling with MongoDB
MongoDB
 
MongoDB Auto-Sharding at Mongo Seattle
MongoDB
 
Introduction to Sharding
MongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
MongoDB
 
Scaling MongoDB - Presentation at MTP
darkdata
 
Sharding and things we'd like to see improved
Igor Donchovski
 
Scaling to 30,000 Requests Per Second and Beyond with MongoDB
mchesnut
 
Back to Basics 3: Scaling 30,000 Requests a Second with MongoDB
MongoDB
 
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range

  • 1. Scaling MongoDB Sharding into, and beyond the Multi-Terabyte Range MongoNYC 2013 Kenny Gorman Founder, ObjectRocket @objectrocket @kennygorman
  • 3. Philosophy scalability is the ability of a system, network, or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth. - André B. Bondi, 'Characteristics of scalability and their impact on performance', Proceedings of the 2nd international workshop on Software and performance
  • 4. Philosophy ● Benefits of scaling horizontally ○ Adding compute in linear relation to storage ○ Use many smaller systems ○ Cost benefits ○ Start small, grow over time ○ Incremental approach
  • 5. Philosophy ● Do I need to scale? yes... and... no...
  • 6. Philosophy ● Scale vertically works only for a while ○ Ratio's get whack ○ You are going to wish you did something else for a living ● Split workloads ○ Figure out what needs to be colocated with what. ● Then scale horizontally ○ Shard!
  • 7. Design ● Scaling vertically vs horizontally ● MongoDB horizontal scalability; sharding ○ Sharding architecture ○ Sharding keys and collections ● Achieving your scaling goals ○ Tuning for writes ○ Tuning for reads ○ Lock scopes
  • 9. Design - Shard keys and patterns ● Keys ○ Range based keys ■ Know your access patterns ○ Hash based keys ■ More generic/easy option ○ Use profiler and explain() to identify queries and what patterns should be used. ○ Local and Scatter Gather ■ May not get every query to be local PLAN AHEAD
  • 10. Design - Shard keys and patterns
  • 11. Design - Sharding Collections ● Shard Collection ● Based on a shard key ○ Range based ○ Hash based ● Chunks ● Chunk location ○ Balancer ○ Manual { "_id" : "mydb.users", "lastmod" : ISODate("1970-01-16T20:33:39.757Z"), "dropped" : false, "key" : { "_id" : "hashed" }, "unique" : false, "lastmodEpoch" : ObjectId("51a8d7a261c75a12f1c7f833") }
  • 12. Design - Chunks 64mb chunk 64mb chunk 64mb chunk 64mb chunk 64mb chunk Shard: 001 DB: test Collection: foo Shard: 002 DB: test Collection: foo Balancer
  • 13. Design - Chunks { "_id" : "mydb.users-_id_-2315986245884394206", "lastmod" : { "t" : 89, "i" : 0 }, "lastmodEpoch" : ObjectId("51a8d7a261c75a12f1c7f833"), "ns" : "mydb.users", "min" : { "_id" : NumberLong("-2315986245884394206") }, "max" : { "_id" : NumberLong("-2237069208547820477") }, "shard" : "d3b07384d113edec49eaa6238ad5ff00"} { "_id" : "mydb.users-_id_-2395340237016371204", "lastmod" : { "t" : 88, "i" : 0 }, "lastmodEpoch" : ObjectId("51a8d7a261c75a12f1c7f833"), "ns" : "mydb.users", "min" : { "_id" : NumberLong("-2395340237016371204")}, "max" : { "_id" : NumberLong("-2315986245884394206")}, "shard" : "15e894ac57eddb32713e7eae90d13e41" }
  • 14. Design - take aways ● Getting the proper shard key is critical. Once defined it's a pain in the a^$ to change. ● Creating a shard key that achieves your goals can sometimes be tricky. Take time to test this portion in dev/sandbox environments. ● Use profiler and explain() to figure out if you are using proper keys ● Understanding the Balancer's effect on your workload is critical. You probably need more I/O capacity than you think.
  • 15. @scale - balancing ● Balancing is hard ○ Visibility ○ Balancer process
  • 16. @scale - balancing var balance_check = function(n) { if ( n ) { var output = db.chunks.aggregate([ { $group : { _id : { "_id":"$ns", "shard":"$shard" }, chunks : { $sum : 1 } }}, { $match : { "_id._id" : n } }, { $sort : { "chunks" : 1 } } ]); } else { var output = db.chunks.aggregate([ { $group : { _id : { "_id":"$ns", "shard":"$shard" }, chunks : { $sum : 1 } }}, { $sort : { "chunks" : 1 } } ]); } printjson(output); }; https://gist.github.com/kgorman/5775530
  • 17. mongos> balance_check("mydb.users") { "result" : [ { "_id" : { "_id" : "mydb.users", "shard" : "884e49a58a63060782d767feed8e6c88" }, "chunks" : 1 #<------ !!!!!!! OH NO }, { "_id" : { "_id" : "mydb.users", "shard" : "15e894ac57eddb32713e7eae90d13e41" }, "chunks" : 77 }, { "_id" : { "_id" : "mydb.users", "shard" : "1134604ead16f77309235aa3d327bb59" }, "chunks" : 77 }, { "_id" : { "_id" : "mydb.users", "shard" : "d3b07384d113edec49eaa6238ad5ff00" }, "chunks" : 78 @scale - balancing
  • 18. @scale - balancing WTF ● Balancer Havoc!
  • 19. @scale - balancing ● Balancer 'Fixes' ○ Pre-splitting ○ Windows ○ Micro-windows ○ Custom scripts ○ Don't use it
  • 20. @scale - Monitoring ● Monitor everything. But some key items: ○ Shard size ○ Balancer on/off ○ Response time ○ Balance of OPS across shards ○ Failed migration of chunks ○ Locks ○ I/O ● Get histograms! ○ Graphite
  • 21. @scale - Capacity ● Don't fail to plan ● Disk space/size is critical ○ maxSize() ○ extending disk space ○ adding cpu or memory capacity ○ Slave 'tricks' ■ Shell game ● Compute resources ● You need disk I/O no matter what anyone says. ○ Size for balancer workloads too http://blog.foursquare.com/2010/10/05/so-that-was-a-bummer/ 1. We’re making changes to our operational procedures to prevent overloading, and to ensure that future occurrences have safeguards so foursquare stays up.
  • 22. @scale ● Things to watch for ○ Out of disk space or other resources ■ Don't wait! ○ Balancer havoc ○ No more I/O left ○ Shard Asymmetry ○ Scatter gather's ● Things to ensure you do ○ Use maxSize, leave yourself a bit of wiggle room ○ Leave profiler on! ○ Explain and profile your queries