SlideShare a Scribd company logo
MongoDB	
  Sharding	
  
fundamentals	
  
Antonios	
  Giannopoulos	
  	
  
Database	
  Administrator	
  at	
  
ObjectRocket	
  by	
  Rackspace	
  
Background	
  
-­‐  14	
  years	
  in	
  databases	
  and	
  system	
  engineering	
  
-­‐  NoSQL	
  DBA	
  @	
  ObjectRocket	
  by	
  Rackspace	
  
	
  
-­‐  MongoDB	
  CerFfied	
  DBA	
  
What	
  is	
  sharding?	
  
-­‐  A	
  mechanism	
  for	
  horizontal	
  scaling	
  
	
  
-­‐  Distributes	
  the	
  dataset	
  over	
  mulFple	
  servers	
  
(shards)	
  
	
  
-­‐  Each	
  shard	
  is	
  an	
  independent	
  database	
  
	
  
-­‐  All	
  shards	
  consists	
  a	
  single	
  logical	
  database	
  
Why	
  Sharding?	
  
-­‐  Increases	
  cluster	
  throughput	
  –	
  Read/Write	
  
Scaling	
  
	
  
-­‐  Reduces	
  costs	
  -­‐	
  Many	
  small	
  servers	
  VS	
  one	
  big	
  
box	
  
-­‐  Eliminates	
  HW	
  and	
  SW	
  hard	
  limits	
  
MongoDB	
  Sharding	
  
-­‐  Consists	
  of	
  three	
  elements:	
  Shards,	
  Config	
  Servers	
  and	
  
Mongos	
  
	
  
-­‐  Shards:	
  Hold	
  the	
  cluster	
  data,	
  databases,	
  collecFons,	
  
documents	
  (Data	
  nodes)	
  
-­‐  Config	
  Servers:	
  Hold	
  the	
  cluster	
  metadata,	
  map	
  the	
  cluster	
  
architecture.	
  
-­‐  	
  Mongos:	
  Serve	
  all	
  drivers	
  requests.	
  Route	
  each	
  request	
  to	
  a	
  
shard	
  or	
  shards	
  (Router	
  nodes)	
  
ApplicaFon	
  /	
  Driver	
  Layer	
  	
  
Mongos01	
   Mongos02	
   MongosN	
  
ConfigSrv01	
  
ConfigSrv02	
  
ConfigSrv03	
  
Shard02	
   ShardN	
  
MongoDB	
  Sharded	
  Cluster	
  
Shard01	
   …	
  
…	
  
How	
  Sharding	
  works?	
  
-­‐  Range	
  parFFoning	
  per	
  collecFon	
  (chunks)	
  
-­‐  Shard	
  key	
  to	
  define	
  chunks	
  (field(s))	
  
-­‐  Chunks	
  are	
  “metadata”	
  on	
  the	
  config	
  servers	
  
-­‐  Chunks	
  can	
  move,	
  split	
  and	
  merge	
  
How	
  Sharding	
  works?	
  -­‐	
  Example	
  
{	
  "name"	
  :	
  "Angelina",	
  "surname"	
  :	
  "Jolie",	
  "posiFon"	
  :	
  "Windows	
  Eng.",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
{	
  "name"	
  :	
  "Emma",	
  "surname"	
  :	
  "Stone",	
  "posiFon"	
  :	
  "Windows	
  Eng.",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
{	
  "name"	
  :	
  "Charlize",	
  "surname"	
  :	
  "Theron",	
  "posiFon"	
  :	
  "Linux	
  Eng.",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
{	
  "name"	
  :	
  "Olivia",	
  "surname"	
  :	
  "Wilde",	
  "posiFon"	
  :	
  "Linux	
  Eng.",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
{	
  "name"	
  :	
  "Jessica",	
  "surname"	
  :	
  "Alba",	
  "posiFon"	
  :	
  "Sr	
  Linux	
  Eng.",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
{	
  "name"	
  :	
  "Scarlef",	
  "surname"	
  :	
  "Johansson",	
  "posiFon"	
  :	
  "Sr	
  Windows	
  Eng.",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
{	
  "name"	
  :	
  "Megan",	
  "surname"	
  :	
  "Fox",	
  "posiFon"	
  :	
  "Networks	
  Eng.",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
{	
  "name"	
  :	
  "Mila",	
  "surname"	
  :	
  "Kunis",	
  "posiFon"	
  :	
  "Sr	
  Networks	
  Eng.",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
{	
  "name"	
  :	
  "Natalie",	
  "surname"	
  :	
  "Portman",	
  "posiFon"	
  :	
  "Database	
  Eng",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
{	
  "name"	
  :	
  "Anne",	
  "surname"	
  :	
  "Hathaway",	
  "posiFon"	
  :	
  "Sr	
  Database	
  Eng",	
  "phone"	
  :	
  "555-­‐5555"	
  }	
  
	
  
	
  
-­‐  CollecFon	
  employees	
  for	
  an	
  IT	
  company	
  
-­‐  Shard	
  key	
  “posi-on”	
  
How	
  Sharding	
  works?	
  -­‐	
  Example	
  
{	
  "min"	
  :	
  {	
  "posiFon"	
  :	
  {	
  "$minKey"	
  :	
  1	
  }	
  },	
  "max"	
  :	
  {	
  "posiFon"	
  :	
  "Database	
  Eng"	
  },	
  
"shard"	
  :	
  ”Shard01"	
  }	
  
	
  
{	
  "min"	
  :	
  {	
  "posiFon"	
  :	
  "Database	
  Eng"	
  },	
  "max"	
  :	
  {	
  "posiFon"	
  :	
  "Sr	
  Database	
  
Eng"	
  },	
  "shard"	
  :	
  "Shard01"	
  }	
  
	
  
{	
  "min"	
  :	
  {	
  "posiFon"	
  :	
  "Sr	
  Database	
  Eng"	
  },	
  "max"	
  :	
  {	
  "posiFon"	
  :	
  "Windows	
  
Eng."	
  },	
  "shard"	
  :	
  "Shard02"	
  }	
  
	
  
{	
  "min"	
  :	
  {	
  "posiFon"	
  :	
  "Windows	
  Eng."	
  },	
  "max"	
  :	
  {	
  "posiFon"	
  :	
  {	
  "$maxKey"	
  :	
  1	
  }	
  },	
  
"shard"	
  :	
  "Shard02"	
  }	
  
	
  
-­‐	
  Lower/upper	
  bound	
  and	
  shard	
  (server)	
  	
  	
  
	
  
Choose	
  a	
  shard	
  key	
  
-­‐  High	
  Cardinality	
  
-­‐  Not	
  Null	
  values	
  
-­‐  Immutable	
  field(s)	
  
	
  
-­‐  Not	
  Monotonically	
  increased	
  fields	
  
	
  
Choose	
  a	
  shard	
  key	
  
-­‐  Even	
  read/write	
  distribuFon	
  
-­‐  Even	
  data	
  distribuFon	
  
	
  
-­‐  Read	
  targeFng	
  
-­‐  Read	
  locality	
  
Choose	
  a	
  shard	
  key	
  
-­‐  Hashed	
  shard	
  keys	
  for	
  randomness	
  
	
  
-­‐  Compound	
  shard	
  keys	
  for	
  cardinality	
  
	
  
-­‐  Unique	
  indexes	
  are	
  good	
  	
  
-­‐  {_id:”hashed”}	
  scales	
  writes	
  
LimitaFons	
  of	
  Sharding	
  
-­‐  Unique	
  indexes	
  –	
  Just	
  one…	
  
-­‐  IniFal	
  collecFon	
  size	
  –	
  Avoid	
  collecFons	
  >	
  256G,	
  
hard	
  limit	
  is	
  a	
  funcFon	
  of	
  key	
  and	
  chunk	
  size	
  ,	
  
for	
  64MB	
  chunk/512B	
  key	
  is	
  more	
  than	
  1TB	
  
	
  
-­‐  Number	
  of	
  documents	
  per	
  chunk	
  	
  (250K)	
  	
  
LimitaFons	
  of	
  Sharding	
  
-­‐  Shard	
  key	
  size	
  <	
  512	
  bytes	
  
-­‐  MulFkey,text,	
  geo	
  indexes	
  are	
  prohibited	
  
-­‐  Some	
  operaFons	
  won’t	
  run	
  (for	
  example	
  group,	
  
db.eval(),	
  $isolated,	
  $snapshot,	
  geoSearch)	
  
“Sharding”	
  –	
  Other	
  players	
  
-­‐  ApplicaFon	
  level	
  sharding	
  
-­‐  Mysql	
  (MaxScale,	
  Fabric,…)	
  
	
  
-­‐  Postgres	
  (pg_shard)	
  
-­‐  ElasFcSearch	
  (Document	
  ID	
  or	
  rouFng)	
  
-­‐  Cassandra	
  (Hash-­‐based	
  -­‐	
  Ring	
  topology)	
  	
  
Contact	
  
www.objectrocket.com	
  
www.rackspace.co.uk/objectrocket/mongodb	
  
antonios.giannopoulos@rackspace.co.uk	
  
	
  
	
  
We	
  are	
  hiring!	
  (DevOps,	
  DBAs	
  and	
  more)	
  
hfp://objectrocket.com/careers	
  
QuesFons?	
  
	
  
Thank	
  you!!!	
  
	
  
MongoDB	
  Meetup	
  
What's	
  new	
  in	
  MongoDB	
  3.0	
  
	
  Tuesday,	
  November	
  10	
  ,	
  7:00	
  pm	
  
	
  @	
  Harokopio	
  University	
  
	
  
Ad

More Related Content

What's hot (20)

Introduction to Apache Hive
Introduction to Apache HiveIntroduction to Apache Hive
Introduction to Apache Hive
Avkash Chauhan
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Dvir Volk
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
MongoDB
MongoDBMongoDB
MongoDB
nikhil2807
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
PHP Cookies and Sessions
PHP Cookies and SessionsPHP Cookies and Sessions
PHP Cookies and Sessions
Nisa Soomro
 
PHP - Introduction to PHP Cookies and Sessions
PHP - Introduction to PHP Cookies and SessionsPHP - Introduction to PHP Cookies and Sessions
PHP - Introduction to PHP Cookies and Sessions
Vibrant Technologies & Computers
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
HabileLabs
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
Derek Stainer
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
Saurav Haloi
 
PHP FUNCTIONS
PHP FUNCTIONSPHP FUNCTIONS
PHP FUNCTIONS
Zeeshan Ahmed
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
Edureka!
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
PolarSeven Pty Ltd
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
valuebound
 
Apache PIG
Apache PIGApache PIG
Apache PIG
Prashant Gupta
 
Introduction to MongoDB.pptx
Introduction to MongoDB.pptxIntroduction to MongoDB.pptx
Introduction to MongoDB.pptx
Surya937648
 
Bootstrap PPT Part - 2
Bootstrap PPT Part - 2Bootstrap PPT Part - 2
Bootstrap PPT Part - 2
EPAM Systems
 
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
spark-project
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Harri Kauhanen
 
Introduction to Apache Hive
Introduction to Apache HiveIntroduction to Apache Hive
Introduction to Apache Hive
Avkash Chauhan
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Dvir Volk
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
PHP Cookies and Sessions
PHP Cookies and SessionsPHP Cookies and Sessions
PHP Cookies and Sessions
Nisa Soomro
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
HabileLabs
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
Derek Stainer
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
Saurav Haloi
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
Edureka!
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
valuebound
 
Introduction to MongoDB.pptx
Introduction to MongoDB.pptxIntroduction to MongoDB.pptx
Introduction to MongoDB.pptx
Surya937648
 
Bootstrap PPT Part - 2
Bootstrap PPT Part - 2Bootstrap PPT Part - 2
Bootstrap PPT Part - 2
EPAM Systems
 
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
spark-project
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 

Viewers also liked (6)

Webinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and ScaleWebinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and Scale
MongoDB
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
MongoDB
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: Sharding
MongoDB
 
MongoDB training for java software engineers
MongoDB training for java software engineersMongoDB training for java software engineers
MongoDB training for java software engineers
Moshe Kaplan
 
Back to Basics Webinar 3: Introduction to Replica Sets
Back to Basics Webinar 3: Introduction to Replica SetsBack to Basics Webinar 3: Introduction to Replica Sets
Back to Basics Webinar 3: Introduction to Replica Sets
MongoDB
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
MongoDB
 
Webinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and ScaleWebinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and Scale
MongoDB
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
MongoDB
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: Sharding
MongoDB
 
MongoDB training for java software engineers
MongoDB training for java software engineersMongoDB training for java software engineers
MongoDB training for java software engineers
Moshe Kaplan
 
Back to Basics Webinar 3: Introduction to Replica Sets
Back to Basics Webinar 3: Introduction to Replica SetsBack to Basics Webinar 3: Introduction to Replica Sets
Back to Basics Webinar 3: Introduction to Replica Sets
MongoDB
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
MongoDB
 
Ad

Similar to MongoDB Sharding Fundamentals (20)

2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
MongoDB
 
Spark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross Lawley
Spark Summit
 
How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
MongoDB
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
Ben van Mol
 
20140614 introduction to spark-ben white
20140614 introduction to spark-ben white20140614 introduction to spark-ben white
20140614 introduction to spark-ben white
Data Con LA
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
Rick Copeland
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and Workshop
AhmedabadJavaMeetup
 
Spark Cassandra Connector Dataframes
Spark Cassandra Connector DataframesSpark Cassandra Connector Dataframes
Spark Cassandra Connector Dataframes
Russell Spitzer
 
Sharding
ShardingSharding
Sharding
MongoDB
 
Scaling PyData Up and Out
Scaling PyData Up and OutScaling PyData Up and Out
Scaling PyData Up and Out
Travis Oliphant
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training Presentations
Srinivas Mutyala
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
Dangling DNS records takeover at scale
Dangling DNS records takeover at scaleDangling DNS records takeover at scale
Dangling DNS records takeover at scale
Chandrapal Badshah
 
Eyeing the Onion
Eyeing the OnionEyeing the Onion
Eyeing the Onion
bsidesaugusta
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
MapR Technologies
 
MongoDB 3.0
MongoDB 3.0 MongoDB 3.0
MongoDB 3.0
Victoria Malaya
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Ontico
 
My Sql And Search At Craigslist
My Sql And Search At CraigslistMy Sql And Search At Craigslist
My Sql And Search At Craigslist
MySQLConference
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
Samantha Quiñones
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
MongoDB
 
Spark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross Lawley
Spark Summit
 
How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
MongoDB
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
Ben van Mol
 
20140614 introduction to spark-ben white
20140614 introduction to spark-ben white20140614 introduction to spark-ben white
20140614 introduction to spark-ben white
Data Con LA
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
Rick Copeland
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and Workshop
AhmedabadJavaMeetup
 
Spark Cassandra Connector Dataframes
Spark Cassandra Connector DataframesSpark Cassandra Connector Dataframes
Spark Cassandra Connector Dataframes
Russell Spitzer
 
Sharding
ShardingSharding
Sharding
MongoDB
 
Scaling PyData Up and Out
Scaling PyData Up and OutScaling PyData Up and Out
Scaling PyData Up and Out
Travis Oliphant
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training Presentations
Srinivas Mutyala
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
Dangling DNS records takeover at scale
Dangling DNS records takeover at scaleDangling DNS records takeover at scale
Dangling DNS records takeover at scale
Chandrapal Badshah
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
MapR Technologies
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)Шардинг в MongoDB, Henrik Ingo (MongoDB)
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Ontico
 
My Sql And Search At Craigslist
My Sql And Search At CraigslistMy Sql And Search At Craigslist
My Sql And Search At Craigslist
MySQLConference
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
Samantha Quiñones
 
Ad

More from Antonios Giannopoulos (15)

Comparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Comparing Geospatial Implementation in MongoDB, Postgres, and ElasticComparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Comparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Antonios Giannopoulos
 
Using MongoDB with Kafka - Use Cases and Best Practices
Using MongoDB with Kafka -  Use Cases and Best PracticesUsing MongoDB with Kafka -  Use Cases and Best Practices
Using MongoDB with Kafka - Use Cases and Best Practices
Antonios Giannopoulos
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_new
Antonios Giannopoulos
 
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
Antonios Giannopoulos
 
Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDB
Antonios Giannopoulos
 
Upgrading to MongoDB 4.0 from older versions
Upgrading to MongoDB 4.0 from older versionsUpgrading to MongoDB 4.0 from older versions
Upgrading to MongoDB 4.0 from older versions
Antonios Giannopoulos
 
How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018
Antonios Giannopoulos
 
Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018
Antonios Giannopoulos
 
Triggers in MongoDB
Triggers in MongoDBTriggers in MongoDB
Triggers in MongoDB
Antonios Giannopoulos
 
Sharded cluster tutorial
Sharded cluster tutorialSharded cluster tutorial
Sharded cluster tutorial
Antonios Giannopoulos
 
MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017
Antonios Giannopoulos
 
Percona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorial
Antonios Giannopoulos
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
Antonios Giannopoulos
 
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration VariablesAntonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos
 
Introduction to Polyglot Persistence
Introduction to Polyglot Persistence Introduction to Polyglot Persistence
Introduction to Polyglot Persistence
Antonios Giannopoulos
 
Comparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Comparing Geospatial Implementation in MongoDB, Postgres, and ElasticComparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Comparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Antonios Giannopoulos
 
Using MongoDB with Kafka - Use Cases and Best Practices
Using MongoDB with Kafka -  Use Cases and Best PracticesUsing MongoDB with Kafka -  Use Cases and Best Practices
Using MongoDB with Kafka - Use Cases and Best Practices
Antonios Giannopoulos
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_new
Antonios Giannopoulos
 
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
Antonios Giannopoulos
 
Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDB
Antonios Giannopoulos
 
Upgrading to MongoDB 4.0 from older versions
Upgrading to MongoDB 4.0 from older versionsUpgrading to MongoDB 4.0 from older versions
Upgrading to MongoDB 4.0 from older versions
Antonios Giannopoulos
 
How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018
Antonios Giannopoulos
 
Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018
Antonios Giannopoulos
 
MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017
Antonios Giannopoulos
 
Percona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorial
Antonios Giannopoulos
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
Antonios Giannopoulos
 
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration VariablesAntonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos
 
Introduction to Polyglot Persistence
Introduction to Polyglot Persistence Introduction to Polyglot Persistence
Introduction to Polyglot Persistence
Antonios Giannopoulos
 

Recently uploaded (20)

Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 
shit yudh slideshare power likha point presen
shit yudh slideshare power likha point presenshit yudh slideshare power likha point presen
shit yudh slideshare power likha point presen
vishalgurjar11229
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Call illuminati Agent in uganda+256776963507/0741506136
Call illuminati Agent in uganda+256776963507/0741506136Call illuminati Agent in uganda+256776963507/0741506136
Call illuminati Agent in uganda+256776963507/0741506136
illuminati Agent uganda call+256776963507/0741506136
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptxPRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
JayeshTaneja4
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136
illuminati Agent uganda call+256776963507/0741506136
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Introcomputerscienceand datascience.pptx
Introcomputerscienceand datascience.pptxIntrocomputerscienceand datascience.pptx
Introcomputerscienceand datascience.pptx
abdulrehmanbscsf22
 
Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...Thingyan is now a global treasure! See how people around the world are search...
Thingyan is now a global treasure! See how people around the world are search...
Pixellion
 
shit yudh slideshare power likha point presen
shit yudh slideshare power likha point presenshit yudh slideshare power likha point presen
shit yudh slideshare power likha point presen
vishalgurjar11229
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptxPRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
PRE-NATAL GRnnnmnnnnmmOWTH seminar[1].pptx
JayeshTaneja4
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Introcomputerscienceand datascience.pptx
Introcomputerscienceand datascience.pptxIntrocomputerscienceand datascience.pptx
Introcomputerscienceand datascience.pptx
abdulrehmanbscsf22
 

MongoDB Sharding Fundamentals

  • 1. MongoDB  Sharding   fundamentals   Antonios  Giannopoulos     Database  Administrator  at   ObjectRocket  by  Rackspace  
  • 2. Background   -­‐  14  years  in  databases  and  system  engineering   -­‐  NoSQL  DBA  @  ObjectRocket  by  Rackspace     -­‐  MongoDB  CerFfied  DBA  
  • 3. What  is  sharding?   -­‐  A  mechanism  for  horizontal  scaling     -­‐  Distributes  the  dataset  over  mulFple  servers   (shards)     -­‐  Each  shard  is  an  independent  database     -­‐  All  shards  consists  a  single  logical  database  
  • 4. Why  Sharding?   -­‐  Increases  cluster  throughput  –  Read/Write   Scaling     -­‐  Reduces  costs  -­‐  Many  small  servers  VS  one  big   box   -­‐  Eliminates  HW  and  SW  hard  limits  
  • 5. MongoDB  Sharding   -­‐  Consists  of  three  elements:  Shards,  Config  Servers  and   Mongos     -­‐  Shards:  Hold  the  cluster  data,  databases,  collecFons,   documents  (Data  nodes)   -­‐  Config  Servers:  Hold  the  cluster  metadata,  map  the  cluster   architecture.   -­‐   Mongos:  Serve  all  drivers  requests.  Route  each  request  to  a   shard  or  shards  (Router  nodes)  
  • 6. ApplicaFon  /  Driver  Layer     Mongos01   Mongos02   MongosN   ConfigSrv01   ConfigSrv02   ConfigSrv03   Shard02   ShardN   MongoDB  Sharded  Cluster   Shard01   …   …  
  • 7. How  Sharding  works?   -­‐  Range  parFFoning  per  collecFon  (chunks)   -­‐  Shard  key  to  define  chunks  (field(s))   -­‐  Chunks  are  “metadata”  on  the  config  servers   -­‐  Chunks  can  move,  split  and  merge  
  • 8. How  Sharding  works?  -­‐  Example   {  "name"  :  "Angelina",  "surname"  :  "Jolie",  "posiFon"  :  "Windows  Eng.",  "phone"  :  "555-­‐5555"  }   {  "name"  :  "Emma",  "surname"  :  "Stone",  "posiFon"  :  "Windows  Eng.",  "phone"  :  "555-­‐5555"  }   {  "name"  :  "Charlize",  "surname"  :  "Theron",  "posiFon"  :  "Linux  Eng.",  "phone"  :  "555-­‐5555"  }   {  "name"  :  "Olivia",  "surname"  :  "Wilde",  "posiFon"  :  "Linux  Eng.",  "phone"  :  "555-­‐5555"  }   {  "name"  :  "Jessica",  "surname"  :  "Alba",  "posiFon"  :  "Sr  Linux  Eng.",  "phone"  :  "555-­‐5555"  }   {  "name"  :  "Scarlef",  "surname"  :  "Johansson",  "posiFon"  :  "Sr  Windows  Eng.",  "phone"  :  "555-­‐5555"  }   {  "name"  :  "Megan",  "surname"  :  "Fox",  "posiFon"  :  "Networks  Eng.",  "phone"  :  "555-­‐5555"  }   {  "name"  :  "Mila",  "surname"  :  "Kunis",  "posiFon"  :  "Sr  Networks  Eng.",  "phone"  :  "555-­‐5555"  }   {  "name"  :  "Natalie",  "surname"  :  "Portman",  "posiFon"  :  "Database  Eng",  "phone"  :  "555-­‐5555"  }   {  "name"  :  "Anne",  "surname"  :  "Hathaway",  "posiFon"  :  "Sr  Database  Eng",  "phone"  :  "555-­‐5555"  }       -­‐  CollecFon  employees  for  an  IT  company   -­‐  Shard  key  “posi-on”  
  • 9. How  Sharding  works?  -­‐  Example   {  "min"  :  {  "posiFon"  :  {  "$minKey"  :  1  }  },  "max"  :  {  "posiFon"  :  "Database  Eng"  },   "shard"  :  ”Shard01"  }     {  "min"  :  {  "posiFon"  :  "Database  Eng"  },  "max"  :  {  "posiFon"  :  "Sr  Database   Eng"  },  "shard"  :  "Shard01"  }     {  "min"  :  {  "posiFon"  :  "Sr  Database  Eng"  },  "max"  :  {  "posiFon"  :  "Windows   Eng."  },  "shard"  :  "Shard02"  }     {  "min"  :  {  "posiFon"  :  "Windows  Eng."  },  "max"  :  {  "posiFon"  :  {  "$maxKey"  :  1  }  },   "shard"  :  "Shard02"  }     -­‐  Lower/upper  bound  and  shard  (server)        
  • 10. Choose  a  shard  key   -­‐  High  Cardinality   -­‐  Not  Null  values   -­‐  Immutable  field(s)     -­‐  Not  Monotonically  increased  fields    
  • 11. Choose  a  shard  key   -­‐  Even  read/write  distribuFon   -­‐  Even  data  distribuFon     -­‐  Read  targeFng   -­‐  Read  locality  
  • 12. Choose  a  shard  key   -­‐  Hashed  shard  keys  for  randomness     -­‐  Compound  shard  keys  for  cardinality     -­‐  Unique  indexes  are  good     -­‐  {_id:”hashed”}  scales  writes  
  • 13. LimitaFons  of  Sharding   -­‐  Unique  indexes  –  Just  one…   -­‐  IniFal  collecFon  size  –  Avoid  collecFons  >  256G,   hard  limit  is  a  funcFon  of  key  and  chunk  size  ,   for  64MB  chunk/512B  key  is  more  than  1TB     -­‐  Number  of  documents  per  chunk    (250K)    
  • 14. LimitaFons  of  Sharding   -­‐  Shard  key  size  <  512  bytes   -­‐  MulFkey,text,  geo  indexes  are  prohibited   -­‐  Some  operaFons  won’t  run  (for  example  group,   db.eval(),  $isolated,  $snapshot,  geoSearch)  
  • 15. “Sharding”  –  Other  players   -­‐  ApplicaFon  level  sharding   -­‐  Mysql  (MaxScale,  Fabric,…)     -­‐  Postgres  (pg_shard)   -­‐  ElasFcSearch  (Document  ID  or  rouFng)   -­‐  Cassandra  (Hash-­‐based  -­‐  Ring  topology)    
  • 16. Contact   www.objectrocket.com   www.rackspace.co.uk/objectrocket/mongodb   [email protected]       We  are  hiring!  (DevOps,  DBAs  and  more)   hfp://objectrocket.com/careers  
  • 17. QuesFons?     Thank  you!!!     MongoDB  Meetup   What's  new  in  MongoDB  3.0    Tuesday,  November  10  ,  7:00  pm    @  Harokopio  University