SlideShare a Scribd company logo
Sr. Solutions Architect, MongoDB
Jake Angerman
#MongoDBWorld
Sharding Time Series Data
Let's Pretend We Are DevOps
What my friends
think I do
What society
thinks I do
What my Mom
thinks I do
What my boss
thinks I do What I think I do What I really do
DevOps
Sharding Overview
Primary
Secondary
Secondary
Shard 1
Primary
Secondary
Secondary
Shard 2
Primary
Secondary
Secondary
Shard 3
Primary
Secondary
Secondary
Shard N
…
Query
Router
Query
Router
Query
Router
……
Driver
Application
Why do we need to shard?
• Reaching a limit on some resource
– RAM (working set)
– Disk space
– Disk IO
– Client network latency on writes (tag aware sharding)
– CPU
Do we need to shard right now?
• Two schools of thought:
1. Shard at the outset to avoid technical debt later
2. Shard later to avoid complexity and overhead today
• Either way, shard before you need to!
– 256GB data size threshold published in documentation
– Chunk migrations can cause memory contention and disk
IO
Working
Set
Free
RAM
Things seemed fine…
Working
Set
Chunk
Migration
… then I waited
too long to
shard
> db.mdbw.stats()
{
"ns" : "test.mdbw",
"count" : 16000, // one hour's worth of documents
"size" : 65280000, // size of user data, padding included
"avgObjSize" : 4080,
"storageSize" : 93356032, // size of data extents, unused space included
"numExtents" : 11,
"nindexes" : 1,
"lastExtentSize" : 31354880,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 801248,
"indexSizes" : { "_id_" : 801248 },
"ok" : 1
}
collection stats
Storage model spreadsheet
sensors 16,000
years to keep data 6
docs per day 384,000
docs per year 140,160,000
docs total across all years 840,960,000
indexes per day 801248 bytes
storage per hour 63 MB
storage per day 1.5 GB
storage per year 539 GB
storage across all years 3,235 GB
Why we need to shard now
539 GB in year one
alone
0
500
1,000
1,500
2,000
2,500
3,000
3,500
1 2 3 4 5 6
Year
Total storage
(GB)
16,000 sensors today…
… 47,000
tomorrow?
What will our sharded cluster look
like?
• We need to model the application to answer this
question
• Model should include:
– application write patterns (sensors)
– application read patterns (clients)
– analytic read patterns
– data storage requirements
• Two main collections
– summary data (fast query times)
– historical data (analysis of environmental conditions)
Option 1: Everything in one sharded
cluster
Primary Primary Primary
Secondary Secondary Secondary
Secondary Secondary Secondary
Shard 2 Shard 3 Shard N
…
Primary
Secondary
Secondary
Shard 1
Primary Shard
Primary
Secondary
Secondary
Shard 4
• Issue: prevent analytics jobs from affecting application
performance
• Summary data is small (16,000 * N bytes) and accessed
frequently
Option 2: Distinct replica set for
summaries
Primary Primary Primary
Secondary Secondary Secondary
Secondary Secondary Secondary
Shard 1 Shard 2 Shard N
…
Primary
Secondary
Secondary
Replica set
Primary
Secondary
Secondary
Shard 3
• Pros: Operational separation between business functions
• Cons: application must write to two different databases
Application read patterns
• Web browsers, mobile phones, and in-car
navigation devices
• Working set should be kept in RAM
• 5M subscribers * 1% active * 50
sensors/query *
1 device query/min = 41,667 reads/sec
• 41,667 reads/sec * 4080 bytes = 162 MB/sec
– and that's without any protocol overhead
• Gigabit Ethernet is ≈ 118 MB/sec
Primary
Secondary
Secondary
Replica set
1 Gbps
Application read patterns
(continued)
• Options
– provision more bandwidth ($$$)
– tune application read pattern
– add a caching layer
– secondary reads from the replica
set
Primary
Secondary
Secondary
Replica set
1 Gbps
1 Gbps
1 Gbps
Secondary Reads from the Replica
Set
• Stale data OK in this use case
• caution: read preference of
secondary could be disastrous in a
3-replica set if a secondary fails!
• app servers with mixed read
preferences of primary and
secondary are operationally
cumbersome
• Use nearest read preference to
access all nodes
Primary
Secondary
Secondary
Replica set
1 Gbps
1 Gbps
1 Gbps
db.collection.find().readPref
( { mode: 'nearest'} )
Replica Set Tags
• app servers in different data centers use
replica set tags plus read preferencenearest
• db.collection.find().readPref( { mode: 'nearest',
tags: [ {'datacenter': 'east'} ] } )
east
Secondary
Secondary
Primary
>rs.conf()
{"_id":"rs0",
"version":2,
"members":[
{"_id":0,
"host":"node0.example.net:27017",
"tags":{"datacenter":"east"}
},
{"_id":1,
"host":"node1.example.net:27017",
"tags":{"datacenter":"east"}
},
{"_id":2,
"host":"node2.example.net:27017",
"tags":{"datacenter":"east"}
},
}
eastcentralwest
Replica Set Tags
• Enables geographic distribution
SecondarySecondary Primary
eastcentralwest
Replica Set Tags
• Enables geographic distribution
• Allows scaling within each data center
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
Primary
Secondary
Secondary
Analytic read patterns
• How does an analyst look at the data on the
sharded cluster?
• 1 Year of data = 539 GB
3, 256
3, 192
5, 128
9, 64
17, 32
0
50
100
150
200
250
300
0 2 4 6 8 10 12 14 16 18
Server RAM
Number of machines
Application write patterns
• 16,000 sensors every minute = 267 writes/sec
• Could we handle 16,000 writes in one second?
– 16,000 writes * 4080 bytes = 62 MB
• Load test the app!
Modeling the Application - summary
• We modeled:
– application write patterns (sensors)
– application read patterns (clients)
– analytic read patterns
– data storage requirements
– the network, a little bit
Shard Key
Shard Key characteristics
• Agood shard key has:
– sufficient cardinality
– distributed writes
– targeted reads ("query isolation")
• Shard key should be in every query if possible
– scatter gather otherwise
• Choosing a good shard key is important!
– affects performance and scalability
– changing it later is expensive
Hashed shard key
• Pros:
– Evenly distributed writes
• Cons:
– Random data (and index) updates can be IO intensive
– Range-based queries turn into scatter gather
Shard 1
mongos
Shard 2 Shard 3 Shard N
Low cardinality shard key
• Induces "jumbo chunks"
• Examples: sensor ID
Shard 1
mongos
Shard 2 Shard 3 Shard N
[ a, b )
Ascending shard key
• Monotonically increasing shard key values cause
"hot spots" on inserts
• Examples: timestamps, _id
Shard 1
mongos
Shard 2 Shard 3 Shard N
[ ISODate(…), $maxKey )
Choosing a shard key for time series
data
• Consider compound shard key:
{arbitrary value, incrementing value}
• Best of both worlds – local hot spotting, targeted
reads
Shard 1
mongos
Shard 2 Shard 3 Shard N
[ {V1, ISODate(A)}, {V1, ISODate(B)} ),
[ {V1, ISODate(B)}, {V1, ISODate(C)} ),
[ {V1, ISODate(C)}, {V1, ISODate(D)} ),
…
[ {V4, ISODate(A)}, {V4, ISODate(B)}
[ {V4, ISODate(B)}, {V4, ISODate(C)}
[ {V4, ISODate(C)}, {V4, ISODate(D)}
…
[ {V2, ISODate(A)}, {V2, ISODate(B)} ),
[ {V2, ISODate(B)}, {V2, ISODate(C)} ),
[ {V2, ISODate(C)}, {V2, ISODate(D)} ),
…
[ {V3, ISODate(A)}, {V3, ISODate(B)} ),
[ {V3, ISODate(B)}, {V3, ISODate(C)} ),
[ {V3, ISODate(C)}, {V3, ISODate(D)} ),
…
What is our shard key?
• Let's choose: linkID, date
– example: { linkID: 9000006, date: 140312 }
– example: { _id: "900006:140312" }
– this application's _id is in this form already, yay!
Summary
• Model the read/write patterns and storage
• Choose an appropriate shard key
• DevOps influenced the application
– write recent summary data to separate database
– replica set tags for summary database
– avoid synchronous sensor checkins
– consider changing client polling frequency
– consider throttling RESTAPI access to app servers
Which DevOps person are you?
Sr. Solutions Architect, MongoDB
Jake Angerman
#MongoDBWorld
Thank You
$ mongo --nodb
> cluster = new ShardingTest({"shards": 1, "chunksize": 1})
$ mongo --nodb
> // now connect to mongos on 30999
> db = (new Mongo("localhost:30999")).getDB("test")
Sharding Experimentation
I decided to shard from the outset
• Sensor summary
documents can all fit in
RAM
– 16,000 sensors * N bytes
• Velocity of sensor events
is only 267 writes/sec
• Volume of sensor events
is what dictates sharding
{ _id:<linkID>,
update:ISODate(“2013-10-10T23:06:37.000Z”),
last10:{
avgSpeed:<int>,
avgTime:<int>
},
lastHour:{
avgSpeed:<int>,
avgTime:<int>
},
speeds:[52,49,45,51,...],
times:[237,224,246,233,...],
pavement:"WetSpots",
status:"WetConditions",
weather:"LightRain"
}
> this_is_for_replica_sets_not_sharding = {
_id : "mySet",
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C"},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> rs.initiate(conf)
Configuring Sharding
I'm off to my private island in New
Zealand
Replica Set Diagram
> conf = {
_id : "mySet",
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C"},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> rs.initiate(conf)
Configuration Options
My Wonderful Subsection
> conf = {
_id : "mySet”,
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C"},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> rs.initiate(conf)
Configuration Options
Primary DC
Tag Aware Sharding
• Control where data is written to, and read from
• Each member can have one or more tags
– tags: {dc: "ny"}
– tags: {dc: "ny", subnet: "192.168", rack:
"row3rk7"}
• Replica set defines rules for write concerns
• Rules can change without changing app code
Ad

More Related Content

What's hot (20)

Druid realtime indexing
Druid realtime indexingDruid realtime indexing
Druid realtime indexing
Seoeun Park
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
MongoDB
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Henrik Ingo
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0
HBaseCon
 
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности CassandraАндрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Olga Lavrentieva
 
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to RedisMongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
Jason Terpko
 
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
DataStax
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
Markus Höfer
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014
MongoDB
 
MongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and MergingMongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and Merging
Jason Terpko
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
MongoDB
 
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBMonitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
Geoffrey Anderson
 
druid.io
druid.iodruid.io
druid.io
Jéferson Machado
 
RedisConf18 - Redis and Elasticsearch
RedisConf18 - Redis and ElasticsearchRedisConf18 - Redis and Elasticsearch
RedisConf18 - Redis and Elasticsearch
Redis Labs
 
MongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster TutorialMongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster Tutorial
Jason Terpko
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Nosh Petigara
 
Managing Data and Operation Distribution In MongoDB
Managing Data and Operation Distribution In MongoDBManaging Data and Operation Distribution In MongoDB
Managing Data and Operation Distribution In MongoDB
Jason Terpko
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
HBaseCon
 
Druid realtime indexing
Druid realtime indexingDruid realtime indexing
Druid realtime indexing
Seoeun Park
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
MongoDB
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Henrik Ingo
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0
HBaseCon
 
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности CassandraАндрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Olga Lavrentieva
 
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to RedisMongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
Jason Terpko
 
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
DataStax
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
Markus Höfer
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014
MongoDB
 
MongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and MergingMongoDB Chunks - Distribution, Splitting, and Merging
MongoDB Chunks - Distribution, Splitting, and Merging
Jason Terpko
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
MongoDB
 
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBMonitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
Geoffrey Anderson
 
RedisConf18 - Redis and Elasticsearch
RedisConf18 - Redis and ElasticsearchRedisConf18 - Redis and Elasticsearch
RedisConf18 - Redis and Elasticsearch
Redis Labs
 
MongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster TutorialMongoDB - Sharded Cluster Tutorial
MongoDB - Sharded Cluster Tutorial
Jason Terpko
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Nosh Petigara
 
Managing Data and Operation Distribution In MongoDB
Managing Data and Operation Distribution In MongoDBManaging Data and Operation Distribution In MongoDB
Managing Data and Operation Distribution In MongoDB
Jason Terpko
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
HBaseCon
 

Similar to MongoDB for Time Series Data Part 3: Sharding (20)

Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
MongoDB
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
Server Density
 
Mongo db roma replication and sharding
Mongo db roma replication and shardingMongo db roma replication and sharding
Mongo db roma replication and sharding
Guglielmo Incisa Di Camerana
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
MongoDB
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
MongoDB
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
MongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
MongoDB
 
2013 london advanced-replication
2013 london advanced-replication2013 london advanced-replication
2013 london advanced-replication
Marc Schwering
 
NoSQL Infrastructure
NoSQL InfrastructureNoSQL Infrastructure
NoSQL Infrastructure
Server Density
 
Replication MongoDB Days 2013
Replication MongoDB Days 2013Replication MongoDB Days 2013
Replication MongoDB Days 2013
Randall Hunt
 
OSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross LawleyOSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross Lawley
NETWAYS
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADta
b0ris_1
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
Antonios Giannopoulos
 
NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013
Server Density
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
MongoDB
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
Dylan Tong
 
MongoDB 3.0
MongoDB 3.0 MongoDB 3.0
MongoDB 3.0
Victoria Malaya
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-Tree
Sveta Smirnova
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
DataStax Academy
 
Advanced Replication
Advanced ReplicationAdvanced Replication
Advanced Replication
MongoDB
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
MongoDB
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
Server Density
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
MongoDB
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
MongoDB
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
MongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
MongoDB
 
2013 london advanced-replication
2013 london advanced-replication2013 london advanced-replication
2013 london advanced-replication
Marc Schwering
 
Replication MongoDB Days 2013
Replication MongoDB Days 2013Replication MongoDB Days 2013
Replication MongoDB Days 2013
Randall Hunt
 
OSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross LawleyOSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross Lawley
NETWAYS
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADta
b0ris_1
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
Antonios Giannopoulos
 
NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013NoSQL Infrastructure - Late 2013
NoSQL Infrastructure - Late 2013
Server Density
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
MongoDB
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
Dylan Tong
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-Tree
Sveta Smirnova
 
Advanced Replication
Advanced ReplicationAdvanced Replication
Advanced Replication
MongoDB
 
Ad

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
Ad

Recently uploaded (20)

Rock, Paper, Scissors: An Apex Map Learning Journey
Rock, Paper, Scissors: An Apex Map Learning JourneyRock, Paper, Scissors: An Apex Map Learning Journey
Rock, Paper, Scissors: An Apex Map Learning Journey
Lynda Kane
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Hands On: Create a Lightning Aura Component with force:RecordData
Hands On: Create a Lightning Aura Component with force:RecordDataHands On: Create a Lightning Aura Component with force:RecordData
Hands On: Create a Lightning Aura Component with force:RecordData
Lynda Kane
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Rock, Paper, Scissors: An Apex Map Learning Journey
Rock, Paper, Scissors: An Apex Map Learning JourneyRock, Paper, Scissors: An Apex Map Learning Journey
Rock, Paper, Scissors: An Apex Map Learning Journey
Lynda Kane
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Hands On: Create a Lightning Aura Component with force:RecordData
Hands On: Create a Lightning Aura Component with force:RecordDataHands On: Create a Lightning Aura Component with force:RecordData
Hands On: Create a Lightning Aura Component with force:RecordData
Lynda Kane
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 

MongoDB for Time Series Data Part 3: Sharding

  • 1. Sr. Solutions Architect, MongoDB Jake Angerman #MongoDBWorld Sharding Time Series Data
  • 2. Let's Pretend We Are DevOps What my friends think I do What society thinks I do What my Mom thinks I do What my boss thinks I do What I think I do What I really do DevOps
  • 3. Sharding Overview Primary Secondary Secondary Shard 1 Primary Secondary Secondary Shard 2 Primary Secondary Secondary Shard 3 Primary Secondary Secondary Shard N … Query Router Query Router Query Router …… Driver Application
  • 4. Why do we need to shard? • Reaching a limit on some resource – RAM (working set) – Disk space – Disk IO – Client network latency on writes (tag aware sharding) – CPU
  • 5. Do we need to shard right now? • Two schools of thought: 1. Shard at the outset to avoid technical debt later 2. Shard later to avoid complexity and overhead today • Either way, shard before you need to! – 256GB data size threshold published in documentation – Chunk migrations can cause memory contention and disk IO Working Set Free RAM Things seemed fine… Working Set Chunk Migration … then I waited too long to shard
  • 6. > db.mdbw.stats() { "ns" : "test.mdbw", "count" : 16000, // one hour's worth of documents "size" : 65280000, // size of user data, padding included "avgObjSize" : 4080, "storageSize" : 93356032, // size of data extents, unused space included "numExtents" : 11, "nindexes" : 1, "lastExtentSize" : 31354880, "paddingFactor" : 1, "systemFlags" : 1, "userFlags" : 1, "totalIndexSize" : 801248, "indexSizes" : { "_id_" : 801248 }, "ok" : 1 } collection stats
  • 7. Storage model spreadsheet sensors 16,000 years to keep data 6 docs per day 384,000 docs per year 140,160,000 docs total across all years 840,960,000 indexes per day 801248 bytes storage per hour 63 MB storage per day 1.5 GB storage per year 539 GB storage across all years 3,235 GB
  • 8. Why we need to shard now 539 GB in year one alone 0 500 1,000 1,500 2,000 2,500 3,000 3,500 1 2 3 4 5 6 Year Total storage (GB) 16,000 sensors today… … 47,000 tomorrow?
  • 9. What will our sharded cluster look like? • We need to model the application to answer this question • Model should include: – application write patterns (sensors) – application read patterns (clients) – analytic read patterns – data storage requirements • Two main collections – summary data (fast query times) – historical data (analysis of environmental conditions)
  • 10. Option 1: Everything in one sharded cluster Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Shard 2 Shard 3 Shard N … Primary Secondary Secondary Shard 1 Primary Shard Primary Secondary Secondary Shard 4 • Issue: prevent analytics jobs from affecting application performance • Summary data is small (16,000 * N bytes) and accessed frequently
  • 11. Option 2: Distinct replica set for summaries Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Shard 1 Shard 2 Shard N … Primary Secondary Secondary Replica set Primary Secondary Secondary Shard 3 • Pros: Operational separation between business functions • Cons: application must write to two different databases
  • 12. Application read patterns • Web browsers, mobile phones, and in-car navigation devices • Working set should be kept in RAM • 5M subscribers * 1% active * 50 sensors/query * 1 device query/min = 41,667 reads/sec • 41,667 reads/sec * 4080 bytes = 162 MB/sec – and that's without any protocol overhead • Gigabit Ethernet is ≈ 118 MB/sec Primary Secondary Secondary Replica set 1 Gbps
  • 13. Application read patterns (continued) • Options – provision more bandwidth ($$$) – tune application read pattern – add a caching layer – secondary reads from the replica set Primary Secondary Secondary Replica set 1 Gbps 1 Gbps 1 Gbps
  • 14. Secondary Reads from the Replica Set • Stale data OK in this use case • caution: read preference of secondary could be disastrous in a 3-replica set if a secondary fails! • app servers with mixed read preferences of primary and secondary are operationally cumbersome • Use nearest read preference to access all nodes Primary Secondary Secondary Replica set 1 Gbps 1 Gbps 1 Gbps db.collection.find().readPref ( { mode: 'nearest'} )
  • 15. Replica Set Tags • app servers in different data centers use replica set tags plus read preferencenearest • db.collection.find().readPref( { mode: 'nearest', tags: [ {'datacenter': 'east'} ] } ) east Secondary Secondary Primary >rs.conf() {"_id":"rs0", "version":2, "members":[ {"_id":0, "host":"node0.example.net:27017", "tags":{"datacenter":"east"} }, {"_id":1, "host":"node1.example.net:27017", "tags":{"datacenter":"east"} }, {"_id":2, "host":"node2.example.net:27017", "tags":{"datacenter":"east"} }, }
  • 16. eastcentralwest Replica Set Tags • Enables geographic distribution SecondarySecondary Primary
  • 17. eastcentralwest Replica Set Tags • Enables geographic distribution • Allows scaling within each data center Secondary Secondary Secondary Secondary Secondary Secondary Primary Secondary Secondary
  • 18. Analytic read patterns • How does an analyst look at the data on the sharded cluster? • 1 Year of data = 539 GB 3, 256 3, 192 5, 128 9, 64 17, 32 0 50 100 150 200 250 300 0 2 4 6 8 10 12 14 16 18 Server RAM Number of machines
  • 19. Application write patterns • 16,000 sensors every minute = 267 writes/sec • Could we handle 16,000 writes in one second? – 16,000 writes * 4080 bytes = 62 MB • Load test the app!
  • 20. Modeling the Application - summary • We modeled: – application write patterns (sensors) – application read patterns (clients) – analytic read patterns – data storage requirements – the network, a little bit
  • 22. Shard Key characteristics • Agood shard key has: – sufficient cardinality – distributed writes – targeted reads ("query isolation") • Shard key should be in every query if possible – scatter gather otherwise • Choosing a good shard key is important! – affects performance and scalability – changing it later is expensive
  • 23. Hashed shard key • Pros: – Evenly distributed writes • Cons: – Random data (and index) updates can be IO intensive – Range-based queries turn into scatter gather Shard 1 mongos Shard 2 Shard 3 Shard N
  • 24. Low cardinality shard key • Induces "jumbo chunks" • Examples: sensor ID Shard 1 mongos Shard 2 Shard 3 Shard N [ a, b )
  • 25. Ascending shard key • Monotonically increasing shard key values cause "hot spots" on inserts • Examples: timestamps, _id Shard 1 mongos Shard 2 Shard 3 Shard N [ ISODate(…), $maxKey )
  • 26. Choosing a shard key for time series data • Consider compound shard key: {arbitrary value, incrementing value} • Best of both worlds – local hot spotting, targeted reads Shard 1 mongos Shard 2 Shard 3 Shard N [ {V1, ISODate(A)}, {V1, ISODate(B)} ), [ {V1, ISODate(B)}, {V1, ISODate(C)} ), [ {V1, ISODate(C)}, {V1, ISODate(D)} ), … [ {V4, ISODate(A)}, {V4, ISODate(B)} [ {V4, ISODate(B)}, {V4, ISODate(C)} [ {V4, ISODate(C)}, {V4, ISODate(D)} … [ {V2, ISODate(A)}, {V2, ISODate(B)} ), [ {V2, ISODate(B)}, {V2, ISODate(C)} ), [ {V2, ISODate(C)}, {V2, ISODate(D)} ), … [ {V3, ISODate(A)}, {V3, ISODate(B)} ), [ {V3, ISODate(B)}, {V3, ISODate(C)} ), [ {V3, ISODate(C)}, {V3, ISODate(D)} ), …
  • 27. What is our shard key? • Let's choose: linkID, date – example: { linkID: 9000006, date: 140312 } – example: { _id: "900006:140312" } – this application's _id is in this form already, yay!
  • 28. Summary • Model the read/write patterns and storage • Choose an appropriate shard key • DevOps influenced the application – write recent summary data to separate database – replica set tags for summary database – avoid synchronous sensor checkins – consider changing client polling frequency – consider throttling RESTAPI access to app servers
  • 30. Sr. Solutions Architect, MongoDB Jake Angerman #MongoDBWorld Thank You
  • 31. $ mongo --nodb > cluster = new ShardingTest({"shards": 1, "chunksize": 1}) $ mongo --nodb > // now connect to mongos on 30999 > db = (new Mongo("localhost:30999")).getDB("test") Sharding Experimentation
  • 32. I decided to shard from the outset • Sensor summary documents can all fit in RAM – 16,000 sensors * N bytes • Velocity of sensor events is only 267 writes/sec • Volume of sensor events is what dictates sharding { _id:<linkID>, update:ISODate(“2013-10-10T23:06:37.000Z”), last10:{ avgSpeed:<int>, avgTime:<int> }, lastHour:{ avgSpeed:<int>, avgTime:<int> }, speeds:[52,49,45,51,...], times:[237,224,246,233,...], pavement:"WetSpots", status:"WetConditions", weather:"LightRain" }
  • 33. > this_is_for_replica_sets_not_sharding = { _id : "mySet", members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C"}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuring Sharding
  • 34. I'm off to my private island in New Zealand
  • 36. > conf = { _id : "mySet", members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C"}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuration Options
  • 38. > conf = { _id : "mySet”, members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C"}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuration Options Primary DC
  • 39. Tag Aware Sharding • Control where data is written to, and read from • Each member can have one or more tags – tags: {dc: "ny"} – tags: {dc: "ny", subnet: "192.168", rack: "row3rk7"} • Replica set defines rules for write concerns • Rules can change without changing app code

Editor's Notes

  • #35: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #39: Initialize -> Election Primary + data replication from primary to secondary
  • #40: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data