SlideShare a Scribd company logo
June-21-2019
MongoDB HA, what can go wrong?
{"name": "Igor Donchovski",
"live_in": "Skopje",
"email": "donchovski@pythian.com",
"current_role": "Lead database consultant",
"education": [{"type": "College", "name": "FEIT", "graduated": "2008", "university": "UKIM"},
{"type": "Master", "name": "FINKI", "graduated": "2013", "university": "UKIM"}],
"work": [{"role": "Web developer", "start": "2007", "end": "2012", "company": "Gord Systems"},
{"role": "DBA", "start": "2012", "end": "2014", "company": "NOVP"},
{"role": "Database consultant", "start": "2014", "end": "2016", "company": "Pythian"},
{"role": "Lead database consultant", "start": "2016", "company": "Pythian"}],
"certificates": [{"name": "C100DBA", "year": "2016", "description": "MongoDB certified DBA"}],
"social": [{"network": "LinkedIn", "link": "www.linkedin.com/in/igorle"},
{"network": "Twitter", "link": "https://twitter.com/igorle", "handle": "@igorle"}],
"interests": ["Hiking", "Biking", "Traveling"],
"hobbies": ["Painting", "Photography", "Cooking"],
"proud_of": ["Volunteering", "Helping the Community"]}
About Me
© 2019 Pythian. Confidential
• What is replica set, how replication works
• Replication concept
• Replica set features, deployment architectures
• Hidden nodes, Arbiter nodes, Priority 0 nodes
• Production failures
• Monitoring replica set
• QA
Overview
© 2019 Pythian. Confidential
Time
© 2019 Pythian. Confidential
Replication
• Group of mongod processes that maintain the same data set
• Redundancy and high availability
• Increased read capacity (scaling reads)
• Automatic failover
Replica Set
# Members # Nodes Required to Elect New Primary Fault Tolerance
3 2 1
4 3 1
5 3 2
6 4 2
7 4 3
© 2019 Pythian. Confidential
priority:1 votes:1
priority:1 votes:1 priority:1 votes:1
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
1.
© 2019 Pythian. Confidential
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
2. oplog
1.
© 2019 Pythian. Confidential
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
2. oplog
1.
3. 3.
© 2019 Pythian. Confidential
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
© 2018 Pythian. Confidential
2. oplog
1.
3. 3.
4. 4.
Replication Concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary*
*settings.chainingAllowed (true by default)
2. oplog
1.
3. 3.
4. 4.
5.
© 2019 Pythian. Confidential
Replica Set Oplog
• Special capped collection that keeps a rolling record of all operations that
modify the data stored in the databases
• Idempotent
• Default oplog size
For Unix and Windows systems
Storage Engine Default Oplog Size Lower Bound Upper Bound
In-memory 5% of physical memory 50MB 50GB
WiredTiger 5% of free disk space 990MB 50GB
MMAPv1 5% of free disk space 990MB 50GB
© 2019 Pythian. Confidential
© 2019 Pythian. Confidential
Configuration
Configuration Options
• 50 members per replica set (7 voting members)
• Arbiter node
• Priority 0 node
• Hidden node
• Delayed node
© 2019 Pythian. Confidential
• Does not hold copy of data
• Votes in elections
Arbiter Node
hidden : true
Arbiter
© 2019 Pythian. Confidential
Priority 0 Node
Priority - floating point (i.e. decimal) number between 0 and 1000
• Cannot become primary, cannot trigger election
• Visible to application (accepts reads/writes)
• Votes in elections
Secondary
priority : 0
© 2019 Pythian. Confidential
Hidden Node
• Not visible to application
• Never becomes primary, but can vote in elections
• Use cases
○ Reporting
○ Backups
hidden : truehidden: true priority:0
Secondary
hidden : true priority : 0
© 2019 Pythian. Confidential
Delayed Node
• Must be priority 0 member
• Should be hidden member (not mandatory)
• Mainly used for backups (historical snapshot of data)
• Recovery in case of human error
Secondary
slaveDelay : 3600
priority : 0
hidden : true
© 2019 Pythian. Confidential
© 2019 Pythian. Confidential
Everyone on the same page?
© 2019 Pythian. Confidential
Failures
Small Oplog Size
1. Primary/Secondary node down
○ Node failure
○ Planned maintenance
2. Automatic Failover
…… (several hours later)
3. New Primary overwrites latest oplog
4. Failed Node needs resync
MongoDB >= 3.6: db.adminCommand({replSetResizeOplog: 1, size: 32000})
© 2019 Pythian. Confidential
Arbiter Nodes
● Votes in election
● Does not hold copy of data
● If 2 nodes are down, no majority to elect
new Primary
● Fault tolerance is still 1 node
● 4 data nodes + 1 Arbiter makes more
sense
Heartbeat
© 2019 Pythian. Confidential
Priority 0 Nodes
● Application driver sends writes to Primary
● Reads go to Primary by default
● Secondaries can serve reads
● Read preference
○ primary (default)
○ primaryPreferred
○ secondary
○ secondaryPreferred
○ nearest
© 2019 Pythian. Confidential
• Primary node fails
• Replica set starts election for new Primary
• Zero nodes eligible for Primary
• Application can not send writes
• Database is read only*
*depends on read preference setting
Priority 0 Nodes
© 2019 Pythian. Confidential
Hidden Nodes
● Application driver sends writes to Primary
● Reads go to Primary by default
● Secondaries cannot serve reads
● Read preference
○ primary
© 2019 Pythian. Confidential
• Primary node fails
• Replica set starts election for new Primary
• Zero nodes eligible for Primary (priority:0)
• Application can not send writes/reads
• Downtime
Hidden Nodes
© 2019 Pythian. Confidential
• Primary node fails
• Secondary elected as new Primary
• Working set does not fit in memory
• Performance degradation
• Application stalls
Hardware
64GB RAM, 16 CPU
32GB RAM, 8 CPU 32GB RAM, 8 CPU
© 2019 Pythian. Confidential
• Dataset grows
• No Disk space on Secondary
• mongod process fails
• 2 nodes replica set
• Zero tolerance for failures
Hardware
Disk: 300GB
Disk: 300GB Disk: 200GB
© 2019 Pythian. Confidential
● Heartbeat lost
● Primary step down
● New Primary election
● Application timeout*
● Rollback
Best Practice: Test Primary step
down for your application
*Retryable writes since MongoDB 3.6
Network
© 2019 Pythian. Confidential
• All replica set members deployed in single Availability Zone
• Availability Zone #1 goes down
• Downtime
Cloud
Cloud Deployment
Region #1
Availability Zone #1
© 2019 Pythian. Confidential
● Availability Zone #1 goes down
○ New Primary elected from AZ #2
● Availability Zone #2 goes down
○ Database is read only
Cloud Deployment
© 2019 Pythian. Confidential
Cloud
Region #1
AZ#1 AZ#2
• Region #1 goes down
• Downtime
Cloud Deployment
© 2019 Pythian. Confidential
Cloud
Region #1
AZ#1 AZ#2 AZ#3
● VM2 goes down
○ Primary node has majority on VM1
● VM1 goes down
○ Database is read only
Virtualization
VMWARE
VM1 VM2
Physical Server
© 2019 Pythian. Confidential
● Replica set major version upgrade (3.6>4.0)
● Driver v3.6 not compatible with DB v4.0
● Compatibility changes
● Application cannot send requests
● Downtime
● Rollback to previous DB version
Version Upgrades
MongoDB: 3.6.4 MongoDB: 3.6.4
© 2019 Pythian. Confidential
● Replica set major version upgrade
● Promote new version as Primary
● Confirm application works
● Forget to upgrade Secondaries
● Start using new features
● New Primary elected
● Application errors
Version Upgrades
MongoDB: 3.6 MongoDB: 3.6
MongoDB: 4.0
© 2019 Pythian. Confidential
● Minor version upgrade
● Promote new version as Primary
● Confirm application works
● Forget to upgrade Secondaries
● Bug fixes in minor release
● New Primary elected
● Application errors
Version Upgrades
MongoDB: 3.6.4 MongoDB: 3.6.4
MongoDB: 3.6.8
© 2019 Pythian. Confidential
Version Upgrades
MongoDB: 3.6.8MongoDB: 3.6.8MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.3
MongoDB: 3.6.3
MongoDB: 3.6.8
MongoDB: 3.6.8MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
© 2019 Pythian. Confidential
MongoDB: 3.6.3
● Adding index on a collection
● Connect to the Primary node
○ db.people.createIndex( { zipcode: 1 }, { background: true } )
DDL Operation
© 2019 Pythian. Confidential
● Stop one Secondary
● Restart on different port
DDL Operation
Secondary
--port=27777
© 2019 Pythian. Confidential
● Add the Index
● Rejoin to replica
● Promote Secondary as Primary
● Forget the other nodes
DDL Operation
Secondary
--port=27777
db.people.createIndex({zipcode:1})
© 2019 Pythian. Confidential
● Pick one Secondary
● db.fsyncLock()
● Take snapshot
● db.fsyncUnlock()
● Unlock fails
● Secondary starts lagging
● Primary overwrites oplog
● Secondary needs initial sync
Backups
© 2019 Pythian. Confidential
© 2019 Pythian. Confidential
Sharded Clusters
© 2019 Pythian. Confidential
Sharded Clusters
© 2019 Pythian. Confidential
Monitoring Replica Set
• Replica set has no Primary
• Number of unhealthy members is above threshold
• Replication lag is above threshold
• Replica set elected new Primary
• Host of any type has restarted
• Host of type Secondary is recovering
• Host of any type is down
• Host of any type has experienced Rollback
• Network issues between members of the replica set or cluster
• Monitoring backup status
© 2019 Pythian. Confidential
Summary
• Replica set with odd number of voting members
• Hidden or Delayed member for dedicated functions (reporting, backups …)
• Have more than one eligible Primary in the replica set
• Use multi-AZ for Cloud deployments
• Don’t deploy more than one mongod process per node/host
• Run replica set members with same hardware for all nodes
• Run replica set members with same mongo version
• Monitor your replica set status and nodes
• Monitor replication lag and Oplog size
© 2019 Pythian. Confidential
Questions?
© 2019 Pythian. Confidential
We’re Hiring!
https://www.pythian.com/careers/
© 2019 Pythian. Confidential
Ad

More Related Content

What's hot (20)

How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
MongoDB
 
MongoDB and Spark
MongoDB and SparkMongoDB and Spark
MongoDB and Spark
Norberto Leite
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage Engine
MongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
MongoDB
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
Ajay Gupte
 
Webinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDBWebinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDB
MongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
MongoDB
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
MongoDB
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDB
Stone Gao
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
Norberto Leite
 
Sharding
ShardingSharding
Sharding
MongoDB
 
MongoDB - External Authentication
MongoDB - External AuthenticationMongoDB - External Authentication
MongoDB - External Authentication
Jason Terpko
 
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
MongoDB
 
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
Antonios Giannopoulos
 
Sharding
ShardingSharding
Sharding
MongoDB
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
Antonios Giannopoulos
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
MongoDB
 
Mongo db dhruba
Mongo db dhrubaMongo db dhruba
Mongo db dhruba
Dhrubaji Mandal ♛
 
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB
 
Mongo db 3.4 Overview
Mongo db 3.4 OverviewMongo db 3.4 Overview
Mongo db 3.4 Overview
Norberto Leite
 
How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
MongoDB
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage Engine
MongoDB
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
MongoDB
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
Ajay Gupte
 
Webinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDBWebinar: Architecting Secure and Compliant Applications with MongoDB
Webinar: Architecting Secure and Compliant Applications with MongoDB
MongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
MongoDB
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
MongoDB
 
Challenges with MongoDB
Challenges with MongoDBChallenges with MongoDB
Challenges with MongoDB
Stone Gao
 
Sharding
ShardingSharding
Sharding
MongoDB
 
MongoDB - External Authentication
MongoDB - External AuthenticationMongoDB - External Authentication
MongoDB - External Authentication
Jason Terpko
 
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
MongoDB
 
Sharding
ShardingSharding
Sharding
MongoDB
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
Antonios Giannopoulos
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
MongoDB
 
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB
 

Similar to MongoDB HA - what can go wrong (20)

MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation FrameworkMongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB
 
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Flink Forward
 
Sidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion UsersSidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion Users
Dicoding
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to Atlas
MongoDB
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entreprise
LINAGORA
 
Scalable Application Development @ Picnic
Scalable Application Development @ PicnicScalable Application Development @ Picnic
Scalable Application Development @ Picnic
Sander Mak (@Sander_Mak)
 
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Harry McLaren
 
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Fabrice Bernhard
 
Industrialiser spark
Industrialiser sparkIndustrialiser spark
Industrialiser spark
Lucien Fregosi
 
Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud Platform
MariaDB plc
 
Open Social Summit Korea Overview
Open Social Summit Korea OverviewOpen Social Summit Korea Overview
Open Social Summit Korea Overview
Chris Schalk
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source tools
All Things Open
 
Android best practices 2015
Android best practices 2015Android best practices 2015
Android best practices 2015
Sean Katz
 
Conquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to PostgresConquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to Postgres
EDB
 
IRJET- Industry Production Manager using Raspberry Pi
IRJET-  	  Industry Production Manager using Raspberry PiIRJET-  	  Industry Production Manager using Raspberry Pi
IRJET- Industry Production Manager using Raspberry Pi
IRJET Journal
 
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
jaxLondonConference
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
VMware Tanzu
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
Carlos Andrés García
 
Implementing OpenChain ISO/IEC 5230 at endjin
Implementing OpenChain ISO/IEC 5230 at endjinImplementing OpenChain ISO/IEC 5230 at endjin
Implementing OpenChain ISO/IEC 5230 at endjin
HowardvanRooijen1
 
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDays Riga
 
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation FrameworkMongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB World 2019: Unleash the Power of the MongoDB Aggregation Framework
MongoDB
 
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Building A Self Service Streaming Platform at Pinterest - Steven Bairos-Novak...
Flink Forward
 
Sidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion UsersSidiq Permana - Building For The Next Billion Users
Sidiq Permana - Building For The Next Billion Users
Dicoding
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to Atlas
MongoDB
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entreprise
LINAGORA
 
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Splunk Phantom, the Endpoint Data Model & Splunk Security Essentials App!
Harry McLaren
 
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Modernisation of legacy PHP applications using Symfony2 - PHP Northeast Confe...
Fabrice Bernhard
 
Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud Platform
MariaDB plc
 
Open Social Summit Korea Overview
Open Social Summit Korea OverviewOpen Social Summit Korea Overview
Open Social Summit Korea Overview
Chris Schalk
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source tools
All Things Open
 
Android best practices 2015
Android best practices 2015Android best practices 2015
Android best practices 2015
Sean Katz
 
Conquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to PostgresConquering Data Migration from Oracle to Postgres
Conquering Data Migration from Oracle to Postgres
EDB
 
IRJET- Industry Production Manager using Raspberry Pi
IRJET-  	  Industry Production Manager using Raspberry PiIRJET-  	  Industry Production Manager using Raspberry Pi
IRJET- Industry Production Manager using Raspberry Pi
IRJET Journal
 
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...From Java Code to Java Heap: Understanding the Memory Usage of Your App  - Ch...
From Java Code to Java Heap: Understanding the Memory Usage of Your App - Ch...
jaxLondonConference
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
VMware Tanzu
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
Carlos Andrés García
 
Implementing OpenChain ISO/IEC 5230 at endjin
Implementing OpenChain ISO/IEC 5230 at endjinImplementing OpenChain ISO/IEC 5230 at endjin
Implementing OpenChain ISO/IEC 5230 at endjin
HowardvanRooijen1
 
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDaysRiga 2018: Eric Skoglund, Lars Albertsson - Kubernetes as data plat...
DevOpsDays Riga
 
Ad

Recently uploaded (20)

Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
Mathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdfMathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdf
TalhaShahid49
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Journal of Soft Computing in Civil Engineering
 
Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.
anuragmk56
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...
IJCSES Journal
 
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Journal of Soft Computing in Civil Engineering
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
Introduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptxIntroduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptx
AS1920
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G..."Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
Infopitaara
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
Mathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdfMathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdf
TalhaShahid49
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.
anuragmk56
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...
IJCSES Journal
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
Introduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptxIntroduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptx
AS1920
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G..."Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
Infopitaara
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Ad

MongoDB HA - what can go wrong

  • 2. {"name": "Igor Donchovski", "live_in": "Skopje", "email": "[email protected]", "current_role": "Lead database consultant", "education": [{"type": "College", "name": "FEIT", "graduated": "2008", "university": "UKIM"}, {"type": "Master", "name": "FINKI", "graduated": "2013", "university": "UKIM"}], "work": [{"role": "Web developer", "start": "2007", "end": "2012", "company": "Gord Systems"}, {"role": "DBA", "start": "2012", "end": "2014", "company": "NOVP"}, {"role": "Database consultant", "start": "2014", "end": "2016", "company": "Pythian"}, {"role": "Lead database consultant", "start": "2016", "company": "Pythian"}], "certificates": [{"name": "C100DBA", "year": "2016", "description": "MongoDB certified DBA"}], "social": [{"network": "LinkedIn", "link": "www.linkedin.com/in/igorle"}, {"network": "Twitter", "link": "https://twitter.com/igorle", "handle": "@igorle"}], "interests": ["Hiking", "Biking", "Traveling"], "hobbies": ["Painting", "Photography", "Cooking"], "proud_of": ["Volunteering", "Helping the Community"]} About Me © 2019 Pythian. Confidential
  • 3. • What is replica set, how replication works • Replication concept • Replica set features, deployment architectures • Hidden nodes, Arbiter nodes, Priority 0 nodes • Production failures • Monitoring replica set • QA Overview © 2019 Pythian. Confidential Time
  • 4. © 2019 Pythian. Confidential Replication
  • 5. • Group of mongod processes that maintain the same data set • Redundancy and high availability • Increased read capacity (scaling reads) • Automatic failover Replica Set # Members # Nodes Required to Elect New Primary Fault Tolerance 3 2 1 4 3 1 5 3 2 6 4 2 7 4 3 © 2019 Pythian. Confidential priority:1 votes:1 priority:1 votes:1 priority:1 votes:1
  • 6. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary 1. © 2019 Pythian. Confidential
  • 7. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary 2. oplog 1. © 2019 Pythian. Confidential
  • 8. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary 2. oplog 1. 3. 3. © 2019 Pythian. Confidential
  • 9. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary © 2018 Pythian. Confidential 2. oplog 1. 3. 3. 4. 4.
  • 10. Replication Concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary* *settings.chainingAllowed (true by default) 2. oplog 1. 3. 3. 4. 4. 5. © 2019 Pythian. Confidential
  • 11. Replica Set Oplog • Special capped collection that keeps a rolling record of all operations that modify the data stored in the databases • Idempotent • Default oplog size For Unix and Windows systems Storage Engine Default Oplog Size Lower Bound Upper Bound In-memory 5% of physical memory 50MB 50GB WiredTiger 5% of free disk space 990MB 50GB MMAPv1 5% of free disk space 990MB 50GB © 2019 Pythian. Confidential
  • 12. © 2019 Pythian. Confidential Configuration
  • 13. Configuration Options • 50 members per replica set (7 voting members) • Arbiter node • Priority 0 node • Hidden node • Delayed node © 2019 Pythian. Confidential
  • 14. • Does not hold copy of data • Votes in elections Arbiter Node hidden : true Arbiter © 2019 Pythian. Confidential
  • 15. Priority 0 Node Priority - floating point (i.e. decimal) number between 0 and 1000 • Cannot become primary, cannot trigger election • Visible to application (accepts reads/writes) • Votes in elections Secondary priority : 0 © 2019 Pythian. Confidential
  • 16. Hidden Node • Not visible to application • Never becomes primary, but can vote in elections • Use cases ○ Reporting ○ Backups hidden : truehidden: true priority:0 Secondary hidden : true priority : 0 © 2019 Pythian. Confidential
  • 17. Delayed Node • Must be priority 0 member • Should be hidden member (not mandatory) • Mainly used for backups (historical snapshot of data) • Recovery in case of human error Secondary slaveDelay : 3600 priority : 0 hidden : true © 2019 Pythian. Confidential
  • 18. © 2019 Pythian. Confidential Everyone on the same page?
  • 19. © 2019 Pythian. Confidential Failures
  • 20. Small Oplog Size 1. Primary/Secondary node down ○ Node failure ○ Planned maintenance 2. Automatic Failover …… (several hours later) 3. New Primary overwrites latest oplog 4. Failed Node needs resync MongoDB >= 3.6: db.adminCommand({replSetResizeOplog: 1, size: 32000}) © 2019 Pythian. Confidential
  • 21. Arbiter Nodes ● Votes in election ● Does not hold copy of data ● If 2 nodes are down, no majority to elect new Primary ● Fault tolerance is still 1 node ● 4 data nodes + 1 Arbiter makes more sense Heartbeat © 2019 Pythian. Confidential
  • 22. Priority 0 Nodes ● Application driver sends writes to Primary ● Reads go to Primary by default ● Secondaries can serve reads ● Read preference ○ primary (default) ○ primaryPreferred ○ secondary ○ secondaryPreferred ○ nearest © 2019 Pythian. Confidential
  • 23. • Primary node fails • Replica set starts election for new Primary • Zero nodes eligible for Primary • Application can not send writes • Database is read only* *depends on read preference setting Priority 0 Nodes © 2019 Pythian. Confidential
  • 24. Hidden Nodes ● Application driver sends writes to Primary ● Reads go to Primary by default ● Secondaries cannot serve reads ● Read preference ○ primary © 2019 Pythian. Confidential
  • 25. • Primary node fails • Replica set starts election for new Primary • Zero nodes eligible for Primary (priority:0) • Application can not send writes/reads • Downtime Hidden Nodes © 2019 Pythian. Confidential
  • 26. • Primary node fails • Secondary elected as new Primary • Working set does not fit in memory • Performance degradation • Application stalls Hardware 64GB RAM, 16 CPU 32GB RAM, 8 CPU 32GB RAM, 8 CPU © 2019 Pythian. Confidential
  • 27. • Dataset grows • No Disk space on Secondary • mongod process fails • 2 nodes replica set • Zero tolerance for failures Hardware Disk: 300GB Disk: 300GB Disk: 200GB © 2019 Pythian. Confidential
  • 28. ● Heartbeat lost ● Primary step down ● New Primary election ● Application timeout* ● Rollback Best Practice: Test Primary step down for your application *Retryable writes since MongoDB 3.6 Network © 2019 Pythian. Confidential
  • 29. • All replica set members deployed in single Availability Zone • Availability Zone #1 goes down • Downtime Cloud Cloud Deployment Region #1 Availability Zone #1 © 2019 Pythian. Confidential
  • 30. ● Availability Zone #1 goes down ○ New Primary elected from AZ #2 ● Availability Zone #2 goes down ○ Database is read only Cloud Deployment © 2019 Pythian. Confidential Cloud Region #1 AZ#1 AZ#2
  • 31. • Region #1 goes down • Downtime Cloud Deployment © 2019 Pythian. Confidential Cloud Region #1 AZ#1 AZ#2 AZ#3
  • 32. ● VM2 goes down ○ Primary node has majority on VM1 ● VM1 goes down ○ Database is read only Virtualization VMWARE VM1 VM2 Physical Server © 2019 Pythian. Confidential
  • 33. ● Replica set major version upgrade (3.6>4.0) ● Driver v3.6 not compatible with DB v4.0 ● Compatibility changes ● Application cannot send requests ● Downtime ● Rollback to previous DB version Version Upgrades MongoDB: 3.6.4 MongoDB: 3.6.4 © 2019 Pythian. Confidential
  • 34. ● Replica set major version upgrade ● Promote new version as Primary ● Confirm application works ● Forget to upgrade Secondaries ● Start using new features ● New Primary elected ● Application errors Version Upgrades MongoDB: 3.6 MongoDB: 3.6 MongoDB: 4.0 © 2019 Pythian. Confidential
  • 35. ● Minor version upgrade ● Promote new version as Primary ● Confirm application works ● Forget to upgrade Secondaries ● Bug fixes in minor release ● New Primary elected ● Application errors Version Upgrades MongoDB: 3.6.4 MongoDB: 3.6.4 MongoDB: 3.6.8 © 2019 Pythian. Confidential
  • 36. Version Upgrades MongoDB: 3.6.8MongoDB: 3.6.8MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.3 MongoDB: 3.6.3 MongoDB: 3.6.8 MongoDB: 3.6.8MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 MongoDB: 3.6.8 © 2019 Pythian. Confidential MongoDB: 3.6.3
  • 37. ● Adding index on a collection ● Connect to the Primary node ○ db.people.createIndex( { zipcode: 1 }, { background: true } ) DDL Operation © 2019 Pythian. Confidential
  • 38. ● Stop one Secondary ● Restart on different port DDL Operation Secondary --port=27777 © 2019 Pythian. Confidential
  • 39. ● Add the Index ● Rejoin to replica ● Promote Secondary as Primary ● Forget the other nodes DDL Operation Secondary --port=27777 db.people.createIndex({zipcode:1}) © 2019 Pythian. Confidential
  • 40. ● Pick one Secondary ● db.fsyncLock() ● Take snapshot ● db.fsyncUnlock() ● Unlock fails ● Secondary starts lagging ● Primary overwrites oplog ● Secondary needs initial sync Backups © 2019 Pythian. Confidential
  • 41. © 2019 Pythian. Confidential
  • 42. Sharded Clusters © 2019 Pythian. Confidential
  • 43. Sharded Clusters © 2019 Pythian. Confidential
  • 44. Monitoring Replica Set • Replica set has no Primary • Number of unhealthy members is above threshold • Replication lag is above threshold • Replica set elected new Primary • Host of any type has restarted • Host of type Secondary is recovering • Host of any type is down • Host of any type has experienced Rollback • Network issues between members of the replica set or cluster • Monitoring backup status © 2019 Pythian. Confidential
  • 45. Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting, backups …) • Have more than one eligible Primary in the replica set • Use multi-AZ for Cloud deployments • Don’t deploy more than one mongod process per node/host • Run replica set members with same hardware for all nodes • Run replica set members with same mongo version • Monitor your replica set status and nodes • Monitor replication lag and Oplog size © 2019 Pythian. Confidential