SlideShare a Scribd company logo
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Maximizing MongoDB
Performance on AWS
Approaches to running your MongoDB deployment in AWS
• Manage everything yourself.
Things to consider:
• Storage/Disk
• Networking
• Instance Size (Memory &
CPU)
• Rely on a “MongoDB” service
with best practices built in.
Storage
Storage Recommendations
• Databases require fast storage
• Amazon EBS: persistent block level
storage to associate with Amazon
EC2 instances
• Instance store: storage volumes that
directly attach to Amazon EC2
hardware for temporary storage
Storage Recommendations
• We recommend Amazon EBS for
Amazon EC2 for MongoDB
• Configured IOPS (PIOPS) for IOPS
guarantee
• Amazon EBS-optimized instance types
(C4, M4, D2, etc) provide dedicated
throughput between Amazon EC2 &
Amazon EBS
• Improves performance by minimizing
contention from other traffic to your
instances with regards to Amazon EBS
I/O traffic
Storage Recommendations
• We recommend dedicating 3 separate
volumes for MongoDB, each with their
own IOPS due to differing workloads.
For example:
• /data (1000 IOPS)
• /journal (250 IOPS)
• /log (100 IOPS)
Storage Recommendations
• Consider throughput for storage
performance
• AWS is fairly generous
• Of this writing, the maximums for AWS
throughput are:
• Max. throughput/Volume is 320 MiB/s
• Max. throughput/Instance is 800 MiB/s
• For better throughput, shard
Storage Recommendations
• RAID 10 provides data reliability by
mirroring data on secondary drives
(RAID 1) and stripes data across
drives (RAID 0)
• Ensure that your total throughput of
the combined RAIDed volumes does
not exceed the maximum instance
throughput
Networking
Networking Recommendations
• Amazon EC2 Enhanced Networking
can provide significantly improved
performance & consistency
Networking Recommendations
• Configure Amazon VPCs for
MongoDB. Amazon Virtual Private
Cloud allows you to provision a
private, isolated section in AWS where
you can define your own IP address,
subnets, route tables, and gateways.
• Use Managed NAT Gateway service
Instance Size
Instance Size Recommendations
• Err on the side of going larger and
scaling down as needed.
• MongoDB working set should fit in
memory
• M4, I2, and R3 Amazon EC2 instance
types tend to be most successful and
widely deployed in customer
deployments
Instance Size Recommendations
• One mongod process per instance to
avoid processes competing for system
resources
Additional Recommendations
Resilience
Resilience
Approaches to running your MongoDB deployment in AWS
• Manage everything yourself.
Things to consider:
• Storage/Disk
• Networking
• Instance Size (Memory &
CPU)
• Rely on a “MongoDB” service
with best practices built in.
Operations Burden
PATCHES
UPGRADES
SECURITY
BACKUPS
RECOVERY
99.999% UPTIME
UPSCALE
DOWNSCALE
PERFORMANCE
UAT
STAGING
MONITORING
ALERTS
PROVISION
CONFIGURE
INSTALL
Automated Available On-Demand
Secure Highly Available Automated Backups
Elastically Scalable
Database as a Service for MongoDB
Maximizing MongoDB Performance on AWS
Questions ?

More Related Content

Viewers also liked (20)

PPTX
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
MongoDB
 
PPTX
How Auto Trader enables the UK's largest digital automotive marketplace
MongoDB
 
PDF
MongoDB Europe 2016 - Welcome
MongoDB
 
PDF
Big Data Spain 2016: Keynote
MongoDB
 
PDF
MongoDB World 2016: Poster Sessions eBook
MongoDB
 
PPTX
Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...
MongoDB
 
PDF
Webinar: 10-Step Guide to Creating a Single View of your Business
MongoDB
 
PPTX
Back to Basics: My First MongoDB Application
MongoDB
 
PPTX
Back to Basics Webinar 3: Introduction to Replica Sets
MongoDB
 
PPTX
Back to Basics 2017: Introduction to Sharding
MongoDB
 
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
PDF
Webinar: Working with Graph Data in MongoDB
MongoDB
 
PPTX
Webinar: Transitioning from SQL to MongoDB
MongoDB
 
PDF
Creating a Modern Data Architecture for Digital Transformation
MongoDB
 
PDF
MongoDB and Hadoop: Driving Business Insights
MongoDB
 
PPTX
MongoDB et Hadoop
MongoDB
 
PPTX
Running MongoDB on AWS
MongoDB
 
PPTX
Running MongoDB 3.0 on AWS
MongoDB
 
PPTX
Big Data and NoSQL for Database and BI Pros
Andrew Brust
 
PPTX
Introduction tomongodb
Lee Theobald
 
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
MongoDB
 
How Auto Trader enables the UK's largest digital automotive marketplace
MongoDB
 
MongoDB Europe 2016 - Welcome
MongoDB
 
Big Data Spain 2016: Keynote
MongoDB
 
MongoDB World 2016: Poster Sessions eBook
MongoDB
 
Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...
MongoDB
 
Webinar: 10-Step Guide to Creating a Single View of your Business
MongoDB
 
Back to Basics: My First MongoDB Application
MongoDB
 
Back to Basics Webinar 3: Introduction to Replica Sets
MongoDB
 
Back to Basics 2017: Introduction to Sharding
MongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
Webinar: Working with Graph Data in MongoDB
MongoDB
 
Webinar: Transitioning from SQL to MongoDB
MongoDB
 
Creating a Modern Data Architecture for Digital Transformation
MongoDB
 
MongoDB and Hadoop: Driving Business Insights
MongoDB
 
MongoDB et Hadoop
MongoDB
 
Running MongoDB on AWS
MongoDB
 
Running MongoDB 3.0 on AWS
MongoDB
 
Big Data and NoSQL for Database and BI Pros
Andrew Brust
 
Introduction tomongodb
Lee Theobald
 

Similar to Maximizing MongoDB Performance on AWS (9)

PDF
Amazon Web Services - Relational Database Service Meetup
cyrilkhairallah
 
PPTX
Diveinto AWS
Abhishek Amralkar
 
PPTX
Hadoop AWS infrastructure cost evaluation
mattlieber
 
PPTX
ECS19 Anil Erduran and Ryan Pothecary - SQL Server On AWS RDS and Andamazone EC2
European Collaboration Summit
 
PPTX
Auto scaling websites in the cloud
David Veksler
 
PDF
Best Practices for Running Microsoft SQL Server on AWS
Gianluca Hotz
 
PPTX
Accelerate SQL Server Migration to the AWS Cloud
Datavail
 
PPTX
SQL Server in the AWS Cloud
DBInsight Pty Ltd
 
PPTX
Running Oracle EBS in the cloud (OAUG Collaborate 18 edition)
Andrejs Prokopjevs
 
Amazon Web Services - Relational Database Service Meetup
cyrilkhairallah
 
Diveinto AWS
Abhishek Amralkar
 
Hadoop AWS infrastructure cost evaluation
mattlieber
 
ECS19 Anil Erduran and Ryan Pothecary - SQL Server On AWS RDS and Andamazone EC2
European Collaboration Summit
 
Auto scaling websites in the cloud
David Veksler
 
Best Practices for Running Microsoft SQL Server on AWS
Gianluca Hotz
 
Accelerate SQL Server Migration to the AWS Cloud
Datavail
 
SQL Server in the AWS Cloud
DBInsight Pty Ltd
 
Running Oracle EBS in the cloud (OAUG Collaborate 18 edition)
Andrejs Prokopjevs
 
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
Ad

Recently uploaded (20)

PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Digital Circuits, important subject in CS
contactparinay1
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 

Maximizing MongoDB Performance on AWS

Editor's Notes

  • #7: The reason for separating your deployment storage across 3 volumes is that database journal files and log files are sequential in nature, and as such, have different access patterns compared to data files. Separating data files from journal and/or log files, particularly with a write intensive workload, will provide an increase in performance by reducing I/O contention. Depending on your workload, and if you are experiencing high I/O wait times, you may be able to benefit from separate disks for your data files, journal, and log files.
  • #8:  MongoDB can provide greater throughput for your deployment by using sharding to spread the load across instances, each holding a subset of your database. For example, if you distribute your data across three shards on independent instances, then your maximum throughput across all three will be 2400MiB/s.
  • #9: We recommend using a redundant array of independent disks (RAID) to improve performance and durability of a MongoDB deployment. There are many levels of RAID, and each has its own advantages and disadvantages. The two key concepts underlying RAID are mirroring (RAID 1) where the same data is written to several disks and striping (RAID 0) where several disks are broken into stripes or bins with the data being copied across these. By using the proper RAID design, data durability and/or increased I/O performance is possible -- sometimes with one being sacrificed for the other. Fortunately, there is a RAID option that doesn’t sacrifice data reliability or increased I/O performance. RAID 10 (sometime called RAID 1+0) combines the features of RAID 1 and RAID 0. RAID 1 provides data reliability by mirroring data on secondary drives, whereas RAID 0 helps to increase I/O performance by striping data across drives. “In most cases, RAID 10 provides better throughput and latency than all other RAID levels, except RAID 0 (which wins in throughput). Thus, RAID 10 is the preferable RAID level for I/O-intensive applications such as database, email, and web servers, as well as for any other use requiring high disk performance.” As noted earlier, you also need to ensure that your total throughput of the combined RAIDed volumes does not exceed the maximum instance throughput. If there are more PIOPS than this maximum limit provisioned for your combined volumes the additional IOPS will be wasted.
  • #11: If your instance type supports the Enhanced Networking feature, we strongly recommend that you enable it. There are unfortunately a few caveats as Amazon EC2 provides enhanced networking capabilities via single root I/O virtualization (SR-IOV) which is only available on C3, C4, D2, I2, M4, and R3 instances and only supported when you are using Amazon VPC (Virtual Private Cloud).
  • #12: Amazon Virtual Private Cloud allows you to provision a private, isolated section in AWS where you can define your own IP address, subnets, route tables, and gateways. Using VPC private subnets when deploying MongoDB servers is recommended. By using Network Address Translation (NAT), your private subnet can access the Internet, but no one on the Internet can access your MongoDB servers. AWS provides a Managed NAT (Network Address Translation) Gateway service and we recommend using this. It enables the mapping of your private IP addresses in your VPC private subnet to a public address with traffic leaving AWS, and it then maps any public IP addresses back to your VPC subnet private addresses for traffic entering AWS. It is also possible to configure a site-to-site VPN connection to access your MongoDB deployment.
  • #14: The working set is the portion of data and related indexes that your clients access most frequently. In cases where your data set is larger than memory, many random disk I/Os will happen which will affect performance as the necessary data is pulled from disk into memory.  Based on our experience helping to implement and support MongoDB deployments on AWS, we have found that the M4, I2, and R3 Amazon ECW instance types tend to be the most successful and widely used in customer deployments.
  • #15: For example, if you are using the WiredTiger storage engine and running two mongod processes on the same instance, you would need to calculate the appropriate cache size needed for each mongod process by evaluating the portion of total RAM each process should use and then split the default cache size between each. If you improperly size the WiredTiger cache and the cache does not have enough space to load additional data, pages will be evicted from the cache to free up space, resulting in unnecessary I/O and performance degradation.