MongoDB As A Service
MongoDB As A Service
Conclusion 18
Resources 18
Introduction
With several hundred thousand production deployments Organizations such as a top investment bank, The Royal
and customers in more than 50% of Fortune 100 Bank of Scotland, and the US Department of Veteran
companies, MongoDB is the industry’s fastest growing Affairs use MongoDB as their Database-as-a-Service
database. An increasing number of organizations are using (DBaaS) platform. Building upon the success these and
MongoDB Enterprise Advanced to deliver a others have had, this whitepaper provides the top 10
Database-as-a-Service (DBaaS), standardizing the way in considerations IT groups need to make in building their
which internal business units and project teams consume own MongoDB-as-a-Service, whether delivered from
MongoDB, thereby improving: private clouds running in internal data centers or from any
of the leading public cloud platforms.
• Business Agility
Agility.. Making it simple to rapidly spin up
new development environments that can be quickly This paper is focused on those organizations building their
migrated to production deployments when the project own private MongoDB-as-a-Service, but MongoDB also
goes live offers MongoDB Atlas as a ready built service. Some of the
considerations covered in the paper, such as identifying
• Operational Efficiency
Efficiency.. Re-using standard
common workloads for capacity planning purposes, and
infrastructure, processes, tools, and best practices
scaling with sharding also apply to MongoDB Atlas.
across multiple projects
1
Curr
Current
ent Pr
Projected
ojected (1
(12
2 months)
six months, the IT group can capture current and Architecture Design
anticipated database usage, architecture design, and
operational policies. This will ensure the IT group designs a A profile of the existing or planned infrastructure will help
shared service delivery infrastructure that will meet both size platform requirements and cluster configurations. Key
the short and medium term needs of its internal customers. data to capture is shown in Table 2.
The process will also identify candidates for an initial pilot
of the service before it is made generally available to
Operational Policies
project teams across the organization.
The final stage of the discovery process is to capture
Key stakeholders for consultation in this stage include the
requirements that dictate how the application is run in
following for each project:
production, including:
• Business owners
• Performance and availability SLAs (Service Level
• Architects Agreements)
• Developers • Provisioning, upgrade, and change control processes
• DBAs • Data archive, backup, and restore policies
• Operations staff • Database management, and monitoring
• Network and storage engineers • Security requirements (e.g., access control, encryption,
• Corporate security and compliance representatives and auditing)
The first stage is to document current and projected While not exhaustive, the checklists above will help to
MongoDB usage for each project. Key statistics to capture profile MongoDB usage and inform a design that meets
are shown in Table 1. the immediate needs of internal customers. It is also
important to remember that with its loosely-coupled,
flexible architecture the IT group is not locked in to a rigid
MongoDB design. It can be rapidly adapted and
re-provisioned to meet new application requirements as
they evolve in the future.
2
Curr
Current
ent Pr
Projected
ojected (1
(12
2 months)
Number of Shards
3
Storage • Avoid over-subscription by isolating the MongoDB
workload from others that share the same physical SAN
MongoDB does not require shared storage (e.g. Storage and networking infrastructure.
Area Networks), and is instead optimized for locally
• Without proper redundancy, SANs can present a single
attached storage. Data access patterns in MongoDB do
point of failure. If all members of a MongoDB replica set
not have sequential properties, and as a result applications
are co-located on the same SAN, ensure mechanisms
may experience substantial performance gains by using
exist for fast SAN recovery.
SSDs, especially where workloads require random updates
to very large working sets. While data files benefit from
SSDs, MongoDB’s journal files do exhibit high sequential Operating System
write profiles and are therefore good candidates for fast
local hard disk drives. MongoDB Enterprise Advanced is certified for multiple
operating systems:
When planning storage provision, it is important to consider
storage engine options. The WiredTiger and Encrypted • Four Linux distributions: Red Hat Enterprise Linux,
storage engines provide several compression options, CentOS, Ubuntu, SuSE, and Amazon Linux
making them up to 80% more storage-efficient than the • Windows 7/Windows Server 2008 R2 or later
MMAPv1 storage engine.
• macOS
Most MongoDB deployments should use RAID-10 storage.
In choosing an operating system, enterprise mandates
RAID-5 and RAID-6 do not provide sufficient performance.
must be considered first. Where the enterprise supports
RAID-0 provides good write performance, but limited read
multiple options, Linux is preferred.
performance and insufficient fault tolerance.
4
Step 3: Virtualization Strategy Step 4: Enabling Multi-Tenant
Services
While not a prerequisite, building an infrastructure to
deliver MongoDB-as-a-Service enables the IT group to
There are multiple approaches to building a multi-tenant
utilize virtualization technologies. In efforts to drive up
MongoDB service on top of the virtualization technologies
system utilization and enhance operational efficiency by
discussed in Step 3. The appropriate choice will depend on
eliminating “one application per server", most enterprises
specific requirements for security, workload isolation, and
have already standardized on a certified set of virtualization
performance. The following section focuses on the two
technologies. MongoDB Enterprise Advanced is supported
latter criteria, while security is discussed in Step 5.
on all mainstream virtualized public and private cloud
infrastructure, including:
Hypervisor-Based Virtual Machines
• Hypervisor virtualization such as Xen, KVM, VMware
vCloud Suite and vSphere platform, Oracle VirtualBox, Each physical server is partitioned into multiple VMs
OpenVZ and Microsoft Hyper-V running a full operating system image and MongoDB
• Container virtualization, such as Linux Containers (LXC) (mongod) process. System resources such as CPU, RAM,
and Docker and disk IO can be dedicated to each VM, preventing one
VM from impacting the performance of others.
• Private and public cloud platforms based on the
virtualization technologies described above, including While this approach does not allow the density of instances
OpenStack, Cloud Foundry and OpenShift, and public seen with lighter-weight container-based virtualization, it
cloud offerings such as AWS, Google Compute Engine, does enable stronger isolation between each instance. It is
Rackspace, and Microsoft Azure also a well tested, mature approach used by technologies
• Non-virtualized public cloud offerings such as IBM’s such as VMware vSphere and services such as AWS EC2.
SoftLayer A key consideration in deploying enterprise hypervisor
With multiple VM (Virtual Machine) images or containers technologies is to avoid over-provisioning at any level of
running MongoDB on a single physical host, consideration CPU cores, RAM, network, or storage. These technologies
should be given to ensuring adequate resources are assume that most hypervisor client systems will rarely use
allocated to each instance. Avoid over-provisioning their allotted resources. That assumption is invalid for an
resources such as RAM. Most importantly, ensure that operational database such as MongoDB. In particular,
multiple members of a replica set are not sharing the same memory ballooning should be avoided or disabled, as it will
underlying hardware, as this will create a single point of conflict directly with MongoDB’s approach to using RAM.
failure.
Containers
Key Takeaways Using Linux’s LXC containers and cgroups, a single
MongoDB supports all mainstream virtualization platforms. physical host and Linux kernel can be partitioned into
As we will see below, the choice of virtualization technology multiple isolated user-level containers, each running a
can impact the strategy for database multi-tenancy within a single MongoDB process, assigned with unique user
single physical MongoDB cluster. credentials for access control. As with VMs, system
resources can be dedicated to each container to prevent
oversubscription by competing workloads.
5
Figur
Figure
e 1: MongoDB Multi-Tenancy with Virtualization – Containers vs. VMs
• Pac
Packk mor
moree inst
instances
ances per physic
physical
al host as there is Process Separation
less system overhead. Containers use one operating
system image shared between all VMs rather than each An alternative approach is to run a MongoDB process for
VM carrying its own operating system each tenant in a single operating system image. This allows
for a high density of tenants, but with limited isolation there
• Faster to inst
instantiate
antiate a LXC or Docker container than it can be contention for system resources between
is to boot a guest operating system in a VM processes. Linux cgroups can be used to constrain the use
It is common to run containers within VMs (e.g., when using of RAM, CPU cores, and disk and network IO by each
Docker on Amazon EC2), providing a double level of mongod process.
virtualization.
6
Figur
Figure
e 2: MongoDB Multi-Tenancy with Logical vs. Process Separation
MongoDB 3.6 enables operations teams to more easily There is not a one-size-fits-all; and differing technologies
inspect, monitor, and control each user session running in can be combined to manage applications at different
the database (across all logical databases). They can view, stages of their lifecycle and to accommodate specific SLAs
group, and search user sessions across every node in the and usage patterns. MongoDB is sufficiently flexible to
cluster, and respond to performance issues in real time. For support all of the approaches discussed above.
example, if a user or developer error is causing runaway
queries, administrators now have the fine-grained
operational oversight to view and terminate that session by
removing all associated session state across a sharded
cluster in a single operation.
7
Step 5: Enforcing Security • Within a multi-tenant environment, landlord developers
and administrators in the IT team can be assigned
Isolation between Multiple permissions across multiple physical clusters and
Tenants databases, while tenant developers and administrators
in individual project teams can be granted a more
limited set of actions across the logical databases or
MongoDB Enterprise Advanced features extensive individual collections used by their application. This
capabilities to enforce security isolation between tenants. functionality enables a clear separation of duties and
Security is a dimension of service design that should be control.
defined early, though it may be implemented progressively
as the enterprise services mature. Details vary by For simplicity in account provisioning and maintenance,
organization and must go hand-in-hand with multi-tenant predefined roles can be delegated across entire teams,
access to the cluster. ensuring the enforcement of consistent policies across
specific functions within the organization.
8
Encryption
MongoDB data can be encrypted on the network and on
disk. Support for SSL allows clients to connect to
MongoDB over an encrypted channel. MongoDB supports
FIPS 140-2 encryption when run in FIPS Mode with a
FIPS validated Cryptographic module.
Key Takeaways
Definition of security policies should start at the outset of
the project, based on corporate compliance and privacy
directives.
9
Figur
Figuree 4: Active/Active Data Centers - Tolerates Failures of Servers, Racks & Data Center, plus Network Partitions
elected to primary and the client connections failover to As a best practice replica set members should at the very
that new primary. least run on separate physical servers, preferably in
separate racks and for highest resilience, across
The number of replicas in a MongoDB replica set is
regionally-separated data centers.
configurable, with a larger number of replica members
providing increased data durability and protection against The number of replica set members should also be
database downtime (e.g. in case of multiple machine carefully considered, ideally using a quantitative model of
failures, rack failures, data center failures, or network empirically-based probabilities of the various failure levels
partitions). In MongodDB 3.0 and higher, replica sets can of different infrastructure components (i.e. VM, physical
contain up to 50 members. Replica set members can be server, rack, data center, and region). At a minimum, three
deployed in a single data center or across multiple data members should be deployed in each replica set, though in
centers in active-standby or active-active modes, providing less critical applications it is possible to use two replica set
geographic resilience in the event of regional disasters. In members and an arbiter (note that in this model, the replica
addition, MongoDB provides advanced options to control set would be unable to serve writes if configured with a
data center awareness. majority write concern in the event of a failure of either of
the replica set members).
Read the MongoDB and Multi-Data Center Deployments
whitepaper to learn more about replication and geographic
awareness. Database Scaling with MongoDB
Automatic Sharding
Deploying Replica Sets in a Shared While performance-intensive applications can be moved to
MongoDB Service their own dedicated replica sets, as the workload continues
to grow users should consider scaling out (sharding)
Depending on the SLAs, multiple applications can be
MongoDB if any of the following conditions are anticipated:
hosted on a single replica set, with workload isolation
enforced by the appropriate multi-tenancy strategy • RAM Limit
Limitation.
ation. The size of the system’s active
discussed in Step 5. working set plus indexes is expected to exceed the
capacity of the maximum amount of RAM in the system.
The IT team then has the flexibility to separate the most
performance or availability-sensitive applications to their • Disk II/O
/O Limit
Limitation.
ation. The system will have a large
own dedicated replica sets within the resource pool, while amount of write activity, and the operating system will
still maintaining centralized control and management of the not be able to write data fast enough to meet demand,
service. or I/O bandwidth will limit how fast the writes can be
flushed to disk.
10
Figur
Figuree 5: Sharding and replica sets – automatic sharding provides horizontal scalability; replica sets prevent downtime
• Storage Limit
Limitation.
ation. The data set will grow to exceed Deploying Shards in a Shared MongoDB
the storage capacity of a single node in the system. Service
Applications that meet these criteria, or that are likely to do While sharding is automatic and transparent to the
so in the future, should be designed for scaling out in application, careful consideration needs to be given to
advance rather than waiting until they run out of capacity. selecting a shard key as this controls how the database is
partitioned and distributed across the hardware cluster.
MongoDB provides horizontal scale out using a technique
Shard key selection can have a significant impact on the
called sharding, allowing MongoDB deployments to scale
performance of the database. The choice of shard key is
beyond the hardware limitations of a single server.
application-dependent, based on the database schema and
Sharding distributes data across multiple physical partitions
the way in which the application queries and writes data.
called shards, and is transparent to applications. Shards
can be located within a single data center or distributed Unless MongoDB is servicing a single application
across multiple data centers. As illustrated in Figure 5, accessed by multiple tenants (i.e. Software-as-a-Service, or
each shard is deployed in a replica set, to provide both SaaS) it is not appropriate to provision all applications to a
scalability and high availability to the MongoDB service. single sharded cluster. Instead, each application requiring
the additional scaling that sharding brings should be
MongoDB automatically balances the data in the cluster as
deployed to its own sharded cluster within the shared
the data grows or the size of the cluster increases or
MongoDB resource pool. This approach ensures that each
decreases. For more on sharding see the Sharding
application is scaled according to its workload patterns.
Introduction.
11
Review the documentation to learn more about shard key converting complex manual tasks into reliable, automated
selection. procedures with the click of a button or via an API call:
• Deploy
Deploy.. Any topology, at any scale
Key Takeaways • Upgrade. In minutes, with no downtime
Failure to meet SLAs will not only result in the MongoDB • Sc
Scale.
ale. Add capacity, without taking the application
service failing to gain traction within the organization, it can offline;
also result in damage to the corporate brand, lost
• Sc
Scheduled
heduled Bac
Backups.
kups. Customize to meet recovery
customers, and even regulatory penalties.
goals
• All production applications should use MongoDB’s • Point-in-time Recovery
Recovery.. Restore to any point in time,
replica sets to avoid downtime that can result from because disasters aren't scheduled
system failures.
• Performance Alerts. Monitor 100+ system metrics
• Busier or more critical apps can be provisioned to their and get custom alerts before the system degrades.
own dedicated replica sets to achieve higher
performance. Ops Manager roles can be defined to IT group
administrators across the entire shared environment, and
• When an application needs to scale beyond the capacity
delegated to individual project teams to provide access to
of a single replica set master, the database can be
just the resources they have provisioned. From MongoDB
re-provisioned onto a sharded cluster.
3.6, multiple Projects (each managing multiple MongoDB
Even though you may have some application databases clusters) can be placed under a single organization,
co-located on the same physical hardware and others allowing operations teams to centrally view and administer
distributed to dedicated replica sets and sharded clusters, all Projects under the organization hierarchy.
you can still manage the overall MongoDB resource pool
as a single, shared service. This is discussed in the
following section.
Deployments and Upgrades
It must be simple for project teams to request allocation of
resources from the MongoDB resource pool, and for those
Step 7: Managing the Service: resources to then be provisioned and managed. Ops
Provisioning, Monitoring, and Manager reliably orchestrates the tasks that administrators
have traditionally performed manually – provisioning a new
Disaster Recovery cluster, upgrades, restoring systems to a point in time, and
many other operational tasks.
Ops Manager is the simplest way to run MongoDB, making
Ops Manager provides the ability to create pre-provisioned
it easy for operations teams to deploy, monitor, backup, and
server pools. The Ops Manager agent can be installed
scale MongoDB. Ops Manager was created by the
across a fleet of servers (physical hardware, VMs, AWS
engineers who develop the database and is available as
instances, etc.) by a configuration management tool such
part of MongoDB Enterprise Advanced. Many of the
as Chef, Puppet, or Ansible. The server pool can then be
capabilities of Ops Manager are also available with
exposed to internal teams, ready for provisioning servers
MongoDB Cloud Manager, hosted in the cloud. Today,
into their local groups, either by the programmatic Ops
Cloud Manager supports thousands of deployments,
Manager API or the Ops Manager GUI. When users
including systems from one to hundreds of servers.
request an instance, Ops Manager will remove the server
Ops Manager and Cloud Manager incorporate best from the pool, and then provision and configure it into the
practices to help keep managed databases healthy and local group. It can return the server to the pool when it is
optimized. They ensures operational continuity by no longer required, all without sysadmin intervention.
Administrators can track when servers are provisioned
12
Figur
Figuree 6: Ops Manager self-service portal: simple, intuitive, and powerful. Deploy and upgrade clusters with a single click.
from the pool, and receive alerts when available server configuration management tools, or manually by an
resources are running low. Pre-provisioned server pools administrator.
allow administrators to create true, on-demand database • The administrator creates a new design goal for the
resources for private cloud environments. system, either as a modification to an existing
Building upon server pools, Ops Manager offers certified deployment (e.g., upgrade, oplog resize, new shard), or
integration with Cloud Foundry. BOSH, the Cloud Foundry as a new system.
configuration management tool, can install the Ops • The agents periodically check in with the Ops Manager
Manager agent onto the server configuration requested by central server and receive the new design instructions.
the user, and then use the Ops Manager API to build the
• Agents create and follow a plan for implementing the
desired MongoDB configuration. Once the deployment has
design. Using a sophisticated rules engine, agents
reached goal state, Cloud Foundry will notify the user of
continuously adjust their individual plans as conditions
the URL of their MongoDB deployment. From this point,
change. In the face of many failure scenarios – such as
users can log in to Ops Manager to monitor, back-up, and
server failures and network partitions – agents will
automate upgrades of their deployment.
revise their plans to reach a safe state.
13
In addition to initial deployment, Ops Manager and Cloud Ops Manager and Cloud Manager allow administrators to
Manager make it possible to dynamically resize capacity by set custom alerts when key metrics are out of range. Alerts
adding shards and replica set members. Other can be configured for a range of parameters affecting
maintenance tasks such as upgrading MongoDB or individual hosts, replica sets, agents and backup. Alerts can
resizing the oplog can be reduced from dozens or be sent via SMS and email or integrated into existing
hundreds of manual steps to the click of a button, all with incident management systems such as PagerDuty and
zero downtime. HipChat to proactively warn of potential issues, before they
escalate to costly outages.
14
MongoDB engineers monitor user backups on a 24x365 on the underlying infrastructure, the chosen virtualization
basis, alerting operations teams if problems arise. technologies must supplement this with appropriate
charges for software, support, and administration costs.
Chargeback
Key Takeaways
How cost accounting and chargeback is managed is Cost accounting and chargeback policies are specific to
largely dependent on specific organizational policies. There each organization. Many public and private cloud
are, however, best practices to observe: infrastructures provide mechanisms to tracking and billing
the use of underlying infrastructure resources.
• If those project teams consuming the service do not
bear proportionate costs, there is a risk of overuse and
depletion of available resources. Provisioned capacity
Step 9: Define the
can be left idle by teams who have no motivation to
return it to the service’s resource pool. Implementation Plan
• Conversely, if the resources are overpriced, the
consumers will make little if any use of them, instead With the variety of enterprise requirements for delivering
favoring less expensive options, including local business MongoDB as a Service, there is no single “out of the box"
unit resources or public cloud providers. template for an implementation plan. Using the
considerations presented in this whitepaper, MongoDB
Accounting processes will typically begin with the
consultants can apply best practices to collaborate with the
underlying infrastructure layer (i.e., servers and storage)
IT group in defining a plan that accelerates implementation,
whose resources are consumed first. As services are built
while at the same time reducing risk.
15
Personnel Requirements during the development phase of their applications,
including MongoDB schema design, sharding, and
The IT group implementing the MongoDB service should performance tuning.
seek participation from representatives drawn from all
internal stakeholders. The primary service implementation Learn more about the full range of MongoDB consulting
work may be performed by operations-capable developers services.
from within the organization’s own staff, or by a trusted
Systems Integrator (SI). However, active participation and
Key Takeaways
review throughout the development process should be
provided by: Create a service implementation team with 360-degree
involvement of MongoDB and enterprise stakeholders.
• MongoDB-as-a-Service project management
• Operations staff who will assume responsibility of the Step 10: Production-Grade
service DBaaS - Supported, Secure, and
• Network and storage administrators Automated
• Application developers who are the internal customers
for the first phase of the service
We are the MongoDB experts. Over 4,300 organizations
• Corporate security and compliance representatives rely on our commercial products, including startups and
more than half of the Fortune 100. We offer software and
services to make your life easier:
Augmenting the Team: MongoDB
Consulting Services MongoDB Enterprise Advanced is the best way to run
MongoDB in your data center. It's a finely-tuned package
MongoDB Consulting Engineers should also be used as
of advanced software, support, certifications, and other
extensions to the project team, bringing expertise and best
services designed for the way you do business.
practices from other MongoDB-as-a-Service
engagements. A range of fixed-term engagements are MongoDB Atlas is a database as a service for MongoDB,
available to support you through design, testing, launch, letting you focus on apps instead of ops. With MongoDB
and ongoing management of the service: Atlas, you only pay for what you use with a convenient
hourly billing model. With the click of a button, you can
• The MongoDB Private Cloud Accelerator consulting
scale up and down when you need to, with no downtime,
package provides support from the experts to get your
full security, and high performance.
MongoDB private cloud up and running.
• The MongoDB Health Check provides an assessment MongoDB Stitch is a backend as a service (BaaS), giving
of the service’s architecture design readiness and developers full access to MongoDB, declarative read/write
operational policies. controls, and integration with their choice of services.
• The Operations Rapid Start package gives your MongoDB Cloud Manager is a cloud-based tool that helps
operations and devops teams the skills and tools to run you manage MongoDB on your own infrastructure. With
and manage MongoDB with confidence. automated provisioning, fine-grained monitoring, and
continuous backups, you get a full management suite that
• Once launched, a MongoDB Dedicated Consulting
reduces operational overhead, while maintaining full control
Engineer provides ongoing advisory services to the IT
over your databases.
team from a named, experienced engineer.
MongoDB Professional helps you manage your
These consulting packages complement a range of
deployment and keep it running smoothly. It includes
services that can be provided for individual project teams
16
support from MongoDB engineers, as well as access to • One-click scale up, out, or down on demand. MongoDB
MongoDB Cloud Manager. Atlas can provision additional storage capacity as
needed without manual intervention.
Development Support helps you get up and running quickly.
It gives you a complete package of software and services • Automated patching and single-click upgrades for new
for the early stages of your project. major versions of the database, enabling you to take
advantage of the latest and greatest MongoDB features
MongoDB Consulting packages get you to production
• Live migration to move your self-managed MongoDB
faster, help you tune performance in production, help you
clusters into the Atlas service with minimal downtime
scale, and free you up to focus on your next release.
MongoDB Atlas can be used for everything from a quick
MongoDB Training helps you become a MongoDB expert,
Proof of Concept, to test/QA environments, to powering
from design to operating mission-critical systems at scale.
production applications. The user experience across
Whether you're a developer, DBA, or architect, we can
MongoDB Atlas, Cloud Manager, and Ops Manager is
make you better at MongoDB.
consistent, ensuring that disruption is minimal if you decide
to manage MongoDB yourself and migrate to your own
17
Take advantage of the free tier to get started; when you
need more bandwidth, the usage-based pricing model
ensures you only pay for what you consume. Learn more
and try it out for yourself.
Conclusion
Resources
18