Mongodb Microservices Containers Orchestration
Mongodb Microservices Containers Orchestration
Enabling Microservices
Containers & Orchestration Explained
June 2017
Table of Contents
Introduction 1
Orchestration 4
Docker Machine 4
Docker Swarm 4
Docker Compose 4
Kubernetes 4
Mesos 5
Choosing an Orchestration Framework 6
Security Considerations 6
We Can Help 13
Resources 14
Introduction
Want to try out MongoDB on your laptop? Execute a single This white paper introduces the concepts behind
command and you have a lightweight, self-contained containers and orchestration, then explains the available
sandbox; another command removes all traces when you're technologies and how to use them with MongoDB.
done.
Need an identical copy of your application stack in multiple What are Containers?
environments? Build your own container image and let your
development, test, operations, and support teams launch an
identical clone of your environment. To illustrate the concepts associated with software
containers, it is helpful to consider a similar example from
Containers are revolutionizing the entire software lifecycle: the physical world – shipping containers.
from the earliest technical experiments and proofs of
concept through development, test, deployment, and Shipping containers are efficiently moved using different
support. modes of transport – perhaps initially being carried by a
truck to a port, then neatly stacked alongside thousands of
Orchestration tools manage how multiple containers are other shipping containers on a huge container ship that
created, upgraded and made highly available. Orchestration carries them to the other side of the world. At no point in
also controls how containers are connected to build the journey do the contents of that container need to
sophisticated applications from multiple, microservice repacked or modified in any way.
containers.
Shipping containers are ubiquitous, standardized, and
The rich functionality, simple tools, and powerful APIs make available anywhere in the world, and they're extremely
container and orchestration functionality a favorite for simple to use – just open them up, load in your cargo, and
DevOps teams who integrate them into Continuous lock the doors shut.
Integration (CI) and Continuous Delivery (CD) workflows.
1
The contents of each container are kept isolated from that Containers Compared to Virtual Machines
of the others; the container full of Mentos can safely sit (VMs)
next to the container full of soda without any risk of a
reaction. Once a spot on the container ship has been There are a number of similarities between virtual
booked, you can be confident that there's room for all of machines (VMs) and containers – in particular, they both
your packed cargo for the whole trip – there's no way for a allow you to create an image and spin up one or more
neighboring container to steal more than its share of instances, then safely work in isolation within each one.
space. Containers, however, have a number of advantages which
make them better suited to building and deploying
Software containers fulfill a similar role for your application. applications.
Packing the container involves defining what needs to be
there for your application to work – operating system, Each instance of a VM must contain an entire operating
libraries, configuration files, application binaries, and other system, all required libraries, and of course the actual
parts of your technology stack. Once the container has application binaries. All of that software consumes several
been defined, that image is used to create containers that Gigabytes of storage and memory. In contrast, each
run in any environment, from the developer's laptop to your container holds its application and any dependencies, but
test/QA rig, to the production data center, on-premises or the same Linux kernel and libraries can be shared between
in the cloud, without any changes. This consistency can be multiple containers running on the host. The fact that each
very useful: for example, a support engineer can spin up a container imposes minimal overhead on storage, RAM, and
container to replicate an issue and be confident that it CPU means that many can run on the same host, and each
exactly matches what's running in the field. takes just a couple of seconds to launch.
Containers are very efficient and many of them can run on Running many containers allows each one to focus on a
the same machine, allowing full use of all available specific task; multiple containers then work in concert to
resources. Linux containers and cgroups are used to make implement sophisticated applications. In such microservice
sure that there's no cross-contamination between architectures, each container can use different versions of
containers: data files, libraries, ports, namespaces, and programming languages and libraries that can be upgraded
memory contents are all kept isolated. They also enforce independently.
upper boundaries on how much system resource (memory,
Due to the isolation of capabilities within containers, the
storage, CPU, network bandwidth, and disk I/O) a
effort and risk associated with updating any given
container can consume so that a critical application isn't
container is far lower than with a more monolithic
squeezed out by noisy neighbors.
architecture. The lends itself to Continuous Delivery – an
Metaphors tend to fall apart at some point, and that's true approach that involves fast software development
with this one as well. There are exceptions but shipping iterations and frequent, safe updates to the deployed
containers typically don't interact with each other – each application.
has its job to fulfill (keep its contents together and safe
The tools and APIs provided with container technologies
during shipping) and it doesn't need help from any of its
such as Docker are very powerful and more
peers to achieve that. In contrast, it can be very powerful to
developer-focused than those available with VMs. These
have software containers interact with each other through
APIs allow the management of containers to be integrated
well defined interfaces – e.g., one container provides a
into automated systems – such as Chef and Puppet – used
database service that an application running in another
by DevOps teams to cover the entire software development
container can access through an agreed port. The modular
lifecycle. This has led to wide scale adoption by
container model is a great way to implement microservice
DevOps-oriented groups.
architectures.
Virtual machines still have an essential role to play, as you'll
very often be running your containers within VMs –
2
including when using the cloud services provided by providing the same capability – continue to provide service.
Amazon, Google, or Microsoft. With the addition of some automation (see the
orchestration section of this paper), failed containers can
be automatically recreated (rescheduled) either on the
How Containers Benefit Your same or a different host, restoring full capacity and
business redundancy.
3
be linked so that they communicate without opening up Docker Swarm
these resources to other systems.
Docker Swarm produces a single, virtual Docker host by
clustering multiple Docker hosts together. It presents the
Orchestration same Docker API; allowing it to integrate with any tool that
works with a single Docker host.
Clearly, the process of deploying multiple containers to A common practice is for Docker Swarm to employ Docker
implement an application can be optimized through Machine to create the hosts making up the swarm –
automation. This becomes more and more valuable as the especially early on in the development process.
number of containers and hosts grow. This type of
Docker Swarm can grow with your needs as it allows for
automation is referred to as orchestration. Orchestration
pluggable scheduler backends. You can start off with the
can include a number of features, including:
default scheduler but swap in Mesos (see below) for large,
• Provisioning hosts production deployments.
4
• Controlled exposure of network ports to systems and of which type, which should exist at any point in
outside of the cluster time. If a pod fails then the Replication Controller
creates a replacement; if a request is made to increase
Kubernetes is designed to work in multiple environments,
the size of the cluster then the Replication Controller
including bare metal, on-premises VMs, and public clouds.
starts the additional pods.
Google Container Engine provides a tightly integrated
platform which includes hosting of the Kubernetes and
Docker software, as well as provisioning the host VMs and Mesos
orchestrating the containers.
Apache Mesos is designed to scale to tens of thousands of
The key components making up Kubernetes are: physical machines. Mesos is in production with a number of
large enterprises such as Twitter, Airbnb, and Apple. An
• A Cluster is a collection of one or more bare-metal
application running on top of Mesos is made up of one or
servers or virtual machines (referred to as nodes)
more containers and is referred to as a framework. Mesos
providing the resources used by Kubernetes to run one
offers resources to each framework; and each framework
or more applications.
must then decide which to accept. Mesos is less
• Pods are groups of containers and volumes co-located feature-rich than Kubernetes and may involve extra
on the same host. Containers in the same Pod share the integration work – defining services or batch jobs for
same network namespace and IP address (unique only Mesos is programmatic while it is declarative for
within the cluster) and can communicate with each Kubernetes.
other using localhost. Pods are considered to be
There is currently a project to run Kubernetes as a Mesos
ephemeral, rather than durable entities, and are the
framework. Mesos provides the fine grained resource
basic scheduling unit. The structure, contents, and
allocation of Kubernetes pods across the nodes in a
interfaces for a pod are defined using either a JSON or
cluster. Kubernetes adds the higher level functions such
YAML configuration file.
as: load balancing; high availability through failover
• Volumes map ephemeral directories within a container (rescheduling); and elastic scaling.
to persistent storage which survives container restarts
and rescheduling. Volumes also allow data to be shared Mesos is particularly suited to environments where the
amongst containers within a pod. application needs to be co-located with other services such
as Hadoop, Kafka, and Spark. Mesos is also the foundation
• Services act as basic load balancers and ambassadors
for a number of distributed systems such as:
for other containers, exposing them to the outside
world. e.g., A service can provide a static, external IP • Apache Aurora – a highly scalable service scheduler for
Address & port which it maps to another (internal to the long-running services and cron jobs; it's used by
cluster) port on multiple containers. Twitter. Aurora extends Mesos by adding rolling
• Labels are tags assigned to entities such as containers updates, service registration, and resource quotas.
that allow them to be managed or referenced as a • Chronos – a fault tolerant service scheduler, to be used
group. One resource can reference one or more other as a replacement for cron, to orchestrate scheduled
resources by including their label(s) in its selector. jobs within Mesos.
e.g., containers in a cluster might have labels for
• Marathon – a simple to use service scheduler; it builds
environment, role and location; a service could be setup
upon Mesos and Chronos by ensuring that two Chronos
to act as an interface to a subset of those containers by
instances are running.
setting its selector to environment=production,
role=web-server, location=new-york.
• A Replic
Replication
ation Contr
Controller
oller handles the scheduling of
pods across the cluster. When configuring a Replication
Controller, you specify the required numbers of pods,
5
Choosing an Orchestration Framework Within a container, the ability for malicious or buggy
software to cause harm can be reduced by using resource
Each orchestration platform has advantages relative to the isolation and rationing.
others and so users should evaluate which are best suited
to their needs. Aspects to consider include: It's important to ensure that the container images are
regularly scanned for vulnerabilities, and that the images
• Does your enterprise have an existing DevOps are digitally signed. There are now many projects that
framework that the orchestration must fit within and provide scripts and scanning tools that can check if images
what APIs does it require? and packages are up to date and free of security defects.
• How many hosts will be used? Mesos is proven to work Note that updating the images has no impact on existing
over thousands of physical machines. containers; fortunately, Kubernetes and Aurora have the
ability to perform rolling updates of containers.
• Will the containers be run on bare metal, private VMs, or
in the cloud? Kubernetes is widely used in cloud
deployments.
Considerations for MongoDB
• Are there requirements for automated high availability?
A Kubernetes Replication Controller will automatically
reschedule failed pods/containers; Mesos considers Running MongoDB with containers and orchestration
this the role of an application’s framework code. introduces some additional considerations:
• Is grouping and load balancing required for services? • MongoDB database nodes are stateful. In the event that
Kubernetes provides this but Mesos considers it a a container fails, and is rescheduled, it's undesirable for
responsibility of the application’s framework code. the data to be lost (it could be recovered from other
nodes in the replica set, but that takes time). To solve
• What skills do you have within your organization? Mesos
typically requires custom coding to allow your this, features such as the Volume abstraction in
application to run as a framework; Kubernetes is more Kubernetes can be used to map what would otherwise
declarative. be an ephemeral MongoDB data directory in the
container to a persistent location where the data
• Setting up the infrastructure to run containers is simple
survives container failure and rescheduling.
but the same is not true for some of orchestration
frameworks – including Kubernetes and Mesos. • MongoDB database nodes within a replica set must
Consider using hosted services such as Google communicate with each other – including after
Container Engine for Kubernetes, particularly for proofs rescheduling. All of the nodes within a replica set must
of concept. know the addresses of all of their peers, but when a
container is rescheduled, it is likely to be restarted with
a different IP Address. For example, all containers within
Security Considerations a Kubernetes Pod share a single IP address, which
changes when the pod is rescheduled. With Kubernetes,
this can be handled by associating a Kubernetes
While many of the concerns when using containers are Service with each MongoDB node, which uses the
common to bare metal deployments, containers provide an Kubernetes DNS service to provide a hostname for the
opportunity to improve levels of security if used properly. service that remains constant through rescheduling.
Because containers are so lightweight and easy to use, it's
• Once each of the individual MongoDB nodes is running
easy to deploy them for very specific purposes, and the
(each within its own container), the replica set must be
container technology helps ensure that only the minimum
initialized and each node added. This is likely to require
required capabilities are exposed.
some additional logic beyond that offered by off the
shelf orchestration tools. Specifically, one MongoDB
6
node within the intended replica set must be used to This section starts by creating the entire MongoDB replica
execute the rs.initiate and rs.add commands. set on a single GCE cluster, meaning that all members of
the replica set will be within a single availability zone. This
• When using WiredTiger (default from 3.2), the
clearly doesn't provide geographic redundancy. In reality,
WiredTiger cache size defaults to 60% of the host's
little has to be changed to run across multiple clusters and
RAM minus 1 GB. If using cgroups to constrain the
those steps are described in the "Multiple Availability Zone
amount of RAM used by a container, WiredTiger ignores
MongoDB Replica Set" section.
that constraint when calculating its cache size. Override
that behaviour by setting wiredTigerCacheSizeGB. Each member of the replica set will be run as its own pod
• If the orchestration framework provides automated with a service exposing an external IP address and port.
rescheduling of containers (as Kubernetes does, for This 'fixed' IP address is important as both external
instance) then this can increase MongoDB's resiliency applications and the other replica set members can rely on
as a failed replica set member can be automatically it remaining constant in the event that a pod is
recreated, restoring full redundancy levels without rescheduled.
human intervention.
Figure 1 illustrates one of these pods and the associated
• It should be noted that while the orchestration Replication Controller and service:
framework might monitor the state of the containers, it
• Starting at the core there is a single container named
is unlikely to monitor the applications running within the
mongo-node1. mongo-node1 includes an image called
containers, or backup their data. That means it's
mongo which is a publicly available MongoDB container
important to use a strong monitoring and backup
solution such as MongoDB Cloud Manager, included image hosted on Docker Hub. The container exposes
with MongoDB Enterprise Advanced and MongoDB port 27107 within the cluster.
Professional. Consider creating your own image that • The Kubernetes volumes feature is used to map the
contains both your preferred version of MongoDB and /data/db directory within the connector to the
the MongoDB Automation Agent. persistent storage element named
mongo-persistent-storage1; which in turn is
mapped to a disk named mongodb-disk1 created in
Implementing a MongoDB the Google Cloud. This is where MongoDB would store
Replica Set using Docker and its data so that it is persisted over container
Kubernetes rescheduling.
Kubernetes 1.3 introduced Minikube to simplify the port number in the container. The service identifies the
creation of a minimal, single-node Kubernetes cluster on correct pod using a selector that matches the pod's
your laptop to get developers up and running quickly – this labels. That external IP Address and port will be used by
could be a good option for initial experiments. both an application and for communication between the
replica set members. There are also local IP addresses
for each container, but those change when containers
7
Figur
Figure
e 1: Pod for a Single Replica Set member
8
Figur
Figure
e 2: Pod for the second Replica Set member
9
--- name: mongo-node2
apiVersion: v1 instance: jane
kind: Service spec:
metadata: containers:
name: mongo-svc-c - name: mongo-node2
labels: image: mongo
name: mongo-svc-c command:
spec: - mongod
type: LoadBalancer - "--replSet"
ports: - my_replica_set
- port: 27017 ports:
targetPort: 27017 - containerPort: 27017
protocol: TCP volumeMounts:
name: mongo-svc-c - name: mongo-persistent-storage2
selector: mountPath: /data/db
name: mongo-node3 volumes:
instance: freddy - name: mongo-persistent-storage2
--- gcePersistentDisk:
apiVersion: v1 pdName: mongodb-disk2
kind: ReplicationController fsType: ext4
metadata: ---
name: mongo-rc1 apiVersion: v1
labels: kind: ReplicationController
name: mongo-rc metadata:
spec: name: mongo-rc3
replicas: 1 labels:
selector: name: mongo-rc
name: mongo-node1 spec:
template: replicas: 1
metadata: selector:
labels: name: mongo-node3
name: mongo-node1 template:
instance: rod metadata:
spec: labels:
containers: name: mongo-node3
- name: mongo-node1 instance: freddy
image: mongo spec:
command: containers:
- mongod - name: mongo-node3
- "--replSet" image: mongo
- my_replica_set command:
ports: - mongod
- containerPort: 27017 - "--replSet"
volumeMounts: - my_replica_set
- name: mongo-persistent-storage1 ports:
mountPath: /data/db - containerPort: 27017
volumes: volumeMounts:
- name: mongo-persistent-storage1 - name: mongo-persistent-storage3
gcePersistentDisk: mountPath: /data/db
pdName: mongodb-disk1 volumes:
fsType: ext4 - name: mongo-persistent-storage3
--- gcePersistentDisk:
apiVersion: v1 pdName: mongodb-disk3
kind: ReplicationController fsType: ext4
metadata:
name: mongo-rc2 Figure 3 shows the full target configuration.
labels:
name: mongo-rc
spec:
replicas: 1
selector:
name: mongo-node2
template:
metadata:
labels:
10
The mongo client is then used to access any one of the
services to initiate the replica set and add the remaining
members:
11
Figur
Figure
e 4: Headless service to avoid co-locating of MongoDB replica set members
redundancy is required then the three pods should be run gcloud container clusters create "europe-1" \
--zone "europe-west1-c" --num-nodes 1
in three different availability zones or regions. gcloud compute disks create --size=200GB \
--zone="europe-west1-c" mongodb-disk-europe
Surprisingly little needs to change in order to create a kubectl create -f mongo-europe.yaml
similar replica set that is split between three zones – which
requires three clusters. Each cluster requires its own
Kubernetes YAML file that defines just the pod, Replication
Controller and service for one member of the replica set. It
is then a simple matter to create a cluster, persistent
storage, and MongoDB node for each zone, e.g.:
Figur
Figure
e 5: Replica set running over multiple availability zones
12
Kubernetes Enhancements for Square Enix is one of the world’s leading providers of
gaming experiences, publishing such iconic titles as Tomb
Stateful Services – StatefulSets Raider and Final Fantasy. They have produced an internal
multi-tenant database-as-a-service using MongoDB and
Docker – find out more in this case study.
Running stateful services inside containers has historically
been discouraged, but it is something that organizations Comparethemarket.com is a one of the UK’s leading
have found increasingly attractive and necessary. As a providers for price comparison services and uses
result, the Kubernetes development community have been MongoDB as the operational database behind its large
working to make it simpler to do. microservice environment. Service uptime is critical, and
MongoDB’s distributed design is key to ensure that SLA’s
Kubernetes 1.3 (July 2016) introduced an alpha version of
are always met. Comparethemarket.com’s deployment
PetSets, which were subsequently replaced by
consists of microservices deployed in AWS. Each
St
StatefulSets
atefulSets – beta in Kubernetes 1.5 (December 2016)
microservice, or logical grouping of related microservices, is
and 1.6 (March 2017). StatefulSets group together a set of
provisioned with its own MongoDB replica set running in
stateful pods.
Docker containers, and deployed across multiple AWS
The StatefulSet functionality provides stricter guarantees Availability Zones to provide resiliency and high availability.
than those offered by a traditional Replication Controller – MongoDB Ops Manager is used to provide the operational
all targeted towards managing distributed, stateful automation that is essential to launch new features quickly:
applications. The most applicable guarantee for MongoDB deploying replica sets, providing continuous backups, and
are: performing zero downtime upgrades.
13
automated provisioning, fine-grained monitoring, and
continuous backups, you get a full management suite that
reduces operational overhead, while maintaining full control
over your databases.
Resources
New York • Palo Alto • Washington, D.C. • London • Dublin • Barcelona • Sydney • Tel Aviv
US 866-237-8815 • INTL +1-650-440-4474 • [email protected]
© 2017 MongoDB, Inc. All rights reserved.
14