Vmware Press Ebook On Containers and Kubernetes
Vmware Press Ebook On Containers and Kubernetes
Accelerating Digital
Transformation with
Containers and
Kubernetes
An Introduction to
Cloud-Native Technology
Author
Steve Hoenisch
Warning and Disclaimer
Every effort has been made to make this book as complete and as accurate as possible, but
no warranty or fitness is implied. The information provided is on an “as is” basis. The authors,
VMware Press, VMware, and the publisher shall have neither liability nor responsibility to any
person or entity with respect to any loss or damages arising from the information contained
in this book.
The opinions expressed in this book belong to the author and are not necessarily those of
VMware.
VMware, Inc.
3401 Hillview Avenue
Palo Alto CA 94304
USA Tel 877-486-9273
Fax 650-427-5001
www.vmware.com.
Copyright © 2018 VMware, Inc. All rights reserved. This product is protected by U.S. and
international copyright and intellectual property laws. VMware products are covered by
one or more patents listed at http://www.vmware.com/go/patents. VMware is a registered
trademark or trademark of VMware, Inc. and its subsidiaries in the United States and/or
other jurisdictions. All other marks and names mentioned herein may be trademarks of their
respective companies.
2
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Organization of this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
Point of Departure:
Cloud-Native Terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Driving Digital Transformation with Containers
and Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
The Business Value of Digital Transformation. . . . . . . . . . . . . . . . . 13
Cloud-Native Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
12-Factor Apps: A Methodology for
Delivering Software as a Service. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
The Business Value of Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . . 17
An Example Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Demystifying Kubernetes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Platform vs. Runtime Environment. . . . . . . . . . . . . . . . . . . . . . . . . . 19
Robust Open-Source Technology from a
Google Production System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Defogging the Abstract Terminology of Kubernetes. . . . . . . . . . 20
A Concise Overview of Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . 21
Just Another Fad in the Hype Cycle?. . . . . . . . . . . . . . . . . . . . . . . . 24
Kubernetes in Production Environments. . . . . . . . . . . . . . . . . . . . . 24
A Rapidly Maturing Ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Kubernetes Won’t Solve All Your Problems . . . . . . . . . . . . . . . . . . 25
Introduction to Cloud-Native Architectures
and Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Microservices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Deconstructing the Monolith and Other Use Cases. . . . . . . . . . . 29
Kubernetes for Cloud-Native and 12-Factor Applications. . . . . . 30
Profile of a DevOps Engineer: Responsibilities and Skills . . . . . . 32
Continuous Integration and Continuous Deployment . . . . . . . . . 34
Container Technology in the
Software-Defined Data Center. . . . . . . . . . . . . . . . . . . . . . . . 35
VMware vSphere and the SDDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Abstract and Automate: Network Virtualization. . . . . . . . . . . . . . 36
Risk-Free Scale Out with Ease: Virtual Storage. . . . . . . . . . . . . . . 37
Put a Lid on It: Security for Containers . . . . . . . . . . . . . . . . . . . . . . 38
Linux Container Hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Securing Cloud Platforms with Lightwave . . . . . . . . . . . . . . . . . . . 44
Managing Container Images with Harbor. . . . . . . . . . . . . . . . . . . . 51
Microservices Meets Micro- segmentation: Delivering
Developer-Ready Infrastructure for Modern Application
Development with NSX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4
Author and Contributors
Author and Editor
Steve Hoenisch is a technology evangelist,
writer, and editor who specializes in emerging
technology and cloud-native solutions. He’s
written numerous influential technical white
papers and magazine articles on digital
transformation, Kubernetes, containers, big data,
Hadoop, storage platforms, security, and
regulatory compliance. A former newspaper
editor with a master’s degree in linguistics, he
has published articles in XML Journal, The
Hartford Courant, and the Chicago Tribune. He
works in the Cloud-Native Apps business unit at
VMware.
Contributors
Ben Corrie has been a leading voice of technical
innovation in the container space at VMware for
three years. Ben was the initiator of the research
that led to the vSphere Integrated Containers
product and as an architect on that product,
Ben’s role now is to look 6 to 12 months ahead to
help align VMware with the container challenges
ahead.
6
Simone Morellato is currently a Director of
Technical Product Management at VMware
where he leads technical product management
and marketing efforts for the company’s Cloud-
Native Applications business unit. Simone has
more than 16 years of experience in storage,
networking and infrastructure for both
traditional and cloud applications. Before joining
VMware, Simone worked at Apcera, a container
management platform, acquired by Ericsson. He
has also held leadership, marketing and technical
presales roles at Cisco, Riverbed Technology,
Astute Networks and Andiamo Systems (later
acquired by Cisco).
But one-time innovation is often not enough. The digital era calls for
continuous innovation at an accelerated pace—and the kind of modern-
ized data centers and software development technologies that make it
possible.
8
considers the kind of infrastructure, virtualization technologies, systems,
and security required by next-generation data centers.
The chapters that follow become increasingly technical as they use two
key products from VMware—VMware vSphere Integrated Containers
and VMware Pivotal Container Service—to explain the architecture of
cloud-native applications, the capabilities of Kubernetes, and the use
cases for container technology.
The final sections of the book turn to examples that demonstrate how to
exploit the power of containers and Kubernetes to solve technical
problems.
Point of Departure:
Cloud-Native Terminology
Container technology comes with its own lexicon. If you’re familiar with
the basic terminology around containers, Kubernetes, and cloud-native
applications, you can skip this section. For plain-language descriptions of
terminology in the cloud-native space, see the glossary at the end of the
book.
Containers
Container: A portable format, known as an image, for packaging an
application with instructions on how to run it as well as an environment
in which the image is executed. When the container image is executed, it
runs as a process on a computer or virtual machine with its own isolated,
self-described application, file system, and networking. A container is
more formally known as an application container. The use of containers
is increasing because they provide a portable, flexible, and predictable
way of packaging, distributing, modifying, testing, and running applica-
tions. Containers speed up software development and deployment.
10
• Orchestrated to optimize resource utilization.
• Segmented into microservices to ease modification, maintenance,
and scalability.
Platforms
The overarching business objective of using a container platform is to
accelerate the development and deployment of scalable, enterprise-grade
software that is easy to modify, extend, operate, and maintain. Three
types of platforms provide varying degrees of support for container tech-
nology:
• A platform for running individual container instances.
• Containers as a service.
• Platform as a service.
12
Digital Transformation
All the modernizing elements covered in this section—containers,
Kubernetes, microservices, container platforms, DevOps, and the CI/CD
pipeline—converge into a powerful recipe for digital transformation: You
can optimize the use of your computing resources and your software
development practices to extend your enterprise’s adaptability, produc-
tivity, innovation, competitive advantage, and global reach.
The ingredients for building effective applications are less clear than the
desired outcomes.
Cloud-Native Applications
The Cloud Native Computing Foundation, a project of The Linux
Foundation, defines cloud-native applications as follows:1
1. Containerized—Each part (applications, processes, etc.) is
packaged in its own container. This facilitates reproducibility, trans-
parency, and resource isolation.
1
This definition is from the FAQ of the Cloud Native Computing Foundation,
https://www.cncf.io/about/faq/.
14
2. Dynamically orchestrated—Containers are actively scheduled and
managed to optimize resource utilization.
3. Microservices oriented—Applications are segmented into micros-
ervices. This segmentation significantly increases the overall agility
and maintainability of applications.
2
The twelve factors are paraphrased from the descriptions at the Twelve-Factor
App web site.
16
services as resources that can be attached to or detached from a
deployment by modifying the configuration.
5. Treat build and run as separate stages. A deployment of the
codebase takes place in three separate stages: build, release, and
runtime. The build stage converts the codebase into an execut-
able—a build—and then the release stage combines the build with
the configuration to produce a release that’s ready for execution in
the runtime environment.
6. Run the app as stateless processes. The processes share nothing
with other processes, and data that must persist is stored in a data-
base running as a stateful supporting service.
7. Expose services by using port binding. Taking HTTP as an example,
the app exports HTTP as a service by binding to a port and listen-
ing on the port for incoming requests.
8. Scale out by adding concurrent processes. The app handles
workloads by assigning each type of work to a process type. A
web process, for example, handles HTTP requests, while a worker
process manages background tasks.
9. Ensure durability with disposability. Processes are disposable—they
can be started or stopped quickly to make sure that the application
can be changed or scaled easily.
10. Make development and production peers. The app is geared
toward continuous deployment by allowing developers to integrate
new code quickly and to deploy the app themselves in a produc-
tion environment. The production and development environments
should be as similar as possible.
11. Process logs as event streams. The app neither routes nor stores
the output stream from its logs but instead writes it as a stream of
data to standard output, where it is to be collected by the execu-
tion environment and routed to a tool or system, such as Hadoop,
for storage or analysis.
12. Run one-off management scripts and tasks, such as a database
migration, in an environment identical to that of the app’s long-run-
ning processes.
As recently hired developers work on the mobile app, the taxi company
modernizes its data center with commodity hardware and virtualization.
To maximize resource utilization of its small data center and to minimize
18
costs, the company plans to run its new app in Docker containers on vir-
tual machines. Kubernetes will orchestrate the containerized application.
After being rolled out and advertised in and on its cars, the app is an
instant success. To meet fluctuations in use of the app, the company
uses Kubernetes to dynamically scale the number of containers running
the app. For example, when metrics for the app hit a predefined thresh-
old indicating high usage, which typically happens during rush hour, the
company’s DevOps team uses the horizontal pod autoscaling feature
of Kubernetes to automatically maximize the number of containers so
that the system can match demand. At 4 am, in contrast, the number of
containers is reduced to elastically match the low demand at that time,
conserving resources.
The mobile app correlates ride requests with location. By mining the
data and combining it with its intimate historic knowledge of the city’s
patterns, the cab company can station cabs in the perfect locations for
hailing customers—preempting some car requests to the competition.
What’s more, because the company processes the app’s logs as event
streams, the company can do this dynamically during day and night,
shifting cars to hot spots.
Although you don’t need Kubernetes to use containers, you will likely
need Kubernetes if you want to robustly and repeatedly deploy and auto-
mate a containerized application in a production environment.
20
Robust Open-Source Technology from
a Google Production System
Kubernetes started out as a closed-source project at Google based on an
orchestration system called Borg. Google uses Borg to initiate, schedule,
restart, and monitor public-facing applications, such as Gmail and Google
Docs, as well as internal frameworks, such as MapReduce.3 Kubernetes
was heavily influenced by Borg and the lessons learned from running
Borg on a massive scale in a production environment. In 2015, Google
open-sourced Kubernetes. Shortly afterward, Google donated it as seed
technology to the Cloud Native Computing Foundation, a newly formed
open-source project hosted by the Linux Foundation. (VMware is a member
of the Linux Foundation and the Cloud Native Computing Foundation.)
Other terms, abbreviations, and acronyms taint the fringes of the Kuber-
netes platform as it bumps up against containers on the one hand and
the accompanying infrastructure on the other: runC, OCI, YAML, JSON,
IaaS, PaaS, and KaaS. There’s even the odd abbreviation of Kubernetes
itself: K8s.
Yet once you become familiar with the system, its relationship to contain-
ers, and the infrastructure at its edges, the meaning of the terms comes
into focus.
For definitions of more Kubernetes terms, see the glossary at the end of
the book.
22
Here, briefly, is how it works. A Kubernetes cluster contains a master node
and several worker nodes. Then, when you deploy an application on the
cluster, the components of the application run on the worker nodes. The
master node manages the deployment.
Main Components
Kubernetes includes these components:
• The Kubernetes API
• The Kubernetes command-line interface, kubectl
• The Kubernetes control plane
In the Kubernetes object model, the concept of a Pod is the most basic
deployable building block. A Pod represents an instance of an app
running as a process on a Kubernetes cluster. Here’s where the Docker
runtime comes back into the equation—Docker is commonly used as the
runtime in a Kubernetes Pod.
When you submit this file to the Kubernetes master with the kubectl
command-line interface, the Kubernetes control plane implements the
instructions in the file by starting and scheduling applications so that the
cluster’s state matches your desired state. The Kubernetes master and the
control plane then maintain the desired state by orchestrating the clus-
ter’s nodes, which can be actual servers or virtual machines.
The core of the architecture is an API server that manages the state of
the system’s objects. The API server works with Kubernetes subcompo-
nents, or clients, that are built as composable microservices, such as the
replication controller specified in the YAML file. The replication controller
regulates the desired state of pod replicas when failures occur.
24
MANAGING CONTAINERIZED APPLICATIONS WITH KUBERNETES
• Kubernetes orchestrates distributed, containerized applications to:
• Optimize utilization of computing resources.
• Provide policies for scheduling.
• Maintain desired state.
• Handle faults and failures with automation.
• Provide high availability.
• Monitor jobs in real-time.
• Manage an application’s configuration.
• Dynamically scale to meet changes in demand.
5
: See Portworx, “2017 Annual Container Adoption Survey: Huge Growth in Containers,”
April 12, 2017; April 12, 2017; ClusterHQ, “Container Market Adoption Survey 2016”; Sysdig,
“The 2017 Docker Usage Report,” Apurva Dave, April 12, 2017; and Forbes, “2017 State of
Cloud Adoption and Security,” Louis Columbus, April 23, 2017.
26
• Consolidate servers and reduce costs through efficient resource
utilization.
• Elegantly handle machine failure through self-healing and high
availability.
• Ease and expedite application deployment, logging, and
monitoring.
• Automate scalability for containers and containerized applications.
• Decouple applications from machines for portability and flexibility.
• Easily modify, update, extend, or redeploy applications without
affecting other workloads.
Microservices
The digital transformation is driving a shift toward new application archi-
tectures. Developing a new application or refactoring an existing one with
containers and microservices is often motivated by the following out-
comes:
• Extend an application’s capabilities more easily
• Add new features faster and easier
• Improve maintainability
• Reduce vulnerabilities
• Make it perform faster or scale better
28
exploit the capacity of a vSphere software-defined data center to develop
and test a containerized application, which is a pre-existing advantage for
organizations with vSphere environments who want to begin building and
deploying cloud-native applications.
30
Kubernetes for Cloud-Native and
12-Factor Applications
Kubernetes makes containerized applications work in a manageable way
at scale. Recall the second part of the definition of cloud-native applica-
tions: They are dynamically orchestrated in such a way that containers are
actively scheduled and managed to optimize resource utilization. Kuber-
netes does exactly that. It orchestrates containers and their workloads to
optimize the utilization of the virtual machines and physical servers that
make up the nodes in a cluster.
Revisiting the 12 factors from the previous chapter details how Kuber-
netes streamlines application management. In general, Kubernetes can
deploy and run 12-factor apps.
32
DevOps
DevOps is a key practice driving the development and deployment of
cloud-native applications and 12-factor apps. When developers and
IT personnel collaborate on operations to release software early and
often, or daily or hourly for that matter, you can see DevOps at work in
practices, processes, and automation behind the building, testing, and
releasing of software.
34
2. Use a tool such as Atom to write code and unit tests in languages
like Python, Java, and Go.
3. Commit changes with Git or GitHub.
4. Use tools such as Jenkins and Gerrit to continuously integrate
those changes.
5. Use a tool such as vRealize Automation to test code.
6. Create a software artifact by using JFrog Artifactory.
7. Perform continuous deliver by using Jenkins.
8. Manage configuration by using Chef, Ansible, or Puppet.
9. Monitor the application by using vRealize Operations and vRealize
Log Insight.
36
• Resource optimization
• Easier maintenance
• Automated operations
• Heightened security
38
Put a Lid on It: Security for Containers
Security poses an obstacle to container adoption. The increased attack
surface of containers and other factors can heighten risk. Running con-
tainerized applications on virtual machines, however, decreases the attach
surface of containers and lowers risk.
7
NIST Special Publication 800-190, Application Container Security Guide, by Murugiah
Souppaya, Computer Security Division Information Technology Laboratory; John Morello,
Twistlock, Baton Rouge, Louisiana; Karen Scarfone, Scarfone Cybersecurity, Clifton, Virginia.
September 2017. This publication is available free of charge from https://doi.org/10.6028/
NIST.SP.800-190
8
NIST.IR 8176, Security Assurance Requirements for Linux Application Container
Deployments, by Ramaswamy Chandramouli, Computer Security Division, Information
Technology Laboratory. October 2017. This publication is available free of charge from
https://doi.org/10.6028/NIST.IR.8176
40
not just one, to be able to provide a global summary of resource
usage for all running containers.
From this basic definition, you can see that for a data center, micro-
segmentation reduces the risk of attack, limits the damage from an
attack, and improves security. According to VMware NSX Micro-segmen-
tation Day 1, the micro-segmentation capabilities of VMware NSX can
implement the following security controls:10
• Distributed stateful firewalling, which can protect each application
running in the data center with application level gateways that are
applied on a per-workload basis.
• Topology agnostic segmentation, which protects each application
with a firewall indepedent of the underlying network topology.
• Centralized ubiquitous policy control of distributed services, which
controls access with a centralized management plane.
• Granular unit-level controls implemented by high-level policy
objects, which can create a security perimeter for each application
without relying on VLANs.
• Network-based isolation, which implements logical network
overlays through virtualization.
• Policy-driven unit-level service insertion and traffic steering, which
can help monitor network traffic.
9
For more information about what micro-segmentation is and what is isn’t, see Micro-seg-
mentation for Dummies, by Lawrence Miller and Joshua Soto, published by John Wiley &
Sons, Inc. 2015.
VMware NSX Micro-segmentation Day 1, by Wade Holmes, published by VMware Press,
10
2017.
Photon OS Overview
Project Photon OS™ is an open source Linux container host optimized for
cloud-native applications, cloud platforms, and VMware infrastructure.
Photon OS provides a secure runtime environment for running containers.
42
Security-Hardened Linux
The design of Photon OS prioritizes security. Photon OS secures itself
with its build process, compiler settings, root password rules, and PGP-
signed packages and repositories. A system administrator or DevOps
manager can enforce security with vulnerability scans, the pluggable
authentication modules, the Linux auditing service, and many other mea-
sures.
Life-Cycle Management
Photon OS seeks to reduce the burden and complexity of managing
clusters of Linux machines by including an efficient packaging model,
extensibility, and centralized administration in its fundamental design.
Here are some of Photon’s design elements that simplify life-cycle man-
agement:
• Atomic updates with RPM-OSTree
• Incremental stateful updates (RPMs)
• Package repositories curated by VMware
• Extensible distribution: You can add and remove functionality incre-
mentally
• Signed repositories
44
KEY PHOTON OS PROPERTIES
Standards are a key requirement because they let you apply trusted
tools and protocols across disparate environments to reduce the risk of
security incidents and compliance problems. Flexibility enables you to
port your security policies and controls from one environment to another
as you move a server or application. As more workloads migrate to the
cloud, the cloud-platform independence of your identity service helps
you move from one cloud provider to another without having to redeploy
or reconfigure identity management systems.
LIGHTWAVE SERVICES
• Identity management and directory services
• Authentication and authorization
• Certificates
46
Implementing Cloud-Scale Security with
Lightwave
Lightwave meets the requirements of these use cases with its directory
service, Active Directory interoperability, Kerberos authentication, and
certificate services. Lightwave provides the following services:
• Directory services and identity management with LDAP and Active
Directory interoperability
• Authentication services with Kerberos, SRP, WS-Trust (SOAP),
SAML WebSSO (browser-based SSO), OAuth/OpenID Connect
(REST APIs), and other protocols
• Certificate services with a certificate authority and a certificate
store
The directory service uses the following secure data access mechanisms:
Replication
For replication, the directory service uses a state-based scheme for even-
tually consistent multi-node LDAP replication. Every directory node in a
Lightwave domain accepts write requests. On Lightwave, the replication
service includes a tool with a user interface and a single command to
add or remove a node, which simplifies topology management. In addi-
tion, backup and restore is supported on a per-node basis. Overall, this
approach to replication simplifies the life-cycle management of a domain.
Architecture
48
Figure 1: The architecture of the Lightwave directory service.
Authentication Services
The authentication services of Lightwave contain two main compo-
nents: A server that acts as a Kerberos 5 key distribution center and a
secure token service that supports the OIDC, WS-TRUST, and SAML 2.0
(WebSSO) standards for single sign-on.
The secure token service can issue Security Assertion Markup Language
(SAML) 2.0 tokens as well as OIDC tokens. Lightwave can be integrated
with Active Directory to issue secure tokens to principals defined in
Active Directory forests. With SAML 2.0, Lightwave gives a user SSO
access to different services by using the same credentials.
Lightwave works with OAuth 2.0 and OpenID Connect, a protocol for
authentication and authorization that lets you set up one SSO service for
different cloud services. After users enter their credentials to access one
service, they don’t need to do it again to access others.
Lightwave also works with the WS-Trust standard to issue and vali-
date security tokens and to broker trust relationships between parties
exchanging secure messages.
The secure token service supports the WS-Trust, SAML, OAuth2 and
OpenID Connect standards with the following capabilities:
• Browser-based single sign-on (SSO)
• Full multi-tenancy support
• External SAML Federation
• Just-in-time provisioning (JIT)
• IDP discovery and selection
• Multiple identity sources, including the native VMware directory,
Microsoft Active Directory, and OpenLDAP
• Full Kerberos support and full AD domain trust support for inte-
grating authentication with Microsoft Windows and Active
Directory
• Schema customization for OpenLDAP to support a broad range of
OpenLDAP deployments
• Two-factor smart card support with a common access card (CAC)
or with an RSA SecurID token
• REST management APIs
The certificate authority issues signed X.509 digital certificates and sup-
ports the PKIX standard. It can distribute CA roots and CRLs over HTTP
and LDAP in accordance with RFC 4387. It also supports CSR and key
generation as well as auto-enrollment and certificate revocation. Secured
and authenticated by Kerberos, the certificate authority validates certifi-
cate requests by analyzing key usage, extensions, SAN, and other factors.
Server policies can be used to automatically approve or reject certificates.
The certificate authority has a dual mode in which it can act as an enter-
prise root CA or as a subordinate or intermediate CA. The key lengths are
strong, ranging from 1 K to 16 K, and the hashing algorithms use SHA-1 or
SHA-2, the latter of which is the default. The key usage is encryption and
signing. The certificate file formats are PKCS12, PEM and JKS.
52
Figure 2: The graphical user interface of the Harbor portal.
Use Cases
• Enterprise-wide use of container technology.
• Secure container images for use in production.
• Extend Docker with management and security, including integra-
tion with identity management systems.
• Manage access to container images with Microsoft Active Directory,
LDAP, or Lightwave.
Component Architecture
Here is a diagram that depicts the architecture of Harbor.
54
Proxy: The components of Harbor, such as the registry, UI, and token ser-
vices, are all behind a reversed proxy. The proxy forwards requests from
browsers and Docker clients to the backend services.
Registry: The registry stores Docker images and processes Docker push
and pull commands. Because Harbor enforces access control to images,
the registry directs clients to a token service to obtain a valid token for
each pull or push request.
UI: The graphical user interface helps you manage images on the registry
webhook, which is a mechanism configured in the registry to populate
changes in the status of images. Harbor uses the webhook to update
logs, initiate replications, and perform some other functions.
Token service: The token service issues a token for every docker push or
pull command.
Job services: The job services replicates and synchronizes images across
instances of Harbor.
Implementation
Each component of Harbor is wrapped as a Docker container. Naturally,
Harbor is deployed by Docker Compose. In the source code (https://
github.com/vmware/harbor), the Docker Compose template used to
deploy Harbor is located at /Deployer/docker-compse.yml. Opening this
template file reveals the six container components making up Harbor.
UI: Core services within the architecture. This container is the main part of
Project Harbor.
Component Interaction
The following examples of Docker commands illustrate the interaction
among Harbor’s components.
After the user enters the required credentials, the Docker client sends an
HTTP GET request to the address “192.168.1.10/v2/”. The different contain-
ers of Harbor will process it according to the following steps:
(a) First, this request is received by the proxy container listening on port
80. Nginx in the container forwards the request to the Registry container
at the backend.
(b) The Registry container has been configured for token-based authen-
tication, so it returns an error code 401, notifying the Docker client to
obtain a valid token from a specified URL. In Harbor, this URL points to
the token service of Core Services.
(c) When the Docker client receives this error code, it sends a request to
the token service URL, embedding username and password in the request
header according to basic authentication of HTTP specification.
(d) After this request is sent to the proxy container via port 80, Nginx
again forwards the request to the UI container according to pre-config-
ured rules. The token service within the UI container receives the request,
it decodes the request and obtains the username and password.
(e) After getting the username and password, the token service checks
the database and authenticates the user by the data in the MySql data-
base. When the token service is configured for LDAP/AD authentication,
it authenticates against the external LDAP/AD server. After a successful
56
authentication, the token service returns a HTTP code that indicates the
success. The HTTP response body contains a token generated by a
private key.
At this point, one docker login process has been completed. The Docker
client saves the encoded username/password from step (c) locally in a
hidden file.
(a) Firstly, the docker client repeats the process similar to login by send-
ing the request to the registry, and then gets back the URL of the token
service;
(b) Subsequently, when contacting the token service, the Docker client
provides additional information to apply for a token of the push operation
on the image (library/hello-world);
(c) After receiving the request forwarded by Nginx, the token service
queries the database to look up the user’s role and permissions to push
the image. If the user has the proper permission, it encodes the informa-
tion of the push operation and signs it with a private key and generates a
token to the Docker client;
(d) After the Docker client gets the token, it sends a push request to the
registry with a header containing the token. Once the Registry receives
the request, it decodes the token with the public key and validates its
content. The public key corresponds to the private key of the token ser-
vice. If the registry finds the token valid for pushing the image, the image
transferring process begins.
For more information and a link to the release page, see Integrate with
Kubernetes at https://github.com/vmware/harbor/blob/master/docs/
kubernetes_deployment.md.
Project Hatchway
Project Hatchway, an open source storage project from VMware, provides
storage infrastructure options for containers in vSphere environments,
including hyper-converged infrastructure (HCI) with by VMware vSAN.
Over the last six to nine months, the container ecosystem has really
woken up to the production challenges of using any of the leading con-
tainer frameworks in production. We have discussed this topic on the
VMware cloud-native blog recently in relation to Kubernetes and Docker.
Last month, with the introduction of Pivotal Cloud Foundry 1.10, both the
VMware NSX team and the Pivotal team shared some initial concepts
around developer-ready infrastructure. Both are good reads if you are
itching for some technical details.
60
update “live” for the customers of the business. You can refer to this as
infrastructure response. At some point down the road, for most of our
enterprise customers, there is some sort of audit or compliance check
that they must adhere to and pass or they will get fined. We will call that
audit readiness.
3 Network(s) Network(s)
BOSH Release 1
CPI-vSphere Job Job Job Job Job
BOSH Stem Cell VM VM VM VM VM
2
BOSH
vSphere
Running BOSH
Deployment
5
62
2. BOSH stemcell: A stemcell is a versioned base operating system
image built for each CPI that BOSH supports. It is commonly based
on Canonical’s Ubuntu distribution, but is also available in RHEL
and even Windows image ports. Typically, the stemcell is a hard-
ened base OS image with a BOSH agent predeployed. BOSH will
use this agent to install and manage the lifecycle of software on
that VM for instance.
3. BOSH release: A BOSH release is a versioned tarball containing the
complete source code and job definitions required to describe to
BOSH how that release of software should be deployed on a VM
or instance provisioned from a stemcell. An example is the Kubo
release which includes all the packages and details required to
allow BOSH deploy a fully functional Kubernetes cluster.
4. BOSH deployment manifest: BOSH needs to receive some declar-
ative information to actually deploy something. This is provided by
an operator via a manifest. A manifest defines one or more releases
and stemcells to be used in a deployment and provides some key
variables like IPstack info, instance count, and advanced configu-
ration of the given release(s) you want to deploy. This manifest is
typically written in a YAML format.
5. BOSH deployment: BOSH needs some declarative information
before it can deploy anything. This is provided by an operator via a
deployment manifest and a cloudconfig manifest. These manifests
are typically written in a YAML format.
Name: mykubo-deployment
releases:
BOSH - name: kubo-release
Deployment version: 0.0.5
Manifest - name: docker
version: 28.0.1
BOSH Release: kubo-release
Version: 0.0.5
Package(s):
kubernetes: kubernetes-1.6.6/*
nginx: nginx/ Network(s)
nginx-release-1.11.4.tar.gz
CPI-vSphere VM VM VM
BOSH Release: kubo-release KBS KBS KBS
Version: 0.0.6 V1.6.6 V1.6.6 V1.6.6
BOSH
Package(s):
kubernetes: kubernetes-1.7.1/* vSphere
nginx: nginx/nginx-release-1.11.4.tar.gz
Running BOSH
Deployment
mykubo-deployment
Platform
BOSH Agent
Operator master 0 master 1
mykubo-deployment-1 mykubo-deployment-2
Developer Team A Developer Team B
etcd 0 etcd 1 etcd 2
mykubo-deployment-1
CPI-vSphere
Health Manager
BOSH
master- Developer
haproxy
0
Developer Team B
BOSH Agent
master 0 master 1
mykubo-deployment-2
Referencing Figure 8, we can see the key benefits that BOSH provides the
operator.
1. Repeatability: In a cloud native development environment, the
operator can generate two or more similar deployment manifests
to deploy two or more unique but functionally identical Kubernetes
deployments to meet the needs of multiple developer consumers.
2. Day 2 operations: BOSH lifecycle management makes it easy to
keep all of the Kubernetes deployments healthy.
66
• Patching: Because BOSH uses versioned releases, it is trivial for
an operator to upgrade the Kubernetes Kubo release and apply it
to all running deployments with little to no interruption of service.
BOSH will update each deployment as well as maintain its state by:
(1) detaching persistent disks, (2) rebuilding the affected VMs or
instances, and then (3) re-attaching persistent disks.
BOSH Architecture
BOSH is typically deployed as a single VM or instance. That VM/instance
has many components that perform vital roles in enabling BOSH to
manage deployments at scale:
• NATS: Provides a message bus via which the various services of
BOSH can interact.
• POSTGRESQL: BOSH writes all of its state into a database. Typ-
ically that database is internal to a single VM BOSH deployment
and provided by Postgres. This can be modified, however, to use
an external data source so that the BOSH VM can be rebuilt and
reconnect to the database to reload its persistent state.
• BLOBSTORE: Each stemcell and release uploaded to BOSH is
stored in a blobstore. Default deployments of BOSH use an internal
store (webdav), but, like the Postgresql database, this can also be
externalized.
• Director: The main API that the BOSH CLI will interface with to
allow an operator to create and manage BOSH deployments.
• Health Monitor: BOSH requires that each VM it deploys have an
agent that it can communicate with to assign and deploy jobs from
BOSH releases that are defined in a deployment manifest. It will
also maintain the health of each VM or instance it has deployed.
The agent will report vitals back to BOSH and in cases where ser-
vices in the VM are faulted, or the agent is unreachable, the Health
Monitor can use plugins to restart services and even rebuild the VM
or instance.
• CPI: The CPI is the IaaS-specific executable binary that BOSH uses
to interact with the defined IaaS in its deployment YAML.
• UAA: Provides user access and authentication that allows BOSH to
authenticate operators via SAML or LDAP backends.
• CREDHUB: Manages credentials like passwords, certificates, certif-
icate authorities, SSH keys, RSA keys, and arbitrary values (strings
and JSON blobs). BOSH will leverage credhub to create and store
key credentials for deployments, like public certificates and keys.
NATS master-
haproxy
POSTGRESQL 0
BLOBSTSORE
master 0 master 1
DIRECTOR
HEALTH MONITOR
CREDHUB
worker 0 worker 1 worker 2
Until now, there has been no reliable or convenient way to deliver a strong
level of operational capability to a consumer who may want to run Kuber-
netes in production on their own on-premises and public clouds. To solve
this problem, Google partnered with Pivotal (the leading contributor to
BOSH) to build Cloud Foundry Container Runtime. CFCR was formerly
known by the acronym Kubo, meaning Kubernetes on BOSH. BOSH is
an open source tool for the deployment, release engineering, lifecycle
management, and monitoring of distributed software systems. Google
and Pivotal saw BOSH as a tool with the potential to facilitate produc-
tion-grade Kubernetes operations.
68
Self-Care for the Kubernetes Control Plane
On its own, Kubernetes does a great job maintaining healthy running
workloads. However, it’s not so great at self-care of its control plane
components like its API, controller manager, etc., or its core kubelet pro-
cesses. BOSH provides health and monitoring capabilities to the complete
Kubernetes control plane to keep not only app workloads healthy, but
also Kubernetes itself healthy and running.
70
Container Platforms and
Services
The objective of this chapter is to help you understand container plat-
forms from VMware, their underlying technology, and their business value
so that you can make an informed decision about the best platform for
your organization, its use cases, and its goals.
For example, a prescriptive platform includes its own scheduler for con-
tainers and specifies how to use it to run containerized applications. The
main advantage of a perscriptive platform is that it places the platform’s
complexity in a layer of abstraction—all developers have to do is write
their code and generate an application artifact, and the platform handles
the rest. The disadvantage is that you have fewer options and less flexibil-
ity in how you delivery and deploy your app. A prescriptive platform also
imposes methods of using containers on you; as a result, you might be
unable to manage containers with standard APIs, such as the Docker API
or the Kubernetes API.
72
FACTORS IN SELECTING A CONTAINER PLATFORM
The platform that you select will depend on your unique situation and goals.
Here are some factors to consider:
• Use cases
• Application types and their workloads
• Software development methods and processes
• Operations
• Security and compliance
• Networking
• People and their skill sets
• Maturity of your organization’s container adoption
• Maximizing flexibility vs. minimizing complexity
• Business objectives
Although containers themselves are not new, barriers have hindered their
use for building and deploying enterprise applications. Until fairly recently,
containers lacked the tooling and ecosystem for enterprise-grade deploy-
ment, management, operations, security, and scalability. In addition, the
requirements of IT administrators often went unfulfilled: Infrastructure for
running containers has neglected networking, storage, monitoring, log-
ging, backup, disaster recovery, maintenance, and high availability.
Despite High Expectations for Digital Transformation Led by Cloud, Analytics, Robotic
11
Process Automation, Cognitive & Mobile, IT & Other Business Services Areas See Low Ca-
pability to Execute, The Hackett Group, March 16, 2017. A version of the research is available
for download, following registration, at http://www.thehackettgroup.com/research/2017/
social-media/key17it/.
vSphere Integrated Containers offers the quickest and easiest way for
vSphere users to start using containers today without additional capital
or labor investment. Its tight integration with the entire VMware SDDC
environment, as well as its support for leading container technologies like
Docker, makes it a great solution for a seamless transition to container
adoption. You can tap the benefits of containers for enhanced developer
productivity, business agility, and fast time-to-market.
Before diving into the details of the system architecture, here’s a brief
review of the system’s design objectives. vSphere Integrated Containers
addresses the following commonly occurring objectives:
12
Introduction to Container Security, Docker white paper, Docker.com.
74
• Enable a universal platform for transitioning to modern develop-
ment practices.
• Enable the infrastructure to support the coexistence of both
traditional and modern application designs on common, existing
hardware and software.
• Improve developer agility, shorten time to market, and maximize
application resiliency
• Developers need an environment where they can build, test, and
run their applications using native container tools with minimal
involvement from IT.
• Support a standard framework for orchestrating the deployment of
cloud native applications and automating management of applica-
tion availability in operation.
• Provide integration with the enterprise-grade capabilities of
VMware infrastructure.
• Provide security and availability of application when running in
production.
• Increase visibility into container deployments using standard
VMware tools for better operability.
• Streamline development team access to tools and infrastructure
resources.
• Eliminate extensive approval processes for acquisition and manual
provisioning of infrastructure, which frequently results in develop-
ers pursuing alternative paths of less resistance such as rogue IT or
public offerings.
Architecture
vSphere Integrated Containers is a product designed to tightly integrate
container workflow, lifecycle and provisioning with the vSphere SDDC.
It provides a container management portal, an enterprise-class registry,
and a container runtime for vSphere fully integrated into a commercial
distribution.
Components
76
open source project by adding the functionalities that enterprises
require, such as security, auditing and identity management.
3. Admiral is a container management portal. It provides a GUI for
DevOps teams to provision and manage containers, and includes
the ability to obtain statistics and information about container
instances. It provides both Docker compose and a proprietary
application definition through templating to combine different
containers into an application. It also supports containers scaling
in and out. Advanced capabilities, such as approval workflows, are
available when integrated with vRealize Automation.
4. Photon OS is a minimal Linux container host, optimized to run on
VMware platforms. It is used throughout vSphere Integrated Con-
tainers wherever a Linux guest kernel is required.
The core SDDC infrastructure subsystems, vSphere, NSX, and vSAN com-
plement vSphere Integrated Containers by extending trusted capabilities
such as:
• Distributed Resoource Scheduling (DRS)
• vMotion
• High Availability (HA)
• Secure isolation, micro-segmentation, and RBAC
• SSO via PSC with extension to external identity sources such as
Active Directory/LDAP
• Granular monitoring and logging visibility via vCenter, vRealize
Operations, and VRNI
• vSAN, iSCSI, NFS shared storage
• Direct deployment to Distributed vSwitch and NSX Logical
Switches, and integration with NSX virtual network infrastructure
components
• Unified, full-stack monitoring and logging visibility
Deployment Options
78
Container Runtime
The vSphere Integrated Containers Engine is a container runtime for
vSphere. It enables the provisioning and management of VMs into
vSphere clusters using the Docker binary image format. It enables
vSphere admins to pre-allocate certain amounts of compute, network-
ing and storage and provide that to developers as a self-service portal
exposing a familiar Docker-compatible API. It allows developers who are
familiar with Docker to develop in containers and deploy them alongside
traditional VM-based workloads on vSphere clusters. VMs provisioned
using vSphere Integrated Containers take advantage of many of the ben-
efits of vSphere including DRS, clustering, vMotion, HA, distributed port
groups and shared storage.
The VMs created by vSphere Integrated Containers engine have all of the
characteristics of software containers:
• Ephemeral storage layer with optionally attached persistent “vol-
umes”
• Custom Linux guest designed to be “just a kernel” needs “images”
to be functional
• Automatically configured to various network topologies
80
If you consider a Venn diagram of “What vSphere Does” in one circle
and “What Docker Does” in another, the intersection is not insignificant.
vSphere Integrated Containers takes as much of vSphere as possible and
layers on whatever Docker capabilities are missing. The following sections
discuss the key concepts and components that make this possible.
A single ESXi host can have multiple VCHs, each of which with differ-
ent resources and different users. Similarly, a single VCH can expose the
entire capacity of a vSphere cluster of ESXi hosts. It all depends on your
own use case and requirements.
The vic-machine utility is a binary built for Windows, Linux and Mac OSX
that manages the lifecycle of VCHs. The vic-machine has been designed
to be used by vSphere admins. It takes pre-existing compute, network,
storage and a vSphere user as input and creates a VCH as output. It has
the following additional functions:
82
• Creates certificates for Docker client TLS authentication
• Checks that prerequisites have been met on the cluster (firewall,
licenses, etc.)
• Assists in configuring ESXi host firewalls
• Configures a running VCH for debugging
• Lists, reconfigure, upgrades/downgrades and deletes VCHs.
When the VM powers on, it boots from the ISO, chroots into the container
filesystem on the attached disk, sets up any internal state such as envi-
ronment variables and then starts the container process.
You can look at the Docker Container Host as a container VM that deliv-
ers a particular use case. Instead of instantiating, as a container VM, a
Docker image that represents an application, you are instantiating, as a
container VM, a Docker image that represents a Docker host.
84
are able to provide local and LDAP-based authentication and authori-
zation to their teams and project-level content trust and notary services
for container images in their private registries. Manual and automated
container image vulnerability scanning is also included to avoid running
images with known vulnerabilities in your data centers.
Control Plane
The VCH VM acts as a proxy between the Docker client and the vSphere
SDK and all of the control plane operations of a VCH are initiated by the
vSphere user associated with it. As previously mentioned, the control
plane is extended into containerVMs via the Tether process. The majority
of control plane operations are VM creation, reconfiguration and deletion.
Compute
Networking
86
to allow vSphere networks to be directly exposed to the Docker client for
private container traffic. It is the use of distributed port groups that allows
for containerVMs to be provisioned across multiple hosts and vMotioned.
Networks created via the Docker client currently use IPAM segregation
rather than full micro-segmentation.
Storage
When images are pulled from a Docker registry, they are extracted onto
VMDK snapshots and indexed on a local datastore. Multiple contain-
erVMs can share the same base images because they are immutable and
mounted read-only.
88
Benefits of the Container VM Model
A container VM is strongly isolated by design and benefits from vSphere
enterprise features such as High Availability and vMotion. It is ideally
suited to long-running containers or services with the following require-
ments:
• Strong isolation - a container VM has its own kernel and has no
access to a shared filesystem or control plane
• High throughput - a container VM has its own guest buffer cache
and can connect directly to a virtual network
• High availability - a container VM can be configured so it can run
independent of the availability of the VCH and can benefit from
vSphere HA and vMotion
• Persistent data - a container VM can persist its data to a volume
disk that can be backed up completely independent of the VM
This means that it is not possible to deploy a container with access to the
control plane. It is also impossible to mount parts of the host’s filesystem
as shared read-write volumes into the container.
Modern developers need an environment where they can build and test
their apps using native container technology with minimal involvement
from IT. Today, they use their laptops or a VM with a Docker engine in it
as the main tools to build containerized applications. However, trying to
build an application that goes beyond a simple demo on a laptop or desk-
top can hit performance and memory constraints. And having developers
requesting a VM with the Docker engine in it from IT is time consuming
because all the burden of configuration management and network con-
figuration is left to the IT team.
For example, developers can use DCH to integrate VIC into their CI/CD
pipeline and use products like Jenkins to build applications on DCH and
then push them to production using VCH. This allows build and test jobs
to use vSphere infrastructure as completely ephemeral compute.
90
The DCH gives developers the Docker tools they need to build modern
applications or repackage existing ones and IT teams governance and
control over the infrastructure. vSphere administrators provision compute,
networking, and storage resources and provide them to developers as a
self-service portal that exposes the familiar Docker compatible API.
Developers and IT teams need not worry about patching, security, isola-
tion, of the Docker hosts. Those functions are completely automated by
how DCHs are deployed as part of VIC.
Summary
VMware vSphere Integrated Containers is a comprehensive container
solution built on the industry-leading virtualization platform, VMware
vSphere. It enables customers to run both modern and traditional work-
loads in production on their existing SDDC infrastructure today with
enterprise-grade networking, storage, security, performance and visibil-
ity. It offers the quickest and easiest way for vSphere customers to start
using containers today without additional capital or labor investment.
92
VMware Pivotal Container Service
VMware Pivotal Container Service (PKS) provides a production-grade
Kubernetes-based container solution equipped with advanced net-
working, a private container registry, and full lifecycle management. The
solution radically simplifies the deployment and operation of Kuberne-
tes clusters so you can run and manage containers at scale on VMware
vSphere or in public clouds.
PKS exposes Kubernetes in its native form without adding any layers
of abstraction or proprietary extensions, which lets developers use
the native Kubernetes CLI that they are most familiar with. PKS can be
deployed and operationalized by using Pivotal Operations Manager, which
allows a common operating model to deploy PKS across multiple IaaS
abstractions like vSphere and Google Cloud Platform.
Architecture
PKS builds on Kubernetes, BOSH, VMware NSX-T, and Project Harbor to
form a highly available, production-grade container service. With built-in
intelligence and integration, PKS ties all these open source and commer-
cial modules together, delivering a simple-to-use solution with an efficient
Kubernetes deployment and management experience.
94
PKS Control Plane
A key component of PKS, the control plane is the self-service interface
responsible for the on-demand deployment and lifecycle management of
Kubernetes clusters. It provides an API interface for self-service consump-
tion of Kubernetes clusters. The API submits requests to BOSH, which
automates the creation, update, and deletion of Kubernetes clusters.
The command-line interface and API of BOSH support multiple use cases
through the lifecycle of Kubernetes. You can deploy multiple Kubernetes
cluster in minutes. Scaling Kubernetes clusters can also be done with
CLI or API calls. Patching and updating one or more Kubernetes clusters
are also made easier by PKS through the same mechanism, making sure
your clusters always keep pace with the latest security and maintenance
updates. If the clusters are no longer required, the user can quickly delete
them.
96
Another powerful result of NSX integration with PKS is an assortment of
operational tools and troubleshooting utilities:
• Traceflow
• Port mirroring
• Port connection tool
• Spoofguard
• Syslog
• Port counters
• IPFIX
Such tools are the mainstay of a modern, virtualized network. And now
they have been ported to container networking on Kubernetes. Such
tools fulfill the requirements of production-level networking for con-
tainerized applications so you can, for example, debug communication
between pods and the microservices components of your containerized
applications.
A Boon to Operations
But there’s more. Because NSX provides secure networking for microser-
vices-based applications running on Kubernetes, developers can rapidly,
frequently, and confidently deploy software without having to write code
to guard against traditional infrastructure issues.
There are other results as well, all critical outcomes associated with
moving in the direction of cloud-native applications:
• Being able to modernize legacy applications more quickly and
efficiently.
With the private registry, users can scan container images for vulnera-
bilities to mitigate the risk of security breaches related to contaminated
container images.
Persistent Storage
PKS allows customers to deploy Kubernetes clusters for both stateless
and stateful applications. It supports the vSphere Cloud Provider stor-
age plugin which is part of Kubernetes through Project Hatchway. This
allows PKS to support Kubernetes storage primitives such as Volumes,
Persistent Volumes (PV), Persistent Volumes Claims (PVC), Storage Class
and Stateful Sets on vSphere storage, and also brings in enterprise-grade
storage features like Storage Policy Based Management(SPBM) with
vSAN to Kubernetes based applications.
98
Managing Operations by Integrating with
Other VMware Solutions
PKS can be integrated with other VMware products to offer a full-stack
Kubernetes service. Here are some of the VMware products with which
PKS can integrate:
VMware vRealize Log Insight™: Log Insight delivers highly scalable log
management with actionable dashboards, analytics, and broad third-
party extensibility, giving you deep operational visibility and faster
troubleshooting.
High Availability
PKS provides critical production-grade capabilities to ensure maximum
uptime for workloads running in your Kubernetes clusters. It continuously
monitors the health of all underlying VM instances, and recreates VMs
when there are failed or unresponsive nodes. It also manages the rolling
upgrade process for a fleet of Kubernetes clusters, allowing clusters to be
upgraded with no downtime for application workloads.
Multi-Cloud
PKS supports multi-cloud deployment through BOSH. With PKS, you
can deploy containerized application with Kubernetes on-premises on
vSphere, or on public clouds such as Google Cloud Platform.
100
Secure Container Minimizes application breaches with enhanced con-
Registry: tainer security.
Simplifies container image management and enhances
security through image replication, RBAC, AD/LDAP
integration, notary services, vulnerability scanning, and
auditing.
Constant Enhances developer productivity by letting developers
Compatibility access the most up-to-date Kubernetes features and
with GKS tools.
Supporting Microservices
Developers often turn to container technology to support micro-services.
A micro-services architecture breaks up the functions of an application
into a set of small, discrete, decentralized, goal-oriented processes, each
of which can be independently developed, tested, deployed, replaced,
and scaled.
102
Providing a Developer Sandbox with vSphere
Integrated Containers
vSphere Integrated Containers creates an enterprise container infrastruc-
ture within vSphere, enabling both traditional and containerized apps
to run side by side on a common infrastructure. Developers can initiate
Docker container hosts within a resource pool so they can spin containers
up and down on demand without having to file a ticket with IT.
Self-Service Provisioning
Developers can self-provision Docker container hosts. Although this tick-
etless environment gives developers the Docker tools they need to build
modern applications or repackage existing ones in containers, IT retains
governance and control over the infrastructure because vSphere Inte-
grated Containers leaves the management of the hosts to the vSphere
administrator.
Developers and DevOps need not worry about patching, security, isola-
tion, tenancy, availability, clustering, or capacity planning. Those functions
continue to be business as usual for the vSphere administrator. Instead,
developers and DevOps receive a container endpoint as a service. The
outcome is a win-win situation for both developers and administrators:
The vSphere administrator gets visibility into and control over the virtual
machines, while developers and DevOps can self-provision Docker con-
tainer hosts and work with them by using a Docker client.
104
deploys infrastructure, services, data, and applications on demand. For a
traditional application, however, a common first step toward moderniza-
tion is repackaging part or all of it in a container.
The developer can then push the container image to the vSphere Inte-
grated Containers registry, tag it, and run it in the virtual container host.
At the same time, an administrator of VMware vCenter® can see the con-
106
tainer VM in the vSphere inventory. The developer or the administrator
can use the monitoring page in the vSphere Integrated Containers portal
to view statistics and logs. This unification is made possible in part by the
vSphere Integrated Containers management portal, which is integrated
with identity management to securely provision containers. The result
enables application development teams to repackage, test, and deploy
applications quickly and efficiently.
Benefits of Replatforming
Replatforming an application propels you toward several objectives
associated with accelerating application development and deployment
without having to deal with the complexity of re-architecting or refactor-
ing an application:
• Workload consolidation, especially if you are increasingly moving in
the direction of developing cloud-native applications.
• Simplified and improved integration with a continuous integration
and continuous deployment pipeline (CI/CD).
• Operational efficiency for managing the application with auto-
mation, security,monitoring, logging, analytics, and lifecycle
management.
108
Section Summary
VMware Pivotal Container Services delivers a highly available, produc-
tion-grade Kubernetes-based container service equipped with container
networking, security,and lifecycle management. Deployable both on-prem
in vSphere and in public clouds like Google Cloud Platform, VMware PKS
is well suited to replatforming applications that will benefit from contain-
erization and orchestration.
Here are the some of the ideal use cases and workloads for PKS:
• Running modern data services such as Elasticsearch, Cassandra,
and Spark.
• Running ISV applications packaged in containers.
• Running microservices-based apps that require a custom stack.
Prerequisites
Using Photon OS within AWS EC2 requires the following resources:
• AWS account. Working with EC2 requires an Amazon account for
AWS with valid payment information. Keep in mind that, if you try
the examples in this document, you will be charged by Amazon.
See Setting Up with Amazon EC2.
• Amazon tools. The following examples also assume that you have
installed and configured the Amazon AWS CLI and the EC2 CLI and
AMI tools, including ec2-ami-tools.
See Installing the AWS Command Line Interface, Setting Up the Amazon
EC2 Command Line Interface Tools on Linux, and Configuring AWS Com-
mand-Line Interface. Also see Setting Up the AMI Tools. This article uses
an Ubuntu 14.04 workstation to generate the keys and certificates that
AWS requires.
110
Downloading the Photon OS Image for
Amazon
VMware packages Photon OS as a cloud-ready Amazon machine image
(AMI) that you can download for free from Bintray.
Download the Photon OS AMI now and save it on your workstation. For
instructions, see Downloading Photon OS.
Note: The AMI version of Photon is a virtual appliance with the informa-
tion and packages that Amazon needs to launch an instance of Photon
in the cloud. To build the AMI version, VMware starts with the minimal
version of Photon OS and adds the sudo and tar packages to it.
The examples in this article show how to generate SSH and RSA keys for
your Photon instance, upload the Photon OS .ami image to the Amazon
cloud, and configure it with cloud-init. In many of the examples, you must
replace information with your own paths, account details, or other infor-
mation from Amazon.
The first step is to generate SSH keys on, for instance, an Ubuntu
workstation:
ssh-keygen -f ~/.ssh/mykeypair
The command generates a public key in the file with a .pub extension
and a private key in a file with no extension. Keep the private key file and
remember the name of your key pair; the name is the file name of the two
files without an extension. You’ll need the name later to connect to the
Photon instance.
Change the mode bits of the public key pair file to protect its security. In
the command, include the path to the file if you need to.
chown 600 mykeypair.pub
To import your public key pair file (but not your private key pair file),
connect to the EC2 console at https://console.aws.amazon.com/ec2/ and
select the region for the key pair. A key pair works in only one region, and
the instance of Photon that will be uploaded later must be in the same
region as the key pair. Select key pairs under Network & Security, and
then import the public key pair file that you generated earlier.
For more information, see Importing Your Own Key Pair to Amazon EC2.
When you bundle up an image for EC2, Amazon requires an RSA user
signing certificate. You create the certificate by using openssl to first
generate a private RSA key and then to generate the RSA certificate that
references the private RSA key. Amazon uses the pairing of the private
key and the user signing certificate for handshake verification.
Remember where you store your private key locally; you’ll need it again
later.
For more information, see the Create a Private Key and the Create the
User Signing Certificate sections of Setting Up the AMI Tools.
Third, upload to AWS the certificate value from the certificate.pem file
that you created in the previous command. Go to the Identity and Access
Management console at https://console.aws.amazon.com/iam/, navigate
to the name of your user, open the Security Credentials section, click
Manage Signing Certificates, and then click Upload Signing Certificate.
112
Open certificate.pem in a text editor, copy and paste the contents of the
file into the Certificate Body field, and then click Upload Signing
Certificate.
For more information, see the Upload the User Signing Certificate section
of Setting Up the AMI Tools.
The next prerequisite is to create a security group and set it to allow SSH,
HTTP, and HTTPS connections over ports 22, 80, and 443, respectively.
Connect to the EC2 command-line interface and run the following com-
mands:
aws ec2 create-security-group --group-name photon-sg
--description “My Photon security group”
{
“GroupId”: “sg-d027efb4”
}
aws ec2 authorize-security-group-ingress --group-name pho-
ton-sg --protocol tcp --port 22 --cidr 0.0.0.0/0
By using 0.0.0.0/0 for SSH ingress on Port 22, you are opening the port
to all IP addresses–which is not a security best practice but a conve-
nience for the examples in this article. For a production instance or other
instances that are anything more than temporary machines, you should
authorize only a specific IP address or range of addresses. See Amazon’s
document on Authorizing Inbound Traffic for Linux Instances.
Next, make a directory to store the image, and then extract the Photon
OS image from its archive by running the following tar command. (You
might have to change the file name to match the version you have.)
mkdir bundled
tar -zxvf ./photon-ami.tar.gz
The command uses the certificate path to your PEM-encoded RSA public
key certificate file; the path to your PEM-encoded RSA private key file;
your EC2 user account ID; the correct architecture for Photon OS; the
path to the Photon OS AMI image extracted from its tar file; and the bun-
dled directory from the previous step.
You must replace the values of the certificate path, the private key, and
the user account with your own values.
$ ec2-bundle-image --cert certificate.pem --privatekey
myprivatersakey.pem --user <EC2 account id> --arch x86_64
--image photon-ami.raw --destination ./bundled/
Now upload the bundle to the Amazon S3 cloud. The following command
includes the path to the XML file containing the manifest for the Photon
OS machine created during the previous step, though you might have to
change the file name to match the version you have. The manifest file is
typically located in the same directory as the bundle.
The command also includes the name of the Amazon S3 bucket in which
the bundle is to be stored; your AWS access key ID; and your AWS secret
access key.
$ ec2-upload-bundle --manifest ./bundled/photon-ami.mani-
fest.xml --bucket <bucket-name> --access-key <Account Access
Key> --secret-key <Account Secret key>
114
Step 7: Register the Image
The final step in creating an AMI before you can launch it is to register it.
The following command includes a name for the AMI, its architecture, and
its virtualization type. The virtualization type for Photon OS is hvm.
$ ec2-register <bucket-name>/photon-ami.manifest.xml --name
photon-ami --architecture x86_64 --virtualization-type hvm
Once registered, you can launch as many new instances as you want.
Now things get a little tricky. In the following command, the user-data-file
option instructs cloud-init to import the cloud-config data in user-data.
txt.
The command also includes the ID of the AMI, which you can obtain by
running ec2-describe-images; the instance type of m3.medium, which is a
general purpose instance type; and the name of key pair, which should be
replaced with your own–otherwise, you won’t be able to connect to the
instance.
Before you run the command, change directories to the directory con-
taining the mykeypair file and add the path to the user-data.txt.
$ ec2-run-instances <ami-ID> --instance-type m3.medium -g
photon-sg --key mykeypair --user-data-file user-data.txt
Here are the contents of the user-data.txt file that cloud-init applies to the
machine the first time it boots up in the cloud.
#cloud-config
hostname: photon-on-01
groups:
- cloud-admins
- cloud-users
users:
- default
- name: photonadmin
gecos: photon test admin user
primary-group: cloud-admins
groups: cloud-users
lock-passwd: false
passwd: vmware
- name: photonuser
gecos: photon test user
primary-group: cloud-users
groups: users
Now run the following command to check on the state of the instance
that you launched:
$ ec2-describe-instances
Finally, you can obtain the external IP address of the instance by running
the following query:
$ aws ec2 describe-instances --instance-ids <instance-id>
--query ‘Reservations[*].Instances[*].PublicIpAddress’
--output=text
If need be, check the cloud-init output log file on EC2 at /var/log/cloud-
init-output.log to see how EC2 handled the settings in the cloud-init data
file.
For more information on using cloud-init user data on EC2, see Running
Commands on Your Linux Instance at Launch.
Connect to the instance over SSH by specifying the private key (.pem) file
and the user name for the Photon machine, which is root:
ssh -i ~/.ssh/mykeypair root@<public-ip-address-of-instance>
On the minimal version of Photon OS, the docker engine is enabled and
running by default, which you can see by running the following command:
systemctl status docker
116
Step 3: Start the Web Server
Note: Please make sure that the proper security policies have been
enabled on the Amazon AWS side to enable traffic to port 80 on the VM.
To pull Nginx from its Docker Hub and start it, run the following
command:
docker run -p 80:80 vmwarecna/nginx
The Nginx web server should be bound to the public DNS value for the
instance of Photon OS–that is, the same address with which you con-
nected over SSH.
On your local workstation, open a web browser and go to the the public
address of the Photon OS instance running Docker. The following screen
should appear, showing that the web server is active:
When you’re done, halt the Docker container by typing Ctrl+c in the SSH
console where you are connected to EC2.
You can now run other containerized applications from the Docker Hub or
your own containerized application on Photon OS in the Amazon cloud.
To try this addition, you’ll have to run another instance with this new
cloud-init data source and then get the instance’s public IP address to
check that the Nginx web server is running.
118
Integrating Lightwave with Photon OS
Lightwave provides security services to Photon OS. You can use Light-
wave to join a Photon OS virtual machine to the Lightwave directory
service and then authenticate users with Kerberos.
(In the names of the packages, “afd” stands for authentication framework
daemon; “ic” stands for infrastructure controller, which is Lightwave’s
internal name for its domain controller. Several of the packages, such as
Jansson and Tomcat, are used by Lightwave for Java services or other
tooling.)
120
After deploying and promoting the Lightwave domain controllers, you can see them in the
GCE web interface:
As you can see from the picture above, a Virtual Container Host (VCH) not only allows you to
easily segregate management traffic from data traffic, but also Docker client traffic from in-
tra-container traffic. Moreover, since containers in VIC are deployed as virtual machines (VMs),
vSphere administrators can make vSphere networks directly available to containers.
The VIC network overview lightboard details the networking concepts for
vSphere Integrated Containers, while the recently updated documenta-
tion comes in handy to further explain these options:
• Client Network: The Client Network is used by a VCH to expose the
Docker API service and where developers must point their Docker
clients to manage and run containers.
• Public Network: The Public Network is used by a VCH to pull
images from registries. The most common use case is to pull
images from the public Docker hub. You can also create your own
private, secure local registry by using the VIC Registry (based on
Project Harbor).
• Management Network: The Management Network is used by a VCH
to securely communicate with vCenter and ESXi hosts.
• Bridge Network: The Bridge Network is a private network for con-
tainer communication. External access is granted by exposing ports
to containers and routing the traffic through the VCH endpoint VM.
With no extra configuration, VIC provides service discovery while
a built-in IPAM server provides the containerVMs with private IP
addresses from the subnet of the bridge network.
• Container Network: A Container Network is a user-defined network
that can be used to connect containerVMs directly to a routable
network. Container networks allow vSphere administrators to make
vSphere networks directly available to containers. Container net-
works are specific to VIC and have no equivalent in Docker.
For developers, one of the standout features is the ability for VIC to
expose containers directly on a network through the use of the container
network option: vic-machine create –container-network. You can connect
the containerVMs to any specific distributed port group or NSX logical
switch, giving them their dedicated connection to the network.
122
• No network bandwidth sharing: Every container gets its own net-
work interface and all the bandwidth it can provide is available to
the application. Traffic does not route through the VCH endpoint
VM via network address translation (NAT), and containers do not
share the public IP of the VCH.
• No NAT conflicts: There’s no need for port mapping anymore.
Every container gets its own IP address. The container services are
directly exposed on the network without NAT, so applications that
once could not run on containers can now run by using VIC.
• No Port conflicts: Since every container gets its own IP, you can
have multiple application containers that require an exclusive port
running on the same VCH. This provides better utilization of your
resources.
All of this is possible through the use of the Container Network option.
The container firewall trust level is managed when you create a VCH:
vic-machine create --container-network-firewall “Port-
Group”:[closed | open |outbound | published | peers]
In VIC version 1.2, the default trust level is set to Published. This means
that you now have to explicitly identify which ports will be exposed with
the -p option; example:
docker run -d -p 80 --network=external nginx
Specifying the exposed port improves security and gives you more
awareness of your environment and applications.
Now, if you still want to use the -P option (e.g. docker run -d -P nginx),
you need to change the container network firewall trust level to Open:
vic-machine create --container-network “PortGroup” --con-
tainer-network-firewall “PortGroup”:open
You can configure VCHs where no network traffic can come out of them,
no matter what the developers try to do:
vic-machine create --container-network “PortGroup” --con-
tainer-network-firewall “PortGroup”:closed
124
Or, you can configure VCHs where all traffic is permitted and you let the
developer decide at the application level which ports are exposed and
which are not:
vic-machine create --container-network “PortGroup” --container-net-
work-firewall “PortGroup”:open
Or, you can configure VCHs where only outbound connections are per-
mitted. This works well if you plan to host applications that consume but
do not provide services:
vic-machine create --container-network “PortGroup” --con-
tainer-network-firewall “PortGroup”:outbound
You can configure VCHs where only connections to published ports are
permitted, letting the developers or DevOps control which ports are open
for applications where you can’t change the Dockerfile. Think of all the
new COTS applications delivered as Docker images:
vic-machine create --container-network “PortGroup” --con-
tainer-network-firewall “PortGroup”:published
You can also configure VCHs where the containers can only communicate
with each other. This is ideal for a set of microservices that need to talk
with each other, but not with the external world. For example, a set of
Spark jobs that compute some data and save the result to disk:
vic-machine create --container-network “PortGroup” --con-
tainer-network-firewall “PortGroup”:peers
You should now have a better understanding of the benefits that the
different networking options of VMware vSphere Integrated Contain-
ers, together with the Container Network Firewall feature, provide over
traditional container host implementations, and how they make deploy-
ing containers on VIC even more secure. You should also know how to
segregate different types of network traffic, make containers routable by
exposing them directly on a network, and secure network connections by
using the five distinct trust levels of the container network firewall.
Building an image using this Docker file results in an image with several
layers:
$ docker build -f Dockerfile.example-1 -t demo:0.1 .
Sending build context to Docker daemon 7.168kB
Step 1/4 : FROM alpine:3.6
--- 76da55c8019d
Step 2/4 : RUN echo -e “#!/bin/sh\ndate\nsleep 2d\ndate” >/
bin/our-application
--- Running in dfce6e80a2fb
--- 9295df9995e6
Removing intermediate container dfce6e80a2fb
Step 3/4 : RUN chmod 755 /bin/our-application
--- Running in cdc0e6d7ba27
--- 1d5559a943d4
Removing intermediate container cdc0e6d7ba27
Step 4/4 : CMD /bin/our-application
--- Running in d44e2734bef0
--- 31af83e49686
Removing intermediate container d44e2734bef0
Successfully built 31af83e49686
126
Successfully tagged demo:0.1
The alpine layer is there at the bottom, and our additional commands
have generated a few more layers that get stacked on top to be the
image we want. The final image that has all the files we need is referenced
by the ID 31af83e49686 or by the tag demo:0.1. Each of those layers
should be stored in a registry, and can be reused by future images.
As long as the container runs, the file demo-state will exist and have the
same contents. Stopping and starting the container has no effect on the
container layer, so demo-state will still exist.
If we stop and remove the container, the container layer will be removed
as well as our hold on the demo-state file. Running a new instance of the
container will have a new empty container layer.
There is a distinction here that you should note: The container layer is
ephemeral storage. It’s around for as long as the container, and is lost
when the container goes away. This is in contrast to requirements for data
that needs to remain after the container is removed.
Let’s flip it around—if container images were able to change, how could
you be sure running a specific image today and running it tomorrow
would have the same results? How could you debug an image on my
laptop and be sure you are seeing the same code that is having a prob-
lem in QA? If an application has persisted state in its local image, how do
other instances of the application container get access to that data?
Replication
128
An example of this pattern is running a Cassandra database cluster, where
replication enables the dynamic addition or removal of nodes. If you’re
running Cassandra in containers and being good about bootstrapping
and removing nodes, then you could run a stable database cluster with
normal basic docker run. The persistence is handled by storing data in
the container layer. As long as enough containers are up, persistence is
maintained.
If you can design your application to be able to recreate any needed data,
you’re using this pattern.
A more efficient variant of this process would store each prime number
found and the last number tested in apache Kafka. Given a consistent
initial range and the transaction log, you can quickly get back to a known
state without retesting each number for primality, and continue process-
ing from there.
We can leverage an existing persistent file system that lives on the Docker
host inside the container. This is a pattern most of us are familiar with,
as it has been the way to handle data persistence since tape drives were
invented.
Bind Mount
This is simply mounting a host file system file or directory into the con-
tainer. This is not very different from mounting a CD-ROM onto a virtual
machine (VM). The host path may look like /srv/dir-to-mount, and inside
the container you may be able to access the directory at /mnt/dir-to-
mount.
Volumes
This is slightly different from a simple bind mount. Here, Docker creates
a directory that is the volume, and mounts it just like a bind mount. In
contrast to bind mounts, Docker manages the lifecycle of this volume.
By doing so, it provides the ability to use storage drivers that enable the
backing storage to exist outside of the host running the container.
In vSphere Integrated Containers, if you want to use volumes that are pri-
vate to the container, you can use the iSCSI or vSAN storage in vSphere.
If you have data that should be shared into more than one container, you
can use an NFS backed datastore from vSphere.
130
or updated using vic-machine configure. Volumes added can only be
removed by removing the container host, but that usually isn’t a problem.
Here is an example showing the command that would create the con-
tainer host and enable it to present volumes with various backing stores:
vic-machine ......
--volume-store vsanDatastore/volumes/my-vch-data:backed-up-
encrypted
--volume-store iSCSI-nvme/volumes/my-vch-logs:default
--volume-store vsphere-nfs-datastore/volumes/my-vch-li-
brary:nfs-datastore
--volume-store ‘nfs://10.118.68.164/mnt/nfs-
vol?uid=0\&gid=0:nfs-direct’
The first volume store is on a vSAN datastore and uses the label backed-
up-encrypted so that a client can type docker volume create –opt
VolumeStore=backed-up-encrypted myData to create a volume in that
store. The second uses cheaper storage backed by a FreeNAS server
mounted using iSCSI, and is used for storing log data. Note that it has the
label “default,” which means that any volume created without a volume
store specified is created here. The third and fourth are for two types of
NFS exports. The first being an NFS datastore presented by vSphere, and
the other a standard NFS host directly (useful if you want to share data
between containers).
Note regarding NFS gotcha: NFS mounts in container can be tricky. If you
notice that you cannot read or write files to an NFS share in container,
then you have probably hit this gotcha.Note the final volume store above
has uid and gid arguments. There are two competing concerns. First,
Docker will generally run as uid and gid 0, or as root. You can change that
behavior by specifying a USER in the Dockerfile or on the command line.
See Docker user command for details on how to set it. Second, NFS has
many ways permissions based on uid and gid are applied to the mounted
file system. You must ensure that the user of the running container
matches the uid and gid permissions on the files exported by NFS. Finally,
note that the syntax for native Docker NFS volumes and VIC NFS vol-
umes is different, so if you are trying to apply this to native Docker, you’ll
want to start here.
Once you’ve installed the VCH, you’ll notice that there are now empty
folders created on the respective datastores ready for volume data:
vsanDatastore/volumes/my-vch-data/volumes
iSCSI-nvme/volumes/my-vch-logs/volumes
vsphere-nfs-datastore/volumes/my-vch-library/volumes
nfs://10.118.68.164/mnt/nfs-vol/volumes
Let’s go ahead and create volumes using the Docker client. Note the
implied use of the default volume store in the second example.
$ docker volume create --opt VolumeStore=backed-up-encrypted
--opt Capacity=1G demo_data
$ docker volume create --opt Capacity=5G demo_logs
$ docker volume create --opt VolumeStore=nfs-datastore demo_
nfs_datastore
$ docker volume create --opt VolumeStore=nfs-direct demo_
nfs_direct
After volume creation, you’ll see the following files were
created in the backing datastores:
vsanDatastore/volumes/my-vch-data/volumes/demo_data/demo_
data.vmdk
vsanDatastore/volumes/my-vch-data/volumes/demo_data/Image-
Metadata/DockerMetaData
iSCSI-nvme/volumes/my-vch-logs/volumes/demo_logs/demo_logs.
vmdk
iSCSI-nvme/volumes/my-vch-logs/volumes/demo_logs/ImageMeta-
data/DockerMetaData
vsphere-nfs-datastore/volumes/my-vch-library/volumes/demo_
nfs_datastore/demo_nfs_datastore.vmdk
vsphere-nfs-datastore/volumes/my-vch-library/volumes/demo_
nfs_datastore/ImageMetadata/DockerMetaData
nfs://10.118.68.164/mnt/nfs-vol/volumes/demo_nfs_direct
nfs://10.118.68.164/mnt/nfs-vol/volumes_metadata/demo_nfs_
direct/DockerMetaData
132
# cat /data/some-data /logs/some-logs /library/some-lib /
shared/some-shared
# exit
Right now, only native NFS volumes are allowed to share data between
more than one container. Here is an example of sharing some storage
between containers using native NFS.
Open two terminals. In the first run this command to start nginx:
$ docker run --name nginx -v demo_nfs_direct:/usr/share/nginx/html:ro
-p 80:80 -d nginx
In the same terminal query, the nginx server you just created. Replace
192.168.100.159 with the ip address of the container running nginx:
$ curl 192.168.100.159
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...
</html>
As a final note, if you have a stateful process that can handle restart,
VMware HA will enable restarting the container on a new ESXi host if
the original ESXi host fails. If your process can’t implement a replay or
replication pattern to recover state on failure, then VMware Fault Toler-
ance enables transparent continuation of processing during an ESXi host
failure. In this case the container VM continues running on the new ESXi
host as though there were no failure of the original host. We’ll see if we
can make a blog entry demonstrating the Fault Tolerance feature.
And after causing an ESXi host failure, the container is moved to and
started on a different ESXi host:
Figure 20: The container is moved to and started on a new ESXi host.
So, there you have it: vSphere Integrated Containers can provide resilient
storage and cope with host failures. It’s not mandatory during develop-
ment, but definitely a boon in the production landscape.
134
The DCH image is distributed through Docker Hub and, as part of the VIC
product distribution, in the registry. All the official DCH images main-
tained by VMware are based on Project Photon OS, an open source Linux
operating system optimized for hosting containers and running cloud-na-
tive applications. The source, Dockerfiles and documentation are available
at github.com/vmware/vic-product.
DCH is well-suited for development use cases. Here are some examples:
• As part of a CI/CD pipeline, VIC can be used to enhance end-to-
end dev-build-push-deploy workflows. VIC with DCH can be used
as a (self-service) private cloud for CI/CD by enabling the easy
deployment and tear down of Docker hosts.
• VIC and DCH allow you to treat Docker Hosts as ephemeral com-
pute. This has the benefit of eliminating snowflakes (individually
managed Docker Hosts), which reduces Operating System OpEx
costs. For example, as part of a CI pipeline, you could instantiate
ephemeral Docker Hosts that exist only for the purpose of building
and pushing images, and only for the time it takes to complete that
task.
• An example of how VIC can be used to deploy Jenkins is given
here.
This section demonstrates how flexible this DCH abstraction is. The sec-
tion walks you through how a developer can leverage VIC 1.2 and DCH to
easily instantiate a Docker swarm using only well-known native Docker
commands. The section also shows how easy it is to create a complete
cluster of ephemeral compute using DCH.
136
for w in $(docker inspect -f ‘{{range .NetworkSettings.
Networks}}{{.IPAddress}}{{end}}’ worker${i}); do
docker -H $w:2375 swarm join --token ${SWARM_TOKEN}
${SWARM_MASTER}:2377
done
done
Let’s break this down to better understand what this script does.
User-defined Variables
## USER-DEFINED VARIABLES
# Number of swarm workers desired
NUM_WORKERS=3
# name of routable (external) network
# this needs to be defined on your VCH using the ‘--con-
tainer-network’ option
# use ‘docker network ls’ to list available external
networks
CONTAINER_NET=routable
# Docker Container Host (DCH) image to use
# see https://hub.docker.com/r/vmware/dch-photon/tags/
for list of available Docker Engine versions
DCH_IMAGE=”vmware/dch-photon:17.06”
As a user of the script, this is the only section you need to modify.
NUM_WORKERS – this is the number of worker nodes that will be added
to the swarm, in addition to the manager node.
CONTAINER_NET – this is the network to be used by our Docker
Container Hosts. Here we leverage the ability of vSphere Integrated Con-
tainers to connect containers directly to vSphere Port Groups rather than
through the Container Host. This will allow for easier interaction with our
swarm.
DCH_IMAGE – here you can specify a different version of the Docker
engine by modifying the tag (e.g. ‘vmware/dch-photon:1.13’). You can see
the list of available tags/versions here.
This is where we begin to see the DCH magic in action. Specifically, look
at the following command:
docker run -d -v registrycache:/var/lib/docker \
--net $CONTAINER_NET \
--name manager1 --hostname=manager1 \
$DCH_IMAGE
138
# create docker volumes for each worker to be used as
image cache
docker volume create --opt Capacity=10GB --name work-
er-vol${i}
# run new worker container
docker run -d -v worker-vol${i}:/var/lib/docker \
--net $CONTAINER_NET \
--name worker${i} --hostname=worker${i} \
$DCH_IMAGE
# wait for daemon to start
sleep 10
done
This simple `for’ loop repeats NUM_WORKERS times. For each iteration, it:
• creates a volume to be used as the image cache for the worker
• instantiates a worker using the vmware/dch-photon:17.06 image
• joins the worker to the swarm using the join token (SWARM_
TOKEN) we fetched in the previous step
Once again, because we are using DCH with VIC, we see that it requires
only a very simple docker run command to create and run our Docker
Hosts that will be used as the swarm worker nodes.
You can use the following commands (replace with your own endpoint IP
address and make sure the script is in the current directory):
export DOCKER_HOST=192.168.100.144:2375
./dch-swarm.sh
Upon completion, the script prints information about the newly created
swarm:
140
We can also see our newly created containerVMs hosting the swarm
manager and the swarm workers in vSphere Client:
Figure 22: Container VMs holding the swarm manager and workers.
Finally, we can test the swarm by running the Docker Example Voting
App. We deploy the app against our swarm by using the docker stack
deploy command:
We can see above that all of the required services were started. Testing
the application in the browser, we can see it is indeed running and func-
tional:
Conclusion
You should now have a better understanding of how the newly intro-
duced VIC DCH feature helps address developer use cases. We have
shown how easy it is to automate the deployment of Docker hosts using
DCH. VIC provides end-users and developers with a Docker dial tone and
a very flexible consumption model on top of vSphere, while DCH enables
a new level of self-service.
142
• The way that containers force developers to think about how state
is managed, in terms of persistence, scope and integrity is leading
to more flexible application architectures.
Few would argue with the contention that containers make software
provisioning easier. As such, with vSphere Integrated Containers, it’s never
been easier to provision software to vSphere. This is just as true of Jen-
kins as any other application. However, with power comes responsibility
and if we want to deploy Jenkins, there’s a few critical factors we need to
consider:
• What are the software artifacts in the Jenkins image we want to
deploy?
• Do we trust the provenance of those artifacts?
• Do those software artifacts contain known vulnerabilities?
As a cloud admin, you can choose to build your own container images
using your own Dockerfiles, or you can start from a public image and
further modify it to your needs. As you’ll see from DockerHub, you can
choose from a Debian base or an Alpine base. The Alpine base is half
the download size and has less than half the packages, so that makes it
attractive, although there may be compliance considerations involved in
the decision.
144
At the time these commands were run, vSphere Integrated Containers
registry identified that the zlib package contains two high-level vulner-
abilities. (Your results might be different when you run the command.)
It provides a link to the CVE database, which describes the issue. It also
shows that there are updated versions of zlib in which the issue is fixed,
so we can use that information to create a Dockerfile that defines a new
image with updated packages.
cat Dockerfile
FROM jenkins/jenkins:lts-alpine
USER root
RUN apk update && apk upgrade
USER jenkins
As a cloud admin, you can choose to limit the vulnerability level images
can be deployed with. You can also restrict the images deployed to an
endpoint to only ones that have been signed by a service such as Notary.
Data Persistence
The Jenkins container is configured in such a way that all of the persistent
state is stored in one location: /var/jenkins_home. This means that you
can safely start, stop or even upgrade the Jenkins master container and
it will always come back up with all of the previous data, assuming you
specified a named volume.
146
A vSphere Integrated Containers container has exclusive access to its
own guest buffer cache, so there’s no resource competition from other
containers.
If the vSphere cluster has HA enabled, then if an ESXi host goes down,
the endpoint VM and the containers will be automatically restarted on
other hosts.
If you deploy the same container via the Management UI, you can create
a template that persists the configuration so that you can re-use it for
future deployments. Viewing the container once it’s deployed allows you
to see statistics and logs for the container.
148
lying infrastructure is unable to match the platform’s resource demands,
performance can be impaired. Developer-ready infrastructure from
VMware supplies three key services that dynamically optimize resources
for Pivotal Cloud Foundry:
• Scalability
• Availability
• Security
150
Running Application Instances
Developer
Developer
P
S2I Pipeline Docker U AI
Image S
H svc1.foo.com
PCF does this by scheduling multiple AIs per application and running
them in multiple PCF availability zones. These zones, typically created
in sets of three, maintain an application’s uptime if a fault occurs in the
infrastructure backing a zone.
P
S2I Pipeline Docker U AI AI
Image S
H svc1.foo.com
152
Delivering IaaS with VMware Solutions
Today, the infrastructure underlying Pivotal Cloud Foundry is typically
delivered through an Infrastructure as a Service (IaaS) solution like
VMware vSphere® and VMware vCenter®.
PCF interacts with the IaaS through BOSH, an open source tool for man-
aging the lifecycle of distributed systems. Pivotal Cloud Foundry deploys
BOSH as a virtual machine called the Ops Manager Director. BOSH
creates VM instances and assigns one or more jobs to each VM instance.
BOSH jobs provide a VM instance with desired service release com-
ponents of PCF; for example, a job called diego_cell contains all of the
release components required to stage and start containers in PCF. BOSH
will then ensure availability of all PCF services by deploying VM instances
across availability zones and ensuring jobs are assigned to them across
availability zones as well.
P Diego Container
Artifact U Root FS
War
S AI AI AI
BuildPack
H myapp.foo.com
RabbitMQ
P
S2I Pipeline Docker U AI AI
Image S
H svc1.foo.com
External
Data
Agent
Ops BOSH
Man BOSH Jobs GR DC GR DC GR DC
“instances”
PCF CTRL
CPI GR = Go Routers
DC = Diego Cell
vCenter NSX
IaaS vSphere
IaaS
Scalability
While there are many ways in which Pivotal Cloud Foundry is designed to
scale, one of the most important is in scaling application instances. These
run on VM instances in PCF called Diego Cells.
154
Supporting Multiple PCF Deployments on vSphere
Infrastructure
P Diego Container
U Root FS
War
S AI AI AI
BuildPack
H myapp.foo.com
RabbitMQ
P
Docker U AI AI
Image S
H svc1.foo.com
External
Data
Agent
Ops
Man BOSH BOSH Jobs DC GR DC GR DC GR
“instances”
PCF CTRL Cluster1 Cluster2 Cluster3
1
CPI RP RP RP
vSphere DC DC DC
Clusters GR GR GR
vCenter NSX
IaaS vSphere
IaaS
Other issues can affect availability. These issues can exist within both
PCF and the infrastructure hosting it. Platform operators must implement
effective monitoring of both PCF and the infrastructure hosting it. This is
critical to maintaining availability and scale.
vRealize Operations and VMware vRealize® Log Insight™ ingest PCF KPIs
and events as well as IaaS metrics and events from vCenter. Data, such as
156
application instance growth patterns and the speeds at which infrastruc-
ture resources are being consumed by developer tenants, allows vRealize
Operations to track if and when the underlying infrastructure capac-
ity will be exceeded. vRealize Operations also visualizes the complete
deployment of PCF in a set of dashboards detailing PCF key performance
indicators, and can alert you when there are unhealthy KPIs. vRealize
allows an operator to track and act upon significant log events that
may indicate PCF service and application failures. vRealize Operations
and vRealize Log Insight provide a comprehensive and unified view of
long-term trending availability and help maximize Pivotal Cloud Foundry
deployment uptimes.
P Diego Container
U Root FS
War
S AI AI AI
BuildPack
H myapp.foo.com
RabbitMQ
s
tric
P
Me
Docker U AI AI
Image S
H svc1.foo.com
External
Data
Agent
2 2 2
Ops
Man BOSH BOSH Jobs
vROps DC GR DC GR DC GR
“instances”
PCF CTRL Cluster1 Cluster2 Cluster3
2 1
CPI RP RP RP
vRealize vSphere DC DC DC
Clusters GR GR GR
vCenter NSX
IaaS vSphere
IaaS
The key here is to secure control of network access points and also be
able to audit and report on all access as it occurs, both of which are
enabled by vRealize in conjunction with the VMware NSX network virtual-
ization and security platform.
PCF requires that an external load balancing service take in requests and
forward them to the GoRouters, which then route requests to the applica-
tion instances or AIs. The NSX Edge provides the load balancing services
that PCF requires as well as offering network address translation (NAT)
services and SSL cryptographic protocols for securing application traffic.
158
1 Scale What does PCF Pivotal Cloud Foundry
need from IaaS?
Events
2 Availability Fire
Metrics AZ1 AZ2 AZ3 Service Tiles
Hose
3 Selectivity Stage
MySQL
Service Broker
P Diego Container
U Root FS
War
S AI AI AI
BuildPack
H myapp.foo.com
RabbitMQ
s
tric
P
nts
Me
Docker U AI AI
Eve
Image S
H svc1.foo.com
External
Data
Agent
2 2 2
Ops BOSH
vROps VRLI Man BOSH Jobs DC GR DC GR DC GR
“instances”
PCF CTRL Cluster1 Cluster2 Cluster3
2 1
CPI RP RP RP
vRealize vSphere DC DC DC
Clusters GR GR GR
vCenter NSX
IaaS vSphere
3
IaaS
LB
SSL NSX Networking
NAT
A leading financial group in Asia that serves more than four million
customers with over 280 branches in 18 markets is, like so many other
businesses, adapting to change. “The experience of telcos, transport, and
retailing shows that we’re changing the way we communicate, the way we
commute, and the way we consume. So why would banking be immune
or be safeguarded from any of this?” the CEO of this financial group said.
160
VMWARE FOOTPRINT
• Server virtualization technology
• vSphere Integrated Containers
• VMware Technical Account Manager Services
The Solution
SOLUTION
Re-platforming applications from legacy Unix to Linux, packaged in con-
tainers, and leveraging VMware vSphere® Integrated Containers™ to increase
agility in scheduling batch jobs for business applications and improve
resource utilization across clusters and data centers.
On top of business agility, they were also able to bring savings to their
infrastructure costs. They can now distribute the applications and have
them share the same compute resources though the use of their batch
scheduler. Where batch jobs used to have dedicated pools of resources,
the load is now spread across the same pool of resources and scheduled
on demand.
• Batch workloads are now running in vSphere alongside legacy
workloads, and using all of the available capacity (50% overhead
from metro clustering).
• vSphere Distributed Resource Scheduler (DRS) provides elastic
resource management allowing them to schedule the excess metro
cluster capacity in an efficient manner.
• Since vSphere Integrated Containers included in their vSphere
licensing, it allows them to deploy this solution to all of their clus-
ters without incurring additional licensing costs.
162
• Batch applications no longer incur overhead when not in use
(before re-platforming, they would require dedicated infrastruc-
ture)
The solution also had a positive impact on their operational costs. The
immutable and ephemeral quality of the Container VMs means that they
were able to drastically reduce the operating systems maintenance and
other day-2 operations costs. Because the container VMs are short-lived,
it also allowed them to eliminate some of the security costs associated
with agents licensing.
APPLICATIONS VIRTUALIZED
• Various business-critical batch applications required for risk calculations
(hsVaR) And grid computing.
• Build Slaves for Jenkins software development process automation.
Looking Ahead
As the next step, this financial institution is planning to migrate the rest of
the batch applications that are already targeted to move to this platform.
More teams at the company are also experimenting with vSphere Inte-
grated Containers. One application team is now using vSphere Integrated
Containers to provision ephemeral Jenkins build slaves to optimize their
continuous integration pipeline. Other applications being investigated
include:
• Big data analytics: Spark compute with an object store as backend,
spinning up Spark container VMs.
• Security scanning: Fortify port scanning requires resources on
demand, which would fit well in the current model.
Glossary
This glossary presents definitions for terminology in the cloud-native
space. The definitions are not intended to be axiomatic, dictionary-style
definitions but rather plain-language descriptions of what a term means
and an explanation of why the technology associated with it matters. For
some of the terms, meaning varies by usage, situation, perspective, or
context.
A
ACID: ACID stands for Atomicity, Consistency, Isolation, and Durability—
properties of database transactions that, taken together, guarantee the
validity of data in the face of power failures or system errors.
164
AKS: Azure Container Service (AKS) is Microsoft’s managed Kubernetes
service that runs in Azure.
API server: In Kubernetes, the API server provides a frontend that han-
dles REST requests and processes data for API “objects,” such as pods,
services, and replication controllers.
B
build: With Docker, it is the process of building Docker images by using a
Dockerfile. In the context of the CI/CD pipeline, the build process gener-
ates an artifact, such as a set of binary files that contain an application.
C
Cassandra: A NoSQL database, Apache Cassandra manages structured
data distributed across commodity hardware. Common use cases include
recommendation and personalization engines, product catalogs, play lists,
fraud detection, and message analysis.
166
CNI: Container Network Interface. It is a open source project hosted by
the CNCF to provide a specification and libraries for configuring network
interfaces in Linux containers.
D
day one: Refers to deployment.
168
technologies and practices—such as containers, Kubernetes, microser-
vices, container platforms, DevOps, and the CI/CD pipeline—converge
into a powerful recipe for digital transformation.
E
elastic: A resource or service that can dynamically expand or contract to
meet fluctuations in demand.
F
fault tolerance: Fault tolerance is the property that lets a system continue
to function properly in the event of component failure.
Fluentd: A data collector for unified logging. Fluentd, which works with
cloud-native applications, is hosted by the CNCF.
H
Hadoop: Hadoop comprises the Hadoop Distributed File System (HDFS)
and MapReduce. HDFS is a scalable storage system built for Hadoop and
big data. MapReduce is a processing framework for data-intensive com-
putational analysis of files stored in a Hadoop Distributed File System.
Apache Hadoop is the free, open-source version of Hadoop that is
managed by the Apache Software Foundation. The open-source version
provides the foundation for several commercial distributions, including
Hortonworks, IBM Open Platform, and Cloudera. There are also Hadoop
platforms as a service. Microsoft offers HDInsight as part of its public
cloud, Azure. Amazon Elastic MapReduce, or EMR, delivers Hadoop as a
web service through AWS.
170
Helm tool. The charts help improve the portability of Kubernetes appli-
cations. A single chart can contain an entire web application, including
databases, caches, HTTP servers, and other resources.
I
image: With Docker, an image is the basis of a container. An image
specifies changes to the root file system and the corresponding execu-
tion parameters that are to be used in the container runtime. An image
typically contains a union of layered files systems stacked on top of each
other. An image does not have state and it never changes.
J
Jaeger: A distributed tracing system released as open source software by
Uber Technologies, Jaeger can monitor microservice-based architectures.
Use cases include distributed transaction monitoring, root cause analysis,
service dependency analysis, and performance optimization. Jaeger is
hosted by the CNCF.
Kafka: Apache Kafka partitions data streams and spreads them over
a distributed cluster of machines to coordinate the ingestion of vast
amounts of data for analysis. More formally, Kafka is a distributed pub-
lish-subscribe messaging system. A key use of Kafka is to help Spark or
a similar application process streams of data. In such a use case, Kafka
aggregates the data stream—for example, log files from different serv-
ers—into “topics” and presents them to Spark Streaming, which analyzes
the data in real time.
L
LDAP: Lightweight Directory Access Protocol. It is a standard protocol for
storing and accessing directory service information, especially usernames
and passwords. Applications can connect to an LDAP server to verify
users and groups.
172
Active Directory interoperability, Kerberos authentication, and certificate
services. Lightwave empowers IT security managers to impose the proven
security policies and best practices of on-premises computing systems on
their cloud computing environment. More specifically, Lightwave includes
the following services:
• Directory services and identity management with LDAP and Active
Directory interoperability
• Authentication services with Kerberos, SRP, WS-Trust (SOAP),
SAML WebSSO (browser-based SSO), OAuth/OpenID Connect
(REST APIs), and other protocols
• Certificate services with a certificate authority and a certificate
store
linkerd: A service mesh that adds service discovery, routing, failure han-
dling, and visibility to cloud-native applications. linkerd is hosted by the
CNCF.
M
Memcached: As a system that caches data in the distributed memory of
a cluster of computers, Memcached accelerates the performance of web
applications by holding the results of recent database calls in random-ac-
cess memory (RAM).
Minikube: A tool that lets you run a single-node Kubernetes cluster inside
a virtual machine or locally on a personal computer.
N
namespace: In the context of a Linux computer, a namespace is a feature
of the kernel that isolates and virtualizes system resources. Processes that
are restricted to a namespace can interact only with other resources and
processes in the same namespace.
In Kubernetes, when many virtual clusters are backed by the same under-
lying physical cluster, the virtual clusters are called namespaces.
O
OCI stands for Open Container Initiative, an organization dedicated to
setting industry-wide container standards. OCI was formed under the
auspices of the Linux Foundation for the express purpose of creating
open industry standards around container formats and runtime. The OCI
contains two specifications: the Runtime Specification (runtime-spec)
and the Image Specification (image-spec). VMware is a member of OCI.
See https://www.opencontainers.org/.
174
OpenTracing: A vendor-neutral standard for distributed tracing. It is
hosted by the CNCF.
P
PaaS: See platform as a service.
private cloud: A fully virtualized data center that includes two key
capabilities that increase agility and are different from a virtualized data
center: self-service and automation.
176
Q
quality of service: It is often abbreviated QoS.
R
RabbitMQ: An open source message broker, RabbitMQ implements the
Advanced Method Queuing Protocol to give applications a common inter-
mediate platform through which they can connect and exchange data.
Spark: Apache Spark is an engine for large-scale data processing that can
be used interactively from the Python shell. Spark combines streaming,
SQL, and complex analytics by powering a stack of tools that can coexist
in the same application. Spark can access diverse data sources, including
not only the Hadoop File System (HDFS) but also Cassandra and Mon-
goDB. Data scientists like Spark because they get access to Python’s
powerful numeric processing libraries.
178
Spring Cloud Data Flow: A toolkit for building data integration and real-
time data processing pipelines. The Spring Cloud Data Flow server uses
Spring Cloud Deployer to integrate pipelines with Pivotal Cloud Foundry,
Mesos, or Kubernetes. Spring Cloud Data Flow helps engineers develop
analytics pipelines by providing a distributed system that unifies inges-
tion, real-time analytics, batch processing, and data export.
T
tag: With Docker, a tag is a label that a user applies to a Docker image to
distinguish it from other images in a repository.
the cloud: Computing resources available over the Internet. See cloud
computing.
U
UID: It can stand for user identifier, user ID, or unique identifier, depending
on the context or the system. With Kubernetes, for example, a UID is a
string that uniquely identifies an object.
V
Vagrant: HashiCorp’s Vagrant turns a machine’s configuration into a dis-
tributable template to produce a predictable development environment
for applications.
W
workload: A workload is the computational or transactional burden of
a set of computing, networking, and storage tasks associated with an
application. Similar apps with the same technology and tools can have
radically different workloads under different circumstances or during
X
XML: Extensible Markup Language. It is a flexible but verbose format for
structuring and exchanging data. XML is often used in legacy applica-
tions, Java applications, and web applications for a variety of purposes,
such as structuring configuration files or exchanging data. Although XML
is sometimes used in cloud-native applications, JSON or YAML (which
see) are the preferred data formats.
Y
YARN: A sub-project of Apache Hadoop, YARN separates resource man-
agement from computational processing to expand interactional patterns
beyond MapReduce for data stored in HDFS. YARN allocates resources
for Hadoop applications such as MapReduce and Storm as they perform
computations. YARN, in effect, stands at the center of a Hadoop environ-
ment by providing a data operating system and pluggable architecture
for other applications.
Z
ZooKeeper: Apache ZooKeeper coordinates distributed applications mas-
querading as animals. It provides a registry for their names. It configures
and synchronizes them. It keeps them from running amok.
180
Numbers
12-factor app: A methodology for developing a software-as-a-service
(SaaS) application—that is, a web app—and typically deploying it on a
platform as a service or a containers as a service.
Copyright © 2018 VMware, Inc. All rights reserved. This product is protected by U.S. and inter-
national copyright and intellectual property laws. VMware products are covered by one or more
patents listed at http://www.vmware.com/go/patents. VMware is a registered trademark or trade-
mark of VMware, Inc. and its subsidiaries in the United States and/or other jurisdictions. All other
marks and names mentioned herein may be trademarks of their respective companies.