0% found this document useful (0 votes)
90 views20 pages

Service Mesh Ultimate Guide 2021: Next Generation Microservices Communication

This document is a service mesh guide that provides an overview of service mesh technologies. It discusses the history of service mesh and how it has evolved to address the needs of developing and operating distributed microservice architectures across diverse infrastructure environments. The guide explores trends in multi-cloud, multi-cluster, and multi-tenant service mesh deployments. It also covers new patterns, implementations, features, use cases, and the future of service meshes.

Uploaded by

KP S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views20 pages

Service Mesh Ultimate Guide 2021: Next Generation Microservices Communication

This document is a service mesh guide that provides an overview of service mesh technologies. It discusses the history of service mesh and how it has evolved to address the needs of developing and operating distributed microservice architectures across diverse infrastructure environments. The guide explores trends in multi-cloud, multi-cluster, and multi-tenant service mesh deployments. It also covers new patterns, implementations, features, use cases, and the future of service meshes.

Uploaded by

KP S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

The InfoQ eMag / Issue #101 / December 2021

Service Mesh Ultimate


Guide 2021
Next Generation Microservices Communication

Service Mesh Exploring the


Service Mesh
Implementations (Possible) Future of
Features
and Products Service Meshes

FACILITATING THE SPREAD OF KNOWLEDGE AND INNOVATION IN PROFESSIONAL SOFTWARE DEVELOPMENT


InfoQ @ InfoQ InfoQ InfoQ

Service Mesh Ultimate


Guide 2021
IN THIS ISSUE

5 - The Service Mesh Pattern


6 - Service Mesh Features
7 - Service Mesh Architecture: Looking Under the Hood
8 - Use Cases
8 - Antipatterns
9 - Service Mesh Implementations and Products
10 - Service Mesh Comparisons: Which Service Mesh?
10 - Service Mesh Tutorials
11 - History of the Service Mesh
12 - Service Mesh Standards
12 - Exploring the (Possible) Future of Service Meshes
17 - FAQ

18 - Additional Resources
18 - Glossary

PRODUCTION EDITOR Ana Ciobotaru / COPY EDITOR Maureen Spencer / DESIGN Dragos Balasoiu & Ana Ciobotaru
GENERAL FEEDBACK [email protected] / ADVERTISING [email protected] / EDITORIAL [email protected]
The InfoQ eMag / Issue #101 / December 2021
Key Takeaways

• Learn about the emerging including custom certificate authority


architecture trends in the adoption plugins, adaptive routing features for
of service mesh technologies, higher availability and scalability of
especially the multi-cloud, multi- the services, and enhancing sidecar
cluster, and multi-tenant models, how proxies.
to deploy service mesh solutions in
• Learn about what's coming up in
heterogeneous infrastructures (bare
the operational aspects, such as
metal, VMs and Kubernetes), and
configuring multi-cluster capabilities
application/service connectivity from
and connecting Kubernetes
edge computing layer to to the mesh
workloads to servers hosted on VM
layer.
infrastructure, and the developer
• Learn about some of the new portals to manage all the features
patterns in the service mesh and API in multi-cluster service mesh
ecosystem, like Multi-Cluster Service installations.
Mesh, Media Service Mesh and
Chaos Mesh as well as classic
microservices anti-patterns like the
“Death Star” architecture.

• Get an up-to-date summary of the


recent innovations of using service
mesh in the area of deployments,
with rapid experimentations, chaos
engineering and canary deployments
between Pods (K8s clusters) and
VMs (non-K8s clusters).

• Explore innovations in the area of


Service Mesh extensions including:
enhanced identity management for
securing microservices connectivity

3
Not all organizations have transitioned all their business
The InfoQ eMag / Issue #101 / December 2021

and IT apps to the Kubernetes cloud platform. So, since


the inception of service mesh there has been a need
for this technology to work in diverse IT infrastructure
environments.

With the growing adoption of microservice architectures,


application systems have become decoupled and
distributed in terms of cloud providers, infrastructure
(Kubernetes, VM’s, Bare Metal Servers), geographies, and
Srini Penchikala
even types of workloads to be managed in service mesh
is a senior IT architect based out of Austin,
Texas. He has over 25 years of experience integrated environments.
in software architecture, design, and
development, and has a current focus on
cloud-native architectures, microservices Let’s start off with some history of how service mesh came
and service mesh, cloud data pipelines, and about.
continuous delivery. Penchikala wrote Big-
Data Processing with Apache Spark and co-
wrote Spring Roo in Action, from Manning. Around 2016, the term “service mesh” appeared in the
He is a frequent conference speaker, is a
big-data trainer, and has published several arenas of microservices, cloud computing, and DevOps.
articles on various technical websites.
Buoyant team used the term in 2016 to explain their product
Linkerd. As with many concepts within computing, there
is actually a long history of the associated pattern and
In the last few years, service mesh technologies have
technology.
come a long way. Service mesh plays a vital role in cloud
native adoption by various organizations. By providing the
The arrival of the service mesh has largely been due to a
four main types of capabilities—Connectivity, Reliability,
perfect storm within the IT landscape. Developers began
Observability, and Security—service mesh has become
building distributed systems using a multi-language
a core component of IT organizations’ technology and
(polyglot) approach, and needed dynamic service discovery.
infrastructure modernization efforts. Service mesh enables
Operations began using ephemeral infrastructure, and
Dev and Ops teams to implement these capabilities at
wanted to gracefully handle the inevitable communication
infrastructure level, so application teams don’t need to
failures and enforce network policies. Platform teams
reinvent the wheel when it comes to the cross-cutting non-
began embracing container orchestration systems like
functional requirements.
Kubernetes, and wanted to dynamically route traffic in
and around the system using modern API-driven network
Since the publication of the first edition of this guide back
proxies, such as Envoy.
in February of 2020, service mesh technologies have gone
through significant innovations and several new architecture
This guide aims to answer pertinent questions for software
trends, technology capabilities, and service mesh projects
architects and technical leaders, such as: What is a service
have emerged in the ever evolving service mesh space.
mesh? Do I need a service mesh? How do I evaluate the
different service mesh offerings?
The previous year has seen the service mesh products
evolve to be much more than Kubernetes-only solutions
You can use the Table of Contents menu from the previous
where apps that are not hosted on Kubernetes platform
page to quickly navigate this guide.
couldn’t take advantage of the service mesh.

4
The InfoQ eMag / Issue #101 / December 2021
The Service
Mesh Pattern

The service mesh pattern is communication library to Structure


focusing on managing all service- handle service discovery, The service mesh pattern
to-service communication within routings, and application- primarily focuses on handling
a distributed software system. level (Layer 7) non-functional traditionally what has been
communication requirements. referred to as "east-west" remote
Context procedure call (RPC)-based
• Externalizing service
The context for the pattern is traffic: request/response type
communication configuration,
twofold: First, that engineers communication that originates
including network locations
have adopted the microservice internally within a datacenter and
of external services, security
architecture pattern, and are travels service-to-service. This is
credentials, and quality of
building their applications by in contrast to an API gateway or
service targets.
composing multiple (ideally edge proxy, which is designed
single-purpose and independently • Providing passive and active to handle “north-south” traffic:
deployable) services together. monitoring of other services. Communication that originates
Second, the organizations have externally and ingresses to an
• Decentralizing
embraced cloud native platform endpoint or service within the
the enforcement
technologies such as containers datacenter.
of policy throughout a
(e.g. Docker), orchestrators (e.g.
distributed system.
Kubernetes), and gateways.
• Providing observability
Intent defaults and standardizing
The problems that the service the collection of associated
mesh pattern attempts to solve data.
include: - Enabling request logging
- Configuring distributed
• Eliminating the need to
tracing
compile into individual
- Collecting metrics
services a language-specific

5
Service Mesh Features
The InfoQ eMag / Issue #101 / December 2021

A service mesh implementation real-time service-to-service Security


will typically offer one or more of communication • Service-to-service
the following features: authentication (mTLS)
• Enables platform teams to
configure “sane defaults” to • Certificate Management
• Normalizes naming and
protect the system from bad
adds logical routing, (e.g. • User Authentication (JWT)
communication
maps the code-level name
• User Authorization (RBAC)
“user-service” to the platform-
Service mesh capabilities can be
specific location “AWS- • Encryption
categorized into four areas as
us-east-1a/prod/users/
listed below:
v4”) Observability
• Monitoring
• Provides traffic shaping and • Connectivity
traffic shifting • Telemetry, Instrumentation,
• Reliability
Metrics
• Maintains load balancing,
• Security
typically with configurable • Distributed Tracing
algorithms • Observability
• Service Graph
• Provides service release
Let’s look at what features service
control (e.g., canary releasing Observability
mesh technologies can offer in
and traffic splitting) • Monitoring
each of these areas.
• Offers per-request routing • Telemetry, Instrumentation,
(e.g., traffic shadowing, Let’s look at what features service Metrics
fault injection, and debug mesh technologies can offer in
• Distributed Tracing
re-routing) each of these areas.
• Service Graph
• Adds baseline reliability, such
Connectivity
as health checks, timeouts/
• Traffic Control (Routing,
deadlines, circuit breaking,
Splitting)
and retry (budgets)
• Gateway (Ingress, Egress)
• Increases security, via
transparent mutual Transport • Service Discovery
Level Security (TLS) and
• A/B Testing, Canary
policies such as Access
Control Lists (ACLs • Service Timeouts, Retries

• Provides additional
Reliability
observability and monitoring,
• Circuit Breaker
such as top-line metrics
(request volume, success • Fault Injection/Chaos Testing
rates, and latencies), support
for distributed tracing, and
the ability to “tap” and inspect

6
Service Mesh Architecture: Looking Under the Hood

The InfoQ eMag / Issue #101 / December 2021


A service mesh consists of two Control Plane and Data Plane
high-level components: a data combined provide the best of
plane and a control plane. Matt both worlds, in the sense that
Klein, creator of the Envoy Proxy, the policies can be defined and
has written an excellent deep-dive managed centrally, at the same
into the topic of “service mesh time, the same policies can
data plane versus control plane.” be enforced in a decentralized
manner, locally in each pod on
Broadly speaking, the data Kubernetes cluster.
plane “does the work” and is
responsible for “conditionally The policies can be related to
translating, forwarding, and security, routing, circuit breaker,
observing every network packet or monitoring.
that flows to and from a [network
endpoint].” The diagram below is taken
from the Istio architecture
In modern systems, the data documentation, and although
plane is typically implemented as the technologies labeled are
a proxy, (such as Envoy, HAProxy, specific to Istio, the components
or MOSN), which is run out-of- are general to all service mesh
process alongside each service implementation.
as a “sidecar.” Linkerd uses a
micro-proxy approach that’s
optimized for the service mesh
sidecar use cases.

A control plane “supervises the


work,” and takes all the individual
instances of the data plane—a
set of isolated stateless sidecar
proxies—and turns them into a
distributed system.

The control plane doesn’t touch


any packets/requests in the
system, but instead, it allows a
human operator to provide policy
and configuration for all of the
running data planes in the mesh.
The control plane also enables Istio architecture, demonstrating the how the control
the data plane telemetry to be plane and proxy data plane interact (courtesy of the Istio
collected and centralized, ready documentation)
for consumption by an operator.

7
Use Cases business impact such as avoiding Too Many Traffic Management
The InfoQ eMag / Issue #101 / December 2021

multiple (non-idempotent) HTTP Layers (Turtles All the Way


There are a variety of use cases POST requests. Down)
that a service mesh can enable or This antipattern occurs when
support. Observability of Traffic developers do not coordinate
As a service mesh is on the with the platform or operations
Dynamic Service Discovery and critical path for every request team, and duplicate existing
Routing being handled within the communication handling logic
A service mesh provides dynamic system, it can also provide in code that is now being
service discovery and traffic additional “observability,” such as implemented via a service mesh.
management, including traffic distributed tracing of a request, For example, an application
shadowing (duplicating) for frequency of HTTP error codes, implementing a retry policy within
testing, and traffic splitting for and global and service-to-service the code in addition to a wire-
canary releasing and A/B type latency. Although a much level retry policy provided by the
experimentation. overused phrase in the enterprise service mesh configuration. This
space, service meshes are antipattern can lead to issues
Proxies used within a service often proposed as a method to such as duplicated transactions.
mesh are typically “application capture all of the data necessary
layer” aware (operating at to implement a “single pane of Service Mesh Silver Bullet
Layer 7 in the OSI networking glass” view of traffic flows within There is no such thing as a “silver
stack). This means that traffic the entire system. bullet” within IT, but vendors
routing decisions and the are sometimes tempted to
labeling of metrics can draw Communication Security anoint new technologies with
upon data in HTTP headers or A service mesh also supports this label. A service mesh will
other application layer protocol the implementation and not solve all communication
metadata. enforcement of cross-cutting problems with microservices,
security requirements, such as container orchestrators like
Service-to-Service providing service identity (via Kubernetes, or cloud networking.
Communication Reliability x509 certificates), enabling A service mesh aims to facilitate
application-level service/network service-to-service (east-west)
A service mesh supports the segmentation (e.g., “service A” communication only, and there
implementation and enforcement can communicate with “service is a clear operational cost to
of cross-cutting reliability B,” but not “service C”) ensuring deploying and running a service
requirements, such as request all communication is encrypted mesh.
retries, timeouts, rate limiting, and (via TLS), and ensuring the
circuit-breaking. A service mesh presence of valid user-level Enterprise Service Bus (ESB) 2.0
is often used to compensate identity tokens or "passports.” During the pre-microservice
(or encapsulate) dealing with service-oriented architecture
(SOA) era the Enterprise Service
the eight fallacies of distributed Antipatterns
computing. It should be noted Buses (ESB) implemented a
that a service mesh can only offer communication system between
It is often a sign of a maturing
wire-level reliability support (such software components. Some fear
technology when antipatterns of
as retrying an HTTP request), that many of the mistakes from
usage emerge. Service meshes
and ultimately the service should the ESB era will be repeated with
are no exception.
be responsible for any related the use of a service mesh.

8
The centralized control of issues when the whole system is Service Mesh

The InfoQ eMag / Issue #101 / December 2021


communication offered via ESBs having problems in production.
clearly had value. However, the
Implementations
development of the technologies Lacking a service communication and Products
was driven by vendors, which led strategy and governance model,
to multiple problems, such as: a the architecture becomes The following is a non-exhaustive
lack of interoperability between what’s called the “Death Star list of current service mesh
ESBs, bespoke extension of Architecture.” implementations:
industry standards (e.g., adding
vendor-specific configuration to For more information on this • Linkerd
WS-* compliant schema), and architecture anti-pattern, check • Istio
high cost. ESB vendors also out the articles Part1, Part2, and
did nothing to discourage the Part3 on cloud native architecture • Consul
integration and tight-coupling adoption. • Kuma
of business logic into the
communication bus. Domain-Specific Service Meshes • AWS App Mesh
Local implementation and • NGINX Service Mesh
Big Bang Deployment over-optimization of service
There is a temptation within IT at meshes can sometimes lead • AspenMesh
large to believe that a big bang to too narrow of a scope of • Kong
approach to deployment is the the service mesh deployment.
easiest approach to manage, but Developers may prefer service • Solo Gloo Mesh
as research from Accelerate and mesh instances specific to • Tetrate Service Bridge
the State of DevOps Report, this their own business domains
is not the case. As a complete but this approach has more • Traefik Mesh (formerly
rollout of a service mesh means disadvantages than benefits. Maesh)
that this technology is on the • Meshery
critical path for handling all We don’t want to implement
end user requests, a big bang a too fine-grained scope of • Open Service Mesh (CNCF
deployment is highly risky. service mesh, like a dedicated sandbox project)
service mesh for each business
Death Star Architecture or functional domain in the Also, other products like DataDog
When organizations adopt organization (e.g., Finance, HR, are starting to offer integrations
microservices architecture Accounting, etc.). This defeats with service mesh technologies
and development teams start the purpose of having a common like Linkerd, Istio, Consul Connect,
creating new microservices or service orchestration solution and AWS App Mesh.
leverage existing services in like service mesh for capabilities
their applications, the service-to- such as enterprise level service
service communication becomes discovery or cross-domain
a critical part of the architecture. service routing.
Without a good governance
model, this can lead to a tight
coupling between different
services. It will also be difficult to
pinpoint which service is having

9
Service Mesh Comparisons: Service Mesh Tutorials
The InfoQ eMag / Issue #101 / December 2021

Which Service Mesh?


For engineers or architects looking to experiment
The service mesh space is extremely fast moving, with multiple service meshes the following tutorials,
and so any attempt to create a comparison is likely playgrounds, and tools are available:
to quickly become out of date. However, several
comparisons do exist. Care should be taken to • Layer 5 Meshery — a multi service mesh
understand the source’s bias (if any) and the date management plane.
that the comparison was made. • Solo’s SuperGloo — a service mesh
orchestration platform
• https://layer5.io/landscape
• KataCoda Istio tutorial
• https://kubedex.com/istio-vs-linkerd-vs-
linkerd2-vs-consul/ (correct as of August 2021) • Consul service mesh tutorial

• https://platform9.com/blog/kubernetes- • Linkerd tutorial


service-mesh-a-comparison-of-istio-linkerd-and- • NGINX Service Mesh Tutorial
consul/ (up to date as of October 2019)

• https://servicemesh.es/ (last published August


2021)

InfoQ always recommends that service mesh


adopters perform their own due diligence and
experimentation on each offering.

10
The InfoQ eMag / Issue #101 / December 2021
History of the Service Mesh

InfoQ has been tracking the In late 2014, Netflix released 2018, Consul Connect and Gloo
topic that we now call service an entire suite of JVM-based Mesh in November 2018, service
mesh since late 2013, when utilities including Prana, a mesh interface (SMI) in May
Airbnb released SmartStack, “sidecar” process that allowed 2019, and Maesh(now called
which offered an out-of-process application services written in Traefik Mesh) and Kuma in
service discovery mechanism any language to communicate via September 2019.
(using HAProxy) for the emerging HTTP to standalone instances
“microservices” style architecture. of the libraries. In 2016, the Even service meshes that
Many of the previously labeled NGINX team began talking about emerged outside of the unicorns,
“unicorn” organizations were “The Fabric Model,” which was such as HashiCorp’s Consul,
working on similar technologies very similar to a service mesh, took inspiration from the
before this date. From the early but required the use of their aforementioned technology,
2000s Google was developing commercial NGINX Plus product often aiming to implement
its Stubby RPC framework for implementation. Also, Linkerd the CoreOS coined concept of
that evolved into gRPC, and v0.2 was announced in February "GIFEE”; Google infrastructure for
the Google Frontend (GFE) and 2016, though the team didn't start everyone else.
Global Software Load Balancer calling it a service mesh until
(GSLB), traits of which can later. For a deep-dive into the history
be seen in Istio. In the earlier of how the modern service mesh
2010s, Twitter began work on Other highlights from the concept evolved, Phil Calçado has
the Scala-powered Finagle from history of the service mesh written a comprehensive article
which the Linkerd service mesh include the releases of Istio in "Pattern: Service Mesh.”
emerged. May 2017, Linkerd 2.0 in July

11
Service Mesh innovation by providers of Service mesh configuration, and workload
The InfoQ eMag / Issue #101 / December 2021

Mesh Technology. SMI enables metadata. SMP specification is


Standards flexibility and interoperability, used to capture the following
and covers the most common details:
Even though the service mesh
service mesh capabilities. Current
technologies have seen a major
specification components focus • Environment and
transformation year after year for
on the connectivity aspect of infrastructure details
the last few years, the standards
service mesh capabilities. The
on service mesh haven’t caught • Number and size of nodes,
API specifications include the
up with the innovations. orchestrator
following:
• Service mesh and its
The main standard for using
• Traffic Access Control configuration
service mesh solutions is the
Service Mesh Interface (SMI). • Traffic Metrics • Workload/application details
The Service Mesh Interface is a
• Traffic Specs • Statistical analysis to
specification for service meshes
characterize performance
that run on Kubernetes. It doesn’t • Traffic Split
implement a service mesh itself
William Morgan from the Linkerd
but defines a common standard The current SMI ecosystem
team wrote about benchmarking
that can be implemented by a includes a wide range of service
the performance of Linkerd and
variety of service mesh providers. mesh including Istio, Linkerd,
Istio. There is also an article from
Consul Connect, Gloo Mesh and
2019 about Istio best practices
The goal of the SMI API is to so on.
on benchmarking service mesh
provide a common, portable
performance.
set of Service Mesh APIs which The SMI specification is licensed
a Kubernetes user can use in under the Apache License
It’s important to keep in mind,
a provider agnostic manner. Version 2.0.
like any other performance
In this way, people can define
benchmark, you should not put
applications that use Service If you want to learn more about
too much weight into any of these
Mesh technology without SMI specification and its API
external publications, especially
tightly binding to any specific details, check out the following
by the product vendors. You
implementation. links.
should design and execute your
own performance testing in your
SMI is basically a collection of • Core Specification (current
server environment to validate
Kubernetes Custom Resource version: 0.6.0)
which specific product fits the
Definitions (CRD) and Extension
• Specification Github project business and non-functional
API Servers. These APIs can be
requirements of your application.
installed onto any Kubernetes • How to Contribute
cluster and manipulated using
standard tools. To activate these Service Mesh
APIs, an SMI provider is run in the
Benchmarks
Kubernetes cluster.
Service Mesh Performance is a
SMI specification allows for both
standard for capturing the details
standardization for end-users and
of infrastructure capacity, service

12
Exploring the (Possible) Future of Service Meshes

The InfoQ eMag / Issue #101 / December 2021


Kasun Indrasiri has explored "The is a userspace application that Multi-cloud, multi-cluster, multi-
Potential for Using a Service Mesh "bypasses the heavy layers of tenant service meshes
for Event-Driven Messaging,” in the Linux kernel networking In the recent years, the
which he discussed two main stack and talks directly to the cloud adoption by different
emerging architectural patterns network hardware," and work by organizations has transformed
for implementing messaging the Cilium team that utilizes from a single cloud solution
support within a service mesh: the extended Berkley Packet (private or public) to a new
the protocol proxy sidecar, and Filter (eBPF) functionality in the infrastructure based on multi-
the HTTP bridge sidecar. This is Linux kernel for "very efficient cloud (private, public, and hybrid)
an active area of development networking, policy enforcement, supported by multiple different
within the service mesh and load balancing functionality." vendors (AWS, Google, Microsoft
community, with the work towards Another team is mapping the Azure, and so on). Also, the need
supporting Apache Kafka within concept of a service mesh to for supporting diverse workloads
Envoy attracting a fair amount of L2/L3 payloads with Network (transactional, batch, and
attention. Service Mesh, as an attempt to streaming) is critical to realize a
“re-imagine network function unified cloud architecture.
Christian Posta has previously virtualization (NFV) in a cloud-
written about attempts to native way.” These business and non-
standardize the usage of service functional requirements in
meshes in “Towards a Unified, Second, there are multiple turn lead to the need for
Standard API for Consolidating initiatives to integrate service deploying service mesh
Service Meshes.” This article meshes more tightly with public solutions in heterogeneous
also discusses the Service Mesh cloud platforms, as seen in infrastructures (bare metal, VMs,
Interface (SMI) that was recently the introduction of AWS App and Kubernetes). The service
announced by Microsoft and Mesh, GCP Traffic Director, mesh architecture needs to
partners at KubeCon EU. The SMI and Azure Service Fabric Mesh. transform accordingly to support
defines a set of common and The Buoyant team is leading the these diverse workloads and
portable APIs that aims to provide charge with developing effective infrastructures.
developers with interoperability human-centric control planes
across different service for service mesh technology. Technologies like Kuma support
mesh technologies including They have recently released the multi-mesh control plane to
Istio, Linkerd, and Consul Buoyant Cloud, a SaaS-based make the business applications
Connect. “team control plane” for platform work in multi-cluster and multi-
teams operating Kubernetes. cloud service mesh environments.
The topic of integrating service This product is discussed in more These solutions abstract away the
meshes with the platform fabric detail in the section below. synchronization of service mesh
can be further divided into two policies across multiple zones
sub-topics.First, there is work There have also been several and the service connectivity (and
being conducted to reduce the innovations in the service mesh service discovery) across those
networking overhead introduced area since last year. Let’s look zones.
by a service mesh data plane. at some of these innovations.
This includes the data plane Another emerging trend in multi-
development kit (DPDK), which cluster service mesh

13
technologies is the need for used Linkerd and Chaos Mesh to features of a service mesh. NSM
The InfoQ eMag / Issue #101 / December 2021

application/service connectivity conduct chaos experiments for works with existing Container
from edge computing layer (IoT their project. Network Interface (CNI)
devices) to the mesh layer. implementations.
Service Mesh as a Service
Media Service Mesh Some service mesh vendors, like Service Mesh Extensions
Media Streaming Mesh or Media Buoyant, are offering managed Service mesh extensions is
Service Mesh, developed at Cisco service mesh or “service mesh another area that has been seeing
Systems, is used for orchestrating as a service” solutions. Earlier a lot of innovations. Some of the
real-time applications like this year, Buoyant announced developments in service mesh
multi-player gaming, multi-party the public beta release of a extensions include:
video-conferencing, or CCTV SaaS application called Buoyant
streaming using service mesh Cloud that allows the customer • enhanced identity
technologies on Kubernetes cloud organizations to take advantage management for securing
platform. These applications are of managed service mesh with microservices connectivity
moving more and more away the on-demand support features including custom certificate
from monolithic applications for the Linkerd service mesh. authority plugins
to microservices architectures.
• adaptive routing features
A service mesh can help the Some of the features offered
for higher availability and
applications by providing by the Buoyant Cloud solution
scalability of the services
capabilities like load balancing, include the following:
encryption, and observability. • enhancing sidecar proxies
• Automatic tracking of Linkerd
Chaos Mesh data plane and control plane Service Mesh Operations
Chaos Mesh, a CNCF hosted health Another important area of
project, is an open-source, service mesh adoption is in the
• Managing service mesh
cloud-native chaos engineering operations side of the service
lifecycles and versions across
platform for applications mesh lifecycle. The operational
pods, proxies, and clusters on
hosted on Kubernetes. Though aspects—such as configuring
Kubernetes platform
not a direct service mesh multi-cluster capabilities and
implementation, Chaos Mesh • SRE-focused tools including connecting Kubernetes workloads
enables Chaos Engineering service level objectives to servers hosted on VM
experiments by orchestrating (SLOs), workload golden infrastructure, and the developer
fault injection behavior into the metric tracking, and change portals to manage all the features
applications. Fault injection is a tracking and API in multi-cluster service
key capability of service mesh mesh installations—are going
technologies. Network Service Mesh (NSM) to play a significant role in the
Network Service Mesh (NSM), overall deployment and support
Chaos Mesh hides the underlying another Cloud Native Computing of service mesh solutions in
implementation details so the Foundation sandbox project, production.
application developers can focus provides a hybrid, multi-cloud
on the actual chaos experiments. IP service mesh. NSM enables
Chaos Mesh can be used along capabilities such as network
with a service mesh. Checkout service connectivity, security,
this use case on how the team and observability which are core

14
The InfoQ eMag / Issue #101 / December 2021
FAQ
FAQ
Service Mesh Frequently Asked Questions.

15
What is a service mesh? scaling service-to-service providing container orchestration
The InfoQ eMag / Issue #101 / December 2021

communication, or has a specific and a service mesh is


A service mesh is a technology use case to resolve. responsible for service-to-service
that manages all service-to- communication). However, work
service (east-west) traffic Do I need a service mesh to is underway to push service
within a distributed (potentially implement service discovery with mesh-like functionality into
microservice-based) software microservices? modern Platform-as-a-Service
system. It provides both No. A service mesh provides one (PaaS) offerings.
business-focused functional way of implementing service
operations, such as routing, discovery. Other solutions include How do I implement, deploy, or
and nonfunctional support, for language-specific libraries (such rollout a service mesh?
example, enforcing security as Ribbon and Eureka, or Finagle) The best approach would be to
policies, quality of service, analyse the various service mesh
and rate limiting. It is typically Does a service mesh add products (see above), and follow
(although not exclusively) overhead/latency to my service- the implementation guidelines
implemented using sidecar to-service communication? specific to the chosen mesh. In
proxies through which all services Yes, a service mesh adds at general, it is best to work with all
communicate. least two extra network hops stakeholders and incrementally
when a service is communicating deploy any new technology into
How does a service mesh differ with another service (the first production.
from an API gateway? is from the proxy handling the
For service mesh definition, see source’s outbound connection, Can I build my own service
above. and the second is from the mesh?
proxy handling the destination’s Yes, but the more pertinent
On the other hand, an API inbound connection). However, question is should you? Is
gateway manages all ingress this additional network hop building a service mesh a core
(north-south) traffic into a typically occurs over the localhost competency of your organization?
cluster, and provides additional or loopback network interface, Could you be providing value
support for cross-functional and adds only a small amount to your customers in a more
communication requirements. It of latency (on the order of effective way? Are you also
acts as the single entry point into milliseconds). Experimenting with committed to maintaining your
a system and enables multiple and understanding whether this own mesh, patching it for security
APIs or services to act cohesively is an issue for the target use case issues, and constantly updating
and provide a uniform experience should be part of the analysis and it to take advantage of new
to the user. evaluation of a service mesh. technologies? With the range of
open source and commercial
If I am deploying microservices, Shouldn’t a service mesh be service mesh offerings that are
do I need a service mesh? part of Kubernetes or the “cloud now available, it is most likely
Not necessarily. A service native platform” that applications more effective to use an existing
mesh adds operational are being deployed onto? solution.
complexity to the technology Potentially. There is an argument
stack, and therefore is for maintaining separation of Which team owns the service
typically only deployed if the concerns within cloud native mesh within a software delivery
organization is having trouble platform components (e.g. organization?
Kubernetes is responsible for

16
Typically the platform or occurred with Docker and

The InfoQ eMag / Issue #101 / December 2021


operations team own the service container technology.
mesh, along with Kubernetes
and the continuous delivery Which service mesh should I
pipeline infrastructure. However, use?
developers will be configuring the There is no single answer
service mesh properties, and so to this question. Engineers
both teams should work closely must understand their current
together. Many organizations requirements, and the skills,
are following the lead from the resources, and time available
cloud vanguard such as Netflix, for their implementation team.
Spotify, and Google, and are The service mesh comparison
creating internal platform teams links above will provide a good
that provide tooling and services starting point for exploration,
to full cycle product-focused but we strongly recommend that
development teams. organizations experiment with
at least two meshes in order
Is Envoy a service mesh? to understand which products,
No. Envoy is a cloud native proxy technologies, and workflows work
that was originally designed and best for them.
built by the Lyft team. Envoy is
often used as the data plane Can I use a service mesh outside
with a service mesh. However, in of Kubernetes?
order to be considered a service Yes. Many service meshes
mesh, Envoy must be used in allow the installation and
conjunction with a control plane management of data plane
in order for this collection of proxies and the associated
technologies to become a service control plane on a variety of
mesh. The control plane can infrastructure. HashiCorp’s
be as simple as a centralized Consul is the most well known
config file repository and metric example of this, and Istio is also
collector, or a comprehensive/ being used experimentally with
complex as Istio. Cloud Foundry.

Can the words “Istio” and


“service mesh” be used
interchangeably?
No. Istio is a type of service
mesh. Due to the popularity of
Istio when the service mesh
category was emerging, some
sources were conflating Istio
and service mesh. This issue of
conflation is not unique to service
mesh—the same challenge

17
Additional Resources Data plane: A proxy that conditionally translates,
The InfoQ eMag / Issue #101 / December 2021

forwards, and observes every network packet that


• InfoQ Service Mesh homepage flows to and from a service network endpoint.

• The InfoQ eMag - Service Mesh: Past, Present, Docker: A Docker container image is a lightweight,
and Future standalone, executable package of software that
• The Service Mesh: What Every Software includes everything needed to run an application:
Engineer Needs to Know about the World’s Most code, runtime, system tools, system libraries and
Over-Hyped Technology settings.

• Service Mesh Comparison East-West traffic: Network traffic within a data


• Service Meshes center, network, or Kubernetes cluster. Traditional
network diagrams were drawn with the service-to-
• Adoption of Cloud Native Architecture, Part 3: service (inter-data center) traffic flowing from left to
Service Orchestration and Service Mesh right (east to west) in the diagrams.

Envoy Proxy: An open-source edge and service


proxy, designed for cloud-native applications. Envoy
Glossary is often used as the data plane within a service
mesh implementation.
API gateway: Manages all ingress (north-south)
traffic into a cluster, and provides additional support Ingress traffic: Network traffic that originates from
for cross-functional communication requirements. outside the data center, network, or Kubernetes
It acts as the single entry point into a system and cluster.
enables multiple APIs or services to act cohesively
and provide a uniform experience to the user. Istio: C++ (data plane) and Go (control plane)-based
service mesh that was originally created by Google
Consul: A Go-based service mesh from HashiCorp. and IBM in partnership with the Envoy team from
Lyft.
Containerization: A container is a standard unit
of software that packages up code and all its Kubernetes: A CNCF-hosted container orchestration
dependencies so the application runs quickly and scheduling framework that originated from
and reliably from one computing environment to Google.
another.
Kuma: A Go-based service mesh from Kong.
Control plane: Takes all the individual instances
of the data plane (proxies) and turns them into Linkerd: A Rust (data plane) and Go (control plane)
a distributed system that can be visualized and powered service mesh that was derived from an
controlled by an operator. early JVM-based communication framework at
Twitter.
Circuit breaker: Handles faults or timeouts when
connecting to a remote service. Helps to improve Maesh: A Go-based service mesh from Containous,
the stability and resiliency of an application. the maintainers of the Traefik API gateway.

18
MOSN: A Go-based proxy from the Ant Financial Traffic shifting: Migrating traffic from one location

The InfoQ eMag / Issue #101 / December 2021


team that implements the (Envoy) xDS APIs. to another.

North-South traffic: Network traffic entering Traffic Split: Allow users to incrementally direct
(or ingressing) into a data center, network, or percentages of traffic between various services.
Kubernetes cluster. Traditional network diagrams Used by clients such as ingress controllers or
were drawn with the ingress traffic entering the service mesh sidecars to split the outgoing traffic to
data center at the top of the page and flowing down different destinations.
(north to south) into the network.
The role of a modern software architect is
Proxy: A software system that acts as an continually shifting. To keep up-to-date on emerging
intermediary between endpoint components. patterns and technologies, subscribe to InfoQ's
Software Architects' newsletter. Each month, you'll
Segmentation: Dividing a network or cluster into receive essential news and experiences from
multiple sub-networks. industry peers on everything you need to know.

Service mesh: Manages all service-to-service


(east-west) traffic within a distributed (potentially
microservice-based) software system. It provides
both functional operations, such as routing, and
nonfunctional support, for example, enforcing
security policies, quality of service, and rate limiting.

Service Mesh Interface (SMI): A standard interface


for service meshes deployed onto Kubernetes.

Service mesh policy: A specification of how a


collection of services/endpoints are allowed to
communicate with each other and other network
endpoints.

Sidecar: A deployment pattern, in which an


additional process, service, or container is deployed
alongside an existing service (think motorcycle
sidecar).

Single pane of glass: A UI or management console


that presents data from multiple sources in a unified
display.

Traffic shaping: Modifying the flow of traffic


across a network, for example, rate limiting or load
shedding.

19
InfoQ @ InfoQ InfoQ InfoQ

Read recent
Curious aboutissues
previous issues?
The InfoQ eMag / Issue #77 / October 2019

Taming Complex
Systems in Production

@emilywithcurls
An Engineer’s Sustainable Operations Testing in
Guide to a Good in Complex Systems with Production—Quality
Night’s Sleep Production Excellence Software, Faster

FACILITATING THE SPREAD OF KNOWLEDGE AND INNOVATION IN PROFESSIONAL SOFTWARE DEVELOPMENT

Managing Observability, Re-Examining Microservices Microservices: Testing,


Resilience, and Complexity after the First Decade Observing, and
within Distributed Systems Understanding
This eMag takes a deep In this eMag we explore To tame complexity and its
We have prepared this eMag
dive into the techniques and some more of the benefits effects, organizations need
This eMag helps you reflect for you with content created This eMag takes a deep
culture changes required of .NET Core and how it can a structured, multi-pronged,
on the subject of reducing by professional software dive into the techniques and
to successfully test, benefit not only traditional human-focused approach,
complexity within modern developers who have been culture changes required to
observe, and understand .NET developers, but all that: makes operations
applications and distributed working with microservices successfully test, observe, and
microservices. technologists who need to work sustainable, centers
systems, and provides you for quite some time. If you understand microservices.
bring robust, performant decisions around customer
with different perspectives are considering migrating to
and economical solutions to experience, uses continuous
and learned lessons from a microservices approach,
market. testing, and includes chaos
people who have already had be ready to take some
engineering and system
to deal with challenges from notes about the lessons
observability. In this eMag,
the real world. learned, mistakes made, and
we cover all of these topics.
recommendations from those
experts.

InfoQ @InfoQ InfoQ InfoQ

You might also like