0% found this document useful (0 votes)

300 views33 pages

Kubernetes Production Readiness and Best Practices Checklist

The document provides a checklist for ensuring Kubernetes deployments are ready for production use. It covers topics like availability, resource management, security, scalability, and monitoring. Availability topics discussed include configuring liveness and readiness probes, having at least 3 master nodes, replicating master nodes in odd numbers, isolating etcd replicas, backing up etcd data, distributing nodes across zones, and configuring autoscaling.

Uploaded by

BrianJ.RomeroAguirre

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

300 views33 pages

Kubernetes Production Readiness and Best Practices Checklist

Uploaded by

BrianJ.RomeroAguirre

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Kubernetes

Production
Readiness and
Best Practices
Checklist
Kubernetes Production Readiness and Best Practices Checklist

Here is our complete checklist to ensure your Kubernetes

deployments are ready for prime time. In this checklist, we will
cover the topics of availability, resource management, security,
scalability and monitoring for Kubernetes. If you have already
read the blog post you can skip the section on availability.

Availability
Configured liveness and readiness probes?

Liveness probe is the Kubernetes equivalent of “have you tried

turning it off and on again”. Liveness probes detect containers
that are not able to recover from failed states and restart them. It
is a great tool to build-in auto recovery into production
Kubernetes deployments. You can create liveness probes based
on kubelet, http or tcp checks.

Readiness probes detect whether a container is temporarily

unable to receive traffic and will mitigate these situations by
stopping traffic flow to it. Readiness probes will also detect
whether new pods are ready to receive traffic, before allowing
traffic flow, during deployment updates.

replex.io 2
Kubernetes Production Readiness and Best Practices Checklist

Provisioned at least 3 master nodes?

Having the control plane replicated across 3 nodes is the

minimum required configuration for a highly available Kubernetes
cluster. Etcd requires a majority of master nodes to form a
quorum and continue functioning. With 3 master nodes, the
cluster can overcome the failure of 1 master node, since it still
has 2 to form a majority.

Here is a table outlining the fault tolerance of different cluster

sizes.

Replicated master nodes in odd numbers?

As is apparent from this table, master nodes should always be

replicated in odd numbers. Odd-numbered master clusters have
the same tolerance as the next highest even numbered cluster.

Isolated etcd replicas?

The etcd master component is responsible for storing and

replicating cluster state. As such it has high resource
requirements. Therefore, a best practice is to isolate the etcd
replicas by placing them on dedicated nodes. This de-couples the
control plane components and the etcd members and ensures
sufficient resource availability for etcd members making the
cluster more robust and reliable.

It is recommended to have at least a 5-member etcd cluster in

production.

replex.io 3
Kubernetes Production Readiness and Best Practices Checklist

Have a plan for regular etcd backups?

Since etcd stores cluster state, it is always a best practice to

regularly backup etcd data. It is also a good idea to save etcd
backup data on a separate host. etcd clusters can be backed up
by taking a snapshot with the etcdctl snapshot save command or
by copying the member/snap/db file from an etcd data directory.

If running on public cloud provider storage volumes, it is

relatively easy to create etcd backups by taking a snapshot of
the storage volume.

Distributed master nodes across zones?

Distributing master nodes across zones is also a high availability

best practice. This ensures that master nodes are immune to
outages of entire availability zones.

Using Kops, master nodes can be easily distributed across

zones using the --master-zones flag.

Distributed worker nodes across zones?

Worker nodes should also be distributed across availability

zones. Worker nodes can be distributed across zones by using
the --zones flag in Kops.

replex.io 4
Kubernetes Production Readiness and Best Practices Checklist

Configured Autoscaling for both master and

worker nodes?

When using the cloud, a best practice is to place both master

and worker nodes in autoscaling groups. Autoscaling groups will
automatically bring up a node in the event of termination. Kops
places both master and workers nodes into autoscaling groups
by default.

Baked-in HA load balancing?

Once multiple master replica nodes have been deployed, the next
obvious step is to load balance traffic to and from those
replicas. You can do this by creating an L4 load balancer in front
of all apiserver instances and updating the DNS name
appropriately or use the round-robin DNS technique to access all
apiserver directly. Check this document for more information.

Configured active-passive setup for scheduler

and controller manager?

As opposed to the other control plane components, the

scheduler and controller manager components of the control
plane have to read and write data actively, therefore they need to
be configured in an active-passive setup. Once both components
have been replicated across zones, they should be configured in
an active-passive setup.

This can be done by passing the --leader-elect flag to kube-

scheduler.

replex.io 5
Kubernetes Production Readiness and Best Practices Checklist

Configured the correct number of pod replicas

for high availability?

To ensure highly available Kubernetes workloads, pods should

also be replicated using Kubernetes controllers like ReplicaSets,
Deployments and Statefulsets.

Both deployments and statefulsets are central to the concept of

high availability and will ensure that the desired number of pods
is always maintained. The number of replicas is usually dictated
by application requirements.

Kubernetes does recommend using Deployments over

Replicasets for pod replication since they are declarative and
allow you to roll back to previous versions easily. However, if
your use-case requires custom updates orchestration or does
not require updates at all, you can still use Replicasets.

Spinning up any naked pods?

Are all your pods part of a Replicaset or Deployment? Naked

pods are not re-scheduled in case of node failure or shut down.
Therefore, it is best practice to always spin up pods as part of a
Replicaset or Deployment.

replex.io 6
Kubernetes Production Readiness and Best Practices Checklist

Setup Federation for multiple clusters?

If you are provisioning multiple clusters for low latency,

availability and scalability, setting up Kubernetes federation is a
best practice. Federation will allow you to keep resources across
clusters in sync and auto-configure DNS servers and load
balancers.

Federating clusters involves first setting up the federation

control plane and then creating federation API resources.

Configured heartbeat and election timeout

intervals for etcd members?

When configuring etcd clusters, it is important to correctly

specify both heartbeat and election timeout parameters.
Heartbeat interval is the frequency with which the etcd leader
notifies followers. Timeout interval is the time period a follower
will wait for a heartbeat before attempting to become a leader
itself.

The heartbeat interval is recommended to be the round-trip time

between the members. Election timeouts are recommended to
be at least 10 times the round-trip time between members.

replex.io 7
Kubernetes Production Readiness and Best Practices Checklist

Setup Ingress?

Ingress allows HTTP and HTTPS traffic from the outside internet
to services inside the cluster. Ingress can also be used for load
balancing, terminating SSL and to give services externally-
reachable URLs.

In order for ingress to work, your cluster needs an ingress

controller. Kubernetes officially supports GCE and nginx
controller as of now. Here is a list of other ingress controllers
you might want to check out.

You can also create an external cloud load balancer in place of

the ingress resource, by including type: LoadBalancer in the
Service configuration file.

replex.io 8
Kubernetes Production Readiness and Best Practices Checklist

Resource Management

Configured resource requests and limits for

containers?

Resource requests and limits help you manage resource

consumption by individual containers. Resource requests are a
soft limit on the amount of resources that can be consumed by
individual containers. Limits are the maximum amount of
resources that can be consumed.

Resource requests and limits can be set for CPU, memory and
ephemeral storage resources. Setting resource requests and
limits is a Kubernetes best practice and will help avoid
containers getting throttled due to lack of resources or going
berserk and hogging resources.

To check whether all containers inside a pod have resource

requests and limits defined use the following command

kubectl describe pod -n <namespace_name> <pod_name>

This will display a list of all containers with the corresponding

limits and requests for both CPU and memory resources.

replex.io 9
Kubernetes Production Readiness and Best Practices Checklist

Specified resource requests and limits for local

ephemeral storage?

Local ephemeral storage is a new type of resource introduced in

Kubernetes 1.8. Containers use ephemeral storage for local
storage. If you have configured local ephemeral storage check to
see that you have set requests and limits for each container for
this resource type.

Here is how you can check whether requests and limits have
been defined for local ephemeral storage for all containers:

kubectl describe pod -n <namespace_name> <pod_name>

Created separate namespaces for your teams?

Kubernetes namespaces are virtual partitions of your

Kubernetes clusters. It is recommended best practice to create
separate namespaces for individual teams, projects or
customers. Examples include Dev, production, frontend etc. You
can also create separate namespaces based on custom
application or organizational requirements. Here is how you can
display a list of all namespaces:

kubectl get namespaces

Or
kubectl get namespaces --show-labels

You can also display a list of all the pods running inside a
namespace with kubectl get pods --all-namespaces

replex.io 10
Kubernetes Production Readiness and Best Practices Checklist

Configured default resource requests and limits

for namespaces?

Default requests and limits specify the default values for

memory and CPU resources for all containers inside a
namespace. In situations where resource request and limit
values are not specifically defined for a container created inside
a namespace with default values, that container will
automatically inherit the default values. Configuring default
values on a namespace level is a best practice to ensure that all
containers created inside that namespace get assigned both
request and limit values.

Here is how you check whether a namespace has been assigned

default resource requests and limits:

kubectl describe namespace <namespace_name>

Configured Limit ranges for namespaces?

Limit ranges also work on the namespace level and allow us to

specify the minimum and maximum CPU and memory resources
that can be consumed by individual containers inside a
namespace.

replex.io 11
Kubernetes Production Readiness and Best Practices Checklist

Whenever a container is created inside a namespace with limit

ranges, it has to have a resource request value that is equal to or
higher than the minimum value we defined in the limit range. The
container also has to have both CPU and memory limits that are
equal to lower than the max value defined in the limit range.

Check whether limit ranges have been configured:

kubectl describe namespace <namespace_name>

Specified Resource Quotas for namespaces?

Resource quotas also work on the namespace level and provide

another layer of control over cluster resource usage.

Resource Quotas limit the total amount of CPU, memory and

storage resources that can be consumed by all containers
running in a namespace.

Consumption of storage resources by persistent volume claims

can also be limited based on individual storage class.
Kubernetes administrators can define storage classes based on
quality of service levels or backup policies.

Check whether resource quotas have been configured:

kubectl describe namespace <namespace_name>

replex.io 12
Kubernetes Production Readiness and Best Practices Checklist

Configured pod and API Quotas for

namespaces?

Pod quotas allow you to restrict the total number of pods that
can run inside a namespace. API quotas let you set limits for
other API objects like PersistentVolumeClaims, Services and
ReplicaSets.

Pod and API quotas are a good way to manage resource usage
on a namespace level.

To check whether quotas have been configured:

kubectl describe namespace <namespace_name>

Ensured resource availability for etcd?

Typically, Etcd clusters have a pretty large resource footprint.

Therefore, it is best practice to run these clusters on dedicated
hardware to ensure they have access to enough resources.
Resource starvation can lead to the cluster becoming unstable,
which will in turn mean that no new pods can be scheduled.

Here is an etcd resource guide based on the number of nodes in

the cluster and the clients being served. Typical etcd clusters
need 2-4 CPU cores, 8GB of memory, 50 sequential IOPS and a 1
Gbe network connection to run smoothly. For larger etcd cluster,
check out this handy guide.

replex.io 13
Kubernetes Production Readiness and Best Practices Checklist

Configured etcd snapshot memory usage?

Etcd snapshots are backups of the etcd cluster which can be

used for cluster disaster recovery. The --snapshot-count flag
determines the number of changes that need to happen to etcd
before a snapshot is taken. Higher --snapshot-count will hold a
higher number of entries until the next snapshot, which can lead
to higher memory usage. The default value for --snapshot-count
in etcd v3.2 is 100,000.

Make sure to configure this number based on your unique

cluster requirements.

You can do this using $ etcd --snapshot-count=X

Attached labels to Kubernetes objects?

Labels allow Kubernetes objects to be queried and operated

upon in bulk. They can also be used to identify and organize
Kubernetes objects into groups. As such defining labels should
figure right at the top of any Kubernetes best practices list. Here
is a list of recommended Kubernetes labels that should be
defined for every deployment.

Check whether pods have been labelled

kubectl get pods --show-labels

replex.io 14
Kubernetes Production Readiness and Best Practices Checklist

Limited the number of pods that can run on a

node?

You can also control the number of pods that can be scheduled
on a node using the --max-pods flag in Kubelet.

This will help avoid scenarios where rogue or misconfigured jobs

create pods in such large numbers as to overwhelm system
pods.

Reserved compute resources for system

daemons?

Another best practice is to reserve resources for system

daemons that are needed by both the OS and Kubernetes itself
to run. All three resource types CPU, memory and ephemeral
storage resources can be reserved for system daemons. Once
reserved these resources are deducted from node capacity and
are exposed as node allocable resources. Below are kubelet
flags that can be used to reserve resources for system daemons:

--kube-reserved: allows you to reserve resources for Kubernetes

system daemons like the kubelet, container runtime and node
problem detector.

--system-reserved: allows you to reserve resources for OS

system daemons like sshd and udev.

replex.io 15
Kubernetes Production Readiness and Best Practices Checklist

Configured API request processing for API

server?

To manage CPU and memory consumption by the API server,

make sure to configure the maximum number of requests that
can be processed by the API server in parallel.

This can be done using the --max-requests-inflight and --max-

mutating-requests-inflight flags.

Processing a lot of API requests in parallel can be very CPU

intensive for the API server and can also lead to OOM (out of
memory) events.

Configured out of resource handling?

Make sure you configure out of resource handling to prevent

unused images and dead pods and containers taking up too
much unnecessary space on the node.

Out of resource handling specifies Kubelet behavior when the

node starts to run low on resources. In such cases, the Kubelet
will first try to reclaim resources by deleting dead pods (and their
containers) and unused images. If it cannot reclaim sufficient
resources, it will then start evicting pods.

replex.io 16
Kubernetes Production Readiness and Best Practices Checklist

You can influence when the Kubelet kicks into action by

configuring eviction thresholds for eviction signals. Thresholds
can be configured for nodefs.available, nodefs.inodesfree,
imagefs.available and imagefs.inodesfree eviction signals in the
pod spec.

Following are some examples:

nodefs.available<10%
nodefs.inodesFree<5%
imagefs.available<15%
imagefs.inodesFree<20%

Doing this will ensure that unused images and dead containers
and pods do not take up unnecessary disk space.

You should also consider specifying a threshold for

memory.available signal. This will ensure that Kubelet kicks into
action when free memory on the node falls below your desired
level.

Another best practice is to pass the --eviction-minimum-reclaim

to Kubelet. This will ensure that the Kubelet does not pop up and
down the eviction threshold by reclaiming a small amount of
resources. Once an eviction threshold is triggered the Kubelet
will evict pods till the minimum threshold is reached.

replex.io 17
Kubernetes Production Readiness and Best Practices Checklist

Using recommended settings for Persistent

Volumes?

Persistent Volumes (PVs) represent a piece of storage in a

Kubernetes cluster. PVs are similar to regular Kubernetes
volumes with one difference; they have a lifecycle that is
independent of any specific pod in the cluster. Kubernetes
volumes, on the other hand, do not persist data across pod
restarts.

Persistent Volume Claims (PVCs) are requests for storage

resources by a user. PVCs consume PV resources in the same
way that a pod consumes node resources.

When creating PV’s Kubernetes documentation recommends the

following best practices:

• Always include Persistent Volume Claims in the config

• Never include PVs in the config
• Always create a default storage class
• Give the user the option of providing a storage class name

Enabled log rotation?

If you have node-level logging enabled, make sure you also

enable log rotation to avoid logs consuming all available storage
on the node. Enable the logrotate tool for clusters deployed by
the kube-up.sh script on GCP or Docker’s log-opt for all other
environments.

replex.io 18
Kubernetes Production Readiness and Best Practices Checklist

Prevented Kubelet from setting or modifying

label keys?

If you are using labels and label selectors to target pods to

specific nodes for security or regulatory purposes, make sure
you also choose label keys that cannot be modified by the
Kubelet. This will ensure that compromised nodes cannot use
their Kubelet credentials to label their node object and schedule
pods.

Make sure you use the Node authorizer, enable the

NodeRestriction admission plugin and always prefix node labels
with node-restriction.Kubernetes.io/ as well as add the same
prefix to label selectors.

replex.io 19
Kubernetes Production Readiness and Best Practices Checklist

Security

Using the latest Kubernetes version?

Kubernetes regularly pushes out new versions with critical bug

fixes and new security features. Make sure you have upgraded
to the latest version to take advantage of these features.

Enabled RBAC (Role-Based Access Control)?

Kubernetes RBAC allows you to regulate access to your

Kubernetes environment. Using RBAC Kubernetes admins can
define policies authorizing user access as well as the extent of
this access to resources. Make sure to define all four high-level
API objects including Role, Cluster role, Role binding and Cluster
role binding to secure your Kubernetes environment.

You can enable the RBAC API using --authorization-mode=RBAC

Following user access best practices?

Make sure you limit the scope of user permissions by chopping

up your Kubernetes environment into separate namespaces.
This will isolate resources and help contain the damage in case
of security misconfigurations or malicious activity. You can do
this by using Roles (which grant access to resources within a
single namespace) as opposed to ClusterRoles (which grant
access to resources cluster-wide).

replex.io 20
Kubernetes Production Readiness and Best Practices Checklist

Another best practice is to limit user permissions based on the

resources they need access to. For example, you can limit a Role
to a specific namespace and a set of resources e.g. pods.

Avoid giving admin access as much as possible.

Enabled audit logging?

Kubernetes audits are performed by the kube-api server and

allow you to record a sequence of the activities that users or
other system components perform. You should also be
monitoring audit logs to proactively react to malicious activity or
authorization failures.

You can define rules about which events to record and what data
to log in an audit policy. Here is a minimal audit policy which will
log all metadata related to requests

# Log all requests at the Metadata level.

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata

You can implement logging at Request level which will log both
metadata and request body as well as RequestResponse level
which will log response body in addition to request metadata
and request body.

replex.io 21
Kubernetes Production Readiness and Best Practices Checklist

Set Up a Bastion host?

Setting up a bastion host is another security best practice. A

bastion host allows you to shrink your ssh footprint to only one
host. Once a bastion host has been set up, it serves as the only
entry point into the cluster. All other hosts can be accessed via
ssh from that host.

You can enable a bastion host in kops by passing the --bastion

flag to it during cluster creation.

Enabled AlwaysPullImages in admission

controller?

AlwaysPullImages is an admission controller which ensures that

images are always pulled with the correct authorization and
cannot be re-used by other pods without first providing
credentials. This is very useful in a multi-tenant environment and
ensures that images can only be used by users with the correct
credentials.

You can enable AlwaysPullImages images using the flag

kube-apiserver --enable-admission-plugins=AlwaysPullImage

replex.io 22
Kubernetes Production Readiness and Best Practices Checklist

Besides these, you should also enable other recommended

Kubernetes admission controllers including:

NamespaceLifecycle, LimitRanger, ServiceAccount,

DefaultStorageClass, DefaultTolerationSeconds,
MutatingAdmissionWebhook, ValidatingAdmissionWebhook,
Priority, ResourceQuota

Here is how you can check which admission controllers have

been enabled

kube-apiserver -h | grep enable-admission-plugins

Defined pod security policy and enabled it in the

admission controller?

Pod security policies outline a set of security sensitive policies

that pods must fulfil in order to be scheduled. PodSecurityPolicy
is also a recommended admission controller. Here is an example
of a restrictive pod security policy.

You can then enable the security policy using the flag

--enable-admission-plugins=PodSecurityPolicy

This policy forces all users to run as unprivileged users and also
disables privilege escalation as well as enabling other restrictive
security policies.

replex.io 23
Kubernetes Production Readiness and Best Practices Checklist

Chosen a Network plugin and configured

network policies?

Network policies allow you to configure how the various

components of your Kubernetes environment communicate
among each other and with outside network endpoints. These
can include, pods, containers, services and namespaces.

Here is a great list of network policy recipes to get you started.

Network policies are enforced by a networking plugin, so make
sure you have chosen one too. Flannel and Cilium are good
examples.

Implemented authentication for kubelet?

Kubernetes allows anonymous authentication by default. You

can avoid this by enabling RBAC or by disabling anonymous
access to kubelet’s https endpoint by starting kubelet with the
flag --anonymous-auth=false

You can then enable X509 client certificate authentication by

starting the kubelet using the flag --client-ca-file flag and the
API server with --kubelet-client-certificate and --kubelet-client-
key flags.

You can also enable API bearer tokens by ensuring the

authentication.k8s.io/v1beta1 API group is enabled in the API
server and starting the kubelet with the --authentication-token-
webhook and –kubeconfig flags.

replex.io 24
Kubernetes Production Readiness and Best Practices Checklist

Configured Kubernetes secrets?

Sensitive information related to your Kubernetes environment

like a password, token or a key should always be stored in a
Kubernetes secrets object. You see a list of the secrets already
created using:

kubectl get secrets

Enabled data encryption at rest?

Encrypting data at rest is another security best practice. In

Kubernetes, data can be encrypted using either of these four
providers: aescbc, secretbox, aesgcm or kms.

Encryption can be enabled by passing the --encryption-provider-

config flag to kube-apiserver process.

To ensure all secrets are encrypted update all secrets using:

kubectl get secrets --all-namespaces -o json | kubectl replace -f

–

This will also apply server-side encryption to all secrets.

replex.io 25
Kubernetes Production Readiness and Best Practices Checklist

Disabled default service account?

All newly created pods and containers without a service account

are automatically assigned the default service account.

The default service account has a very wide range of

permissions in the cluster and should therefore be disabled.

You can do this by setting automountServiceAccountToken:

false on the service account.

Scanned containers for security vulnerabilities?

Another security best practice is to scan your container images

for known security vulnerabilities.

You can do this using open source tools like Anchore and Clair
which will help you identify common vulnerabilities and
exposures (CVEs) and mitigate them.

Configured security context for pods, containers

and volumes?

Security context specifies privilege and access control settings

for pods and containers. Pod security context can be defined by
including the securityContext field in the pod specification. Once
a security context has been specified for a pod it automatically
propagates to all the containers that belong to the pod.

replex.io 26
Kubernetes Production Readiness and Best Practices Checklist

A best practice when setting the security context for a pod is to

set both runAsNonRoot and readOnlyRootFileSystem fields to
true and allowPriviligeEscalation to false. This will introduce
more layers into your container and Kubernetes environment and
prevent privilege escalation.

Enabled Kubernetes logging?

Kubernetes logs will help you understand what is happening

inside your cluster as well as debug problems and monitor
activity. Logs for containerized applications are usually written
to the standard output and standard error streams.

A best practice when implementing Kubernetes logging is to

configure a separate lifecycle and storage for logs from pods,
containers and nodes. You can do this by implementing a cluster
level logging architecture.

replex.io 27
Kubernetes Production Readiness and Best Practices Checklist

Scalability

Configured the horizontal autoscaler?

The horizontal pod auto-scaler (HPA) automatically scales the

number of pods in a deployment or replica set based on CPU
utilization or custom metrics. You can quickly create an HPA
with

kubectl autoscale <rs_name> --min=3 --max=7 --cpu-

percent=85

You can also list HPAs using

Kubectl get hpa

Or
Kubectl describe hpa

Configured vertical pod autoscaler?

The vertical pod auto-scaler (VPA) automatically sets resource

requests and limits for containers and pods based on resource
utilization metrics.

replex.io 28
Kubernetes Production Readiness and Best Practices Checklist

The VPA can change resource limits and requests and can do
this for new pods as well as existing pods.

You can list VPAs using

kubectl get vpa <vpa_name>

Configured cluster autoscaler?

The cluster autoscaler (CA) automatically scales cluster size

based on two signals; whether there are any pending pods as
well as the utilization of nodes.

If the CA detects any pending pods during its periodic checks, it

requests more nodes from the cloud provider. The CA will also
downscale the cluster and remove idle nodes if they are
underutilized.

replex.io 29
Kubernetes Production Readiness and Best Practices Checklist

Monitoring

Set up a monitoring pipeline?

There are a number of open source monitoring tools that you

can use to monitor your Kubernetes clusters.

Prometheus+grafana is one of the most widely used monitoring

toolsets among DevOps. You can read our tutorial for setting up
a monitoring pipeline using prometheus and grafana here.

Selected a list of metrics to monitor?

Setting up a metrics pipeline also involves identifying a list of

metrics that you want to track.

In the context of resource management, most useful metrics to

track include usage and utilization metrics for CPU, memory and
filesystem. These can be tracked on many different levels of
abstraction from clusters and namespaces, to pods and nodes.

replex.io 30
Kubernetes Production Readiness and Best Practices Checklist

Bonus

Run an end-to-end (e2e) test?

End-to-end tests are a great way to ensure that your Kubernetes

environment will behave in a consistent and reliable manner
when pushed into production. End-to-end tests will also enable
developers to identify bugs before pushing their application out
to end users.

You can run an e2e test by installing kubetest

go get -u k8s.io/test-infra/kubetest

Here is how you can build Kubernetes, up a cluster, run tests,

and tear everything down once the test is finished.

kubetest --build --up --test --down

Mapped external services?

Most developers connect to services residing outside the

Kubernetes cluster for their applications. In such cases, it is a
best practice to use the native Kubernetes service abstraction to
connect to these external services.

replex.io 31
Kubernetes Production Readiness and Best Practices Checklist

A basic setup would involve creating a Service without pod

selectors and an Endpoint object with the external IP. The
Endpoint object receives traffic from the service and forwards it
to the external IP.

Mapping external services in this way makes it easier to

organize and manage external services being used by the team
or organization as a whole. You also won’t need to use IP
addresses directly in your code. Updates are also easier to make
since it only involves changing the IP address in the Endpoints
object, without having to make changes to application code.

Installed the DNS add-on?

Another best practice is to provision a DNS server as a cluster

add-on. A DNS server is the recommended method for service
discovery in Kubernetes. The DNS server will monitor the
Kubernetes API for new Services and will create a set of DNS
entries for each. Kubernetes recommends CoreDNS to be
installed as a DNS server

The DNS add-on makes it easier for pods to connect to services

by doing a DNS query for the servicename or for the
servicename.namespacename.

AUTHOR

Hasham Haider
Fan of all things cloud, containers and micro-services!

replex.io 32
Kubernetes Production Readiness and Best Practices Checklist

Get in touch
replex.io | [email protected]

*The information provided within this eBook is for general informational purposes only. While we try to keep the
information up-to-date and correct, there are no representations or warranties, express or implied, about the
completeness, accuracy, reliability, suitability or availability with respect to the information, products, services, or
related graphics contained in this eBook for any purpose. Any use of this information is at your own risk.

replex.io 33

Bihar CFMS Employee and PayBill Training Manual
100% (4)
Bihar CFMS Employee and PayBill Training Manual
46 pages
SmartPlant License Manager 8
100% (1)
SmartPlant License Manager 8
8 pages
Openshift4 Open19publicsector 190625091340
100% (1)
Openshift4 Open19publicsector 190625091340
51 pages
GitOps and Kubernetes: Continuous Deployment With Argo CD, Jenkins X, and Flux 1st Edition Billy Yuen 2024 Scribd Download
100% (6)
GitOps and Kubernetes: Continuous Deployment With Argo CD, Jenkins X, and Flux 1st Edition Billy Yuen 2024 Scribd Download
52 pages
Java Swing
No ratings yet
Java Swing
20 pages
Polarmods - Patcher Logcat
No ratings yet
Polarmods - Patcher Logcat
1,457 pages
OpenVINO Installation Guide 2019R1
No ratings yet
OpenVINO Installation Guide 2019R1
30 pages
A Software Radio Approach To Global Navigation Satellite System Receiver Design
No ratings yet
A Software Radio Approach To Global Navigation Satellite System Receiver Design
140 pages
Readme For Queen Othala
No ratings yet
Readme For Queen Othala
2 pages
Sruthi Resume
No ratings yet
Sruthi Resume
2 pages
Police Verification Page - Ravi T
No ratings yet
Police Verification Page - Ravi T
1 page
Dgaest Assigment
No ratings yet
Dgaest Assigment
5 pages
VSCodium Installation Guide
No ratings yet
VSCodium Installation Guide
8 pages
TIB Bwpluginftl 6.7.1 User-Guide
No ratings yet
TIB Bwpluginftl 6.7.1 User-Guide
55 pages
Adf - Ly Methods - How To Earn $100-500 Per Day
No ratings yet
Adf - Ly Methods - How To Earn $100-500 Per Day
18 pages
BW SubscriberExportGuide R220
No ratings yet
BW SubscriberExportGuide R220
46 pages
Memor - Datalogic - Manual Del Usuario
No ratings yet
Memor - Datalogic - Manual Del Usuario
85 pages
Kubernetes Persistent Volumes
No ratings yet
Kubernetes Persistent Volumes
13 pages
Oracle Fusion Middleware 12c (12.2.1.3.0) Certification Matrix System
No ratings yet
Oracle Fusion Middleware 12c (12.2.1.3.0) Certification Matrix System
61 pages
AirWatch Install Requirements Guide For SaaS v7 1
No ratings yet
AirWatch Install Requirements Guide For SaaS v7 1
3 pages
Procedures: Bca Iii (E) Lecture No.16
No ratings yet
Procedures: Bca Iii (E) Lecture No.16
4 pages
Kubernetes Kustomize Tutorial
0% (2)
Kubernetes Kustomize Tutorial
14 pages
Kubernetes Presentation - Slides
No ratings yet
Kubernetes Presentation - Slides
44 pages
Program C
No ratings yet
Program C
24 pages
MVS System Commands Summary - Iea3f131
No ratings yet
MVS System Commands Summary - Iea3f131
76 pages
Managing Kube Perf at Scale
No ratings yet
Managing Kube Perf at Scale
26 pages
Updated Resume Seeta
No ratings yet
Updated Resume Seeta
3 pages
Stupid Simple Kubernetes Final
100% (1)
Stupid Simple Kubernetes Final
76 pages
Android Pentest Course - Webview PDF
No ratings yet
Android Pentest Course - Webview PDF
18 pages
Dekkers Algo
No ratings yet
Dekkers Algo
105 pages
Prisma Cloud Complete Guide Kubernetes
No ratings yet
Prisma Cloud Complete Guide Kubernetes
14 pages
Kubernetes and The Enterprise: Brought To You in Partnership With
No ratings yet
Kubernetes and The Enterprise: Brought To You in Partnership With
54 pages
The Systems Development Environment: Multiple Choice Questions
No ratings yet
The Systems Development Environment: Multiple Choice Questions
9 pages
Rancher 2.0 Architecture - V0.8
No ratings yet
Rancher 2.0 Architecture - V0.8
13 pages
Kubernetes Security Ebook Tips Tricks Best Practices PDF
No ratings yet
Kubernetes Security Ebook Tips Tricks Best Practices PDF
20 pages
CBSE Class 6 Computer Science Question Paper Set D PDF
100% (1)
CBSE Class 6 Computer Science Question Paper Set D PDF
3 pages
End User License Agreement - English 29.09.2022
No ratings yet
End User License Agreement - English 29.09.2022
26 pages
Software Supply Chain Best Practices: #Cncfsecuritytag
No ratings yet
Software Supply Chain Best Practices: #Cncfsecuritytag
45 pages
Exercise 3.3: Configure Probes: Simpleapp - Yaml
No ratings yet
Exercise 3.3: Configure Probes: Simpleapp - Yaml
5 pages
Shaft 2012 Manual
No ratings yet
Shaft 2012 Manual
236 pages
Blue-Green Deployment
100% (1)
Blue-Green Deployment
16 pages
Lab 6
No ratings yet
Lab 6
4 pages
Prometheus High Availability With Thanos - Observability For Kubernetes
No ratings yet
Prometheus High Availability With Thanos - Observability For Kubernetes
10 pages
Lecture Notes On Operating Systems
No ratings yet
Lecture Notes On Operating Systems
27 pages
Vishak Sivadas Nicholas Simoff David Buck Tejinder Multani Raymond Brooks II
No ratings yet
Vishak Sivadas Nicholas Simoff David Buck Tejinder Multani Raymond Brooks II
13 pages
MICROSAR ProductInformation en
100% (1)
MICROSAR ProductInformation en
92 pages
Kubernetes Istio Freshers - Experienced
No ratings yet
Kubernetes Istio Freshers - Experienced
7 pages
How Kubernetes Networking Works - Under The Hood
No ratings yet
How Kubernetes Networking Works - Under The Hood
14 pages
Kubernetes Autoscaling Guide
No ratings yet
Kubernetes Autoscaling Guide
7 pages
Docker Fundamentals Jumpstart
No ratings yet
Docker Fundamentals Jumpstart
34 pages
Eks Ultimate Guide
0% (1)
Eks Ultimate Guide
298 pages
Deploying and Scaling Kubernetes With Rancher - 2nd Ed
No ratings yet
Deploying and Scaling Kubernetes With Rancher - 2nd Ed
66 pages
Deployment Strategies
No ratings yet
Deployment Strategies
61 pages
OpenShift Container Platform-4.6-Getting Started With Cost Management-En-US
No ratings yet
OpenShift Container Platform-4.6-Getting Started With Cost Management-En-US
33 pages
VMware K8sForOperators Ebook 042320
No ratings yet
VMware K8sForOperators Ebook 042320
15 pages
Final Evaluation System - Docxfinal
100% (2)
Final Evaluation System - Docxfinal
33 pages
GCP - Using Containers With Kubernetes
No ratings yet
GCP - Using Containers With Kubernetes
6 pages
Red Hat - CloudForms Orchestration - Ansible Automation
No ratings yet
Red Hat - CloudForms Orchestration - Ansible Automation
40 pages
OpenShift - Container - Platform 4.5 Nodes en US PDF
No ratings yet
OpenShift - Container - Platform 4.5 Nodes en US PDF
248 pages
Openshift Container Platform 4.6: Architecture
No ratings yet
Openshift Container Platform 4.6: Architecture
48 pages
OpenShift Container Platform 3.5 Administrator Solutions en US
No ratings yet
OpenShift Container Platform 3.5 Administrator Solutions en US
55 pages
Rancher 2.0: Technical Architecture
No ratings yet
Rancher 2.0: Technical Architecture
11 pages
15 Reasons To Use Redis As An Application Cache: Itamar Haber
No ratings yet
15 Reasons To Use Redis As An Application Cache: Itamar Haber
9 pages
Red Hat Openshift and Kubernetes... What'S The Difference?: E-Book
No ratings yet
Red Hat Openshift and Kubernetes... What'S The Difference?: E-Book
20 pages
Deploy A Jenkins Cluster On AWS - A Cloud Guru PDF
No ratings yet
Deploy A Jenkins Cluster On AWS - A Cloud Guru PDF
18 pages
Kubernetes CKA 0200 Scheduling PDF
No ratings yet
Kubernetes CKA 0200 Scheduling PDF
34 pages
Azure Kubernetes Service - Architecture & Implementation Case Study
No ratings yet
Azure Kubernetes Service - Architecture & Implementation Case Study
9 pages
Kubernetes-Certified-Administrator - README - MD at Master Walidshaari - Kubernetes-Certified-Administrator GitHub PDF
No ratings yet
Kubernetes-Certified-Administrator - README - MD at Master Walidshaari - Kubernetes-Certified-Administrator GitHub PDF
7 pages
Beginner Guide Gitops
No ratings yet
Beginner Guide Gitops
18 pages
Adding Observability To A Kubernetes Cluster Using Prometheus - by Martin Hodges - Jan, 2024 - Medium
No ratings yet
Adding Observability To A Kubernetes Cluster Using Prometheus - by Martin Hodges - Jan, 2024 - Medium
2 pages
Container Engine Vs Container Runtime
No ratings yet
Container Engine Vs Container Runtime
10 pages
Redhat Presentation For Sapphire
No ratings yet
Redhat Presentation For Sapphire
46 pages
Sahdev Zala Henry Nash Martin Hickey: © 2018 IBM Corporation
No ratings yet
Sahdev Zala Henry Nash Martin Hickey: © 2018 IBM Corporation
50 pages
Nagios 3
0% (1)
Nagios 3
11 pages
Kubernetes CKA 0400 Application Lifecycle Management
No ratings yet
Kubernetes CKA 0400 Application Lifecycle Management
69 pages
A Buyers Guide To Enterprise Kubernetes Solutions
No ratings yet
A Buyers Guide To Enterprise Kubernetes Solutions
13 pages
High Availability and Disaster Recovery Kubernetes
No ratings yet
High Availability and Disaster Recovery Kubernetes
6 pages
Kubernetes Fundamental: Phuletv - Devops Lead
No ratings yet
Kubernetes Fundamental: Phuletv - Devops Lead
23 pages
EKS Overview
No ratings yet
EKS Overview
14 pages
Using The Cost of Quality Approach For Software
No ratings yet
Using The Cost of Quality Approach For Software
6 pages
Azure Bicep QuickStart Pro: From JSON and ARM Templates to Advanced Deployment Techniques, CI/CD Integration, and Environment Management
From Everand
Azure Bicep QuickStart Pro: From JSON and ARM Templates to Advanced Deployment Techniques, CI/CD Integration, and Environment Management
Selina Threxan
No ratings yet
CA Cloud Service Management A Clear and Concise Reference
From Everand
CA Cloud Service Management A Clear and Concise Reference
Gerardus Blokdyk
No ratings yet
Mastering Kubernetes in Production: Managing Containerized Applications
From Everand
Mastering Kubernetes in Production: Managing Containerized Applications
Peter Johnson
No ratings yet
Software Containers: The Complete Guide to Virtualization Technology. Create, Use and Deploy Scalable Software with Docker and Kubernetes. Includes Docker and Kubernetes.
From Everand
Software Containers: The Complete Guide to Virtualization Technology. Create, Use and Deploy Scalable Software with Docker and Kubernetes. Includes Docker and Kubernetes.
Jordan Lioy
No ratings yet
Troubleshooting Docker
From Everand
Troubleshooting Docker
John Wooten
No ratings yet
Ansible For Containers and Kubernetes By Examples
From Everand
Ansible For Containers and Kubernetes By Examples
Berton
No ratings yet
Learning SaltStack - Second Edition
From Everand
Learning SaltStack - Second Edition
Colton Myers
No ratings yet
Learn Kubernetes - Container orchestration using Docker: Learn Collection
From Everand
Learn Kubernetes - Container orchestration using Docker: Learn Collection
Arnaud Weil
4/5 (1)
About Kubernetes and Security Practices - Short Edition: First Edition, #1
From Everand
About Kubernetes and Security Practices - Short Edition: First Edition, #1
Ami Adi
No ratings yet
Kubernetes A Complete Guide
From Everand
Kubernetes A Complete Guide
Gerardus Blokdyk
No ratings yet