0% found this document useful (0 votes)
284 views

Kubernetes Production Readiness and Best Practices Checklist

The document provides a checklist for ensuring Kubernetes deployments are ready for production use. It covers topics like availability, resource management, security, scalability, and monitoring. Availability topics discussed include configuring liveness and readiness probes, having at least 3 master nodes, replicating master nodes in odd numbers, isolating etcd replicas, backing up etcd data, distributing nodes across zones, and configuring autoscaling.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
284 views

Kubernetes Production Readiness and Best Practices Checklist

The document provides a checklist for ensuring Kubernetes deployments are ready for production use. It covers topics like availability, resource management, security, scalability, and monitoring. Availability topics discussed include configuring liveness and readiness probes, having at least 3 master nodes, replicating master nodes in odd numbers, isolating etcd replicas, backing up etcd data, distributing nodes across zones, and configuring autoscaling.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Kubernetes

Production
Readiness and
Best Practices
Checklist
Kubernetes Production Readiness and Best Practices Checklist

Here is our complete checklist to ensure your Kubernetes


deployments are ready for prime time. In this checklist, we will
cover the topics of availability, resource management, security,
scalability and monitoring for Kubernetes. If you have already
read the blog post you can skip the section on availability.

Availability
Configured liveness and readiness probes?

Liveness probe is the Kubernetes equivalent of “have you tried


turning it off and on again”. Liveness probes detect containers
that are not able to recover from failed states and restart them. It
is a great tool to build-in auto recovery into production
Kubernetes deployments. You can create liveness probes based
on kubelet, http or tcp checks.

Readiness probes detect whether a container is temporarily


unable to receive traffic and will mitigate these situations by
stopping traffic flow to it. Readiness probes will also detect
whether new pods are ready to receive traffic, before allowing
traffic flow, during deployment updates.

replex.io 2
Kubernetes Production Readiness and Best Practices Checklist

Provisioned at least 3 master nodes?

Having the control plane replicated across 3 nodes is the


minimum required configuration for a highly available Kubernetes
cluster. Etcd requires a majority of master nodes to form a
quorum and continue functioning. With 3 master nodes, the
cluster can overcome the failure of 1 master node, since it still
has 2 to form a majority.

Here is a table outlining the fault tolerance of different cluster


sizes.

Replicated master nodes in odd numbers?

As is apparent from this table, master nodes should always be


replicated in odd numbers. Odd-numbered master clusters have
the same tolerance as the next highest even numbered cluster.

Isolated etcd replicas?

The etcd master component is responsible for storing and


replicating cluster state. As such it has high resource
requirements. Therefore, a best practice is to isolate the etcd
replicas by placing them on dedicated nodes. This de-couples the
control plane components and the etcd members and ensures
sufficient resource availability for etcd members making the
cluster more robust and reliable.

It is recommended to have at least a 5-member etcd cluster in


production.

replex.io 3
Kubernetes Production Readiness and Best Practices Checklist

Have a plan for regular etcd backups?

Since etcd stores cluster state, it is always a best practice to


regularly backup etcd data. It is also a good idea to save etcd
backup data on a separate host. etcd clusters can be backed up
by taking a snapshot with the etcdctl snapshot save command or
by copying the member/snap/db file from an etcd data directory.

If running on public cloud provider storage volumes, it is


relatively easy to create etcd backups by taking a snapshot of
the storage volume.

Distributed master nodes across zones?

Distributing master nodes across zones is also a high availability


best practice. This ensures that master nodes are immune to
outages of entire availability zones.

Using Kops, master nodes can be easily distributed across


zones using the --master-zones flag.

Distributed worker nodes across zones?

Worker nodes should also be distributed across availability


zones. Worker nodes can be distributed across zones by using
the --zones flag in Kops.

replex.io 4
Kubernetes Production Readiness and Best Practices Checklist

Configured Autoscaling for both master and


worker nodes?

When using the cloud, a best practice is to place both master


and worker nodes in autoscaling groups. Autoscaling groups will
automatically bring up a node in the event of termination. Kops
places both master and workers nodes into autoscaling groups
by default.

Baked-in HA load balancing?

Once multiple master replica nodes have been deployed, the next
obvious step is to load balance traffic to and from those
replicas. You can do this by creating an L4 load balancer in front
of all apiserver instances and updating the DNS name
appropriately or use the round-robin DNS technique to access all
apiserver directly. Check this document for more information.

Configured active-passive setup for scheduler


and controller manager?

As opposed to the other control plane components, the


scheduler and controller manager components of the control
plane have to read and write data actively, therefore they need to
be configured in an active-passive setup. Once both components
have been replicated across zones, they should be configured in
an active-passive setup.

This can be done by passing the --leader-elect flag to kube-


scheduler.

replex.io 5
Kubernetes Production Readiness and Best Practices Checklist

Configured the correct number of pod replicas


for high availability?

To ensure highly available Kubernetes workloads, pods should


also be replicated using Kubernetes controllers like ReplicaSets,
Deployments and Statefulsets.

Both deployments and statefulsets are central to the concept of


high availability and will ensure that the desired number of pods
is always maintained. The number of replicas is usually dictated
by application requirements.

Kubernetes does recommend using Deployments over


Replicasets for pod replication since they are declarative and
allow you to roll back to previous versions easily. However, if
your use-case requires custom updates orchestration or does
not require updates at all, you can still use Replicasets.

Spinning up any naked pods?

Are all your pods part of a Replicaset or Deployment? Naked


pods are not re-scheduled in case of node failure or shut down.
Therefore, it is best practice to always spin up pods as part of a
Replicaset or Deployment.

replex.io 6
Kubernetes Production Readiness and Best Practices Checklist

Setup Federation for multiple clusters?

If you are provisioning multiple clusters for low latency,


availability and scalability, setting up Kubernetes federation is a
best practice. Federation will allow you to keep resources across
clusters in sync and auto-configure DNS servers and load
balancers.

Federating clusters involves first setting up the federation


control plane and then creating federation API resources.

Configured heartbeat and election timeout


intervals for etcd members?

When configuring etcd clusters, it is important to correctly


specify both heartbeat and election timeout parameters.
Heartbeat interval is the frequency with which the etcd leader
notifies followers. Timeout interval is the time period a follower
will wait for a heartbeat before attempting to become a leader
itself.

The heartbeat interval is recommended to be the round-trip time


between the members. Election timeouts are recommended to
be at least 10 times the round-trip time between members.

replex.io 7
Kubernetes Production Readiness and Best Practices Checklist

Setup Ingress?

Ingress allows HTTP and HTTPS traffic from the outside internet
to services inside the cluster. Ingress can also be used for load
balancing, terminating SSL and to give services externally-
reachable URLs.

In order for ingress to work, your cluster needs an ingress


controller. Kubernetes officially supports GCE and nginx
controller as of now. Here is a list of other ingress controllers
you might want to check out.

You can also create an external cloud load balancer in place of


the ingress resource, by including type: LoadBalancer in the
Service configuration file.

replex.io 8
Kubernetes Production Readiness and Best Practices Checklist

Resource Management

Configured resource requests and limits for


containers?

Resource requests and limits help you manage resource


consumption by individual containers. Resource requests are a
soft limit on the amount of resources that can be consumed by
individual containers. Limits are the maximum amount of
resources that can be consumed.

Resource requests and limits can be set for CPU, memory and
ephemeral storage resources. Setting resource requests and
limits is a Kubernetes best practice and will help avoid
containers getting throttled due to lack of resources or going
berserk and hogging resources.

To check whether all containers inside a pod have resource


requests and limits defined use the following command

kubectl describe pod -n <namespace_name> <pod_name>

This will display a list of all containers with the corresponding


limits and requests for both CPU and memory resources.

replex.io 9
Kubernetes Production Readiness and Best Practices Checklist

Specified resource requests and limits for local


ephemeral storage?

Local ephemeral storage is a new type of resource introduced in


Kubernetes 1.8. Containers use ephemeral storage for local
storage. If you have configured local ephemeral storage check to
see that you have set requests and limits for each container for
this resource type.

Here is how you can check whether requests and limits have
been defined for local ephemeral storage for all containers:

kubectl describe pod -n <namespace_name> <pod_name>

Created separate namespaces for your teams?

Kubernetes namespaces are virtual partitions of your


Kubernetes clusters. It is recommended best practice to create
separate namespaces for individual teams, projects or
customers. Examples include Dev, production, frontend etc. You
can also create separate namespaces based on custom
application or organizational requirements. Here is how you can
display a list of all namespaces:

kubectl get namespaces


Or
kubectl get namespaces --show-labels

You can also display a list of all the pods running inside a
namespace with kubectl get pods --all-namespaces

replex.io 10
Kubernetes Production Readiness and Best Practices Checklist

Configured default resource requests and limits


for namespaces?

Default requests and limits specify the default values for


memory and CPU resources for all containers inside a
namespace. In situations where resource request and limit
values are not specifically defined for a container created inside
a namespace with default values, that container will
automatically inherit the default values. Configuring default
values on a namespace level is a best practice to ensure that all
containers created inside that namespace get assigned both
request and limit values.

Here is how you check whether a namespace has been assigned


default resource requests and limits:

kubectl describe namespace <namespace_name>

Configured Limit ranges for namespaces?

Limit ranges also work on the namespace level and allow us to


specify the minimum and maximum CPU and memory resources
that can be consumed by individual containers inside a
namespace.

replex.io 11
Kubernetes Production Readiness and Best Practices Checklist

Whenever a container is created inside a namespace with limit


ranges, it has to have a resource request value that is equal to or
higher than the minimum value we defined in the limit range. The
container also has to have both CPU and memory limits that are
equal to lower than the max value defined in the limit range.

Check whether limit ranges have been configured:

kubectl describe namespace <namespace_name>

Specified Resource Quotas for namespaces?

Resource quotas also work on the namespace level and provide


another layer of control over cluster resource usage.

Resource Quotas limit the total amount of CPU, memory and


storage resources that can be consumed by all containers
running in a namespace.

Consumption of storage resources by persistent volume claims


can also be limited based on individual storage class.
Kubernetes administrators can define storage classes based on
quality of service levels or backup policies.

Check whether resource quotas have been configured:

kubectl describe namespace <namespace_name>

replex.io 12
Kubernetes Production Readiness and Best Practices Checklist

Configured pod and API Quotas for


namespaces?

Pod quotas allow you to restrict the total number of pods that
can run inside a namespace. API quotas let you set limits for
other API objects like PersistentVolumeClaims, Services and
ReplicaSets.

Pod and API quotas are a good way to manage resource usage
on a namespace level.

To check whether quotas have been configured:

kubectl describe namespace <namespace_name>

Ensured resource availability for etcd?

Typically, Etcd clusters have a pretty large resource footprint.


Therefore, it is best practice to run these clusters on dedicated
hardware to ensure they have access to enough resources.
Resource starvation can lead to the cluster becoming unstable,
which will in turn mean that no new pods can be scheduled.

Here is an etcd resource guide based on the number of nodes in


the cluster and the clients being served. Typical etcd clusters
need 2-4 CPU cores, 8GB of memory, 50 sequential IOPS and a 1
Gbe network connection to run smoothly. For larger etcd cluster,
check out this handy guide.

replex.io 13
Kubernetes Production Readiness and Best Practices Checklist

Configured etcd snapshot memory usage?

Etcd snapshots are backups of the etcd cluster which can be


used for cluster disaster recovery. The --snapshot-count flag
determines the number of changes that need to happen to etcd
before a snapshot is taken. Higher --snapshot-count will hold a
higher number of entries until the next snapshot, which can lead
to higher memory usage. The default value for --snapshot-count
in etcd v3.2 is 100,000.

Make sure to configure this number based on your unique


cluster requirements.

You can do this using $ etcd --snapshot-count=X

Attached labels to Kubernetes objects?

Labels allow Kubernetes objects to be queried and operated


upon in bulk. They can also be used to identify and organize
Kubernetes objects into groups. As such defining labels should
figure right at the top of any Kubernetes best practices list. Here
is a list of recommended Kubernetes labels that should be
defined for every deployment.

Check whether pods have been labelled

kubectl get pods --show-labels

replex.io 14
Kubernetes Production Readiness and Best Practices Checklist

Limited the number of pods that can run on a


node?

You can also control the number of pods that can be scheduled
on a node using the --max-pods flag in Kubelet.

This will help avoid scenarios where rogue or misconfigured jobs


create pods in such large numbers as to overwhelm system
pods.

Reserved compute resources for system


daemons?

Another best practice is to reserve resources for system


daemons that are needed by both the OS and Kubernetes itself
to run. All three resource types CPU, memory and ephemeral
storage resources can be reserved for system daemons. Once
reserved these resources are deducted from node capacity and
are exposed as node allocable resources. Below are kubelet
flags that can be used to reserve resources for system daemons:

--kube-reserved: allows you to reserve resources for Kubernetes


system daemons like the kubelet, container runtime and node
problem detector.

--system-reserved: allows you to reserve resources for OS


system daemons like sshd and udev.

replex.io 15
Kubernetes Production Readiness and Best Practices Checklist

Configured API request processing for API


server?

To manage CPU and memory consumption by the API server,


make sure to configure the maximum number of requests that
can be processed by the API server in parallel.

This can be done using the --max-requests-inflight and --max-


mutating-requests-inflight flags.

Processing a lot of API requests in parallel can be very CPU


intensive for the API server and can also lead to OOM (out of
memory) events.

Configured out of resource handling?

Make sure you configure out of resource handling to prevent


unused images and dead pods and containers taking up too
much unnecessary space on the node.

Out of resource handling specifies Kubelet behavior when the


node starts to run low on resources. In such cases, the Kubelet
will first try to reclaim resources by deleting dead pods (and their
containers) and unused images. If it cannot reclaim sufficient
resources, it will then start evicting pods.

replex.io 16
Kubernetes Production Readiness and Best Practices Checklist

You can influence when the Kubelet kicks into action by


configuring eviction thresholds for eviction signals. Thresholds
can be configured for nodefs.available, nodefs.inodesfree,
imagefs.available and imagefs.inodesfree eviction signals in the
pod spec.

Following are some examples:

nodefs.available<10%
nodefs.inodesFree<5%
imagefs.available<15%
imagefs.inodesFree<20%

Doing this will ensure that unused images and dead containers
and pods do not take up unnecessary disk space.

You should also consider specifying a threshold for


memory.available signal. This will ensure that Kubelet kicks into
action when free memory on the node falls below your desired
level.

Another best practice is to pass the --eviction-minimum-reclaim


to Kubelet. This will ensure that the Kubelet does not pop up and
down the eviction threshold by reclaiming a small amount of
resources. Once an eviction threshold is triggered the Kubelet
will evict pods till the minimum threshold is reached.

replex.io 17
Kubernetes Production Readiness and Best Practices Checklist

Using recommended settings for Persistent


Volumes?

Persistent Volumes (PVs) represent a piece of storage in a


Kubernetes cluster. PVs are similar to regular Kubernetes
volumes with one difference; they have a lifecycle that is
independent of any specific pod in the cluster. Kubernetes
volumes, on the other hand, do not persist data across pod
restarts.

Persistent Volume Claims (PVCs) are requests for storage


resources by a user. PVCs consume PV resources in the same
way that a pod consumes node resources.

When creating PV’s Kubernetes documentation recommends the


following best practices:

• Always include Persistent Volume Claims in the config


• Never include PVs in the config
• Always create a default storage class
• Give the user the option of providing a storage class name

Enabled log rotation?

If you have node-level logging enabled, make sure you also


enable log rotation to avoid logs consuming all available storage
on the node. Enable the logrotate tool for clusters deployed by
the kube-up.sh script on GCP or Docker’s log-opt for all other
environments.

replex.io 18
Kubernetes Production Readiness and Best Practices Checklist

Prevented Kubelet from setting or modifying


label keys?

If you are using labels and label selectors to target pods to


specific nodes for security or regulatory purposes, make sure
you also choose label keys that cannot be modified by the
Kubelet. This will ensure that compromised nodes cannot use
their Kubelet credentials to label their node object and schedule
pods.

Make sure you use the Node authorizer, enable the


NodeRestriction admission plugin and always prefix node labels
with node-restriction.Kubernetes.io/ as well as add the same
prefix to label selectors.

replex.io 19
Kubernetes Production Readiness and Best Practices Checklist

Security

Using the latest Kubernetes version?

Kubernetes regularly pushes out new versions with critical bug


fixes and new security features. Make sure you have upgraded
to the latest version to take advantage of these features.

Enabled RBAC (Role-Based Access Control)?

Kubernetes RBAC allows you to regulate access to your


Kubernetes environment. Using RBAC Kubernetes admins can
define policies authorizing user access as well as the extent of
this access to resources. Make sure to define all four high-level
API objects including Role, Cluster role, Role binding and Cluster
role binding to secure your Kubernetes environment.

You can enable the RBAC API using --authorization-mode=RBAC

Following user access best practices?

Make sure you limit the scope of user permissions by chopping


up your Kubernetes environment into separate namespaces.
This will isolate resources and help contain the damage in case
of security misconfigurations or malicious activity. You can do
this by using Roles (which grant access to resources within a
single namespace) as opposed to ClusterRoles (which grant
access to resources cluster-wide).

replex.io 20
Kubernetes Production Readiness and Best Practices Checklist

Another best practice is to limit user permissions based on the


resources they need access to. For example, you can limit a Role
to a specific namespace and a set of resources e.g. pods.

Avoid giving admin access as much as possible.

Enabled audit logging?

Kubernetes audits are performed by the kube-api server and


allow you to record a sequence of the activities that users or
other system components perform. You should also be
monitoring audit logs to proactively react to malicious activity or
authorization failures.

You can define rules about which events to record and what data
to log in an audit policy. Here is a minimal audit policy which will
log all metadata related to requests

# Log all requests at the Metadata level.


apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata

You can implement logging at Request level which will log both
metadata and request body as well as RequestResponse level
which will log response body in addition to request metadata
and request body.

replex.io 21
Kubernetes Production Readiness and Best Practices Checklist

Set Up a Bastion host?

Setting up a bastion host is another security best practice. A


bastion host allows you to shrink your ssh footprint to only one
host. Once a bastion host has been set up, it serves as the only
entry point into the cluster. All other hosts can be accessed via
ssh from that host.

You can enable a bastion host in kops by passing the --bastion


flag to it during cluster creation.

Enabled AlwaysPullImages in admission


controller?

AlwaysPullImages is an admission controller which ensures that


images are always pulled with the correct authorization and
cannot be re-used by other pods without first providing
credentials. This is very useful in a multi-tenant environment and
ensures that images can only be used by users with the correct
credentials.

You can enable AlwaysPullImages images using the flag

kube-apiserver --enable-admission-plugins=AlwaysPullImage

replex.io 22
Kubernetes Production Readiness and Best Practices Checklist

Besides these, you should also enable other recommended


Kubernetes admission controllers including:

NamespaceLifecycle, LimitRanger, ServiceAccount,


DefaultStorageClass, DefaultTolerationSeconds,
MutatingAdmissionWebhook, ValidatingAdmissionWebhook,
Priority, ResourceQuota

Here is how you can check which admission controllers have


been enabled

kube-apiserver -h | grep enable-admission-plugins

Defined pod security policy and enabled it in the


admission controller?

Pod security policies outline a set of security sensitive policies


that pods must fulfil in order to be scheduled. PodSecurityPolicy
is also a recommended admission controller. Here is an example
of a restrictive pod security policy.

You can then enable the security policy using the flag

--enable-admission-plugins=PodSecurityPolicy

This policy forces all users to run as unprivileged users and also
disables privilege escalation as well as enabling other restrictive
security policies.

replex.io 23
Kubernetes Production Readiness and Best Practices Checklist

Chosen a Network plugin and configured


network policies?

Network policies allow you to configure how the various


components of your Kubernetes environment communicate
among each other and with outside network endpoints. These
can include, pods, containers, services and namespaces.

Here is a great list of network policy recipes to get you started.


Network policies are enforced by a networking plugin, so make
sure you have chosen one too. Flannel and Cilium are good
examples.

Implemented authentication for kubelet?

Kubernetes allows anonymous authentication by default. You


can avoid this by enabling RBAC or by disabling anonymous
access to kubelet’s https endpoint by starting kubelet with the
flag --anonymous-auth=false

You can then enable X509 client certificate authentication by


starting the kubelet using the flag --client-ca-file flag and the
API server with --kubelet-client-certificate and --kubelet-client-
key flags.

You can also enable API bearer tokens by ensuring the


authentication.k8s.io/v1beta1 API group is enabled in the API
server and starting the kubelet with the --authentication-token-
webhook and –kubeconfig flags.

replex.io 24
Kubernetes Production Readiness and Best Practices Checklist

Configured Kubernetes secrets?

Sensitive information related to your Kubernetes environment


like a password, token or a key should always be stored in a
Kubernetes secrets object. You see a list of the secrets already
created using:

kubectl get secrets

Enabled data encryption at rest?

Encrypting data at rest is another security best practice. In


Kubernetes, data can be encrypted using either of these four
providers: aescbc, secretbox, aesgcm or kms.

Encryption can be enabled by passing the --encryption-provider-


config flag to kube-apiserver process.

To ensure all secrets are encrypted update all secrets using:

kubectl get secrets --all-namespaces -o json | kubectl replace -f


This will also apply server-side encryption to all secrets.

replex.io 25
Kubernetes Production Readiness and Best Practices Checklist

Disabled default service account?

All newly created pods and containers without a service account


are automatically assigned the default service account.

The default service account has a very wide range of


permissions in the cluster and should therefore be disabled.

You can do this by setting automountServiceAccountToken:


false on the service account.

Scanned containers for security vulnerabilities?

Another security best practice is to scan your container images


for known security vulnerabilities.

You can do this using open source tools like Anchore and Clair
which will help you identify common vulnerabilities and
exposures (CVEs) and mitigate them.

Configured security context for pods, containers


and volumes?

Security context specifies privilege and access control settings


for pods and containers. Pod security context can be defined by
including the securityContext field in the pod specification. Once
a security context has been specified for a pod it automatically
propagates to all the containers that belong to the pod.

replex.io 26
Kubernetes Production Readiness and Best Practices Checklist

A best practice when setting the security context for a pod is to


set both runAsNonRoot and readOnlyRootFileSystem fields to
true and allowPriviligeEscalation to false. This will introduce
more layers into your container and Kubernetes environment and
prevent privilege escalation.

Enabled Kubernetes logging?

Kubernetes logs will help you understand what is happening


inside your cluster as well as debug problems and monitor
activity. Logs for containerized applications are usually written
to the standard output and standard error streams.

A best practice when implementing Kubernetes logging is to


configure a separate lifecycle and storage for logs from pods,
containers and nodes. You can do this by implementing a cluster
level logging architecture.

replex.io 27
Kubernetes Production Readiness and Best Practices Checklist

Scalability

Configured the horizontal autoscaler?

The horizontal pod auto-scaler (HPA) automatically scales the


number of pods in a deployment or replica set based on CPU
utilization or custom metrics. You can quickly create an HPA
with

kubectl autoscale <rs_name> --min=3 --max=7 --cpu-


percent=85

You can also list HPAs using

Kubectl get hpa


Or
Kubectl describe hpa

Configured vertical pod autoscaler?

The vertical pod auto-scaler (VPA) automatically sets resource


requests and limits for containers and pods based on resource
utilization metrics.

replex.io 28
Kubernetes Production Readiness and Best Practices Checklist

The VPA can change resource limits and requests and can do
this for new pods as well as existing pods.

You can list VPAs using

kubectl get vpa <vpa_name>

Configured cluster autoscaler?

The cluster autoscaler (CA) automatically scales cluster size


based on two signals; whether there are any pending pods as
well as the utilization of nodes.

If the CA detects any pending pods during its periodic checks, it


requests more nodes from the cloud provider. The CA will also
downscale the cluster and remove idle nodes if they are
underutilized.

replex.io 29
Kubernetes Production Readiness and Best Practices Checklist

Monitoring

Set up a monitoring pipeline?

There are a number of open source monitoring tools that you


can use to monitor your Kubernetes clusters.

Prometheus+grafana is one of the most widely used monitoring


toolsets among DevOps. You can read our tutorial for setting up
a monitoring pipeline using prometheus and grafana here.

Selected a list of metrics to monitor?

Setting up a metrics pipeline also involves identifying a list of


metrics that you want to track.

In the context of resource management, most useful metrics to


track include usage and utilization metrics for CPU, memory and
filesystem. These can be tracked on many different levels of
abstraction from clusters and namespaces, to pods and nodes.

replex.io 30
Kubernetes Production Readiness and Best Practices Checklist

Bonus

Run an end-to-end (e2e) test?

End-to-end tests are a great way to ensure that your Kubernetes


environment will behave in a consistent and reliable manner
when pushed into production. End-to-end tests will also enable
developers to identify bugs before pushing their application out
to end users.

You can run an e2e test by installing kubetest

go get -u k8s.io/test-infra/kubetest

Here is how you can build Kubernetes, up a cluster, run tests,


and tear everything down once the test is finished.

kubetest --build --up --test --down

Mapped external services?

Most developers connect to services residing outside the


Kubernetes cluster for their applications. In such cases, it is a
best practice to use the native Kubernetes service abstraction to
connect to these external services.

replex.io 31
Kubernetes Production Readiness and Best Practices Checklist

A basic setup would involve creating a Service without pod


selectors and an Endpoint object with the external IP. The
Endpoint object receives traffic from the service and forwards it
to the external IP.

Mapping external services in this way makes it easier to


organize and manage external services being used by the team
or organization as a whole. You also won’t need to use IP
addresses directly in your code. Updates are also easier to make
since it only involves changing the IP address in the Endpoints
object, without having to make changes to application code.

Installed the DNS add-on?

Another best practice is to provision a DNS server as a cluster


add-on. A DNS server is the recommended method for service
discovery in Kubernetes. The DNS server will monitor the
Kubernetes API for new Services and will create a set of DNS
entries for each. Kubernetes recommends CoreDNS to be
installed as a DNS server

The DNS add-on makes it easier for pods to connect to services


by doing a DNS query for the servicename or for the
servicename.namespacename.

AUTHOR

Hasham Haider
Fan of all things cloud, containers and micro-services!

replex.io 32
Kubernetes Production Readiness and Best Practices Checklist

Get in touch
replex.io | [email protected]

*The information provided within this eBook is for general informational purposes only. While we try to keep the
information up-to-date and correct, there are no representations or warranties, express or implied, about the
completeness, accuracy, reliability, suitability or availability with respect to the information, products, services, or
related graphics contained in this eBook for any purpose. Any use of this information is at your own risk.

replex.io 33

You might also like