0% found this document useful (0 votes)
126 views

How To Design and Provision A Production-Ready EKS Cluster: Image Credit: Pixabay

This document provides guidance on how to design and provision a production-ready EKS cluster on AWS. It discusses starting with Terraform or eksctl, addressing IP address exhaustion with EKS, using Bottlerocket OS for nodes, enabling useful EKS addons, and encrypting etcd and EBS volumes with KMS. Managing the aws-auth config file is also recommended for additional control.

Uploaded by

centinel29
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
126 views

How To Design and Provision A Production-Ready EKS Cluster: Image Credit: Pixabay

This document provides guidance on how to design and provision a production-ready EKS cluster on AWS. It discusses starting with Terraform or eksctl, addressing IP address exhaustion with EKS, using Bottlerocket OS for nodes, enabling useful EKS addons, and encrypting etcd and EBS volumes with KMS. Managing the aws-auth config file is also recommended for additional control.

Uploaded by

centinel29
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

How to Design and Provision a Production-Ready

EKS Cluster
A comprehensive guide to create and configure a production-grade Kubernetes cluster on
AWS with Terraform, Helm, and other open-source tools.

Image Credit: Pixabay

According to Flexera’s 2021 State of Cloud report, AWS leads the container orchestration
market share with 51% of the respondents using Amazon EKS and ECS compared to 43% for
AKS and 31% for GKE.
Image Credit: Flexera

Flexera does not share the exact breakdown between ECS vs. EKS usage, but according to
Datadog’s latest Container Report, EKS usage is the lowest in terms of managed Kubernetes
services offered by major cloud platforms.

Image Credit: Datadog


Although these survey results may be skewed based on the sample size, anecdotally, it
matches up with my experiences with EKS. AWS groups ECS, ECR, Fargate, and EKS under a
single Containers offering, and tends to lag behind AKS and GKE in terms of Kubernetes-
specific features such as support for the latest Kubernetes version, beta features (e.g. vertical
pod autoscaler, custom kubelet arguments), and managed options (e.g. node auto-repair,
automated upgrades, dashboard).

Despite these shortcomings, given AWS’s dominance in the broader cloud market, many
organizations will opt for EKS as the go-to, managed Kubernetes provider. If you are looking
to design and provision a production-grade EKS cluster, here are some common issues I’ve
encountered and workarounds to navigate the complex EKS ecosystem.

Note: For a more detailed comparison of managed Kubernetes offerings, see State of Managed
Kubernetes 2021.

Starting Point
There are two popular ways to deploy EKS: eksctl and Terraform. eksctl is an open-source
tool managed by Weaveworks that leverages CloudFormation to bootstrap and configure
EKS. If your workflow already revolves around CloudFormation, eksctl may be a good fit to
get started as AWS has good workshops documented for this case. However, if you are
looking for a more traditional IaC approach, using Terraform would be a better approach.

For Terraform, both AWS and Gruntwork provide a base architecture and templates for
getting started:

Provisioning production-ready Amazon EKS clusters using Terraform

How to deploy a production-grade Kubernetes cluster on AWS

Personally, I use the official AWS EKS module, so the examples shown here will point to those
docs, but can be adapted to work with other Terraform templates.

Beware of EKS Exhausting IP Addresses


The first thing to consider before deploying an EKS cluster is VPC design and choosing a
container network interface (CNI). By default, EKS runs with AWS CNI, which was designed
to be compatible with other AWS services such as VPC flow logs. Due to this design choice,
each pod is assigned an IP address from the subnet. This means that not only do your
microservices consume an IP address, but other helper pods that AWS manages (e.g. aws-
node, coredns, kube-proxy) and common tools such as log aggregators, monitoring agents,
and node-specific tools (e.g. spot termination handler, update agents) deployed as
daemonsets will also eat up IP addresses. To compound this issue, each instance type has a
maximum number of IPs that can be assigned based on the number of maximum number of
network interfaces and private IP addresses per interface. In other words, if you are using
small instance types for your cluster, you will hit this limit very quickly.

Image Credit: AWS Blog

So what can be done to mitigate this issue?

1. Utilize secondary CIDR ranges (100.64.0.0/10 and 198.19.0.0/16) to expand the VPC
network

2. Enable IPv6 at the time of cluster creation

3. Set ENABLE_PREFIX_DELEGATION to true on AWS CNI 1.9.0+ . This will add /28 IPv4
address prefixes.

4. Use a different CNI such as Calico, Flannel, Weave, or Cilium

The first three solutions listed above fall under the AWS domain, so it could be a good path
forward if you want to make sure you have full AWS support. On the other hand, other CNIs
do provide other features such as eBPF, WireGuard encryption, and network policy support.

Use Bottlerocket
Similar to Google’s Container-Optimized OS, AWS provides Bottlerocket, a Linux-based
operating system designed for hosting containers. Since Bottlerocket only contains
components needed to run containers, it has a smaller attack surface than the default
Amazon Linux 2 AMI.

One of the best features about Bottlerocket is the automated security updates. It follows the
“The Update Framework” (TUF) to securely update the version of Bottlerocket with
automated rollbacks. This reduces the necessary toil of updating AMI every few weeks for all
underlying nodes.

Note, that by default, Bottlerocket uses two storage volumes:

The root device, /dev/xvda , holds the active and passive partition sets. It also contains

the bootloader, the dm-verity hash tree for verifying the immutable root filesystem, and
the data store for the Bottlerocket API.

The data device, /dev/xvdb , is used as persistent storage for container images, container

orchestration, host-containers, and bootstrap containers.

If you need to change any of those parameters, set those values accordingly, e.g.:

additional_ebs_volumes = [{

block_device_name = "/dev/xvdb"

volume_size = "100"

encrypted = true

}]
Image Credit: Bottlerocket

If for some reason Bottlerocket does not work for your use case, install SSM Agent and
Inspector to secure your AMI. Make sure to also attach the AmazonSSMMangedInstanceCore role
to the nodes as well:

workers_additional_policies =
["arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"]

OR

self_managed_node_group_defaults = {

...

iam_role_additional_policies =
["arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"]

Utilize EKS Addons Wisely


Until recently, EKS users had to self-manage Kubernetes components such as kube-proxy ,
core-dns , and VPC CNI. For any 1.18 or later EKS clusters, Amazon now provides a managed
version for each of those components to make sure latest security patches and bug fixes are
validated by AWS.

For new clusters, opt into kube-proxy and core-dns addons unless you expect massive
changes to core-dns configurations. As for VPC CNI, the answer will depend on the
modifications or design decisions made above (i.e. using AWS VPC CNI vs. open-source
alternatives). There is also the EBS CSI driver add-on, but it is only available in preview
release so I would wait until it is in GA in future releases.

Encrypt etcd and EBS Volumes


Kubernetes secrets, by default, are stored unencrypted in etcd on the master node. EKS gives
the option to use KMS to enable envelope encryption of Kubernetes secrets, but it’s easy to
overlook. Configure the cluster_encryption_config block in the EKS module:

cluster_encryption_config = [{

provider_key_arn = aws_kms_key.eks.arn

resources = ["secrets"]

}]

And create the KMS resources accordingly:

resource "aws_kms_key" "eks" {

description = "EKS Secret Encryption Key"

deletion_window_in_days = 7

enable_key_rotation = true

tags = local.tags

resource "aws_kms_key" "ebs" {

description = "Customer managed key to encrypt EKS managed


node group volumes"

deletion_window_in_days = 7

policy = data.aws_iam_policy_document.ebs.json

# This policy is required for the KMS key used for EKS root volumes, so
the cluster is allowed to enc/dec/attach encrypted EBS volumes

data "aws_iam_policy_document" "ebs" {

# Copy of default KMS policy that lets you manage it

statement {

sid = "Enable IAM User Permissions"

actions = ["kms:*"]

resources = ["*"]

principals {

type = "AWS"

identifiers =
["arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"]

# Required for EKS

statement {

sid = "Allow service-linked role use of the CMK"

actions = [

"kms:Encrypt",

"kms:Decrypt",

"kms:ReEncrypt*",

"kms:GenerateDataKey*",

"kms:DescribeKey"

resources = ["*"]

principals {

type = "AWS"

identifiers = [

"arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/aws-
service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling", #
required for the ASG to manage encrypted volumes for nodes

module.eks.cluster_iam_role_arn,
# required for the cluster / persistentvolume-controller to create
encrypted PVCs

statement {

sid = "Allow attachment of persistent resources"

actions = ["kms:CreateGrant"]

resources = ["*"]

principals {

type = "AWS"

identifiers = [

"arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/aws-
service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling", #
required for the ASG to manage encrypted volumes for nodes

module.eks.cluster_iam_role_arn,
# required for the cluster / persistentvolume-controller to create
encrypted PVCs

}
condition {

test = "Bool"

variable = "kms:GrantIsForAWSResource"

values = ["true"]

To configure EBS volumes to use KMS, you need to also create a new StorageClass object in
Kubernetes and configure the kmsKeyId that you created with Terraform code above. You can
also make this new storage class the default to make sure new EBS volumes provisioned via
Persistent Volume Claims will always be encrypted.

Consider Managing aws-auth Manually


One of the known issues with the EKS module is that it uses the AWS provider and the
Kubernetes provider in the same module. This presents a challenge in cases when the EKS
cluster’s credentials are unknown or being updated as Terraform can’t reliably support using
one provider to configure another provider in a single plan/apply. Users run into this issue
often when a cluster configuration changes (e.g. enabling secrets encryption after creation),
and Terraform thinks it loses the credentials to the cluster.

There are two workarounds:

1. Remove the Kubernetes resource from state prior to applying changes (i.e. terraform

state rm module.eks.kubernetes_config_map.aws_auth then terraform plan )

2. Set manage_aws_auth = false in the EKS module and manage the configmap outside of
Terraform (see how the module manages this here).

Configure Cluster Autoscaler Appropriately


Unlike GKE, EKS does not ship with cluster autoscaler (perhaps this will become an EKS
managed addon in the future). Use OIDC federated authentication and IAM roles for Service
Accounts to deploy cluster autoscaler with auto-discovery turned on with tags configured by
the EKS Terraform module.

Underneath the hood, cluster autoscaler utilizes Amazon EC2 Auto Scaling Groups to
manage each node groups, which means it is subject to the same limitations that ASGs face.
For example, since EBS volumes are zone-specific, a naive deployment of cluster autoscaler
may not trigger a scaling event in the desired availability zone for StatefulSets backed by EBS
volumes. To mitigate this, make sure to configure the cluster autoscaler with:

balance-similar-node-groups=true

Nodegroups configured in different availability zones

An alternative solution is to use Karpenter, which works similar to GKE Autopilot’s dynamic
node provisioning process. For a deep-dive into Karpenter, check out:

Karpenter: Open-Source, High-Performance Kubernetes Cluster


Autoscaler
Kubernetes-native cluster autoscaler is now production-ready according to
AWS.
itnext.io

Utilize Open-Source Tools


Unfortunately neither Terraform nor AWS provide a way to automatically roll out a change to
the underlying instances in an EKS cluster. This means that every time an AMI needs to be
updated or Kubernetes versions needs to be bumped, all of the nodes need to be manually
drained and updated. A way to implement such behavior via Terraform may involve creating
a new nodegroup and writing a script to drain old nodegroups and scaling them down.
Another option is to use the deploy functionality of kubergrunt tool maintained by the
Gruntwork team. This will essentially automate the creation of new node groups, cordoning
and draining old nodes, and removing the old worker nodes.

For other open-source tools to help with EKS management, look into “Useful Tools for Better
Kubernetes Development”:

Useful Tools for Better Kubernetes Development


A collection of awesome Kubernetes tools and projects to deploy, secure,
and monitor your Kubernetes clusters.
yitaek.medium.com

Final Notes
In recent years, EKS has made tremendous progress to make EKS cluster creation and
management a smooth experience. It still lags behind GKE and AKS in terms of managed
features, but with some minor tweaks to existing toolkits, one can easily put together a
production-grade cluster. If there are other tips and tricks that I missed in operating EKS at
scale, please comment below.

You might also like