Kubernetes Security Cheat Sheet
Kubernetes Security Cheat Sheet
Kubernetes
When you deploy Kubernetes, you get a cluster. A Kubernetes cluster consists of a set of worker
machines, called nodes that run containerized applications. The control plane manages the
worker nodes and the Pods in the cluster.
Component Description
kube- kube-apiserver exposes the Kubernetes API. The API server is the front end
apiserver for the Kubernetes control plane.
kube- kube-scheduler watches for newly created Pods with no assigned node, and
scheduler selects a node for them to run on.
kube- kube-controller-manager runs controller processes. Logically, each
controller- controller is a separate process, but to reduce complexity, they are all
manager compiled into a single binary and run in a single process.
cloud- The cloud controller manager lets you link your cluster into your cloud
controller- provider's API, and separates out the components that interact with that
manager cloud platform from components that just interact with your cluster.
Node Components
Node components run on every node, maintaining running pods and providing the Kubernetes
runtime environment. It consists of components such as kubelet, kube-proxy and container
runtime.
Component Description
kubelet is an agent that runs on each node in the cluster. It makes sure that
kubelet
containers are running in a Pod
Container The container runtime is the software that is responsible for running
runtime containers.
This cheatsheet provides a starting point for securing Kubernetes cluster. It is divided into the
following categories:
Securing Kubernetes hosts
Securing Kubernetes components
Kubernetes Security Best Practices: Build Phase
Kubernetes Security Best Practices: Deploy Phase
Kubernetes Security Best Practices: Runtime Phase
All of this potential customisation of Kubernetes means it can be designed to fit a large variety of
scenarios; however, this is also its greatest weakness when it comes to security. Kubernetes is
designed out of the box to be customizable and users must turn on certain functionality to
secure their cluster. This means that the engineers responsible for deploying the Kubernetes
platform need to know about all the potential attack vectors and vulnerabilities poor
configuration can lead to.
It is recommended to harden the underlying hosts by installing the latest version of operating
system, hardening the operating system, implement necessary patch management and
configuration management system, implementing essential firewall rules and undertake specific
security measures depending on the datacenter environment.
Kubernetes Version
It has become impossible to track all potential attack vectors. This fact is unfortunate as there is
nothing more vital than to be aware and on top of potential threats. The best defense is to make
sure that you are running the latest available version of Kubernetes.
The Kubernetes project maintains release branches for the most recent three minor releases and
it backports the applicable fixes, including security fixes, to those three release branches,
depending on severity and feasibility. Patch releases are cut from those branches at a regular
cadence, plus additional urgent releases, when required. Hence it is always recommended to
upgrade the Kubernetes cluster to the latest available stable version. It is recommended to refer
to the version skew policy for further details https://kubernetes.io/docs/setup/release/version-
skew-policy/.
There are several techniques such as rolling updates, and node pool migrations that allow you to
complete an update with minimal disruption and downtime.
Securing Kubernetes components
Here is an overview of the default ports used in Kubernetes. Make sure that your network blocks
access to ports and consider limiting access to the Kubernetes API server except from trusted
networks.
Master node(s):
Worker nodes:
You should limit SSH access to Kubernetes nodes, reducing the risk for unauthorized access to
host resource. Instead you should ask users to use "kubectl exec", which will provide direct
access to the container environment without the ability to access the host.
You can use Kubernetes Authorization Plugins to further control user access to resources. This
allows defining fine-grained-access control rules for specific namespace, containers and
operations.
Advances in network technology, such as the service mesh, have led to the creation of products
like LinkerD and Istio which can enable TLS by default while providing extra telemetry
information on transactions between services.
Kubernetes expects that all API communication in the cluster is encrypted by default with TLS,
and the majority of installation methods will allow the necessary certificates to be created and
distributed to the cluster components. Note that some components and installation methods
may enable local ports over HTTP and administrators should familiarize themselves with the
settings of each component to identify potentially unsecured traffic.
API Authentication
Kubernetes provides a number of in-built mechanisms for API server authentication, however
these are likely only suitable for non-production or small clusters.
Static Token File authentication makes use of clear text tokens stored in a CSV file on API
server node(s). Modifying credentials in this file requires an API server re-start to be
effective.
X509 Client Certs are available as well however these are unsuitable for production use, as
Kubernetes does not support certificate revocation meaning that user credentials cannot be
modified or revoked without rotating the root certificate authority key an re-issuing all
cluster certificates.
Service Accounts Tokens are also available for authentication. Their primary intended use is
to allow workloads running in the cluster to authenticate to the API server, however they can
also be used for user authentication.
OpenID Connect (OIDC) lets you externalize authentication, use short lived tokens, and
leverage centralized groups for authorization.
Managed Kubernetes distributions such as GKE, EKS and AKS support authentication using
credentials from their respective IAM providers.
Kubernetes Impersonation can be used with both managed cloud clusters and on-prem
clusters to externalize authentication without having to have access to the API server
configuration parameters.
In addition to choosing the appropriate authentication system, API access should be considered
privileged and use Multi-Factor Authentication (MFA) for all user access.
Kubernetes authorizes API requests using the API server. It evaluates all of the request attributes
against all policies and allows or denies the request. All parts of an API request must be allowed
by some policy in order to proceed. This means that permissions are denied by default.
Kubernetes ships an integrated Role-Based Access Control (RBAC) component that matches an
incoming user or group to a set of permissions bundled into roles. These permissions combine
verbs (get, create, delete) with resources (pods, services, nodes) and can be namespace or
cluster scoped. A set of out of the box roles are provided that offer reasonable default
separation of responsibility depending on what actions a client might want to perform. It is
recommended that you use the Node and RBAC authorizers together, in combination with the
NodeRestriction admission plugin.
RBAC authorization uses the rbac.authorization.k8s.io API group to drive authorization
decisions, allowing you to dynamically configure policies through the Kubernetes API. To enable
RBAC, start the API server with the --authorization-mode flag set to a comma-separated list that
includes RBAC; for example:
The Kubernetes scheduler will search etcd for pod definitions that do not have a node. It then
sends the pods it finds to an available kubelet for scheduling. Validation for submitted pods is
performed by the API server before it writes them to etcd, so malicious users writing directly to
etcd can bypass many security mechanisms - e.g. PodSecurityPolicies.
Administrators should always use strong credentials from the API servers to their etcd server,
such as mutual auth via TLS client certificates, and it is often recommended to isolate the etcd
servers behind a firewall that only the API servers may access.
Caution
Allowing other components within the cluster to access the master etcd instance with read or
write access to the full keyspace is equivalent to granting cluster-admin access. Using separate
etcd instances for non-master components or using etcd ACLs to restrict read and write access
to a subset of the keyspace is strongly recommended.
To prevent attacks via the dashboard, you should follow some tips:
Do not expose the dashboard without additional authentication to the public. There is no
need to access such a powerful tool from outside your LAN
Turn on RBAC, so you can limit the service account the dashboard uses
Do not grant the service account of the dashboard high privileges
Grant permissions per user, so each user only can see what they are supposed to see
If you are using network policies, you can block requests to the dashboard even from
internal pods (this will not affect the proxy tunnel via kubectl proxy)
Before version 1.8, the dashboard had a service account with full privileges, so check that
there is no role binding for cluster-admin left.
Deploy the dashboard with an authenticating reverse proxy, with multi-factor authentication
enabled. This can be done with either embedded OIDC id_tokens or using Kubernetes
Impersonation. This allows you to use the dashboard with the user's credentials instead of
using a privileged ServiceAccount . This method can be used on both on-prem and
managed cloud clusters.
Container images must be built using approved and secure base image that is scanned and
monitored at regular intervals to ensure only secure and authentic images can be used within the
cluster. It is recommended to configure strong governance policies regarding how images are
built and stored in trusted image registries.
Build a CI pipeline that integrates security assessment (like vulnerability scanning), making it
part of the build process. The CI pipeline should ensure that only vetted code (approved for
production) is used for building the images. Once an image is built, it should be scanned for
security vulnerabilities, and only if no issues are found then the image would be pushed to a
private registry, from which deployment to production is done. A failure in the security
assessment should create a failure in the pipeline, preventing images with bad security quality
from being pushed to the image registry.
Many source code repositories provide scanning capabilities (e.g. Github, GitLab), and many CI
tools offer integration with open source vulnerability scanners such as Trivy or Grype.
There is work in progress being done in Kubernetes for image authorization plugins, which will
allow preventing the shipping of unauthorized images. For more information, refer to the PR
https://github.com/kubernetes/kubernetes/pull/27129.
Restricting what's in your runtime container to precisely what's necessary for your app is a best
practice employed by Google and other tech giants that have used containers in production for
many years. It improves the signal to noise of scanners (e.g. CVE) and reduces the burden of
establishing provenance to just what you need.
Distroless images
Distroless images contains less packages compared to other images, and does not includes
shell, which reduce the attack surface.
Scratch image
An empty image, ideal for statically compiled languages like Go. Because the image is empty -
the attack surface it truly minimal - only your code!
What is being deployed - including information about the image being used, such as
components or vulnerabilities, and the pods that will be deployed
Where it is going to be deployed - which clusters, namespaces, and nodes
How it is deployed - whether it runs privileged, what other deployments it can communicate
with, the pod security context that is applied, if any
What it can access - including secrets, volumes, and other infrastructure components such
as the host or orchestrator API
Is it compliant - whether it complies with your policies and security requirements
To set the namespace for a current request, use the --namespace flag. Refer to the following
examples:
You can permanently save the namespace for all subsequent kubectl commands in that context.
Open Source projects such as ThreatMapper can assist in identifying and prioritizing
vulnerabilities.
Regularly Apply Security Updates to Your Environment
In case vulnerabilities are found in running containers, it is recommended to always update the
source image and redeploy the containers.
NOTE
Try to avoid direct updates to the running containers as this can break the image-container
relationship.
Example: apt-update
Upgrading containers is extremely easy with the Kubernetes rolling updates feature - this allows
gradually updating a running application by upgrading its images to the latest version.
Pod Security Policies are one way to control the security-related attributes of pods, including
container privilege levels. These can allow an operator to specify the following:
When designing your containers and pods, make sure that you configure the security context for
your pods, containers and volumes to grant only the privileges needed for the resource to
function. Some of the important parameters are as follows:
SecurityContext-
Indicates that containers should run as non-root user
>runAsNonRoot
apiVersion: v1
kind: Pod
metadata:
name: hello-world
spec:
containers:
# specification of the pod’s containers
# ...
# ...
# Security Context
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
For more information on security context for Pods, refer to the documentation at
https://kubernetes.io/docs/tasks/configure-pod-container/security-context
Implement Service Mesh
A service mesh is an infrastructure layer for microservices applications that can help reduce the
complexity of managing microservices and deployments by handling infrastructure service
communication quickly, securely and reliably. Service meshes are great at solving operational
challenges and issues when running containers and microservices because they provide a
uniform way to secure, connect and monitor microservices. Service mesh provides the following
advantages:
Observability
Service Mesh provides tracing and telemetry metrics that make it easy to understand your
system and quickly root cause any problems.
Security
A service mesh provides security features aimed at securing the services inside your network
and quickly identifying any compromising traffic entering your cluster. A service mesh can help
you more easily manage security through mTLS, ingress and egress control, and more.
mTLS and Why it Matters Securing microservices is hard. There are a multitude of tools that
address microservices security, but service mesh is the most elegant solution for
addressing encryption of on-the-wire traffic within the network. Service mesh provides
defense with mutual TLS (mTLS) encryption of the traffic between your services. The mesh
can automatically encrypt and decrypt requests and responses, removing that burden from
the application developer. It can also improve performance by prioritizing the reuse of
existing, persistent connections, reducing the need for the computationally expensive
creation of new ones. With service mesh, you can secure traffic over the wire and also make
strong identity-based authentication and authorizations for each microservice. We see a lot
of value in this for enterprise companies. With a good service mesh, you can see whether
mTLS is enabled and working between each of your services and get immediate alerts if
security status changes.
Ingress & Egress Control Service mesh adds a layer of security that allows you to monitor
and address compromising traffic as it enters the mesh. Istio integrates with Kubernetes as
an ingress controller and takes care of load balancing for ingress. This allows you to add a
level of security at the perimeter with ingress rules. Egress control allows you to see and
manage external services and control how your services interact with them.
Operational Control
A service mesh allows security and platform teams to set the right macro controls to enforce
access controls, while allowing developers to make customizations they need to move quickly
within these guardrails.
RBAC
A strong Role Based Access Control (RBAC) system is arguably one of the most critical
requirements in large engineering organizations, since even the most secure system can be
easily circumvented by overprivileged users or employees. Restricting privileged users to least
privileges necessary to perform job responsibilities, ensuring access to systems are set to “deny
all” by default, and ensuring proper documentation detailing roles and responsibilities are in
place is one of the most critical security concerns in the enterprise.
Disadvantages
Along with the many advantages, Service mesh also brings in its set of challenges, few of them
are listed below:
Added Complexity: The introduction of proxies, sidecars and other components into an
already sophisticated environment dramatically increases the complexity of development
and operations.
Required Expertise: Adding a service mesh such as Istio on top of an orchestrator such as
Kubernetes often requires operators to become experts in both technologies.
Slowness: Service meshes are an invasive and intricate technology that can add significant
slowness to an architecture.
Adoption of a Platform: The invasiveness of service meshes force both developers and
operators to adapt to a highly opinionated platform and conform to its rules.
Application authorization
OPA enables you to accelerate time to market by providing pre-cooked authorization technology
so you don’t have to develop it from scratch. It uses a declarative policy language purpose built
for writing and enforcing rules such as, “Alice can write to this repository,” or “Bob can update
this account.” It comes with a rich suite of tooling to help developers integrate those policies into
their applications and even allow the application’s end users to contribute policy for their tenants
as well.
If you have homegrown application authorization solutions in place, you may not want to rip
them out to swap in OPA. At least not yet. But if you are going to be decomposing those
monolithic apps and moving to microservices to scale and improve developer efficiency, you’re
going to need a distributed authorization system and OPA is the answer.
Kubernetes has given developers tremendous control over the traditional silos of compute,
networking and storage. Developers today can set up the network the way they want and set up
storage the way they want. Administrators and security teams responsible for the well-being of a
given container cluster need to make sure developers don’t shoot themselves (or their
neighbors) in the foot.
OPA can be used to build policies that require, for example, all container images to be from
trusted sources, that prevent developers from running software as root, that make sure storage
is always marked with the encrypt bit, that storage does not get deleted just because a pod gets
restarted, that limits internet access, etc.
OPA integrates directly into the Kubernetes API server, so it has complete authority to reject any
resource—whether compute, networking, storage, etc.—that policy says doesn’t belong in a
cluster. Moreover, you can expose those policies earlier in the development lifecycle (e.g. the
CICD pipeline or even on developer laptops) so that developers can receive feedback as early as
possible. You can even run policies out-of-band to monitor results so that administrators can
ensure policy changes don’t inadvertently do more damage than good.
And finally, many organizations are using OPA to regulate use of service mesh architectures. So,
even if you’re not embedding OPA to implement application authorization logic (the top use case
discussed above), you probably still want control over the APIs microservices. You can execute
and achieve that by putting authorization policies into the service mesh. Or, you may be
motivated by security, and implement policies in the service mesh to limit lateral movement
within a microservice architecture. Another common practice is to build policies into the service
mesh to ensure your compliance regulations are satisfied even when modification to source
code is involved.
Limit ranges restrict the maximum or minimum size of some of the resources above, to prevent
users from requesting unreasonably high or low values for commonly reserved resources like
memory, or to provide default limits when none are specified
An option of running resource-unbound containers puts your system in risk of DoS or “noisy
neighbor” scenarios. To prevent and minimize those risks you should define resource quotas. By
default, all resources in Kubernetes cluster are created with unbounded CPU and memory
requests/limits. You can create resource quota policies, attached to Kubernetes namespace, in
order to limit the CPU and memory a pod is allowed to consume.
The following is an example for namespace resource quota definition that will limit number of
pods in the namespace to 4, limiting their CPU requests between 1 and 2 and memory requests
between 1GB to 2GB.
compute-resources.yaml:
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-resources
spec:
hard:
pods: "4"
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
By default, Kubernetes allows every pod to contact every other pod. Traffic to a pod from an
external network endpoint outside the cluster is allowed if ingress from that endpoint is allowed
to the pod. Traffic from a pod to an external network endpoint outside the cluster is allowed if
egress is allowed from the pod to that endpoint.
Network segmentation policies are a key security control that can prevent lateral movement
across containers in the case that an attacker breaks in. One of the challenges in Kubernetes
deployments is creating network segmentation between pods, services and containers. This is a
challenge due to the “dynamic” nature of container network identities (IPs), along with the fact
that containers can communicate both inside the same node or between nodes.
Users of Google Cloud Platform can benefit from automatic firewall rules, preventing cross-
cluster communication. A similar implementation can be deployed on-premises using network
firewalls or SDN solutions. There is work being done in this area by the Kubernetes Network SIG,
which will greatly improve the pod-to-pod communication policies. A new network policy API
should address the need to create firewall rules around pods, limiting the network access that a
containerized can have.
The following is an example of a network policy that controls the network for “backend” pods,
only allowing inbound network access from “frontend” pods:
POST /apis/net.alpha.kubernetes.io/v1alpha1/namespaces/tenant-a/networkpolicys
{
"kind": "NetworkPolicy",
"metadata": {
"name": "pol1"
},
"spec": {
"allowIncoming": {
"from": [{
"pods": { "segment": "frontend" }
}],
"toPorts": [{
"port": 80,
"protocol": "TCP"
}]
},
"podSelector": {
"segment": "backend"
}
}
}
For more information on configuring network policies, refer to the Kubernetes documentation at
https://kubernetes.io/docs/concepts/services-networking/network-policies.
Securing data
In Kubernetes, a Secret is a small object that contains sensitive data, like a password or token. It
is important to understand how sensitive data such as credentials and keys are stored and
accessed. Even though a pod is not able to access the secrets of another pod, it is crucial to
keep the secret separate from an image or pod. Otherwise, anyone with access to the image
would have access to the secret as well. Complex applications that handle multiple processes
and have public access are especially vulnerable in this regard. It is best for secrets to be
mounted into read-only volumes in your containers, rather than exposing them as environment
variables.
The etcd database in general contains any information accessible via the Kubernetes API and
may grant an attacker significant visibility into the state of your cluster.
Always encrypt your backups using a well reviewed backup and encryption solution, and
consider using full disk encryption where possible.
Kubernetes supports encryption at rest, a feature introduced in 1.7, and v1 beta since 1.13. This
will encrypt Secret resources in etcd, preventing parties that gain access to your etcd backups
from viewing the content of those secrets. While this feature is currently beta, it offers an
additional level of defense when backups are not encrypted or an attacker gains read access to
etcd.
You may want to consider using an external secrets manager to store and manage your secrets
rather than storing them in Kubernetes Secrets. This provides a number of benefits over using
Kubernetes Secrets, including the ability to manage secrets across multiple clusters (or clouds),
and the ability to manage and rotate secrets centrally.
For more information on Secrets and their alternatives, refer to the documentation at
https://kubernetes.io/docs/concepts/configuration/secret/.
Open-source tools such as SecretScanner and ThreatMapper can scan container filesystems for
sensitive resources, such as API tokens, passwords, and keys. Such resources would be
accessible to any user who had access to the unencrypted container filesystem, whether during
build, at rest in a registry or backup, or running.
Review the secret material present on the container against the principle of 'least priviledge',
and to assess the risk posed by a compromise.
Proactively securing your containers and Kubernetes deployments at the build and deploy
phases can greatly reduce the likelihood of security incidents at runtime and the subsequent
effort needed to respond to them.
First, you must monitor the most security-relevant container activities, including:
Process activity
Network communications among containerized services
Network communications between containerized services and external clients and servers
Observing container behavior to detect anomalies is generally easier in containers than in virtual
machines because of the declarative nature of containers and Kubernetes. These attributes
allow easier introspection into what you have deployed and its expected activity.
Use Pod Security Policies to prevent risky containers/Pods from being used
PodSecurityPolicy is a cluster-level resources available in Kubernetes (via kubectl) that is highly
recommended. You must enable the PodSecurityPolicy admission controller to use it. Given the
nature of admission controllers, you must authorize at least one policy - otherwise no pods will
be allowed to be created in the cluster.
Pod Security Policies address several critical security use cases, including:
Preventing containers from running with privileged flag - this type of container will have
most of the capabilities available to the underlying host. This flag also overwrites any rules
you set using CAP DROP or CAP ADD.
Preventing sharing of host PID/IPC namespace, networking, and ports - this step ensures
proper isolation between Docker containers and the underlying host
Limiting use of volume types - writable hostPath directory volumes, for example, allow
containers to write to the filesystem in a manner that allows them to traverse the host
filesystem outside the pathPrefix, so readOnly: true must be used
Putting limits on host filesystem use
Enforcing read only for root file system via the ReadOnlyRootFilesystem
Preventing privilege escalation to root privileges
Rejecting containers with root privileges
Restricting Linux capabilities to bare minimum in adherence with least privilege principles
Container Sandboxing
Container runtimes typically are permitted to make direct calls to the host kernel then the kernel
interacts with hardware and devices to respond to the request. Cgroups and namespaces exist
to give containers a certain amount of isolation but the still kernel presents a large attack
surface area. Often times in multi-tenant and highly untrusted clusters an additional layer of
sandboxing is required to ensure container breakout and kernel exploits are not present. Below
we will explore a few OSS technologies that help further isolate running containers from the host
kernel:
Kata Containers: Kata Containers is OSS project that uses stripped-down VMs to keep the
resource footprint minimal and maximize performance to ultimately isolate containers
further.
gVisor : gVisor is a more lightweight than a VM (even stripped down). gVisor is its own
independent kernel written in Go to sit in the middle of a container and the host kernel.
Strong sandbox. gVisor supports ~70% of the linux system calls from the container but
ONLY uses about 20 system calls to the host kernel.
Firecracker: Firecracker super lightweight VM that runs in user space. Locked down by
seccomp, cgroup and namespace policies so system calls are very limited. Firecracker is
built with security in mind but may not support all Kubernetes or container runtime
deployments.
To prevent specific modules from being automatically loaded, you can uninstall them from the
node, or add rules to block them. On most Linux distributions, you can do that by creating a file
such as /etc/modprobe.d/kubernetes-blacklist.conf with contents like:
# SCTP is not used in most Kubernetes clusters, and has also had
# vulnerabilities in the past.
blacklist sctp
To block module loading more generically, you can use a Linux Security Module (such as
SELinux) to completely deny the module_request permission to containers, preventing the
kernel from loading modules for containers under any circumstances. (Pods would still be able to
use modules that had been loaded manually, or modules that were loaded by the kernel on
behalf of some more-privileged process.
At the same time, comparing the active traffic with what’s allowed gives you valuable information
about what isn’t happening but is allowed. With that information, you can further tighten your
allowed network policies so that it removes superfluous connections and decreases your attack
surface.
Logging
Kubernetes supplies cluster-based logging, allowing to log container activity into a central log
hub. When a cluster is created, the standard output and standard error output of each container
can be ingested using a Fluentd agent running on each node into either Google Stackdriver
Logging or into Elasticsearch and viewed with Kibana.
The audit logger is a beta feature that records actions taken by the API for later analysis in the
event of a compromise. It is recommended to enable audit logging and archive the audit file on a
secure server
Ensure logs are monitoring for anomalous or unwanted API calls, especially any authorization
failures (these log entries will have a status message “Forbidden”). Authorization failures could
mean that an attacker is trying to abuse stolen credentials.
Managed Kubernetes providers, including GKE, provide access to this data in their cloud console
and may allow you to set up alerts on authorization failures.
Audit logs
Audit logs can be useful for compliance as they should help you answer the questions of what
happened, who did what and when. Kubernetes provides flexible auditing of kube-apiserver
requests based on policies. These help you track all activities in chronological order.
{
"kind":"Event",
"apiVersion":"audit.k8s.io/v1beta1",
"metadata":{ "creationTimestamp":"2019-08-22T12:00:00Z" },
"level":"Metadata",
"timestamp":"2019-08-22T12:00:00Z",
"auditID":"23bc44ds-2452-242g-fsf2-4242fe3ggfes",
"stage":"RequestReceived",
"requestURI":"/api/v1/namespaces/default/persistentvolumeclaims",
"verb":"list",
"user": {
"username":"[email protected]",
"groups":[ "system:authenticated" ]
},
"sourceIPs":[ "172.12.56.1" ],
"objectRef": {
"resource":"persistentvolumeclaims",
"namespace":"default",
"apiVersion":"v1"
},
"requestReceivedTimestamp":"2019-08-22T12:00:00Z",
"stageTimestamp":"2019-08-22T12:00:00Z"
}
Audit policy defines rules about what events should be recorded and what data they should
include. The audit policy object structure is defined in the audit.k8s.io API group. When an event
is processed, it's compared against the list of rules in order. The first matching rule sets the
"audit level" of the event.
You can pass a file with the policy to kube-apiserver using the --audit-policy-file flag. If the flag
is omitted, no events are logged. Note that the rules field must be provided in the audit policy
file. A policy with no (0) rules is treated as illegal.
Understanding Logging
One main challenge with logging Kubernetes is understanding what logs are generated and how
to use them. Let’s start by examining the Kubernetes logging architecture from a birds eye view.
Container logging
The first layer of logs that can be collected from a Kubernetes cluster are those being generated
by your containerized applications.
The easiest method for logging containers is to write to the standard output (stdout) and
standard error (stderr) streams.
Manifest is as follows.
apiVersion: v1
kind: Pod
metadata:
name: example
spec:
containers:
- name: example
image: busybox
args: [/bin/sh, -c, 'while true; do echo $(date); sleep 1; done']
For persisting container logs, the common approach is to write logs to a log file and then
use a sidecar container. As shown below in the pod configuration above, a sidecar container
will run in the same pod along with the application container, mounting the same volume
and processing the logs separately.
apiVersion: v1
kind: Pod
metadata:
name: example
spec:
containers:
- name: example
image: busybox
args:
- /bin/sh
- -c
- >
while true;
do
echo "$(date)\n" >> /var/log/example.log;
sleep 1;
done
volumeMounts:
- name: varlog
mountPath: /var/log
- name: sidecar
image: busybox
args: [/bin/sh, -c, 'tail -f /var/log/example.log']
volumeMounts:
- name: varlog
mountPath: /var/log
volumes:
- name: varlog
emptyDir: {}
Node logging
When a container running on Kubernetes writes its logs to stdout or stderr streams, the
container engine streams them to the logging driver configured in Kubernetes.
In most cases, these logs will end up in the /var/log/containers directory on your host. Docker
supports multiple logging drivers but unfortunately, driver configuration is not supported via the
Kubernetes API.
Once a container is terminated or restarted, kubelet stores logs on the node. To prevent these
files from consuming all of the host’s storage, the Kubernetes node implements a log rotation
mechanism. When a container is evicted from the node, all containers with corresponding log
files are evicted.
Depending on what operating system and additional services you’re running on your host
machine, you might need to take a look at additional logs. For example, systemd logs can be
retrieved using the following command:
journalctl -u
Cluster logging
On the level of the Kubernetes cluster itself, there is a long list of cluster components that can
be logged as well as additional data types that can be used (events, audit logs). Together, these
different types of data can give you visibility into how Kubernetes is performing as a ystem.
Some of these components run in a container, and some of them run on the operating system
level (in most cases, a systemd service). The systemd services write to journald, and
components running in containers write logs to the /var/log directory, unless the container
engine has been configured to stream logs differently.
Events
Kubernetes events can indicate any Kubernetes resource state changes and errors, such as
exceeded resource quota or pending pods, as well as any informational messages. Kubernetes
events can indicate any Kubernetes resource state changes and errors, such as exceeded
resource quota or pending pods, as well as any informational messages.
The following command will show the latest events for this specific Kubernetes resource:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned
Normal Pulled 13m kubelet, aks-agentpool-42213468-1 Container image "aksre
Normal Created 13m kubelet, aks-agentpool-42213468-1 Created container core
Normal Started 13m kubelet, aks-agentpool-42213468-1 Started container core
Final thoughts
For example, a deployment containing a vulnerability with severity score of 7 or greater should
be moved up in remediation priority if that deployment contains privileged containers and is
open to the Internet but moved down if it’s in a test environment and supporting a non-critical
app.
References