×
Community Blog ACK Container Storage Monitoring: Making Your Applications Run More Stably and Transparently

ACK Container Storage Monitoring: Making Your Applications Run More Stably and Transparently

This article describes the upgraded container monitoring system of ACK, including the display and overview of major dashboard interfaces.

Background

With the increasing popularity of containerized applications, the growth of business scale, and the increase in complexity, it is crucial to monitor the storage performance and status of containerized applications in real time and accurately. The container storage monitoring feature provided by Alibaba Cloud Container Service for Kubernetes (ACK) is designed to solve this problem and provide users with more comprehensive and in-depth insights into storage resources. Based on Kubernetes, ACK provides a variety of container data volumes. The following types of data volumes are commonly used:

Local storage: Data volumes whose data is stored on the node where the pod resides, such as hostPath and emptyDir.

Secret and ConfigMap: Two special data volumes that contain information about objects in the cluster.

PVC: A way to define data volumes external to a cluster, which enables external storage media to be connected to the cluster. In ACK clusters, the three types of Alibaba Cloud storage services that we recommend are disk, NAS, and OSS. They cover different business scenarios. After you connect Alibaba Cloud storage services to a cluster by using PVC, ACK can help you better meet your business requirements and broaden your business scenarios.

Secret and ConfigMap are only used to store cluster object information and have low storage observability requirements due to their specificity. Therefore, we have upgraded the container storage monitoring feature of ACK to use local storage and PVC. This update improves the monitoring capabilities of different storage types in the cluster. It not only optimizes the existing monitoring dashboards but also launches new monitoring dashboards for different cloud storage types, ensuring that users can better understand and manage the storage resources of container business applications.

Optimization Ideas for New Dashboards

The new storage monitoring dashboards cover two storage types of Kubernetes clusters: internal storage and external storage. Internal storage support:

RootFS monitoring: You can view the storage usage and real-time read/write rate of the container RootFS.

Pod temporary storage monitoring: You can view the temporary storage usage and Inode usage.

External storage support:

Disk volume monitoring: You can view the summary information of a disk volume (including the name, namespace, and storage usage), the real-time read/write rate, read/write latency, and read/write throughput.

NAS volume monitoring: You can view the summary information of a NAS volume (including the name, namespace, and space usage), real-time read/write rate, read/write latency, and read/write throughput.

OSS volume monitoring: You can view the summary information of an OSS volume (including the name, namespace, and storage usage), the real-time read/write rate, read/write throughput, the number of OSS operations performed per second, the number of POSIX operations performed per second, and hot files of OSS volumes.

In terms of specific improvements, this update launches three new monitoring dashboards: disk, NAS, and OSS, and also adds storage-related monitoring charts to the three existing monitoring dashboards: cluster, node, and pod.

Content Display of New Dashboards

Disk Volume Monitoring

The following figures show the monitoring dashboard of a disk volume.

1
2

The PVC information table provides details about all disk volumes, including the PV and PVC names, namespaces, nodes, device names, and storage usage. The pod information table shows the mapping between PVCs and workloads. Other charts in the dashboard show disk space usage and real-time data read/write.

NAS Volume Monitoring

The following figures show the monitoring dashboard of NAS volumes.

3
4

The content of the NAS monitoring dashboard is basically the same as that of the disk monitoring dashboard. The only difference is that the PVC information table displays the mount target of each NAS volume instead of the device number.

OSS Volume Monitoring

The monitoring dashboard of OSS volumes is as follows:

5
6

The PVC information table provides the bucket name and other basic information of each OSS volume. In addition to real-time data read/write, the OSS dashboard also provides statistics on the number of OSS operations and POSIX operations performed per second, as well as statistics on hot paths of read and write operations.

Cluster Monitoring

The PVC summary table is added to the cluster monitoring dashboard. You can view the basic information of all PVCs in the cluster.

7

Node Monitoring

The node monitoring dashboard displays the PVC summary information and data read/write.

8

Pod Monitoring

The monitoring data for the root file system (RootFS) and ephemeral storage is added to the pod monitoring dashboard. The metrics related to the root file system show the RootFS information of all containers in the pod, such as the total amount of available storage space and the total amount of data read/write. The metrics related to ephemeral storage cover the following three sections:

• Other types of emptyDir that are mounted to pods except for the TMPFS type
• Pod log files on nodes
• Writable layers for all containers in a pod

Enhanced Monitoring Capabilities for Ephemeral Storage

Note that the Ephemeral Storage Usage (%) chart in the dashboard only displays information for all containers that have the resources.limits.ephemeral-storage configured.

It is also worth noting that RootFS-related metrics are provided by the cAdvisor component. When the container runtime is containerd, cAdvisor does not provide monitoring metrics at the pod level. Although the community tried to fix this problem, the fix was eventually rolled back because the two-party dependency brought about by the code implementation broke the Kubernetes one-way dependency principle. Since containerd is the default container runtime for ACK clusters of v1.24 or later, we have fixed this issue in the ACK csi-plugin component. You only need to install the csi-plugin component of v1.28.3-eb95171-aliyun or later to solve the issue of missing ephemeral storage monitoring data in pods.

9
10

Summary

Container storage serves as the runtime data safeguard for containerized applications. This update of container storage monitoring enables users to manage the storage details within clusters in a comprehensive and fine-grained manner. It quickly locates potential I/O bottlenecks and problems during business operations, thereby better ensuring the smooth operation of the business.

References

[1] Resource Management for Pods and Containers

[2] Monitoring cAdvisor with Prometheus

0 1 0
Share on

Alibaba Container Service

194 posts | 33 followers

You may also like

Comments

Alibaba Container Service

194 posts | 33 followers

Related Products