0% found this document useful (0 votes)
70 views

OpenStack Undercloud & Overcloud Health Check

The document provides health check procedures for OpenStack undercloud and overcloud controllers. It lists commands to check services, systemd units, containers, databases, and message brokers to validate the health of an OpenStack deployment.

Uploaded by

Iki Arif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

OpenStack Undercloud & Overcloud Health Check

The document provides health check procedures for OpenStack undercloud and overcloud controllers. It lists commands to check services, systemd units, containers, databases, and message brokers to validate the health of an OpenStack deployment.

Uploaded by

Iki Arif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Openstack Undercloud – Overcloud Health Check

STANDARD MAINTENANCE PROCEDURE

Prepared For:
PT Telekomunikasi Selular (Telkomsel)

Date delivered:
Table of Contents
Table of Contents 2
Document Information 3
Originator 3
Owner 3
Copyright 3
Distribution 3
Confidentiality 3
Additional Copies 3
Purpose 4
Executive Summary 4
Openstack Undercloud Health Check 5
Openstack Overcloud Health Check 6
Document Information
Originator

Red Hat Consulting

Owner

Red Hat Consulting - Confidential / Restricted Distribution

Copyright

This document contains proprietary information which is for exclusive use of Red Hat, Inc. and is not to be
shared with personnel other than Red Hat, Inc. This document, and any portion thereof, may not be
copied, reproduced, photocopied, stored electronically on a retrieval system, or transmitted without the
express written consent of the owner.
Red Hat Consulting does not warrant this document to be free of errors or omissions. Red Hat Consulting
reserves the right to make corrections, updates, revisions, or changes to the information contained
herein. Red Hat Consulting does not warrant the material described herein to be free of patent
infringement.
Unless provided otherwise in writing BY RED HAT Consulting, the information and programs described
herein are provided “as is” without warranty of any kind, including but not limited to the implied
warranties of merchantability and fitness for a particular purpose. In no event will RED HAT Consulting,
its officers, directors, or employees or affiliates of RED HAT Consulting, their respective officers,
directors, or employees be liable to any entity for any special, collateral, incidental, or consequential
damages, including without any limitation, for any lost profits or lost savings, related or arising in any
way from or out of the use or inability to use the information or programs set forth herein, even if it has
been notified of the possibility of such damage by the purchaser or any third party.

Distribution

Do not forward or copy without written permission from Red Hat Consulting.
Copies of this document are restricted to the following names:
Red Hat, Inc.
Telkomsel

Confidentiality

All information supplied to Telkomsel for the purpose of this engagement is to be considered Red Hat
confidential.

Additional Copies

Additional copies of this document can be obtained from the Service Delivery Manager listed in the Red
Hat Consulting Contact Information section.
Purpose

This document is written to acknowledge acceptance of services provided by Red Hat to Telkomsel via
authorized signatory for each test case.

Executive Summary

Unless provided otherwise in writing BY RED HAT Consulting Services, the information and programs
described herein are provided “as is” without warranty of any kind, including but not limited to the
implied warranties of merchantability and fitness for a particular purpose. In no event will RED HAT
Consulting Services, its officers, directors, or employees or affiliates of RED HAT Consulting Services,
their respective officers, directors, or employees be liable to any entity for any special, collateral,
incidental, or consequential damages, including without any limitation, for any lost profits or lost savings,
related or arising in any way from or out of the use or inability to use the information or programs set
forth herein, even if it has been notified of the possibility of such damage by the purchaser or any third
party.
OpenStack Undercloud Health Check
Note: For overcloud controller health check, please refer to Openstack Overcloud
Controller Health Check

• The best approach to this is to check the agents and services using the following
commands. These commands validate that the services have sent an heartbeat
recently:

<Red Hat OpenStack Platform 16 and earlier>


$ source stackrc
$ openstack network agent list
$ openstack compute service list

Validate the systemd unit services using this command. It doesn't validate the heartbeat:

<Red Hat OpenStack Platform 16 and later>


# systemctl list-units --all | grep "tripleo_.*failed\|openvswitch.*failed"

• [Only applicable for Red Hat OpenStack Platform 16 and later] Validate the
containers using this command.

<Red Hat OpenStack Platform 16 and later>


# podman ps

• When RabbitMQ or MariaDB fails, the agents will be failing on reporting status as
they are using RPC communication to update the database.
• Validate a MySQL connection to the Galera cluster (on all nodes):

<Red Hat OpenStack Platform 16 and later>


# podman exec -u root mysql mysql -u root -e exit

• Validate the health of RabbitMQ with this command (on all nodes):

<Red Hat OpenStack Platform 16 and later>


# podman exec -u root rabbitmq rabbitmqctl node_health_check
OpenStack Overcloud Controller Health Check
Issue

After an outage or a reboot, it's important to validate the health of various OpenStack services.

Resolution

Note: For undercloud health check, please refer to OpenStack Undercloud Health Check

• The best approach to this is to check the agents and services using the following
commands. These commands validate that the services have sent an heartbeat recently:

Raw

$ source overcloudrc
$ openstack network agent list
$ openstack volume service list
$ openstack compute service list

Validate the systemd unit services using this command. It doesn't validate the heartbeat:

• <Red Hat OpenStack Platform 16 and later>


• # systemctl list-units --all | grep "tripleo_.*failed\|openvswitch.*failed"

• Note: Sometimes tripleo_XXXX_healthcheck.service has a bug and reports failed status


wrongly. Please search relevant bugs when you find failed healthcheck services before
judging that the environment is unhealthy.
• [Only applicable for Red Hat OpenStack Platform 13 and later] Validate the containers
using this command.

<Red Hat OpenStack Platform 16 and later>


# podman ps

• When RabbitMQ or MariaDB fails, the agents will be failing on reporting status as
they are using RPC communication to update the database.
• Validate a MySQL connection to the Galera cluster (on all nodes):

<Red Hat OpenStack Platform 16 and later>


# podman exec -u root $(podman ps -q -f name=galera) mysql -u root -e exit
• Validate the Galera cluster (on all nodes):

<Red Hat OpenStack Platform 16 and later>


# podman exec -u root $(podman ps -q -f name=galera) clustercheck

• Check for any stopped resources or failed actions across the cluster (any node):

# pcs status

• Validate the health of RabbitMQ with this command (on all nodes):

<Red Hat OpenStack Platform 16 and later>


# podman exec -u root $(podman ps -q -f name=rabbitmq) rabbitmqctl
node_health_check

• Make sure that there's no partition on the Rabbit cluster:

<Red Hat OpenStack Platform 16 and later>


# podman exec -u root $(podman ps -q -f name=rabbitmq) rabbitmqctl
cluster_status

Reference :
https://access.redhat.com/solutions/7007549
https://access.redhat.com/solutions/3312561

You might also like