Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
The text of and illustrations in this document are licensed by Red Hat under a Creative Commons
Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is
available at
http://creativecommons.org/licenses/by-sa/3.0/
. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must
provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert,
Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift,
Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States
and other countries.
Linux ® is the registered trademark of Linus Torvalds in the United States and other countries.
XFS ® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States
and/or other countries.
MySQL ® is a registered trademark of MySQL AB in the United States, the European Union and
other countries.
Node.js ® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the
official Joyent Node.js open source or commercial project.
The OpenStack ® Word Mark and OpenStack logo are either registered trademarks/service marks
or trademarks/service marks of the OpenStack Foundation, in the United States and other
countries and are used with the OpenStack Foundation's permission. We are not affiliated with,
endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
Abstract
As a system administrator, you can configure the Linux kernel to optimize the operating system.
Changes to the Linux kernel can improve system performance, security, and stability, as well as your
ability to audit the system and troubleshoot problems.
Table of Contents
Table of Contents
. . . . . . . . . .OPEN
MAKING . . . . . . SOURCE
. . . . . . . . . .MORE
. . . . . . .INCLUSIVE
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. . . . . . . . . . . . .
. . . . . . . . . . . . . FEEDBACK
PROVIDING . . . . . . . . . . . . ON
. . . .RED
. . . . .HAT
. . . . .DOCUMENTATION
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. . . . . . . . . . . . .
.CHAPTER
. . . . . . . . . . 1.. .THE
. . . . .LINUX
. . . . . . .KERNEL
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. . . . . . . . . . . . .
1.1. WHAT THE KERNEL IS 8
1.2. RPM PACKAGES 8
1.3. THE LINUX KERNEL RPM PACKAGE OVERVIEW 9
1.4. DISPLAYING CONTENTS OF A KERNEL PACKAGE 10
1.5. INSTALLING SPECIFIC KERNEL VERSIONS 10
1.6. UPDATING THE KERNEL 11
1.7. SETTING A KERNEL AS DEFAULT 11
. . . . . . . . . . . 2.
CHAPTER . . THE
. . . . . 64K
. . . . .PAGE
. . . . . . SIZE
. . . . . KERNEL
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
..............
.CHAPTER
. . . . . . . . . . 3.
. . MANAGING
. . . . . . . . . . . . . KERNEL
. . . . . . . . . MODULES
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
..............
3.1. INTRODUCTION TO KERNEL MODULES 14
3.2. KERNEL MODULE DEPENDENCIES 14
3.3. LISTING INSTALLED KERNEL MODULES 15
3.4. LISTING CURRENTLY LOADED KERNEL MODULES 15
3.5. DISPLAYING INFORMATION ABOUT KERNEL MODULES 16
3.6. LOADING KERNEL MODULES AT SYSTEM RUNTIME 17
3.7. UNLOADING KERNEL MODULES AT SYSTEM RUNTIME 18
3.8. UNLOADING KERNEL MODULES AT EARLY STAGES OF THE BOOT PROCESS 19
3.9. LOADING KERNEL MODULES AUTOMATICALLY AT SYSTEM BOOT TIME 21
3.10. PREVENTING KERNEL MODULES FROM BEING AUTOMATICALLY LOADED AT SYSTEM BOOT TIME
21
3.11. COMPILING CUSTOM KERNEL MODULES 23
.CHAPTER
. . . . . . . . . . 4.
. . .CONFIGURING
. . . . . . . . . . . . . . . .KERNEL
. . . . . . . . .COMMAND-LINE
. . . . . . . . . . . . . . . . . .PARAMETERS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
..............
4.1. WHAT ARE KERNEL COMMAND-LINE PARAMETERS 26
4.2. UNDERSTANDING BOOT ENTRIES 26
4.3. CHANGING KERNEL COMMAND-LINE PARAMETERS FOR ALL BOOT ENTRIES 27
4.4. CHANGING KERNEL COMMAND-LINE PARAMETERS FOR A SINGLE BOOT ENTRY 28
4.5. CHANGING KERNEL COMMAND-LINE PARAMETERS TEMPORARILY AT BOOT TIME 29
4.6. CONFIGURING GRUB SETTINGS TO ENABLE SERIAL CONSOLE CONNECTION 29
4.7. CHANGING BOOT ENTRIES WITH THE GRUB CONFIGURATION FILE 30
.CHAPTER
. . . . . . . . . . 5.
. . CONFIGURING
. . . . . . . . . . . . . . . . KERNEL
. . . . . . . . . PARAMETERS
. . . . . . . . . . . . . . . .AT
. . .RUNTIME
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32
..............
5.1. WHAT ARE KERNEL PARAMETERS 32
5.2. CONFIGURING KERNEL PARAMETERS TEMPORARILY WITH SYSCTL 33
5.3. CONFIGURING KERNEL PARAMETERS PERMANENTLY WITH SYSCTL 34
5.4. USING CONFIGURATION FILES IN /ETC/SYSCTL.D/ TO ADJUST KERNEL PARAMETERS 34
5.5. CONFIGURING KERNEL PARAMETERS TEMPORARILY THROUGH /PROC/SYS/ 35
5.6. ADDITIONAL RESOURCES 36
.CHAPTER
. . . . . . . . . . 7.
. . APPLYING
. . . . . . . . . . . .PATCHES
. . . . . . . . . . .WITH
. . . . . .KERNEL
. . . . . . . . .LIVE
. . . . .PATCHING
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
..............
7.1. LIMITATIONS OF KPATCH 42
7.2. SUPPORT FOR THIRD-PARTY LIVE PATCHING 42
1
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
.CHAPTER
. . . . . . . . . . 8.
. . .KEEPING
. . . . . . . . . .KERNEL
. . . . . . . . .PANIC
. . . . . . .PARAMETERS
. . . . . . . . . . . . . . .DISABLED
. . . . . . . . . . . IN
. . .VIRTUALIZED
. . . . . . . . . . . . . . ENVIRONMENTS
. . . . . . . . . . . . . . . . . . . . . . . . .53
..............
8.1. WHAT IS A SOFT LOCKUP 53
8.2. PARAMETERS CONTROLLING KERNEL PANIC 53
8.3. SPURIOUS SOFT LOCKUPS IN VIRTUALIZED ENVIRONMENTS 54
. . . . . . . . . . . 9.
CHAPTER . . .ADJUSTING
. . . . . . . . . . . . .KERNEL
. . . . . . . . .PARAMETERS
. . . . . . . . . . . . . . .FOR
. . . . .DATABASE
. . . . . . . . . . . .SERVERS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
..............
9.1. INTRODUCTION TO DATABASE SERVERS 55
9.2. PARAMETERS AFFECTING PERFORMANCE OF DATABASE APPLICATIONS 55
. . . . . . . . . . . 10.
CHAPTER . . . GETTING
. . . . . . . . . . .STARTED
. . . . . . . . . .WITH
. . . . . .KERNEL
. . . . . . . . .LOGGING
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57
..............
10.1. WHAT IS THE KERNEL RING BUFFER 57
10.2. ROLE OF PRINTK ON LOG-LEVELS AND KERNEL LOGGING 57
.CHAPTER
. . . . . . . . . . 11.
. . .INSTALLING
. . . . . . . . . . . . . KDUMP
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59
..............
11.1. WHAT IS KDUMP 59
11.2. INSTALLING KDUMP USING ANACONDA 59
11.3. INSTALLING KDUMP ON THE COMMAND LINE 60
.CHAPTER
. . . . . . . . . . 12.
. . . CONFIGURING
. . . . . . . . . . . . . . . .KDUMP
. . . . . . . . ON
. . . . THE
. . . . . COMMAND
. . . . . . . . . . . . LINE
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
..............
12.1. ESTIMATING THE KDUMP SIZE 61
12.2. CONFIGURING KDUMP MEMORY USAGE ON RHEL 9 61
12.3. CONFIGURING THE KDUMP TARGET 63
12.4. CONFIGURING THE KDUMP CORE COLLECTOR 66
12.5. CONFIGURING THE KDUMP DEFAULT FAILURE RESPONSES 67
12.6. CONFIGURATION FILE FOR KDUMP 68
12.7. TESTING THE KDUMP CONFIGURATION 68
12.8. ENABLING AND DISABLING THE KDUMP SERVICE 70
12.9. PREVENTING KERNEL DRIVERS FROM LOADING FOR KDUMP 71
12.10. RUNNING KDUMP ON SYSTEMS WITH ENCRYPTED DISK 72
.CHAPTER
. . . . . . . . . . 13.
. . . ENABLING
. . . . . . . . . . . .KDUMP
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .74
..............
13.1. ENABLING KDUMP FOR ALL INSTALLED KERNELS 74
13.2. ENABLING KDUMP FOR A SPECIFIC INSTALLED KERNEL 74
13.3. DISABLING THE KDUMP SERVICE 75
.CHAPTER
. . . . . . . . . . 14.
. . . SUPPORTED
. . . . . . . . . . . . . .KDUMP
. . . . . . . . CONFIGURATIONS
. . . . . . . . . . . . . . . . . . . . .AND
. . . . .TARGETS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .77
..............
14.1. MEMORY REQUIREMENTS FOR KDUMP 77
14.2. MINIMUM THRESHOLD FOR AUTOMATIC MEMORY RESERVATION 78
14.3. SUPPORTED KDUMP TARGETS 79
14.4. SUPPORTED KDUMP FILTERING LEVELS 81
14.5. SUPPORTED DEFAULT FAILURE RESPONSES 82
14.6. USING FINAL_ACTION PARAMETER 82
14.7. USING FAILURE_ACTION PARAMETER 82
2
Table of Contents
. . . . . . . . . . . 15.
CHAPTER . . . FIRMWARE
. . . . . . . . . . . . ASSISTED
. . . . . . . . . . . DUMP
. . . . . . . MECHANISMS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .84
..............
15.1. FIRMWARE ASSISTED DUMP ON IBM POWERPC HARDWARE 84
15.2. ENABLING FIRMWARE ASSISTED DUMP MECHANISM 84
15.3. FIRMWARE ASSISTED DUMP MECHANISMS ON IBM Z HARDWARE 85
15.4. USING SADUMP ON FUJITSU PRIMEQUEST SYSTEMS 86
. . . . . . . . . . . 16.
CHAPTER . . . ANALYZING
. . . . . . . . . . . . .A
. . CORE
. . . . . . .DUMP
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88
..............
16.1. INSTALLING THE CRASH UTILITY 88
16.2. RUNNING AND EXITING THE CRASH UTILITY 88
16.3. DISPLAYING VARIOUS INDICATORS IN THE CRASH UTILITY 90
16.4. USING KERNEL OOPS ANALYZER 92
16.5. THE KDUMP HELPER TOOL 93
. . . . . . . . . . . 17.
CHAPTER . . . USING
. . . . . . . EARLY
. . . . . . . .KDUMP
. . . . . . . . TO
. . . .CAPTURE
. . . . . . . . . . .BOOT
. . . . . . TIME
. . . . . . CRASHES
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94
..............
17.1. WHAT IS EARLY KDUMP 94
17.2. ENABLING EARLY KDUMP 94
. . . . . . . . . . . 18.
CHAPTER . . . SIGNING
. . . . . . . . . .A. .KERNEL
. . . . . . . . .AND
. . . . . MODULES
. . . . . . . . . . . FOR
. . . . . SECURE
. . . . . . . . . BOOT
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96
..............
18.1. PREREQUISITES 96
18.2. WHAT IS UEFI SECURE BOOT 97
18.3. UEFI SECURE BOOT SUPPORT 98
18.4. REQUIREMENTS FOR AUTHENTICATING KERNEL MODULES WITH X.509 KEYS 98
18.5. SOURCES FOR PUBLIC KEYS 99
18.6. GENERATING A PUBLIC AND PRIVATE KEY PAIR 100
18.7. EXAMPLE OUTPUT OF SYSTEM KEYRINGS 102
18.8. ENROLLING PUBLIC KEY ON TARGET SYSTEM BY ADDING THE PUBLIC KEY TO THE MOK LIST 103
18.9. SIGNING A KERNEL WITH THE PRIVATE KEY 104
18.10. SIGNING A GRUB BUILD WITH THE PRIVATE KEY 105
18.11. SIGNING KERNEL MODULES WITH THE PRIVATE KEY 106
18.12. LOADING SIGNED KERNEL MODULES 108
.CHAPTER
. . . . . . . . . . 19.
. . . UPDATING
. . . . . . . . . . . . THE
. . . . .SECURE
. . . . . . . . .BOOT
. . . . . . .REVOCATION
. . . . . . . . . . . . . . .LIST
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
...............
19.1. PREREQUISITES 110
19.2. WHAT IS UEFI SECURE BOOT 110
19.3. THE SECURE BOOT REVOCATION LIST 110
19.4. APPLYING AN ONLINE REVOCATION LIST UPDATE 111
19.5. APPLYING AN OFFLINE REVOCATION LIST UPDATE 112
. . . . . . . . . . . 20.
CHAPTER . . . .ENHANCING
. . . . . . . . . . . . . SECURITY
. . . . . . . . . . . .WITH
. . . . . .THE
. . . . KERNEL
. . . . . . . . . INTEGRITY
. . . . . . . . . . . . SUBSYSTEM
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113
..............
20.1. THE KERNEL INTEGRITY SUBSYSTEM 113
20.2. TRUSTED AND ENCRYPTED KEYS 114
20.3. WORKING WITH TRUSTED KEYS 115
20.4. WORKING WITH ENCRYPTED KEYS 116
20.5. ENABLING IMA AND EVM 117
20.6. COLLECTING FILE HASHES WITH INTEGRITY MEASUREMENT ARCHITECTURE 120
20.7. ADDING IMA SIGNATURES TO PACKAGE FILES 121
20.8. ENABLING KERNEL RUNTIME INTEGRITY MONITORING 122
20.9. CREATING CUSTOM IMA KEYS USING OPENSSL 123
20.10. DEPLOYING A CUSTOM SIGNED IMA POLICY FOR UEFI SYSTEMS 124
. . . . . . . . . . . 21.
CHAPTER . . . USING
. . . . . . . SYSTEMD
. . . . . . . . . . . TO
. . . .MANAGE
. . . . . . . . . .RESOURCES
. . . . . . . . . . . . . .USED
. . . . . .BY
. . . APPLICATIONS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
...............
21.1. ROLE OF SYSTEMD IN RESOURCE MANAGEMENT 126
21.2. DISTRIBUTION MODELS OF SYSTEM SOURCES 126
21.3. ALLOCATING SYSTEM RESOURCES USING SYSTEMD 127
21.4. OVERVIEW OF SYSTEMD HIERARCHY FOR CGROUPS 127
3
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
.CHAPTER
. . . . . . . . . . 22.
. . . .UNDERSTANDING
. . . . . . . . . . . . . . . . . . .CONTROL
. . . . . . . . . . . GROUPS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .140
...............
22.1. INTRODUCING CONTROL GROUPS 140
22.2. INTRODUCING KERNEL RESOURCE CONTROLLERS 141
22.3. INTRODUCING NAMESPACES 142
. . . . . . . . . . . 23.
CHAPTER . . . .USING
. . . . . . .CGROUPFS
. . . . . . . . . . . . .TO
. . . MANUALLY
. . . . . . . . . . . . .MANAGE
. . . . . . . . . .CGROUPS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .144
...............
23.1. CREATING CGROUPS AND ENABLING CONTROLLERS IN CGROUPS-V2 FILE SYSTEM 144
23.2. CONTROLLING DISTRIBUTION OF CPU TIME FOR APPLICATIONS BY ADJUSTING CPU WEIGHT 146
23.3. MOUNTING CGROUPS-V1 149
23.4. SETTING CPU LIMITS TO APPLICATIONS USING CGROUPS-V1 151
. . . . . . . . . . . 24.
CHAPTER . . . .ANALYZING
. . . . . . . . . . . . .SYSTEM
. . . . . . . . . PERFORMANCE
. . . . . . . . . . . . . . . . . WITH
. . . . . . BPF
. . . . . COMPILER
. . . . . . . . . . . .COLLECTION
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
...............
24.1. INSTALLING THE BCC-TOOLS PACKAGE 155
24.2. USING SELECTED BCC-TOOLS FOR PERFORMANCE ANALYSES 155
Using execsnoop to examine the system processes 155
Using opensnoop to track what files a command opens 156
Using biotop to examine the I/O operations on the disk 157
Using xfsslower to expose unexpectedly slow file system operations 158
4
Table of Contents
5
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
6
PROVIDING FEEDBACK ON RED HAT DOCUMENTATION
4. Enter your suggestion for improvement in the Description field. Include links to the relevant
parts of the documentation.
7
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
The Red Hat kernel is a custom-built kernel based on the upstream Linux mainline kernel that
Red Hat engineers further develop and harden with a focus on stability and compatibility with the latest
technologies and hardware.
Before Red Hat releases a new kernel version, the kernel needs to pass a set of rigorous quality
assurance tests.
The Red Hat kernels are packaged in the RPM format so that they are easily upgraded and verified by
the DNF package manager.
WARNING
Kernels that have not been compiled by Red Hat are not supported by Red Hat.
GPG signature
The GPG signature is used to verify the integrity of the package.
Header (package metadata)
The RPM package manager uses this metadata to determine package dependencies, where to install
files, and other information.
Payload
The payload is a cpio archive that contains files to install to the system.
There are two types of RPM packages. Both types share the file format and tooling, but have different
contents and serve different purposes:
8
CHAPTER 1. THE LINUX KERNEL
kernel-core
Contains the binary image of the Linux kernel (vmlinuz).
kernel-modules-core
Contains the basic kernel modules to ensure core functionality. This includes the modules essential
for the proper functioning of the most commonly used hardware.
kernel-modules
Contains the remaining kernel modules that are not present in kernel-core.
The kernel-core and kernel-modules-core sub-packages together can be used in virtualized and cloud
environments to provide a RHEL 9 kernel with a quick boot time and a small disk size footprint. kernel-
modules sub-package is usually unnecessary for such deployments.
kernel-modules-extra
Contains kernel modules for rare hardware and modules which loading is disabled by default.
kernel-debug
Contains a kernel with numerous debugging options enabled for kernel diagnosis, at the expense of
reduced performance.
kernel-tools
Contains tools for manipulating the Linux kernel and supporting documentation.
kernel-devel
Contains the kernel headers and makefiles sufficient to build modules against the kernel package.
kernel-abi-stablelists
Contains information pertaining to the RHEL kernel ABI, including a list of kernel symbols that are
needed by external Linux kernel modules and a dnf plug-in to aid enforcement.
kernel-headers
Includes the C header files that specify the interface between the Linux kernel and user-space
libraries and programs. The header files define structures and constants that are needed for building
most standard programs.
kernel-uki-virt
Contains the Unified Kernel Image (UKI) of the RHEL kernel.
UKI combines the Linux kernel, initramfs, and the kernel command line into a single signed binary
which can be booted directly from the UEFI firmware.
kernel-uki-virt contains the required kernel modules to run in virtualized and cloud environments
and can be used instead of the kernel-core sub-package.
IMPORTANT
Additional resources
9
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Use the dnf utility to query the file list, for example, of the kernel-core, kernel-modules-core, or
kernel-modules package. Note that the kernel package is a meta package that does not contain any
files.
Procedure
Additional resources
10
CHAPTER 1. THE LINUX KERNEL
Procedure
Additional resources
Procedure
This command updates the kernel along with all dependencies to the latest available version.
Additional resources
package manager
Procedure
Enter the following command to set the kernel as default using the grubby tool:
NOTE
List the boot entries using the id argument and then set an intended kernel as default:
11
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
NOTE
To list the boot entries using the title argument, execute the # grubby --
info=ALL | grep title command.
Execute the following command to set the default kernel for only the next reboot using the
grub2-reboot command:
# grub2-reboot <index|title|id>
WARNING
Set the default kernel for only the next boot with care. Installing new
kernel RPM’s, self-built kernels, and manually adding the entries to the
/boot/loader/entries/ directory may change the index values.
12
CHAPTER 2. THE 64K PAGE SIZE KERNEL
Optimal system performance directly relates to different memory configuration requirements. These
requirements are addressed by the two variants of kernel, each suitable for different workloads. RHEL 9
on 64-bit ARM hardware thus offers two MMU page sizes:
The 4k pages kernel and kernel-64k do not differ in the user experience as the user space is the same.
You can choose the variant that addresses your situation the best.
4k pages kernel
Use 4k pages for more efficient memory usage in smaller environments, such as those in Edge and
lower-cost, small cloud instances. In these environments, increasing the physical system memory
amounts is not practical due to space, power, and cost constraints. Also, not all 64-bit ARM
architecture processors support a 64k page size.
The 4k pages kernel supports graphical installation using Anaconda, system or cloud image-based
installations, as well as advanced installations using Kickstart.
kernel-64k
The 64k page size kernel is a useful option for large datasets on ARM platforms. kernel-64k is
suitable for memory-intensive workloads as it has significant gains in overall system performance,
namely in large database, HPC, and high network performance.
You must choose page size on 64-bit ARM architecture systems at the time of installation. You can
install kernel-64k only by Kickstart by adding the kernel-64k package to the package list in the
Kickstart file.
Additional resources
13
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
System calls
On modern systems, kernel modules are automatically loaded when needed. However, in some cases it is
necessary to load or unload modules manually.
Like the kernel itself, the modules can take parameters that customize their behavior if needed.
Tooling is provided to inspect which modules are currently running, which modules are available to load
into the kernel and which parameters a module accepts. The tooling also provides a mechanism to load
and unload kernel modules into the running kernel.
depmod
The dependency file is generated by the depmod program, which is a part of the kmod package. Many
of the utilities provided by kmod take module dependencies into account when performing operations
so that manual dependency-tracking is rarely necessary.
WARNING
weak-modules
In addition to depmod, Red Hat Enterprise Linux provides the weak-modules script shipped also with
the kmod package. weak-modules determines which modules are kABI-compatible with installed
14
CHAPTER 3. MANAGING KERNEL MODULES
kernels. While checking modules kernel compatibility, weak-modules processes modules symbol
dependencies from higher to lower release of kernel for which they were built. This means that weak-
modules processes each module independently of kernel release they were built against.
Additional resources
What is the purpose of weak-modules script shipped with Red Hat Enterprise Linux?
Procedure
The above example displays the installed kernels list of grubby-8.40-17, from the GRUB menu.
Prerequisites
Procedure
$ lsmod
15
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
nft_counter 16384 16
nf_nat_tftp 16384 0
nf_conntrack_tftp 16384 1 nf_nat_tftp
tun 49152 1
bridge 192512 0
stp 16384 1 bridge
llc 16384 2 bridge,stp
nf_tables_set 32768 5
nft_fib_inet 16384 1
…
b. The Size column displays the amount of memory per module in kilobytes.
c. The Used by column shows the number, and optionally the names of modules that are
dependent on a particular module.
Additional resources
Prerequisites
Procedure
$ modinfo <KERNEL_MODULE_NAME>
For example:
$ modinfo virtio_net
filename: /lib/modules/5.14.0-1.el9.x86_64/kernel/drivers/net/virtio_net.ko.xz
license: GPL
description: Virtio network driver
rhelversion: 9.0
srcversion: 8809CDDBE7202A1B00B9F1C
alias: virtio:d00000001v*
depends: net_failover
retpoline: Y
intree: Y
name: virtio_net
vermagic: 5.14.0-1.el9.x86_64 SMP mod_unload modversions
16
CHAPTER 3. MANAGING KERNEL MODULES
…
parm: napi_weight:int
parm: csum:bool
parm: gso:bool
parm: napi_tx:bool
You can query information about all available modules, regardless of whether they are loaded or
not. The parm entries show parameters the user is able to set for the module, and what type of
value they expect.
NOTE
When entering the name of a kernel module, do not append the .ko.xz extension
to the end of the name. Kernel module names do not have extensions; their
corresponding files do.
Additional resources
IMPORTANT
The changes described in this procedure will not persist after rebooting the system. For
information about how to load kernel modules to persist across system reboots, see
Loading kernel modules automatically at system boot time .
Prerequisites
Root permissions
The respective kernel module is not loaded. To ensure this is the case, list the loaded kernel
modules.
Procedure
# modprobe <MODULE_NAME>
NOTE
17
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
NOTE
When entering the name of a kernel module, do not append the .ko.xz extension
to the end of the name. Kernel module names do not have extensions; their
corresponding files do.
Verification
If the module was loaded correctly, this command displays the relevant kernel module. For
example:
Additional resources
WARNING
Do not unload kernel modules when they are used by the running system. Doing so
can lead to an unstable or non-operational system.
IMPORTANT
After finishing this procedure, the kernel modules that are defined to be automatically
loaded on boot, will not stay unloaded after rebooting the system. For information about
how to counter this outcome, see Preventing kernel modules from being automatically
loaded at system boot time.
Prerequisites
Root permissions
Procedure
18
CHAPTER 3. MANAGING KERNEL MODULES
# lsmod
# modprobe -r <MODULE_NAME>
When entering the name of a kernel module, do not append the .ko.xz extension to the end of
the name. Kernel module names do not have extensions; their corresponding files do.
Verification
If the module was unloaded successfully, this command does not display any output.
Additional resources
You can edit the relevant boot loader entry to unload the desired kernel module before the booting
sequence continues.
IMPORTANT
The changes described in this procedure will not persist after the next reboot. For
information about how to add a kernel module to a denylist so that it will not be
automatically loaded during the boot process, see Preventing kernel modules from being
automatically loaded at system boot time.
Prerequisites
You have a loadable kernel module that you want to prevent from loading for some reason.
Procedure
2. Use the cursor keys to highlight the relevant boot loader entry.
19
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
4. Use the cursor keys to navigate to the line that starts with linux.
The serio_raw kernel module illustrates a rogue module to be unloaded early in the boot
process.
Verification
Once the system fully boots, verify that the relevant kernel module is not loaded.
Additional resources
20
CHAPTER 3. MANAGING KERNEL MODULES
Prerequisites
Root permissions
Procedure
1. Select a kernel module you want to load during the boot process.
The modules are located in the /lib/modules/$(uname -r)/kernel/<SUBSYSTEM>/ directory.
NOTE
When entering the name of a kernel module, do not append the .ko.xz extension
to the end of the name. Kernel module names do not have extensions; their
corresponding files do.
The example command above should succeed and display the relevant kernel module.
IMPORTANT
The changes described in this procedure will persist after rebooting the system.
Additional resources
Prerequisites
The commands in this procedure require root privileges. Either use su - to switch to the root
user or preface the commands with sudo.
21
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Ensure that your current system configuration does not require a kernel module you plan to
deny.
Procedure
1. List modules loaded to the currently running kernel by using the lsmod command:
$ lsmod
Module Size Used by
tls 131072 0
uinput 20480 1
snd_seq_dummy 16384 0
snd_hrtimer 16384 1
…
In the output, identify the module you want to prevent from being loaded.
Alternatively, identify an unloaded kernel module you want to prevent from potentially
loading in the /lib/modules/<KERNEL-VERSION>/kernel/<SUBSYSTEM>/ directory, for
example:
$ ls /lib/modules/4.18.0-477.20.1.el8_8.x86_64/kernel/crypto/
ansi_cprng.ko.xz chacha20poly1305.ko.xz md4.ko.xz
serpent_generic.ko.xz
anubis.ko.xz cmac.ko.xz…
# touch /etc/modprobe.d/denylist.conf
3. In a text editor of your choice, combine the names of modules you want to exclude from
automatic loading to the kernel with the blacklist configuration command, for example:
Because the blacklist command does not prevent the module from being loaded as a
dependency for another kernel module that is not in a denylist, you must also define the install
line. In this case, the system runs /bin/false instead of installing the module. The lines starting
with a hash sign are comments you can use to make the file more readable.
NOTE
When entering the name of a kernel module, do not append the .ko.xz extension
to the end of the name. Kernel module names do not have extensions; their
corresponding files do.
4. Create a backup copy of the current initial RAM disk image before rebuilding:
22
CHAPTER 3. MANAGING KERNEL MODULES
Alternatively, create a backup copy of an initial RAM disk image which corresponds to the
kernel version for which you want to prevent kernel modules from automatic loading:
# dracut -f -v
If you build an initial RAM disk image for a different kernel version than your system
currently uses, specify both target initramfs and kernel version:
$ reboot
IMPORTANT
The changes described in this procedure will take effect and persistafter rebooting the
system. If you incorrectly list a key kernel module in the denylist, you can switch the
system to an unstable or non-operational state.
Additional resources
Prerequisites
You created the /root/testmodule/ directory where you compile the custom kernel module.
Procedure
23
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
#include <linux/module.h>
#include <linux/kernel.h>
int init_module(void)
{ printk("Hello World\n This is a test\n"); return 0; }
void cleanup_module(void)
{ printk("Good Bye World"); }
MODULE_LICENSE("GPL");
The test.c file is a source file that provides the main functionality to the kernel module. The file
has been created in a dedicated /root/testmodule/ directory for organizational purposes. After
the module compilation, the /root/testmodule/ directory will contain multiple files.
The linux/kernel.h header file is necessary for the printk() function in the example code.
The linux/module.h file contains function declarations and macro definitions to be shared
between several source files written in C programming language.
2. Follow the init_module() and cleanup_module() functions to start and end the kernel logging
function printk(), which prints text.
obj-m := test.o
The Makefile contains instructions that the compiler has to produce an object file specifically
named test.o. The obj-m directive specifies that the resulting test.ko file is going to be
compiled as a loadable kernel module. Alternatively, the obj-y directive would instruct to build
test.ko as a built-in kernel module.
The compiler creates an object file (test.o) for each source file ( test.c) as an intermediate step
before linking them together into the final kernel module (test.ko).
After a successful compilation, /root/testmodule/ contains additional files that relate to the
compiled custom kernel module. The compiled module itself is represented by the test.ko file.
Verification
24
CHAPTER 3. MANAGING KERNEL MODULES
# ls -l /root/testmodule/
total 152
-rw-r—r--. 1 root root 16 Jul 26 08:19 Makefile
-rw-r—r--. 1 root root 25 Jul 26 08:20 modules.order
-rw-r—r--. 1 root root 0 Jul 26 08:20 Module.symvers
-rw-r—r--. 1 root root 224 Jul 26 08:18 test.c
-rw-r—r--. 1 root root 62176 Jul 26 08:20 test.ko
-rw-r—r--. 1 root root 25 Jul 26 08:20 test.mod
-rw-r—r--. 1 root root 849 Jul 26 08:20 test.mod.c
-rw-r—r--. 1 root root 50936 Jul 26 08:20 test.mod.o
-rw-r—r--. 1 root root 12912 Jul 26 08:20 test.o
# depmod -a
# modprobe -v test
insmod /lib/modules/5.14.0-1.el9.x86_64/test.ko
# dmesg
[74422.545004] Hello World
This is a test
Additional resources
25
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
IMPORTANT
Changing the behavior of the system by modifying kernel command-line parameters may
have negative effects on your system. Always test changes prior to deploying them in
production. For further guidance, contact Red Hat Support.
By default, the kernel command-line parameters for systems using the GRUB boot loader are defined in
the boot entry configuration file for each kernel boot entry.
You can manipulate boot loader configuration files by using the grubby utility. With grubby, you can
perform these actions:
Additional resources
How to install and boot custom kernels in Red Hat Enterprise Linux 8
d8712ab6d4f14683c5625e87b52b6b6e-5.14.0-1.el9.x86_64.conf
The file name above consists of a machine ID stored in the /etc/machine-id file, and a kernel version.
The boot entry configuration file contains information about the kernel version, the initial ramdisk image,
26
CHAPTER 4. CONFIGURING KERNEL COMMAND-LINE PARAMETERS
The boot entry configuration file contains information about the kernel version, the initial ramdisk image,
and the kernel command-line parameters. The example contents of a boot entry config can be seen
below:
IMPORTANT
When installing a newer version of the kernel in RHEL 9 systems, the grubby tool passes
the kernel command-line arguments from the previous kernel version.
However, this does not apply to RHEL version 9.0 in which newly installed kernels lose
previous command-line options. You must run the grub2-mkconfig command on the
newly installed kernel to pass the parameters to your new kernel. For more information
about this known issue, see Boot loader.
Prerequisites
Procedure
To add a parameter:
For systems that use the GRUB boot loader and, on IBM Z that use the zIPL boot loader, the
command adds a new kernel parameter to each /boot/loader/entries/<ENTRY>.conf file.
# zipl
To remove a parameter:
27
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
# zipl
Additional resources
grubby tool
Prerequisites
Verify that the grubby and zipl utilities are installed on your system.
Procedure
To add a parameter:
# zipl
To remove a parameter:
# zipl
IMPORTANT
Additional resources
grubby tool
28
CHAPTER 4. CONFIGURING KERNEL COMMAND-LINE PARAMETERS
NOTE
This procedure applies only for a single boot and does not persistently make the changes.
Procedure
4. Find the kernel command line by moving the cursor down. The kernel command line starts with
linux on 64-Bit IBM Power Series and x86-64 BIOS-based systems, or linuxefi on UEFI
systems.
NOTE
Press Ctrl+a to jump to the start of the line and Ctrl+e to jump to the end of the
line. On some systems, Home and End keys might also work.
6. Edit the kernel parameters as required. For example, to run the system in emergency mode, add
the emergency parameter at the end of the linux line:
To enable the system messages, remove the rhgb and quiet parameters.
7. Press Ctrl+x to boot with the selected kernel and the modified command line parameters.
IMPORTANT
If you press the Esc key to leave command line editing, it will drop all the user made
changes.
You need to configure some default GRUB settings to use the serial console connection.
29
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Prerequisites
Procedure
GRUB_TERMINAL="serial"
GRUB_SERIAL_COMMAND="serial --speed=9600 --unit=0 --word=8 --parity=no --stop=1"
The first line disables the graphical terminal. The GRUB_TERMINAL key overrides values of
GRUB_TERMINAL_INPUT and GRUB_TERMINAL_OUTPUT keys.
The second line adjusts the baud rate (--speed), parity and other values to fit your environment
and hardware. Note that a much higher baud rate, for example 115200, is preferable for tasks
such as following log files.
On BIOS-based machines:
# grub2-mkconfig -o /boot/grub2/grub.cfg
On UEFI-based machines:
# grub2-mkconfig -o /boot/grub2/grub.cfg
GRUB_CMDLINE_LINUX="crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M
resume=/dev/mapper/rhel-swap rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap"
To change the boot entries, overwrite Boot Loader Specification (BLS) snippets with the contents of
the GRUB_CMDLINE_LINUX values.
Prerequisites
Procedure
1. Add or remove a kernel parameter for individual kernels in a post installation script with grubby:
30
CHAPTER 4. CONFIGURING KERNEL COMMAND-LINE PARAMETERS
The parameter is propagated into the BLS snippets, but not into the /etc/default/grub file.
2. Overwrite BLS snippets with the contents of the GRUB_CMDLINE_LINUX values present in
the /etc/default/grub file:
NOTE
Verification
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-425.3.1.el8.x86_64 root=/dev/mapper/RHELCSB-
Root ro vconsole.keymap=us crashkernel=auto rd.lvm.lv=RHELCSB/Root rd.luks.uuid=luks-
d8a28c4c-96aa-4319-be26-96896272151d rhgb quiet noapic rd.luks.key=d8a28c4c-96aa-
4319-be26-96896272151d=/keyfile:UUID=c47d962e-4be8-41d6-8216-8cf7a0d3b911
ipv6.disable=1
31
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
IMPORTANT
Tunables are divided into classes by the kernel subsystem. Red Hat Enterprise Linux has the following
tunable classes:
32
CHAPTER 5. CONFIGURING KERNEL PARAMETERS AT RUNTIME
Additional resources
Prerequisites
Root permissions
Procedure
# sysctl -a
NOTE
# sysctl <TUNABLE_CLASS>.<PARAMETER>=<TARGET_VALUE>
The sample command above changes the parameter value while the system is running. The
changes take effect immediately, without a need for restart.
NOTE
Additional resources
Prerequisites
Root permissions
Procedure
# sysctl -a
The command displays all kernel parameters that can be configured at runtime.
The sample command changes the tunable value and writes it to the /etc/sysctl.conf file, which
overrides the default values of kernel parameters. The changes take effect immediately and
persistently, without a need for restart.
NOTE
To permanently modify kernel parameters you can also make manual changes to the
configuration files in the /etc/sysctl.d/ directory.
Additional resources
Prerequisites
Root permissions
Procedure
# vim /etc/sysctl.d/<some_file.conf>
34
CHAPTER 5. CONFIGURING KERNEL PARAMETERS AT RUNTIME
<TUNABLE_CLASS>.<PARAMETER>=<TARGET_VALUE>
<TUNABLE_CLASS>.<PARAMETER>=<TARGET_VALUE>
# sysctl -p /etc/sysctl.d/<some_file.conf>
The command enables you to read values from the configuration file, which you created
earlier.
Additional resources
Prerequisites
Root permissions
Procedure
# ls -l /proc/sys/<TUNABLE_CLASS>/
The writable files returned by the command can be used to configure the kernel. The files with
read-only permissions provide feedback on the current settings.
The command makes configuration changes that will disappear once the system is restarted.
# cat /proc/sys/<TUNABLE_CLASS>/<PARAMETER>
Additional resources
35
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
36
APTER 6. CONFIGURING KERNEL PARAMETERS PERMANENTLY BY USING THE KERNEL_SETTINGS RHEL SYSTEM ROLE
After you run the kernel_settings role from the control machine, the kernel parameters are applied to
the managed systems immediately and persist across reboots.
IMPORTANT
Note that RHEL System Role delivered over RHEL channels are available to RHEL
customers as an RPM package in the default AppStream repository. RHEL System Role
are also available as a collection to customers with Ansible subscriptions over Ansible
Automation Hub.
RHEL System Roles were introduced for automated configurations of the kernel using the
kernel_settings System Role. The rhel-system-roles package contains this system role, and also the
reference documentation.
To apply the kernel parameters on one or more systems in an automated fashion, use the
kernel_settings role with one or more of its role variables of your choice in a playbook. A playbook is a
list of one or more plays that are human-readable, and are written in the YAML format.
Various kernel subsystems, hardware devices, and device drivers using the
kernel_settings_sysfs role variable
The CPU affinity for the systemd service manager and processes it forks using the
kernel_settings_systemd_cpu_affinity role variable
Additional resources
37
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Follow these steps to prepare and apply an Ansible playbook to remotely configure kernel parameters
with persisting effect on multiple managed operating systems.
Prerequisites
Entitled by your RHEL subscription, you installed the ansible-core and rhel-system-roles
packages on the control machine.
An inventory of managed hosts is present on the control machine and Ansible is able to connect
to them.
IMPORTANT
RHEL 8.0 - 8.5 provided access to a separate Ansible repository that contains Ansible
Engine 2.9 for automation based on Ansible. Ansible Engine contains command-line
utilities such as ansible, ansible-playbook; connectors such as docker and podman; and
the entire world of plugins and modules. For information about how to obtain and install
Ansible Engine, refer to How do I Download and Install Red Hat Ansible Engine? .
RHEL 8.6 and 9.0 has introduced Ansible Core (provided as ansible-core RPM), which
contains the Ansible command-line utilities, commands, and a small set of built-in Ansible
plugins. The AppStream repository provides ansible-core, which has a limited scope of
support. You can learn more by reviewing Scope of support for the ansible-core package
included in the RHEL 9 AppStream.
Procedure
# cat /home/jdoe/<ansible_project_name>/inventory
[testingservers]
[email protected]
[email protected]
[db-servers]
db1.example.com
db2.example.com
[webservers]
web1.example.com
web2.example.com
192.0.2.42
The file defines the [testingservers] group and other groups. It allows you to run Ansible more
effectively against a specific set of systems.
2. Create a configuration file to set defaults and privilege escalation for Ansible operations.
a. Create a new YAML file and open it in a text editor, for example:
38
APTER 6. CONFIGURING KERNEL PARAMETERS PERMANENTLY BY USING THE KERNEL_SETTINGS RHEL SYSTEM ROLE
a. Create a new YAML file and open it in a text editor, for example:
# vi /home/jdoe/<ansible_project_name>/ansible.cfg
[defaults]
inventory = ./inventory
[privilege_escalation]
become = true
become_method = sudo
become_user = root
become_ask_pass = true
The [defaults] section specifies a path to the inventory file of managed hosts. The
[privilege_escalation] section defines that user privileges be shifted to root on the
specified managed hosts. This is necessary for successful configuration of kernel
parameters. When Ansible playbook is run, you will be prompted for user password. The user
automatically switches to root by means of sudo after connecting to a managed host.
a. Create a new YAML file and open it in a text editor, for example:
# vi /home/jdoe/<ansible_project_name>/kernel-roles.yml
This file represents a playbook and usually contains an ordered list of tasks, also called plays,
that are run against specific managed hosts selected from your inventory file.
---
-
hosts: testingservers
name: "Configure kernel settings"
roles:
- rhel-system-roles.kernel_settings
vars:
kernel_settings_sysctl:
- name: fs.file-max
value: 400000
- name: kernel.threads-max
value: 65536
kernel_settings_sysfs:
- name: /sys/class/net/lo/mtu
value: 65000
kernel_settings_transparent_hugepages: madvise
The name key is optional. It associates an arbitrary string with the play as a label and
identifies what the play is for. The hosts key in the play specifies the hosts against which
the play is run. The value or values for this key can be provided as individual names of
managed hosts or as groups of hosts as defined in the inventory file.
The vars section represents a list of variables containing selected kernel parameter names
39
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
The vars section represents a list of variables containing selected kernel parameter names
and values to which they have to be set.
The roles key specifies what system role is going to configure the parameters and values
mentioned in the vars section.
NOTE
You can modify the kernel parameters and their values in the playbook to fit
your needs.
playbook: kernel-roles.yml
# ansible-playbook kernel-roles.yml
...
BECOME password:
PLAY RECAP
********************************************************************************************************
[email protected] : ok=10 changed=4 unreachable=0 failed=0 skipped=6
rescued=0 ignored=0
[email protected] : ok=10 changed=4 unreachable=0 failed=0 skipped=6
rescued=0 ignored=0
Before Ansible runs your playbook, you are going to be prompted for your password and so that
a user on managed hosts can be switched to root, which is necessary for configuring kernel
parameters.
The recap section shows that the play finished successfully (failed=0) for all managed hosts,
and that 4 kernel parameters have been applied (changed=4).
6. Restart your managed hosts and check the affected kernel parameters to verify that the
changes have been applied and persist across reboots.
Additional resources
Preparing a control node and managed nodes to use RHEL System Roles
40
APTER 6. CONFIGURING KERNEL PARAMETERS PERMANENTLY BY USING THE KERNEL_SETTINGS RHEL SYSTEM ROLE
Configuring Ansible
Using Variables
Roles
41
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Do not have to wait for long-running tasks to complete, for users to log off, or for scheduled
downtime.
Control the system’s uptime more and do not sacrifice security or stability.
Note that not every critical or important CVE will be resolved using the kernel live patching solution. Our
goal is to reduce the required reboots for security-related patches, not to eliminate them entirely. For
more details about the scope of live patching, see the Customer Portal Solutions article .
WARNING
Some incompatibilities exist between kernel live patching and other kernel
subcomponents. Read the
Do not use the SystemTap or kprobe tools during or after loading a patch. The patch could fail
to take effect until after such probes have been removed.
If you require support for an issue that arises with a third-party live patch, Red Hat recommends that you
open a case with the live patching vendor at the outset of any investigation in which a root cause
determination is necessary. This allows the source code to be supplied if the vendor allows, and for their
support organization to provide assistance in root cause determination prior to escalating the
investigation to Red Hat Support.
For any system running with third-party live patches, Red Hat reserves the right to ask for reproduction
with Red Hat shipped and supported software. In the event that this is not possible, we require a similar
system and workload be deployed on your test environment without live patches applied, to confirm if
the same behavior is observed.
42
CHAPTER 7. APPLYING PATCHES WITH KERNEL LIVE PATCHING
For more information about third-party software support policies, see How does Red Hat Global
Support Services handle third-party software, drivers, and/or uncertified hardware/hypervisors or guest
operating systems?
All customers have access to kernel live patches, which are delivered through the usual channels.
However, customers who do not subscribe to an extended support offering will lose access to new
patches for the current minor release once the next minor release becomes available. For example,
customers with standard subscriptions will only be able to live patch RHEL 9.1 kernel until the RHEL 9.2
kernel is released.
A kernel module which is built specifically for the kernel being patched.
The patch module contains the code of the desired fixes for the kernel.
The patch modules register with the livepatch kernel subsystem and provide information
about original functions to be replaced, with corresponding pointers to the replacement
functions. Kernel patch modules are delivered as RPMs.
1. The kernel patch module is copied to the /var/lib/kpatch/ directory and registered for re-
application to the kernel by systemd on next boot.
2. The kpatch module is loaded into the running kernel and the new functions are registered to the
ftrace mechanism with a pointer to the location in memory of the new code.
43
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
3. When the kernel accesses the patched function, it is redirected by the ftrace mechanism which
bypasses the original functions and redirects the kernel to patched version of the function.
The following procedure explains how to subscribe to all future cumulative live patching updates for a
given kernel. Because live patches are cumulative, you cannot select which individual patches are
deployed for a given kernel.
WARNING
Red Hat does not support any third party live patches applied to a Red Hat
supported system.
Prerequisites
Root permissions
Procedure
# uname -r
5.14.0-1.el9.x86_64
44
CHAPTER 7. APPLYING PATCHES WITH KERNEL LIVE PATCHING
2. Search for a live patching package that corresponds to the version of your kernel:
The command above installs and applies the latest cumulative live patches for that specific
kernel only.
If the version of a live patching package is 1-1 or higher, the package will contain a patch module.
In that case the kernel will be automatically patched during the installation of the live patching
package.
The kernel patch module is also installed into the /var/lib/kpatch/ directory to be loaded by the
systemd system and service manager during the future reboots.
NOTE
An empty live patching package will be installed when there are no live patches
available for a given kernel. An empty live patching package will have a
kpatch_version-kpatch_release of 0-0, for example kpatch-patch-5_14_0-1-0-
0.x86_64.rpm. The installation of the empty RPM subscribes the system to all
future live patches for the given kernel.
Verification
# kpatch list
Loaded patch modules:
kpatch_5_14_0_1_0_1 [enabled]
The output shows that the kernel patch module has been loaded into the kernel that is now
patched with the latest fixes from the kpatch-patch-5_14_0-1-0-1.el9.x86_64.rpm package.
NOTE
Entering the kpatch list command does not return an empty live patching
package. Use the rpm -qa | grep kpatch command instead.
Additional resources
45
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Prerequisites
Procedure
1. Optionally, check all installed kernels and the kernel you are currently running:
# uname -r
5.14.0-2.el9.x86_64
Transaction Summary
===================================================
Install 2 Packages
…
This command subscribes all currently installed kernels to receiving kernel live patches. The
46
CHAPTER 7. APPLYING PATCHES WITH KERNEL LIVE PATCHING
This command subscribes all currently installed kernels to receiving kernel live patches. The
command also installs and applies the latest cumulative live patches, if any, for all installed
kernels.
In the future, when you update the kernel, live patches will automatically be installed during the
new kernel installation process.
The kernel patch module is also installed into the /var/lib/kpatch/ directory to be loaded by the
systemd system and service manager during future reboots.
NOTE
An empty live patching package will be installed when there are no live patches
available for a given kernel. An empty live patching package will have a
kpatch_version-kpatch_release of 0-0, for example kpatch-patch-5_14_0-1-0-
0.el9.x86_64.rpm. The installation of the empty RPM subscribes the system to all
future live patches for the given kernel.
Verification
# kpatch list
Loaded patch modules:
kpatch_5_14_0_2_0_1 [enabled]
The output shows that both the kernel you are running, and the other installed kernel have been
patched with fixes from kpatch-patch-5_14_0-1-0-1.el9.x86_64.rpm and kpatch-patch-
5_14_0-2-0-1.el9.x86_64.rpm packages respectively.
NOTE
Entering the kpatch list command does not return an empty live patching
package. Use the rpm -qa | grep kpatch command instead.
Additional resources
When you subscribe your system to fixes delivered by the kernel patch module, your subscription is
47
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
When you subscribe your system to fixes delivered by the kernel patch module, your subscription is
automatic. You can disable this feature, and thus disable automatic installation of kpatch-patch
packages.
Prerequisites
Procedure
1. Optionally, check all installed kernels and the kernel you are currently running:
# uname -r
5.14.0-2.el9.x86_64
Verification step
Additional resources
Prerequisites
The system is subscribed to the live patching stream, as described in Subscribing the currently
installed kernels to the live patching stream.
Procedure
48
CHAPTER 7. APPLYING PATCHES WITH KERNEL LIVE PATCHING
The command above automatically installs and applies any updates that are available for the
currently running kernel. Including any future released cumulative live patches.
NOTE
When the system reboots into the same kernel, the kernel is automatically live patched
again by the kpatch.service systemd service.
Additional resources
Prerequisites
Root permissions
Procedure
The example output above lists live patching packages that you installed.
When a live patching package is removed, the kernel remains patched until the next reboot, but
the kernel patch module is removed from disk. On future reboot, the corresponding kernel will
no longer be patched.
49
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
The command displays no output if the package has been successfully removed.
# kpatch list
Loaded patch modules:
The example output shows that the kernel is not patched and the live patching solution is not
active because there are no patch modules that are currently loaded.
IMPORTANT
Currently, Red Hat does not support reverting live patches without rebooting your
system. In case of any issues, contact our support team.
Additional resources
Prerequisites
Root permissions
Procedure
# kpatch list
Loaded patch modules:
kpatch_5_14_0_1_0_1 [enabled]
50
CHAPTER 7. APPLYING PATCHES WITH KERNEL LIVE PATCHING
# kpatch list
Loaded patch modules:
kpatch_5_14_0_1_0_1 [enabled]
When the selected module is uninstalled, the kernel remains patched until the next reboot,
but the kernel patch module is removed from disk.
4. Optionally, verify that the kernel patch module has been uninstalled.
# kpatch list
Loaded patch modules:
…
The example output above shows no loaded or installed kernel patch modules, therefore the
kernel is not patched and the kernel live patching solution is not active.
IMPORTANT
Currently, Red Hat does not support reverting live patches without rebooting your
system. In case of any issues, contact our support team.
Additional resources
Prerequisites
Root permissions
Procedure
2. Disable kpatch.service:
51
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
# kpatch list
Loaded patch modules:
kpatch_5_14_0_1_0_1 [enabled]
The example output testifies that kpatch.service has been disabled and is not running.
Thereby, the kernel live patching solution is not active.
# kpatch list
Loaded patch modules:
The example output above shows that a kernel patch module is still installed but the kernel is
not patched.
IMPORTANT
Currently, Red Hat does not support reverting live patches without rebooting your
system. In case of any issues, contact our support team.
Additional resources
52
CHAPTER 8. KEEPING KERNEL PANIC PARAMETERS DISABLED IN VIRTUALIZED ENVIRONMENTS
Additional resources
softlockup_panic
Controls whether or not the kernel will panic when a soft lockup is detected.
The system needs to detect a hard lockup first to be able to panic. The detection is controlled by the
nmi_watchdog parameter.
nmi_watchdog
Controls whether lockup detection mechanisms (watchdogs) are active or not. This parameter is of
integer type.
Value Effect
The hard lockup detector monitors each CPU for its ability to respond to interrupts.
53
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
watchdog_thresh
Controls frequency of watchdog hrtimer, NMI events, and soft/hard lockup thresholds.
10 seconds 2 * watchdog_thresh
Additional resources
Kernel sysctl
Heavy work-load on a host or high contention over some specific resource such as memory, usually
causes a spurious soft lockup firing. This is because the host may schedule out the guest CPU for a
period longer than 20 seconds. Then when the guest CPU is again scheduled to run on the host, it
experiences a time jump which triggers due timers. The timers include also watchdog hrtimer, which can
consequently report a soft lockup on the guest CPU.
Because a soft lockup in a virtualized environment may be spurious, you should not enable the kernel
parameters that would cause a system panic when a soft lockup is reported on a guest CPU.
IMPORTANT
To understand soft lockups in guests, it is essential to know that the host schedules the
guest as a task, and the guest then schedules its own tasks.
Additional resources
54
CHAPTER 9. ADJUSTING KERNEL PARAMETERS FOR DATABASE SERVERS
Red Hat Enterprise Linux 9 provides the following database management systems:
MariaDB 10.5
MySQL 8.0
PostgreSQL 13
Redis 6
fs.aio-max-nr
Defines the maximum number of asynchronous I/O operations the system can handle on the server.
NOTE
fs.file-max
Defines the maximum number of file handles (temporary file names or IDs assigned to open files) the
system supports at any instance.
The kernel dynamically allocates file handles whenever a file handle is requested by an application.
The kernel however does not free these file handles when they are released by the application. The
kernel recycles these file handles instead. This means that over time the total number of allocated
file handles will increase even though the number of currently used file handles may be low.
kernel.shmall
Defines the total number of shared memory pages that can be used system-wide. To use the entire
main memory, the value of the kernel.shmall parameter should be ≤ total main memory size.
kernel.shmmax
Defines the maximum size in bytes of a single shared memory segment that a Linux process can
allocate in its virtual address space.
55
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
kernel.shmmni
Defines the maximum number of shared memory segments the database server is able to handle.
net.ipv4.ip_local_port_range
Defines the port range the system can use for programs which want to connect to a database server
without a specific port number.
net.core.rmem_default
Defines the default receive socket memory through Transmission Control Protocol (TCP).
net.core.rmem_max
Defines the maximum receive socket memory through Transmission Control Protocol (TCP).
net.core.wmem_default
Defines the default send socket memory through Transmission Control Protocol (TCP).
net.core.wmem_max
Defines the maximum send socket memory through Transmission Control Protocol (TCP).
vm.dirty_bytes / vm.dirty_ratio
Defines a threshold in bytes / in percentage of dirty-able memory at which a process generating
dirty data is started in the write() function.
NOTE
vm.dirty_background_bytes / vm.dirty_background_ratio
Defines a threshold in bytes / in percentage of dirty-able memory at which the kernel tries to actively
write dirty data to hard-disk.
NOTE
vm.dirty_writeback_centisecs
Defines a time interval between periodic wake-ups of the kernel threads responsible for writing dirty
data to hard-disk.
This kernel parameters measures in 100th’s of a second.
vm.dirty_expire_centisecs
Defines the time after which dirty data is old enough to be written to hard-disk.
This kernel parameters measures in 100th’s of a second.
Additional resources
56
CHAPTER 10. GETTING STARTED WITH KERNEL LOGGING
The buffer mentioned above is a cyclic data structure which has a fixed size, and is hard-coded into the
kernel. Users can display data stored in the kernel ring buffer through the dmesg command or the
/var/log/boot.log file. When the ring buffer is full, the new data overwrites the old.
Additional resources
0
Kernel emergency. The system is unusable.
1
Kernel alert. Action must be taken immediately.
2
Condition of the kernel is considered critical.
3
General kernel error condition.
4
General kernel warning condition.
5
Kernel notice of a normal but significant condition.
6
Kernel informational message.
7
Kernel debug-level messages.
57
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
# sysctl kernel.printk
kernel.printk = 7 4 1 7
1. Console log-level, defines the lowest priority of messages printed to the console.
3. Sets the lowest possible log-level configuration for the console log-level.
IMPORTANT
The default 7 4 1 7 printk value allows for better debugging of kernel activity. However,
when coupled with a serial console, this printk setting might cause intense I/O bursts that
might lead to a RHEL system becoming temporarily unresponsive. To avoid these
situations, setting a printk value of 4 4 1 7 typically works, but at the expense of losing
the extra debugging information.
Also note that certain kernel command line parameters, such as quiet or debug, change
the default kernel.printk values.
Additional resources
58
CHAPTER 11. INSTALLING KDUMP
IMPORTANT
A kernel crash dump can be the only information available if a system failure occur.
Therefore, operational kdump is important in mission-critical environments. Red Hat
advises to regularly update and test kexec-tools in your normal kernel update cycle. This
is especially important when you install new kernel features.
You can enable kdump for all installed kernels on a machine or only for specified kernels. This is useful
when there are multiple kernels used on a machine, some of which are stable enough that there is no
concern that they could crash. When you install kdump, a default /etc/kdump.conf file is created. The
/etc/kdump.conf file includes the default minimum kdump configuration, which you can edit to
customize the kdump configuration.
Procedure
2. Under Kdump Memory Reservation, select Manual` if you must customize the memory
reserve.
3. Under KDUMP field, in Memory To Be Reserved (MB), set the required memory reserve for
59
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
3. Under KDUMP field, in Memory To Be Reserved (MB), set the required memory reserve for
kdump.
Prerequisites
A repository containing the kexec-tools package for your system CPU architecture.
Fulfilled requirements for kdump configurations and targets. For details, see Supported kdump
configurations and targets.
Procedure
# rpm -q kexec-tools
# kexec-tools-2.0.22-13.el9.x86_64
60
CHAPTER 12. CONFIGURING KDUMP ON THE COMMAND LINE
The makedumpfile --mem-usage command estimates how much space the crash dump file requires. It
generates a memory usage report. The report helps you determine the dump level and which pages are
safe to be excluded.
Procedure
IMPORTANT
By default the RHEL kernel uses 4 KB sized pages on AMD64 and Intel 64 CPU
architectures, and 64 KB sized pages on IBM POWER architectures.
The automatic memory allocation for kdump also varies based on the system hardware architecture and
available memory size. For example, on AMD and Intel 64-bit architectures, the default value for the
crashkernel= parameter will work only when the available memory is more than 1 GB. The kexec-tools
utility configures the following default memory reserves on AMD64 and Intel 64-bit architecture:
crashkernel=1G-4G:192M,4G-64G:256M,64G:512M
61
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
You can also run kdumpctl estimate to query a rough estimate value without triggering a crash. The
estimated crashkernel= value might not be an accurate one but can serve as a reference to set an
appropriate crashkernel= value.
NOTE
The crashkernel=auto option in the boot command line is no longer supported on RHEL
9 and later releases.
Prerequisites
You have fulfilled kdump requirements for configurations and targets. For details, see
Supported kdump configurations and targets .
Procedure
When configuring the crashkernel= value, test the configuration by rebooting with kdump
enabled. If the kdump kernel fails to boot, increase the memory size gradually to set an
acceptable value.
crashkernel=192M
Alternatively, you can set the amount of reserved memory to a variable depending on the
total amount of installed memory using the syntax crashkernel=<range1>:<size1>,
<range2>:<size2>. For example:
crashkernel=1G-4G:192M,2G-64G:256M
The example reserves 192 MB of memory if the total amount of system memory is 1 GB or
higher and lower than 4 GB. If the total amount of memory is more than 4 GB, 256 MB is
reserved for kdump.
crashkernel=192M@16M
62
CHAPTER 12. CONFIGURING KDUMP ON THE COMMAND LINE
automatically. You can also offset memory when setting a variable memory reservation by
specifying the offset as the last value. For example, crashkernel=1G-4G:192M,2G-
64G:256M@16M.
The <custom-value> must contain the custom crashkernel= value that you have
configured for the crash kernel.
# reboot
Verification
Cause the kernel to crash by activating the sysrq key. The address-YYYY-MM-DD-HH:MM:SS/vmcore
file is saved to the target location as specified in the /etc/kdump.conf file. If you choose the default
target location, the vmcore file is saved in the partition mounted under /var/crash/.
WARNING
The commands to test kdump configuration will cause the kernel to crash with data
loss. Follow the instructions with care and do not use an active production system to
test the kdump configuration
The command causes the kernel to crash and reboots the kernel if required.
2. Display the /etc/kdump.conf file and check if the vmcore file is saved in the target destination.
Additional resources
How to manually modify the boot parameter in grub before the system boots
Prerequisites
63
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Fulfilled requirements for kdump configurations and targets. For details, see Supported kdump
configurations and targets.
Procedure
To store the crash dump file in /var/crash/ directory of the local file system, edit the
/etc/kdump.conf file and specify the path:
path /var/crash
The option path /var/crash represents the path to the file system in which kdump saves the
crash dump file.
NOTE
When you specify a dump target in the /etc/kdump.conf file, then the path is
relative to the specified dump target.
When you do not specify a dump target in the /etc/kdump.conf file, then the
path represents the absolute path from the root directory.
Depending on what is mounted in the current system, the dump target and the adjusted dump
path are taken automatically.
To change the local directory in which the crash dump is to be saved, as root, edit the
/etc/kdump.conf configuration file:
a. Remove the hash sign (#) from the beginning of the #path /var/crash line.
b. Replace the value with the intended directory path. For example:
path /usr/local/cores
IMPORTANT
In Red Hat Enterprise Linux 9, the directory defined as the kdump target
using the path directive must exist when the kdump systemd service starts
to avoid failures. This behavior is different from versions earlier than RHEL,
where the directory is created automatically if it did not exist when the
service starts.
64
CHAPTER 12. CONFIGURING KDUMP ON THE COMMAND LINE
To write the file to a different partition, edit the /etc/kdump.conf configuration file:
a. Remove the hash sign (#) from the beginning of the #ext4 line, depending on your choice.
b. Change the file system type and the device name, label or UUID, to the required values. The
correct syntax for specifying UUID values is both UUID="correct-uuid" and UUID=correct-
uuid. For example:
ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937
IMPORTANT
When you use Direct Access Storage Device (DASD) on IBM Z hardware,
ensure the dump devices are correctly specified in /etc/dasd.conf before
you proceed with kdump.
To write the crash dump directly to a device, edit the /etc/kdump.conf configuration file:
a. Remove the hash sign (#) from the beginning of the #raw /dev/vg/lv_kdump line.
b. Replace the value with the intended device name. For example:
raw /dev/sdb1
To store the crash dump to a remote machine using the NFS protocol:
a. Remove the hash sign (#) from the beginning of the #nfs my.server.com:/export/tmp line.
b. Replace the value with a valid hostname and directory path. For example:
nfs penguin.example.com:/export/cores
To store the crash dump to a remote machine using the SSH protocol:
a. Remove the hash sign (#) from the beginning of the #ssh [email protected] line.
i. Remove the hash sign from the beginning of the #sshkey /root/.ssh/kdump_id_rsa
line.
ii. Change the value to the location of a key valid on the server you are trying to dump to.
For example:
65
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
ssh [email protected]
sshkey /root/.ssh/mykey
Compressing the size of a crash dump file and copying only necessary pages using various
dump levels.
Syntax
Options
-c, -l or -p: specify compress dump file format by each page using either, zlib for -c option, lzo
for -l option or snappy for -p option.
-d (dump_level): excludes pages so that they are not copied to the dump file.
--message-level : specify the message types. You can restrict outputs printed by specifying
message_level with this option. For example, specifying 7 as message_level prints common
messages and error messages. The maximum value of message_level is 31
Prerequisites
Fulfilled requirements for kdump configurations and targets. For details, see Supported kdump
configurations and targets.
Procedure
1. As root, edit the /etc/kdump.conf configuration file and remove the hash sign ("#") from the
beginning of the #core_collector makedumpfile -l --message-level 1 -d 31.
The -l option specifies the dump compressed file format. The -d option specifies dump level as 31. The
--message-level option specifies message level as 1.
66
CHAPTER 12. CONFIGURING KDUMP ON THE COMMAND LINE
Additional resources
dump_to_rootfs
Saves the core dump to the root file system.
reboot
Reboots the system, losing the core dump in the process.
halt
Stops the system, losing the core dump in the process.
poweroff
Power the system off, losing the core dump in the process.
shell
Runs a shell session from within the initramfs, you can record the core dump manually.
final_action
Enables additional operations such as reboot, halt, and poweroff after a successful kdump or when
shell or dump_to_rootfs failure action completes. The default is reboot.
failure_action
Specifies the action to perform when a dump might fail in a kernel crash. The default is reboot.
Prerequisites
Root permissions.
Fulfilled requirements for kdump configurations and targets. For details, see Supported kdump
configurations and targets.
Procedure
1. As root, remove the hash sign (#) from the beginning of the #failure_action line in the
/etc/kdump.conf configuration file.
failure_action poweroff
67
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Additional resources
KDUMP_COMMANDLINE_REMOVE
This option removes arguments from the current kdump command line. It removes parameters that
may cause kdump errors or kdump kernel boot failures. These parameters may have been parsed
from the previous KDUMP_COMMANDLINE process or inherited from the /proc/cmdline file.
When this variable is not configured, it inherits all values from the /proc/cmdline file. Configuring this
option also provides information that is helpful in debugging an issue.
KDUMP_COMMANDLINE_APPEND
This option appends arguments to the current command line. These arguments may have been
parsed by the previous KDUMP_COMMANDLINE_REMOVE variable.
For the kdump kernel, disabling certain modules such as mce, cgroup, numa, hest_disable can help
prevent kernel errors. These modules may consume a significant portion of the kernel memory
reserved for kdump or cause kdump kernel boot failures.
To disable memory cgroups on the kdump kernel command line, run the command as follows:
KDUMP_COMMANDLINE_APPEND="cgroup_disable=memory"
Additional resources
68
CHAPTER 12. CONFIGURING KDUMP ON THE COMMAND LINE
WARNING
Do not test kdump on active production systems. The commands to test kdump
will cause the kernel to crash with loss of data. Depending on your system
architecture, ensure that you schedule significant maintenance time because
kdump testing might require several reboots with a long boot time.
If the vmcore file is not generated during the kdump test, identify and fix issues
before you run the test again for a successful kdump testing.
IMPORTANT
Ensure that you schedule significant maintenance time, because kdump testing might
require several reboots with a long boot time.
If you make any manual system modifications, you must test the kdump configuration at
the end of any system modification. For example, if you make any of the following
changes, ensure that you test the kdump configuration for an optimal kdump
performance:
Package upgrades.
New installation and application upgrades that include third party modules.
If you use the hot-plugging mechanism to add more memory on hardware that
support this mechanism.
Prerequisites
You have saved all important data. The commands to test kdump cause the kernel to crash with
loss of data.
You have scheduled significant machine maintenance time depending on the system
architecture.
Procedure
# kdumpctl restart
2. Check the status of the kdump service. With the kdumpctl command, you can print the output
at the console.
69
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
# kdumpctl status
kdump:Kdump is operational
Alternatively, if you use the systemctl command, the output prints in the systemd journal.
3. Initiate a kernel crash to test the kdump configuration. The sysrq-trigger key combination
causes the kernel to crash and might reboot the system if required.
Additional resources
Prerequisites
You have completed kdump requirements for configurations and targets. See Supported
kdump configurations and targets.
Procedure
70
CHAPTER 12. CONFIGURING KDUMP ON THE COMMAND LINE
WARNING
If kptr_restrict is not set to 1 and KASLR is enabled, the contents of /proc/kore file
are generated as all zeros. The kdumpctl service fails to access the /proc/kcore file
and load the crash kernel. The kexec-kdump-howto.txt file displays a warning
message, which recommends you to set kptr_restrict=1. Verify for the following in
the sysctl.conf file to ensure that kdumpctl service loads the crash kernel:
You can append the KDUMP_COMMANDLINE_APPEND= variable using one of the following
configuration options:
rd.driver.blacklist=<modules>
modprobe.blacklist=<modules>
Prerequisites
Procedure
1. Display the list of modules that are loaded to the currently running kernel. Select the kernel
module that you intend to block from loading.
$ lsmod
71
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
KDUMP_COMMANDLINE_APPEND="rd.driver.blacklist=hv_vmbus,hv_storvsc,hv_utils,hv_net
vsc,hid-hyperv"
KDUMP_COMMANDLINE_APPEND="modprobe.blacklist=emcp modprobe.blacklist=bnx2fc
modprobe.blacklist=libfcoe modprobe.blacklist=fcoe"
Additional resources
The kdumpctl estimate command helps you estimate the amount of memory you need for kdump.
kdumpctl estimate prints the recommended crashkernel value, which is the most suitable memory size
required for kdump.
The recommended crashkernel value is calculated based on the current kernel size, kernel module,
initramfs, and the LUKS encrypted target memory requirement.
In case you are using the custom crashkernel= option, kdumpctl estimate prints the LUKS required
size value. The value is the memory size required for LUKS encrypted target.
Procedure
# *kdumpctl estimate*
Encrypted kdump target requires extra memory, assuming using the keyslot with minimum
memory requirement
Reserved crashkernel: 256M
Recommended crashkernel: 652M
72
CHAPTER 12. CONFIGURING KDUMP ON THE COMMAND LINE
NOTE
If the kdump service still fails to save the dump file to the encrypted target, increase the
crashkernel= value as required.
73
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Prerequisites
Procedure
Verification
Prerequisites
Procedure
# ls -a /boot/vmlinuz-*
/boot/vmlinuz-0-rescue-2930657cd0dc43c2b75db480e5e5b4a9
74
CHAPTER 13. ENABLING KDUMP
/boot/vmlinuz-4.18.0-330.el8.x86_64
/boot/vmlinuz-4.18.0-330.rt7.111.el8.x86_64
2. Add a specific kdump kernel to the system’s Grand Unified Bootloader (GRUB) configuration.
For example:
Verification
Prerequisites
Fulfilled requirements for kdump configurations and targets. For details, see Supported kdump
configurations and targets.
All configurations for installing kdump are set up according to your needs. For details, see
Installing kdump .
Procedure
75
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
WARNING
Additional resources
Managing systemd
76
CHAPTER 14. SUPPORTED KDUMP CONFIGURATIONS AND TARGETS
With the provided information and procedures, you understand the supported configurations and
targets on your Red Hat Enterprise Linux 9 systems and properly configure kdump and validate it’s
working.
The memory requirements vary based on certain system parameters. One of the major factors is the
system’s hardware architecture. To find out the exact machine architecture (such as Intel 64 and
AMD64, also known as x86_64) and print it to standard output, use the following command:
$ uname -m
With the stated list of minimum memory requirements, you can set the appropriate memory size to
automatically reserve a memory for kdump on the latest available versions. The memory size depends
on the system’s architecture and total available physical memory.
4 GB to 64 GB 256 MB of RAM
4 GB to 64 GB 320 MB of RAM
4 GB to 64 GB 420 MB of RAM
77
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
4 GB to 16 GB 512 MB of RAM
16 GB to 64 GB 1 GB of RAM
64 GB to 128 GB 2 GB of RAM
4 GB to 64 GB 256 MB of RAM
On many systems, kdump is able to estimate the amount of required memory and reserve it
automatically. This behavior is enabled by default, but only works on systems that have more than a
certain amount of total available memory, which varies based on the system architecture.
IMPORTANT
The automatic configuration of reserved memory based on the total amount of memory
in the system is a best effort estimation. The actual required memory may vary due to
other factors such as I/O devices. Using not enough of memory might cause that a debug
kernel is not able to boot as a capture kernel in case of a kernel panic. To avoid this
problem, sufficiently increase the crash kernel memory.
Additional resources
78
CHAPTER 14. SUPPORTED KDUMP CONFIGURATIONS AND TARGETS
IBM Z (s390x ) 1 GB
64-bit ARM 1 GB
NOTE
The crashkernel=auto option in the boot command line is no longer supported on RHEL
9 and later releases.
79
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Physical Storage
Logical Volume Manager BIOS RAID.
(LVM).
Software iSCSI with
Thin provisioning volume. iBFT. Currently
supported transports are
Fibre Channel (FC) disks bnx2i, cxgb3i, and
such as qla2xxx, lpfc , cxgb4i.
bnx2fc, and bfa.
Software iSCSI with
An iSCSI software- hybrid device driver such
configured logical device as be2iscsi.
on a networked storage
server. Fibre Channel over
Ethernet ( FCoE ).
The mdraid subsystem
as a software RAID Legacy IDE.
solution.
GlusterFS servers.
Hardware RAID such as
smartpqi, hpsa , GFS2 file system.
megaraid, mpt3sas ,
aacraid, and mpi3mr . Clustered Logical
Volume Manager
SCSI and SATA disks. (CLVM).
Network
Hardware using kernel Hardware using kernel
modules such as igb, modules such as sfc
ixgbe , ice, i40e, SRIOV, cxgb4vf , and
e1000e , igc, tg3, pch_gbe.
bnx2x, bnxt_en, qede ,
cxgb4, be2net, enic, IPv6 protocol.
sfc, mlx4_en,
mlx5_core, r8169 , Wireless connections.
atlantic, nfp, and nicvf
on 64-bit ARM InfiniBand networks.
architecture only.
VLAN network over
bridge and team.
80
CHAPTER 14. SUPPORTED KDUMP CONFIGURATIONS AND TARGETS
Hypervisor
Kernel-based virtual
machines (KVM).
Hyper-V 2012 R2 on
RHEL Gen1 UP Guest
only and later version.
Firmware
BIOS-based systems.
Additional resources
Option Description
1 Zero pages
2 Cache pages
4 Cache private
8 User pages
16 Free pages
Additional resources
81
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
dump_to_rootfs
Attempt to save the core dump to the root file system. This option is especially useful in combination
with a network target: if the network target is unreachable, this option configures kdump to save the
core dump locally. The system is rebooted afterwards.
reboot
Reboot the system, losing the core dump in the process.
halt
Halt the system, losing the core dump in the process.
poweroff
Power off the system, losing the core dump in the process.
shell
Run a shell session from within the initramfs, allowing the user to record the core dump manually.
final_action
Enable additional operations such as reboot, halt, and poweroff actions after a successful kdump or
when shell or dump_to_rootfs failure action completes. The default final_action option is reboot.
failure_action
Specifies the action to perform when a dump might fail in the event of a kernel crash. The default
failure_action option is reboot.
Additional resources
Procedure
1. To configure final_action, edit the /etc/kdump.conf file and add one of the following options:
final_action reboot
final_action halt
final_action poweroff
# kdumpctl restart
82
CHAPTER 14. SUPPORTED KDUMP CONFIGURATIONS AND TARGETS
The failure_action parameter specifies the action to perform when a dump fails in the event of a kernel
crash. The default action for failure_action is reboot, which reboots the system.
reboot
Reboots the system after a dump failure.
dump_to_rootfs
Saves the dump file on a root file system when a non-root dump target is configured.
halt
Halts the system.
poweroff
Stops the running operations on the system.
shell
Starts a shell session inside initramfs, from which you can manually perform additional recovery
actions.
Procedure:
1. To configure an action to take if the dump fails, edit the /etc/kdump.conf file and specify one
of the failure_action options:
failure_action reboot
failure_action halt
failure_action poweroff
failure_action shell
failure_action dump_to_rootfs
# kdumpctl restart
83
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
The fadump mechanism offers improved reliability over the traditional dump type, by rebooting the
partition and using a new kernel to dump the data from the previous kernel crash. The fadump requires
an IBM POWER6 processor-based or later version hardware platform.
For further details about the fadump mechanism, including PowerPC specific methods of resetting
hardware, see the /usr/share/doc/kexec-tools/fadump-howto.txt file.
NOTE
The area of memory that is not preserved, known as boot memory, is the amount of RAM
required to successfully boot the kernel after a crash event. By default, the boot memory
size is 256MB or 5% of total system RAM, whichever is larger.
Unlike kexec-initiated event, the fadump mechanism uses the production kernel to recover a crash
dump. When booting after a crash, PowerPC hardware makes the device node /proc/device-
tree/rtas/ibm.kernel-dump available to the proc filesystem (procfs). The fadump-aware kdump
scripts, check for the stored vmcore, and then complete the system reboot cleanly.
In the Secure Boot environment, the GRUB boot loader allocates a boot memory region, known as the
Real Mode Area (RMA). The RMA has a size of 512 MB, which is divided among the boot components
and, if a component exceeds its size allocation, GRUB fails with an out-of-memory (OOM) error.
84
CHAPTER 15. FIRMWARE ASSISTED DUMP MECHANISMS
WARNING
Do not enable firmware assisted dump (fadump) mechanism in the Secure Boot
environment on RHEL 9.1 and earlier versions. The GRUB boot loader fails with the
following error:
The system is recoverable only if you increase the default initramfs size due to the
fadump configuration.
For information about workaround methods to recover the system, see the System
boot ends in GRUB Out of Memory (OOM) article.
Prerequisites
Procedure
NOTE
# reboot
85
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
VMDUMP
The kdump infrastructure is supported and utilized on IBM Z systems. However, using one of the
firmware assisted dump (fadump) methods for IBM Z can provide various benefits:
The sadump mechanism is initiated and controlled from the system console, and is stored on an
IPL bootable device.
The VMDUMP mechanism is similar to sadump. This tool is also initiated from the system
console, but retrieves the resulting dump from hardware and copies it to the system for analysis.
These methods (similarly to other hardware based dump mechanisms) have the ability to
capture the state of a machine in the early boot phase, before the kdump service starts.
Although VMDUMP contains a mechanism to receive the dump file into a Red Hat Enterprise
Linux system, the configuration and control of VMDUMP is managed from the IBM Z Hardware
console.
Additional resources
Stand-alone dump
Procedure
1. Add or edit the following lines in the /etc/sysctl.conf file to ensure that kdump starts as
expected for sadump:
kernel.panic=0
kernel.unknown_nmi_panic=1
WARNING
In particular, ensure that after kdump, the system does not reboot. If the
system reboots after kdump has failed to save the vmcore file, then it is
not possible to invoke the sadump.
failure_action shell
86
CHAPTER 15. FIRMWARE ASSISTED DUMP MECHANISMS
Additional resources
87
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Procedure
The package kernel-debuginfo will correspond to the running kernel and provides the data
necessary for the dump analysis.
Prerequisites
Procedure
1. To start the crash utility, two necessary parameters need to be passed to the command:
88
CHAPTER 16. ANALYZING A CORE DUMP
The following example shows analyzing a core dump created on September 13 2021 at
14:05 PM, using the 5.14.0-1.el9.x86_64 kernel.
...
WARNING: kernel relocated [202MB]: patching 90160 gdb minimal_symbol values
KERNEL: /usr/lib/debug/lib/modules/5.14.0-1.el9.x86_64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2021-09-13-14:05:33/vmcore [PARTIAL DUMP]
CPUS: 2
DATE: Mon Sep 13 14:05:16 2021
UPTIME: 01:03:57
LOAD AVERAGE: 0.00, 0.00, 0.00
TASKS: 586
NODENAME: localhost.localdomain
RELEASE: 5.14.0-1.el9.x86_64
VERSION: #1 SMP Wed Aug 29 11:51:55 UTC 2018
MACHINE: x86_64 (2904 Mhz)
MEMORY: 2.9 GB
PANIC: "sysrq: SysRq : Trigger a crash"
PID: 10635
COMMAND: "bash"
TASK: ffff8d6c84271800 [THREAD_INFO: ffff8d6c84271800]
CPU: 1
STATE: TASK_RUNNING (SYSRQ)
crash>
crash> exit
~]#
NOTE
The crash command can also be used as a powerful tool for debugging a live system.
However use it with caution so as not to break your system.
Additional resources
89
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
To display the kernel message buffer, type the log command at the interactive prompt:
crash> log
... several lines omitted ...
EIP: 0060:[<c068124f>] EFLAGS: 00010096 CPU: 2
EIP is at sysrq_handle_crash+0xf/0x20
EAX: 00000063 EBX: 00000063 ECX: c09e1c8c EDX: 00000000
ESI: c0a09ca0 EDI: 00000286 EBP: 00000000 ESP: ef4dbf24
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process bash (pid: 5591, ti=ef4da000 task=f196d560 task.ti=ef4da000)
Stack:
c068146b c0960891 c0968653 00000003 00000000 00000002 efade5c0 c06814d0
<0> fffffffb c068150f b7776000 f2600c40 c0569ec4 ef4dbf9c 00000002 b7776000
<0> efade5c0 00000002 b7776000 c0569e60 c051de50 ef4dbf9c f196d560 ef4dbfb4
Call Trace:
[<c068146b>] ? __handle_sysrq+0xfb/0x160
[<c06814d0>] ? write_sysrq_trigger+0x0/0x50
[<c068150f>] ? write_sysrq_trigger+0x3f/0x50
[<c0569ec4>] ? proc_reg_write+0x64/0xa0
[<c0569e60>] ? proc_reg_write+0x0/0xa0
[<c051de50>] ? vfs_write+0xa0/0x190
[<c051e8d1>] ? sys_write+0x41/0x70
[<c0409adc>] ? syscall_call+0x7/0xb
Code: a0 c0 01 0f b6 41 03 19 d2 f7 d2 83 e2 03 83 e0 cf c1 e2 04 09 d0 88 41 03 f3 c3 90 c7
05 c8 1b 9e c0 01 00 00 00 0f ae f8 89 f6 <c6> 05 00 00 00 00 01 c3 89 f6 8d bc 27 00 00 00
00 8d 50 d0 83
EIP: [<c068124f>] sysrq_handle_crash+0xf/0x20 SS:ESP 0068:ef4dbf24
CR2: 0000000000000000
Type help log for more information about the command usage.
NOTE
The kernel message buffer includes the most essential information about the
system crash and, as such, it is always dumped first in to the vmcore-dmesg.txt
file. This is useful when an attempt to get the full vmcore file failed, for example
because of lack of space on the target location. By default, vmcore-dmesg.txt is
located in the /var/crash/ directory.
Displaying a backtrace
crash> bt
PID: 5591 TASK: f196d560 CPU: 2 COMMAND: "bash"
#0 [ef4dbdcc] crash_kexec at c0494922
#1 [ef4dbe20] oops_end at c080e402
#2 [ef4dbe34] no_context at c043089d
90
CHAPTER 16. ANALYZING A CORE DUMP
Type bt <pid> to display the backtrace of a specific process or type help bt for more
information about bt usage.
crash> ps
PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 0 0 0 c09dc560 RU 0.0 0 0 [swapper]
> 0 0 1 f7072030 RU 0.0 0 0 [swapper]
0 0 2 f70a3a90 RU 0.0 0 0 [swapper]
> 0 0 3 f70ac560 RU 0.0 0 0 [swapper]
1 0 1 f705ba90 IN 0.0 2828 1424 init
... several lines omitted ...
5566 1 1 f2592560 IN 0.0 12876 784 auditd
5567 1 2 ef427560 IN 0.0 12876 784 auditd
5587 5132 0 f196d030 IN 0.0 11064 3184 sshd
> 5591 5587 2 f196d560 RU 0.0 5084 1648 bash
Use ps <pid> to display the status of a single specific process. Use help ps for more information
about ps usage.
To display basic virtual memory information, type the vm command at the interactive prompt.
crash> vm
PID: 5591 TASK: f196d560 CPU: 2 COMMAND: "bash"
MM PGD RSS TOTAL_VM
f19b5900 ef9c6000 1648k 5084k
VMA START END FLAGS FILE
f1bb0310 242000 260000 8000875 /lib/ld-2.12.so
f26af0b8 260000 261000 8100871 /lib/ld-2.12.so
efbc275c 261000 262000 8100873 /lib/ld-2.12.so
efbc2a18 268000 3ed000 8000075 /lib/libc-2.12.so
efbc23d8 3ed000 3ee000 8000070 /lib/libc-2.12.so
91
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Use vm <pid> to display information about a single specific process, or use help vm for more
information about vm usage.
crash> files
PID: 5591 TASK: f196d560 CPU: 2 COMMAND: "bash"
ROOT: / CWD: /root
FD FILE DENTRY INODE TYPE PATH
0 f734f640 eedc2c6c eecd6048 CHR /pts/0
1 efade5c0 eee14090 f00431d4 REG /proc/sysrq-trigger
2 f734f640 eedc2c6c eecd6048 CHR /pts/0
10 f734f640 eedc2c6c eecd6048 CHR /pts/0
255 f734f640 eedc2c6c eecd6048 CHR /pts/0
Use files <pid> to display files opened by only one selected process, or use help files for more
information about files usage.
Prerequisites
Procedure
2. To diagnose a kernel crash issue, upload a kernel oops log generated in vmcore.
Alternatively you can also diagnose a kernel crash issue by providing a text message or a
vmcore-dmesg.txt as an input.
92
CHAPTER 16. ANALYZING A CORE DUMP
3. Click DETECT to compare the oops message based on information from the makedumpfile
against known solutions.
Additional resources
Additional resources
Kdump Helper
93
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Prerequisites
A repository containing the kexec-tools package for your system CPU architecture.
Fulfilled kdump configuration and targets requirements. For more information see, Supported
kdump configurations and targets.
Procedure
If kdump is not enabled and running, set all required configurations and verify that kdump
service is enabled.
2. Rebuild the initramfs image of the booting kernel with the early kdump functionality:
# reboot
Verification step
94
CHAPTER 17. USING EARLY KDUMP TO CAPTURE BOOT TIME CRASHES
Verification step
Verify that rd.earlykdump was successfully added and early kdump feature was enabled:
# cat /proc/cmdline
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.14.0-1.el9.x86_64 root=/dev/mapper/rhel-root ro
crashkernel=auto resume=/dev/mapper/rhel-swap rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap
rhgb quiet rd.earlykdump
Additional resources
Enabling kdump
95
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
If Secure Boot is enabled, all of the following components have to be signed with a private key and
authenticated with the corresponding public key:
If any of these components are not signed and authenticated, the system cannot finish the booting
process.
Signed kernels
In addition, the signed first-stage boot loader and the signed kernel include embedded Red Hat public
keys. These signed executable binaries and embedded keys enable Red Hat Enterprise Linux 9 to install,
boot, and run with the Microsoft UEFI Secure Boot Certification Authority keys that are provided by the
UEFI firmware on systems that support UEFI Secure Boot.
NOTE
The build system, where you build and sign your kernel module, does not need to
have UEFI Secure Boot enabled and does not even need to be a UEFI-based
system.
18.1. PREREQUISITES
To be able to sign externally built kernel modules, install the utilities from the following
packages:
efikeygen pesign Build system Generates public and private X.509 key pair
96
CHAPTER 18. SIGNING A KERNEL AND MODULES FOR SECURE BOOT
mokutil mokutil Target system Optional utility used to manually enroll the
public key
keyctl keyutils Target system Optional utility used to display public keys
in the system keyring
UEFI Secure Boot establishes a chain of trust from the firmware to the signed drivers and kernel
modules as follows:
An UEFI private key signs, and a public key authenticates the shim first-stage boot loader. A
certificate authority (CA) in turn signs the public key. The CA is stored in the firmware database.
The shim file contains the Red Hat public key Red Hat Secure Boot (CA key 1)to authenticate
the GRUB boot loader and the kernel.
The kernel in turn contains public keys to authenticate drivers and modules.
Secure Boot is the boot path validation component of the UEFI specification. The specification defines:
UEFI Secure Boot helps in the detection of unauthorized changes but does not:
Stop boot path manipulations. Signatures are verified during booting, not when the boot loader
is installed or updated.
97
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
If the boot loader or the kernel are not signed by a system trusted key, Secure Boot prevents them from
starting.
If you want to load externally built kernels or drivers, you must sign them as well.
The system only runs the kernel-mode code after its signature has been properly authenticated.
GRUB module loading is disabled because there is no infrastructure for signing and verification
of GRUB modules. Allowing them to be loaded constitutes execution of untrusted code inside
the security perimeter that Secure Boot defines.
Red Hat provides a signed GRUB binary that contains all the supported modules on Red Hat
Enterprise Linux 9.
Additional resources
You need to meet certain conditions to load kernel modules on systems with enabled UEFI Secure Boot
functionality:
If UEFI Secure Boot is enabled or if the module.sig_enforce kernel parameter has been
specified:
You can only load those signed kernel modules whose signatures were authenticated
against keys from the system keyring (.builtin_trusted_keys) and the platform keyring
(.platform).
The public key must not be on the system revoked keys keyring (.blacklist).
If UEFI Secure Boot is disabled and the module.sig_enforce kernel parameter has not been
specified:
You can load unsigned kernel modules and signed kernel modules without a public key.
Only the keys embedded in the kernel are loaded onto .builtin_trusted_keys and
98
CHAPTER 18. SIGNING A KERNEL AND MODULES FOR SECURE BOOT
Only the keys embedded in the kernel are loaded onto .builtin_trusted_keys and
.platform.
You have no ability to augment that set of keys without rebuilding the kernel.
Module signed Public key UEFI Secure sig_enforce Module load Kernel
found and Boot state tainte
signature valid d
Enabled - Fails -
Enabled - Fails -
Enabled - Succeeds No
Source of X.509 keys User can add UEFI Secure Boot Keys loaded during boot
keys state
Enabled .platform
99
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Source of X.509 keys User can add UEFI Secure Boot Keys loaded during boot
keys state
.builtin_trusted_keys
.platform
Contains keys from third-party platform providers and custom public keys
.blacklist
A module signed by a key from .blacklist will fail authentication even if your public key is in
.builtin_trusted_keys
A signature database
Stores keys (hashes) of UEFI applications, UEFI drivers, and boot loaders
The revoked keys from this database are added to the .blacklist keyring
To use a custom kernel or custom kernel modules on a Secure Boot-enabled system, you must generate
100
CHAPTER 18. SIGNING A KERNEL AND MODULES FOR SECURE BOOT
To use a custom kernel or custom kernel modules on a Secure Boot-enabled system, you must generate
a public and private X.509 key pair. You can use the generated private key to sign the kernel or the
kernel modules. You can also validate the signed kernel or kernel modules by adding the corresponding
public key to the Machine Owner Key (MOK) for Secure Boot.
WARNING
Apply strong security measures and access policies to guard the contents of your
private key. In the wrong hands, the key could be used to compromise any system
which is authenticated by the corresponding public key.
Procedure
NOTE
In FIPS mode, you must use the --token option so that efikeygen finds the
default "NSS Certificate DB" token in the PKI database.
The public and private keys are now stored in the /etc/pki/pesign/ directory.
IMPORTANT
101
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
IMPORTANT
It is a good security practice to sign the kernel and the kernel modules within the validity
period of its signing key. However, the sign-file utility does not warn you and the key will
be usable in Red Hat Enterprise Linux 9 regardless of the validity dates.
Additional resources
Enrolling public key on target system by adding the public key to the MOK list
Prerequisites
You have installed the keyctl utility from the keyutils package.
The .builtin_trusted_keys keyring in the example shows the addition of two keys from the UEFI
102
CHAPTER 18. SIGNING A KERNEL AND MODULES FOR SECURE BOOT
The .builtin_trusted_keys keyring in the example shows the addition of two keys from the UEFI
Secure Boot db keys as well as the Red Hat Secure Boot (CA key 1), which is embedded in the
shim boot loader.
The following example shows the kernel console output. The messages identify the keys with an
UEFI Secure Boot related source. These include UEFI Secure Boot db, embedded shim, and MOK
list.
Additional resources
When RHEL 9 boots on a UEFI-based system with Secure Boot enabled, the kernel loads onto the
platform keyring (.platform) all public keys that are in the Secure Boot db key database. At the same
time, the kernel excludes the keys in the dbx database of revoked keys.
You can use the Machine Owner Key (MOK) facility feature to expand the UEFI Secure Boot key
database. When RHEL 9 boots on an UEFI-enabled system with Secure Boot enabled, the keys on the
MOK list are also added to the platform keyring (.platform) in addition to the keys from the key
database. The MOK list keys are also stored persistently and securely in the same fashion as the Secure
Boot database keys, but these are two separate facilities. The MOK facility is supported by shim,
MokManager, GRUB, and the mokutil utility.
NOTE
Prerequisites
You have generated a public and private key pair and know the validity dates of your public
keys. For details, see Generating a public and private key pair .
Procedure
103
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Procedure
# certutil -d /etc/pki/pesign \
-n 'Custom Secure Boot key' \
-Lr \
> sb_cert.cer
5. Choose Enroll MOK, enter the password you previously associated with this request when
prompted, and confirm the enrollment.
Your public key is added to the MOK list, which is persistent.
Once a key is on the MOK list, it will be automatically propagated to the .platform keyring on
this and subsequent boots when UEFI Secure Boot is enabled.
Prerequisites
You have generated a public and private key pair and know the validity dates of your public
keys. For details, see Generating a public and private key pair .
You have enrolled your public key on the target system. For details, see Enrolling public key on
target system by adding the public key to the MOK list.
You have a kernel image in the ELF format available for signing.
Procedure
Replace version with the version suffix of your vmlinuz file, and Custom Secure Boot key
with the name that you chose earlier.
104
CHAPTER 18. SIGNING A KERNEL AND MODULES FOR SECURE BOOT
# pesign --show-signature \
--in vmlinuz-version.signed
# mv vmlinuz-version.signed vmlinuz-version
# pesign --show-signature \
--in vmlinux-version.signed
# rm vmlinux-version*
Prerequisites
You have generated a public and private key pair and know the validity dates of your public
keys. For details, see Generating a public and private key pair .
You have enrolled your public key on the target system. For details, see Enrolling public key on
target system by adding the public key to the MOK list.
Procedure
105
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Replace Custom Secure Boot key with the name that you chose earlier.
# mv /boot/efi/EFI/redhat/grubx64.efi.signed \
/boot/efi/EFI/redhat/grubx64.efi
Replace Custom Secure Boot key with the name that you chose earlier.
# mv /boot/efi/EFI/redhat/grubaa64.efi.signed \
/boot/efi/EFI/redhat/grubaa64.efi
Your signed kernel module is also loadable on systems where UEFI Secure Boot is disabled or on a non-
UEFI system. As a result, you do not need to provide both a signed and unsigned version of your kernel
module.
Prerequisites
You have generated a public and private key pair and know the validity dates of your public
106
CHAPTER 18. SIGNING A KERNEL AND MODULES FOR SECURE BOOT
You have generated a public and private key pair and know the validity dates of your public
keys. For details, see Generating a public and private key pair .
You have enrolled your public key on the target system. For details, see Enrolling public key on
target system by adding the public key to the MOK list.
You have a kernel module in ELF image format available for signing.
Procedure
# certutil -d /etc/pki/pesign \
-n 'Custom Secure Boot key' \
-Lr \
> sb_cert.cer
2. Extract the key from the NSS database as a PKCS #12 file:
# pk12util -o sb_cert.p12 \
-n 'Custom Secure Boot key' \
-d /etc/pki/pesign
3. When the previous command prompts you, enter a new password that encrypts the private key.
# openssl pkcs12 \
-in sb_cert.p12 \
-out sb_cert.priv \
-nocerts \
-noenc
IMPORTANT
5. Sign your kernel module. The following command appends the signature directly to the ELF
image in your kernel module file:
# /usr/src/kernels/$(uname -r)/scripts/sign-file \
sha256 \
sb_cert.priv \
sb_cert.cer \
my_module.ko
IMPORTANT
107
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
IMPORTANT
In Red Hat Enterprise Linux 9, the validity dates of the key pair matter. The key does not
expire, but the kernel module must be signed within the validity period of its signing key.
The sign-file utility will not warn you of this. For example, a key that is only valid in 2021
can be used to authenticate a kernel module signed in 2021 with that key. However, users
cannot use that key to sign a kernel module in 2022.
Verification
Check that the signature lists your name as entered during generation.
NOTE
The appended signature is not contained in an ELF image section and is not a
formal part of the ELF image. Therefore, utilities such as readelf cannot display
the signature on your kernel module.
# insmod my_module.ko
# modprobe -r my_module.ko
Additional resources
Prerequisites
You have generated the public and private key pair. For details, see Generating a public and
private key pair.
You have enrolled the public key into the system keyring. For details, see Enrolling public key on
target system by adding the public key to the MOK list.
You have signed a kernel module with the private key. For details, see Signing kernel modules
with the private key.
Procedure
2. Copy the kernel module into the extra/ directory of the kernel that you want:
# depmod -a
# modprobe -v my_module
Verification
Additional resources
109
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
19.1. PREREQUISITES
Secure Boot is enabled on your system.
UEFI Secure Boot establishes a chain of trust from the firmware to the signed drivers and kernel
modules as follows:
An UEFI private key signs, and a public key authenticates the shim first-stage boot loader. A
certificate authority (CA) in turn signs the public key. The CA is stored in the firmware database.
The shim file contains the Red Hat public key Red Hat Secure Boot (CA key 1)to authenticate
the GRUB boot loader and the kernel.
The kernel in turn contains public keys to authenticate drivers and modules.
Secure Boot is the boot path validation component of the UEFI specification. The specification defines:
UEFI Secure Boot helps in the detection of unauthorized changes but does not:
Stop boot path manipulations. Signatures are verified during booting, not when the boot loader
is installed or updated.
If the boot loader or the kernel are not signed by a system trusted key, Secure Boot prevents them from
starting.
The UEFI Secure Boot Revocation List, or the Secure Boot Forbidden Signature Database (dbx), is a list
110
CHAPTER 19. UPDATING THE SECURE BOOT REVOCATION LIST
The UEFI Secure Boot Revocation List, or the Secure Boot Forbidden Signature Database (dbx), is a list
that identifies software that Secure Boot no longer allows to run.
When a security issue or a stability problem is found in software that interfaces with Secure Boot, such
as in the GRUB boot loader, the Revocation List stores its hash signature. Software with such a
recognized signature cannot run during boot, and the system boot fails to prevent compromising the
system.
For example, a certain version of GRUB might contain a security issue that allows an attacker to bypass
the Secure Boot mechanism. When the issue is found, the Revocation List adds hash signatures of all
GRUB versions that contain the issue. As a result, only secure GRUB versions can boot on the system.
The Revocation List requires regular updates to recognize newly found issues. When updating the
Revocation List, make sure to use a safe update method that does not cause your currently installed
system to no longer boot.
Prerequisites
Procedure
# fwupdmgr get-devices
# fwupdmgr refresh
# fwupdmgr update
111
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
5. At the end of the update, fwupdmgr or Software asks you to reboot the system. Confirm the
reboot.
Verification
After the reboot, check the current version of the Revocation List again:
# fwupdmgr get-devices
Procedure
# fwupdmgr get-devices
# ls /usr/share/dbxtool/
3. Select the most recent update file for your architecture. The file names use the following
format:
DBXUpdate-date-architecture.cab
5. At the end of the update, fwupdmgr asks you to reboot the system. Confirm the reboot.
Verification
After the reboot, check the current version of the Revocation List again:
# fwupdmgr get-devices
112
CHAPTER 20. ENHANCING SECURITY WITH THE KERNEL INTEGRITY SUBSYSTEM
NOTE
You can use the features with cryptographic signatures only for Red Hat products
because the kernel keyring system includes only the certificates for Red Hat signature
keys. Using other hash features results in incomplete tamper-proofing.
IMA places the measured values within the kernel’s memory space. This prevents users of the
system from modifying the measured values.
IMA allows local and remote parties to verify the measured values.
IMA provides local validation of the current content of files against the values previously
stored in the measurement list within the kernel memory. This extension forbids performing
any operation on a specific file in case the current and the previous measures do not match.
EVM protects extended attributes of files (also known as xattr) that are related to system
security, such as IMA measurements and SELinux attributes. EVM cryptographically hashes
their corresponding values or signs them with cryptographic keys. The keys are stored in the
kernel keyring subsystem.
The kernel integrity subsystem can use the Trusted Platform Module (TPM) to further harden system
security.
A TPM is a hardware, firmware, or virtual component with integrated cryptographic keys, which is built
according to the TPM specification by the Trusted Computing Group (TCG) for important
cryptographic functions. TPMs are usually built as dedicated hardware attached to the platform’s
motherboard. By providing cryptographic functions from a protected and tamper-proof area of the
hardware chip, TPMs are protected from software-based attacks. TPMs provide the following features:
Random-number generator
113
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Hashing generator
Remote attestation
Additional resources
Security hardening
Trusted and encrypted keys are variable-length symmetric keys generated by the kernel that use the
kernel keyring service. The integrity of the keys can be verified, which means that they can be used, for
example, by the extended verification module (EVM) to verify and confirm the integrity of a running
system. User-level programs can only access the keys in the form of encrypted blobs.
Trusted keys
Trusted keys need the Trusted Platform Module (TPM) chip, which is used to both create and
encrypt (seal) the keys. Each TPM has a master wrapping key, called the storage root key, which is
stored within the TPM itself.
NOTE
RHEL 9 supports only TPM 2.0. If you must use TPM 1.2, use RHEL 8. For more
information, see the Is Trusted Platform Module (TPM) supported by Red Hat?
solution.
You can verify that a TPM 2.0 chip has been enabled by entering the following command:
$ cat /sys/class/tpm/tpm0/tpm_version_major
2
You can also enable a TPM 2.0 chip and manage the TPM 2.0 device through settings in the machine
firmware.
In addition to that, you can seal the trusted keys with a specific set of the TPM’s platform
configuration register (PCR) values. PCR contains a set of integrity-management values that reflect
the firmware, boot loader, and operating system. This means that PCR-sealed keys can only be
decrypted by the TPM on the same system on which they were encrypted. However, when a PCR-
sealed trusted key is loaded (added to a keyring), and thus its associated PCR values are verified, it
can be updated with new (or future) PCR values, so that a new kernel, for example, can be booted.
You can save a single key also as multiple blobs, each with a different PCR value.
Encrypted keys
Encrypted keys do not require a TPM, because they use the kernel Advanced Encryption Standard
(AES), which makes them faster than trusted keys. Encrypted keys are created using kernel-
generated random numbers and encrypted by a master key when they are exported into user-space
blobs.
The master key is either a trusted key or a user key. If the master key is not trusted, the encrypted key is
only as secure as the user key used to encrypt it.
114
CHAPTER 20. ENHANCING SECURITY WITH THE KERNEL INTEGRITY SUBSYSTEM
Prerequisites
Trusted Platform Module (TPM) is enabled and active. See The kernel integrity subsystem and
Trusted and encrypted keys.
You can verify that your system has a TPM by entering the tpm2_pcrread command. If the
output from this command displays several hashes, you have a TPM.
Procedure
1. Create a 2048-bit RSA key with an SHA-256 primary storage key with a persistent handle of, for
example, 81000001, by using one of the following utilities:
2. Create a trusted key by using a TPM 2.0 with the syntax of keyctl add trusted <NAME> "new
<KEY_LENGTH> keyhandle=<PERSISTENT-HANDLE> [options]" <KEYRING>. In this
example, the persistent handle is 81000001.
The command creates a trusted key called kmk with the length of 32 bytes (256 bits) and
places it in the user keyring (@u). The keys may have a length of 32 to 128 bytes (256 to 1024
bits).
# keyctl show
Session Keyring
-3 --alswrv 500 500 keyring: ses 97833714 --alswrv 500 -1 \ keyring: uid.1000
642500861 --alswrv 500 500 \ trusted: kmk
115
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
4. Export the key to a user-space blob by using the serial number of the trusted key:
The command uses the pipe subcommand and the serial number of kmk.
6. Create secure encrypted keys that use the TPM-sealed trusted key (kmk). Follow this syntax:
keyctl add encrypted <NAME> "new [FORMAT] <KEY_TYPE>:<PRIMARY_KEY_NAME>
<KEY_LENGTH>" <KEYRING>:
Additional resources
Procedure
The command generates a user key called kmk-user which acts as a primary key and is used to
seal the actual encrypted keys.
2. Generate an encrypted key using the primary key from the previous step:
# keyctl list @u
2 keys in keyring:
427069434: --alswrv 1000 1000 user: kmk-user
116
CHAPTER 20. ENHANCING SECURITY WITH THE KERNEL INTEGRITY SUBSYSTEM
IMPORTANT
Encrypted keys that are not sealed by a trusted primary key are only as secure as the user
primary key (random-number key) that was used to encrypt them. Therefore, load the
primary user key as securely as possible and preferably early during the boot process.
Additional resources
Prerequisites
NOTE
The securityfs file system is mounted on the /sys/kernel/security/ directory and the
/sys/kernel/security/integrity/ima/ directory exists. You can verify where securityfs is mounted
by using the mount command:
# mount
...
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
...
The systemd service manager is patched to support IMA and EVM on boot time. Verify by using
the following command:
For example:
117
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Procedure
1. Enable IMA and EVM in the fix mode for the current boot entry and allow users to gather and
update the IMA measurements by adding the following kernel command-line parameters:
The command enables IMA and EVM in the fix mode for the current boot entry and allows users
to gather and update the IMA measurements.
The ima_policy=appraise_tcb kernel command-line parameter ensures that the kernel uses
the default Trusted Computing Base (TCB) measurement policy and the appraisal step. The
appraisal step forbids access to files whose prior and current measures do not match.
3. Optional: Verify that the parameters have been added to the kernel command line:
# cat /proc/cmdline
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.14.0-1.el9.x86_64 root=/dev/mapper/rhel-root ro
crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M resume=/dev/mapper/rhel-swap
rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap rhgb quiet ima_policy=appraise_tcb ima_appraise=fix
evm=fix
# keyctl add user kmk "$(dd if=/dev/urandom bs=1 count=32 2> /dev/null)" @u
748544121
The kmk is kept entirely in the kernel space memory. The 32-byte long value of the kmk is
generated from random bytes from the /dev/urandom file and placed in the user ( @u) keyring.
The key serial number is on the first line of the previous output.
118
CHAPTER 20. ENHANCING SECURITY WITH THE KERNEL INTEGRITY SUBSYSTEM
The command uses the kmk to generate and encrypt a 64-byte long user key (named evm-
key) and places it in the user ( @u) keyring. The key serial number is on the first line of the
previous output.
IMPORTANT
It is necessary to name the user key as evm-key because that is the name the
EVM subsystem is expecting and is working with.
# mkdir -p /etc/keys/
7. Search for the kmk and export its unencrypted value into the new directory.
8. Search for the evm-key and export its encrypted value into the new directory.
The evm-key has been encrypted by the kernel master key earlier.
# keyctl show
Session Keyring
974575405 --alswrv 0 0 keyring: ses 299489774 --alswrv 0 65534 \ keyring: uid.0
748544121 --alswrv 0 0 \ user: kmk
641780271 --alswrv 0 0 \_ encrypted: evm-key
# ls -l /etc/keys/
total 8
-rw-r--r--. 1 root root 246 Jun 24 12:44 evm-key
-rw-r--r--. 1 root root 32 Jun 24 12:43 kmk
10. Optional: If the keys have been removed from the keyring, for example after system reboot, you
can import the already exported kmk and evm-key instead of creating new ones.
119
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
WARNING
Enabling IMA and EVM without relabeling the system might make the
majority of the files on the system inaccessible.
Verification
# dmesg | tail -1
[…] evm: key initialized
Additional resources
grep(1) manpage
Prerequisites
IMA and EVM are enabled. For more information, see Enabling integrity measurement
architecture and extended verification module.
Procedure
120
CHAPTER 20. ENHANCING SECURITY WITH THE KERNEL INTEGRITY SUBSYSTEM
IMA and EVM ensure that the test_file example file has assigned hash values that are stored as
its extended attributes.
# getfattr -m . -d test_file
# file: test_file
security.evm=0sAnDIy4VPA0HArpPO/EqiutnNyBql
security.ima=0sAQOEDeuUnWzwwKYk+n66h/vby3eD
The example output shows extended attributes with the IMA and EVM hash values and SELinux
context. EVM adds a security.evm extended attribute related to the other attributes. At this
point, you can use the evmctl utility on security.evm to generate either an RSA-based digital
signature or a Hash-based Message Authentication Code (HMAC-SHA1).
Additional resources
Security hardening
Procedure
Verification
1. Confirm that the reinstalled package file has the valid IMA signature. For example, to check the
IMA signature of the /usr/bin/bash file, run:
2. Verify IMA signature of a file with a specified certificate. The IMA code signing key is accessible
by /usr/share/doc/kernel-keys/$(uname -r)/ima.cer.
121
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Prerequisites
The files covered by the policy have valid signatures. For instructions, see Adding IMA
signatures to package files.
Procedure
1. To copy the Red Hat IMA code signing key to the /etc/ima/keys file, run:
$ mkdir -p /etc/keys/ima
$ cp /usr/share/doc/kernel-keys/$(uname -r)/ima.cer /etc/ima/keys
2. To add the IMA code signing key to the .ima keyring, run:
3. Depending on your threat model, define an IMA policy in the /etc/sysconfig/ima-policy file. For
example, the following IMA policy checks the integrity of both executables and involved
memory mapping library files:
# PROC_SUPER_MAGIC = 0x9fa0
dont_appraise fsmagic=0x9fa0
# SYSFS_MAGIC = 0x62656572
dont_appraise fsmagic=0x62656572
# DEBUGFS_MAGIC = 0x64626720
dont_appraise fsmagic=0x64626720
# TMPFS_MAGIC = 0x01021994
dont_appraise fsmagic=0x1021994
# RAMFS_MAGIC
dont_appraise fsmagic=0x858458f6
# DEVPTS_SUPER_MAGIC=0x1cd1
dont_appraise fsmagic=0x1cd1
# BINFMTFS_MAGIC=0x42494e4d
dont_appraise fsmagic=0x42494e4d
# SECURITYFS_MAGIC=0x73636673
dont_appraise fsmagic=0x73636673
# SELINUX_MAGIC=0xf97cff8c
dont_appraise fsmagic=0xf97cff8c
# SMACK_MAGIC=0x43415d53
dont_appraise fsmagic=0x43415d53
# NSFS_MAGIC=0x6e736673
dont_appraise fsmagic=0x6e736673
# EFIVARFS_MAGIC
122
CHAPTER 20. ENHANCING SECURITY WITH THE KERNEL INTEGRITY SUBSYSTEM
dont_appraise fsmagic=0xde5e81e4
# CGROUP_SUPER_MAGIC=0x27e0eb
dont_appraise fsmagic=0x27e0eb
# CGROUP2_SUPER_MAGIC=0x63677270
dont_appraise fsmagic=0x63677270
appraise func=BPRM_CHECK
appraise func=FILE_MMAP mask=MAY_EXEC
4. To load the IMA policy to make sure the kernel accepts this IMA policy, run:
5. To enable the dracut integrity module to automatically load the IMA code signing key and the
IMA policy, run:
The kernel searches the .ima keyring for a code signing key to verify an IMA signature. Before you add a
code signing key to the .ima keyring, you need to ensure that IMA CA key signed this key in the
.builtin_trusted_keys or .secondary_trusted_keys keyrings.
Prerequisites
the KeyUsage extension with the keyCertSign bit asserted but without the
digitalSignature asserted.
The custom IMA code signing key falls under the following criteria:
The IMA CA key signed this custom IMA code signing key.
Procedure
# openssl req -new -x509 -utf8 -sha256 -days 3650 -batch -config ima_ca.conf -outform
DER -out custom_ima_ca.der -keyout custom_ima_ca.priv
# cat ima_ca.conf
[ req ]
123
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
default_bits = 2048
distinguished_name = req_distinguished_name
prompt = no
string_mask = utf8only
x509_extensions = ca
[ req_distinguished_name ]
O = YOUR_ORG
CN = YOUR_COMMON_NAME IMA CA
emailAddress = YOUR_EMAIL
[ ca ]
basicConstraints=critical,CA:TRUE
subjectKeyIdentifier=hash
authorityKeyIdentifier=keyid:always,issuer
keyUsage=critical,keyCertSign,cRLSign
3. To generate a private key and a certificate signing request (CSR) for the IMA code signing key,
run:
# openssl req -new -utf8 -sha256 -days 365 -batch -config ima.conf -out
custom_ima.csr -keyout custom_ima.priv
# cat ima.conf
[ req ]
default_bits = 2048
distinguished_name = req_distinguished_name
prompt = no
string_mask = utf8only
x509_extensions = code_signing
[ req_distinguished_name ]
O = YOUR_ORG
CN = YOUR_COMMON_NAME IMA signing key
emailAddress = YOUR_EMAIL
[ code_signing ]
basicConstraints=critical,CA:FALSE
keyUsage=digitalSignature
subjectKeyIdentifier=hash
authorityKeyIdentifier=keyid:always,issuer
5. Use the IMA CA private key to sign the CSR to create the IMA code signing certificate:
# openssl x509 -req -in custom_ima.csr -days 365 -extfile ima.conf -extensions
code_signing -CA custom_ima_ca.der -CAkey custom_ima_ca.priv -CAcreateserial -
outform DER -out ima.der
In the Secure Boot environment, you may want to only load a signed IMA policy signed by your custom
124
CHAPTER 20. ENHANCING SECURITY WITH THE KERNEL INTEGRITY SUBSYSTEM
In the Secure Boot environment, you may want to only load a signed IMA policy signed by your custom
IMA key.
Prerequisites
The MOK list contains the custom IMA key. For guidance, see Enrolling public key on target
system by adding the public key to the MOK list.
Procedure
4. Sign the policy with your custom IMA code signing key by running the command:
125
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
To achieve this, systemd takes various configuration options from the unit files or directly via the
systemctl command. Then systemd applies those options to specific process groups by using the Linux
kernel system calls and features like cgroups and namespaces.
NOTE
You can review the full set of configuration options for systemd in the following manual
pages:
systemd.resource-control(5)
systemd.exec(5)
ensures that managed services start at the right time and in the correct order during the boot
process.
ensures that managed services run smoothly to use the underlying hardware platform optimally.
provides capabilities to tune various options, which can improve the performance of the service.
IMPORTANT
In general, Red Hat recommends you use systemd for controlling the usage of system
resources. You should manually configure the cgroups virtual file system only in special
cases. For example, when you need to use cgroup-v1 controllers that have no
equivalents in cgroup-v2 hierarchy.
Weights
You can distribute the resource by adding up the weights of all sub-groups and giving each sub-
group the fraction matching its ratio against the sum.
For example, if you have 10 cgroups, each with weight of value 100, the sum is 1000. Each cgroup
receives one tenth of the resource.
Weight is usually used to distribute stateless resources. For example the CPUWeight= option is an
implementation of this resource distribution model.
126
CHAPTER 21. USING SYSTEMD TO MANAGE RESOURCES USED BY APPLICATIONS
Limits
A cgroup can consume up to the configured amount of the resource. The sum of sub-group limits
can exceed the limit of the parent cgroup. Therefore it is possible to overcommit resources in this
model.
For example the MemoryMax= option is an implementation of this resource distribution model.
Protections
You can set up a protected amount of a resource for a cgroup. If the resource usage is below the
protection boundary, the kernel will try not to penalize this cgroup in favor of other cgroups that
compete for the same resource. An overcommit is also possible.
For example the MemoryLow= option is an implementation of this resource distribution model.
Allocations
Exclusive allocations of an absolute amount of a finite resource. An overcommit is not possible. An
example of this resource type in Linux is the real-time budget.
unit file option
A setting for resource control configuration.
For example, you can configure CPU resource with options like CPUAccounting=, or CPUQuota=.
Similarly, you can configure memory or I/O resources with options like AllowedMemoryNodes= and
IOAccounting=.
2. Set the required value of the CPU time allocation policy option:
Verification steps
Check the newly assigned values for the service of your choice.
Additional resources
127
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
creating custom unit files or using the systemctl command. Also, systemd automatically mounts
hierarchies for important kernel resource controllers at the /sys/fs/cgroup/ directory.
For resource control, you can use the following three systemd unit types:
Service
A process or a group of processes, which systemd started according to a unit configuration file.
Services encapsulate the specified processes so that they can be started and stopped as one set.
Services are named in the following way:
<name>.service
Scope
A group of externally created processes. Scopes encapsulate processes that are started and stopped
by the arbitrary processes through the fork() function and then registered by systemd at runtime.
For example, user sessions, containers, and virtual machines are treated as scopes. Scopes are
named as follows:
<name>.scope
Slice
A group of hierarchically organized units. Slices organize a hierarchy in which scopes and services are
placed.
The actual processes are contained in scopes or in services. Every name of a slice unit corresponds to
the path to a location in the hierarchy.
The dash (-) character acts as a separator of the path components to a slice from the -.slice root
slice. In the following example:
<parent-name>.slice
parent-name.slice is a sub-slice of parent.slice, which is a sub-slice of the -.slice root slice. parent-
name.slice can have its own sub-slice named parent-name-name2.slice, and so on.
The service, the scope, and the slice units directly map to objects in the control group hierarchy. When
these units are activated, they map directly to control group paths built from the unit names.
Control group /:
-.slice
├─user.slice
│ ├─user-42.slice
│ │ ├─session-c1.scope
│ │ │ ├─ 967 gdm-session-worker [pam/gdm-launch-environment]
│ │ │ ├─1035 /usr/libexec/gdm-x-session gnome-session --autostart
/usr/share/gdm/greeter/autostart
│ │ │ ├─1054 /usr/libexec/Xorg vt1 -displayfd 3 -auth /run/user/42/gdm/Xauthority -background none
-noreset -keeptty -verbose 3
│ │ │ ├─1212 /usr/libexec/gnome-session-binary --autostart /usr/share/gdm/greeter/autostart
│ │ │ ├─1369 /usr/bin/gnome-shell
│ │ │ ├─1732 ibus-daemon --xim --panel disable
128
CHAPTER 21. USING SYSTEMD TO MANAGE RESOURCES USED BY APPLICATIONS
│ │ │ ├─1752 /usr/libexec/ibus-dconf
│ │ │ ├─1762 /usr/libexec/ibus-x11 --kill-daemon
│ │ │ ├─1912 /usr/libexec/gsd-xsettings
│ │ │ ├─1917 /usr/libexec/gsd-a11y-settings
│ │ │ ├─1920 /usr/libexec/gsd-clipboard
…
├─init.scope
│ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18
└─system.slice
├─rngd.service
│ └─800 /sbin/rngd -f
├─systemd-udevd.service
│ └─659 /usr/lib/systemd/systemd-udevd
├─chronyd.service
│ └─823 /usr/sbin/chronyd
├─auditd.service
│ ├─761 /sbin/auditd
│ └─763 /usr/sbin/sedispatch
├─accounts-daemon.service
│ └─876 /usr/libexec/accounts-daemon
├─example.service
│ ├─ 929 /bin/bash /home/jdoe/example.sh
│ └─4902 sleep 1
…
The example above shows that services and scopes contain processes and are placed in slices that do
not contain processes of their own.
Additional resources
Understanding cgroups
Procedure
List all active units on the system with the systemctl utility. The terminal returns an output
similar to the following example:
# systemctl
UNIT LOAD ACTIVE SUB DESCRIPTION
…
init.scope loaded active running System and Service Manager
session-2.scope loaded active running Session 2 of user jdoe
abrt-ccpp.service loaded active exited Install ABRT coredump hook
abrt-oops.service loaded active running ABRT kernel log watcher
abrt-vmcore.service loaded active exited Harvest vmcores for ABRT
129
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
UNIT
A name of a unit that also reflects the unit position in a control group hierarchy. The units
relevant for resource control are a slice, a scope, and a service.
LOAD
Indicates whether the unit configuration file was properly loaded. If the unit file failed to load,
the field contains the state error instead of loaded. Other unit load states are: stub, merged,
and masked.
ACTIVE
The high-level unit activation state, which is a generalization of SUB.
SUB
The low-level unit activation state. The range of possible values depends on the unit type.
DESCRIPTION
The description of the unit content and functionality.
# systemctl --all
The --type option requires a comma-separated list of unit types such as a service and a slice, or
unit load states such as loaded and masked.
Additional resources
130
CHAPTER 21. USING SYSTEMD TO MANAGE RESOURCES USED BY APPLICATIONS
Procedure
Display the whole cgroups hierarchy on your system with the systemd-cgls command.
# systemd-cgls
Control group /:
-.slice
├─user.slice
│ ├─user-42.slice
│ │ ├─session-c1.scope
│ │ │ ├─ 965 gdm-session-worker [pam/gdm-launch-environment]
│ │ │ ├─1040 /usr/libexec/gdm-x-session gnome-session --autostart
/usr/share/gdm/greeter/autostart
…
├─init.scope
│ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18
└─system.slice
…
├─example.service
│ ├─6882 /bin/bash /home/jdoe/example.sh
│ └─6902 sleep 1
├─systemd-journald.service
└─629 /usr/lib/systemd/systemd-journald
…
The example output returns the entire cgroups hierarchy, where the highest level is formed by
slices.
Display the cgroups hierarchy filtered by a resource controller with the systemd-cgls
<resource_controller> command.
# systemd-cgls memory
Controller memory; Control group /:
├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18
├─user.slice
│ ├─user-42.slice
│ │ ├─session-c1.scope
│ │ │ ├─ 965 gdm-session-worker [pam/gdm-launch-environment]
…
└─system.slice
|
…
├─chronyd.service
│ └─844 /usr/sbin/chronyd
├─example.service
│ ├─8914 /bin/bash /home/jdoe/example.sh
│ └─8916 sleep 1
…
The example output lists the services that interact with the selected controller.
131
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Display detailed information about a certain unit and its part of the cgroups hierarchy with the
systemctl status <system_unit> command.
Additional resources
Procedure
1. To view which cgroup a process belongs to, run the # cat proc/<PID>/cgroup command:
# cat /proc/2467/cgroup
0::/system.slice/example.service
The example output relates to a process of interest. In this case, it is a process identified by PID
2467, which belongs to the example.service unit. You can determine whether the process was
placed in a correct control group as defined by the systemd unit file specifications.
2. To display what controllers and respective configuration files the cgroup uses, check the
cgroup directory:
# cat /sys/fs/cgroup/system.slice/example.service/cgroup.controllers
memory pids
# ls /sys/fs/cgroup/system.slice/example.service/
cgroup.controllers
cgroup.events
…
cpu.pressure
cpu.stat
io.pressure
memory.current
132
CHAPTER 21. USING SYSTEMD TO MANAGE RESOURCES USED BY APPLICATIONS
memory.events
…
pids.current
pids.events
pids.max
NOTE
The version 1 hierarchy of cgroups uses a per-controller model. Therefore the output
from the /proc/PID/cgroup file shows, which cgroups under each controller the PID
belongs to. You can find the respective cgroups under the controller directories at
/sys/fs/cgroup/<controller_name>/.
Additional resources
Procedure
1. Display a dynamic account of currently running cgroups with the systemd-cgtop command.
# systemd-cgtop
Control Group Tasks %CPU Memory Input/s Output/s
/ 607 29.8 1.5G - -
/system.slice 125 - 428.7M - -
/system.slice/ModemManager.service 3 - 8.6M - -
/system.slice/NetworkManager.service 3 - 12.8M - -
/system.slice/accounts-daemon.service 3 - 1.8M - -
/system.slice/boot.mount - - 48.0K - -
/system.slice/chronyd.service 1 - 2.0M - -
/system.slice/cockpit.socket - - 1.3M - -
/system.slice/colord.service 3 - 3.5M - -
/system.slice/crond.service 1 - 1.8M - -
/system.slice/cups.service 1 - 3.1M - -
/system.slice/dev-hugepages.mount - - 244.0K - -
/system.slice/dev-mapper-rhel\x2dswap.swap - - 912.0K - -
/system.slice/dev-mqueue.mount - - 48.0K - -
/system.slice/example.service 2 - 2.0M - -
/system.slice/firewalld.service 2 - 28.8M - -
...
The example output displays currently running cgroups ordered by their resource usage (CPU,
memory, disk I/O load). The list refreshes every 1 second by default. Therefore, it offers a
dynamic insight into the actual resource usage of each control group.
Additional resources
133
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Additional resources
set limits.
prioritize.
Prerequisites
Procedure
…
[Service]
MemoryMax=1500K
…
The configuration limits the maximum memory that the processes in a control group cannot
exceed. The example.service service is part of such a control group which has imposed
limitations. You can use suffixes K, M, G, or T to identify Kilobyte, Megabyte, Gigabyte, or
Terabyte as a unit of measurement.
# systemctl daemon-reload
Verification
# cat /sys/fs/cgroup/system.slice/example.service/memory.max
1536000
The example output shows that the memory consumption was limited at around 1,500 KB.
Additional resources
134
CHAPTER 21. USING SYSTEMD TO MANAGE RESOURCES USED BY APPLICATIONS
Understanding cgroups
The default CPU affinity mask applies to all services managed by systemd.
To configure CPU affinity mask for a particular systemd service, systemd provides CPUAffinity= both
as:
The CPUAffinity= unit file option sets a list of CPUs or CPU ranges that are merged and used as the
affinity mask.
Procedure
To set CPU affinity mask for a particular systemd service using the CPUAffinity unit file option:
1. Check the values of the CPUAffinity unit file option in the service of your choice:
2. As the root user, set the required value of the CPUAffinity unit file option for the CPU ranges
used as the affinity mask:
Additional resources
To set the default CPU affinity mask for all systemd services using the /etc/systemd/system.conf file:
135
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
1. Set the CPU numbers for the CPUAffinity= option in the [Manager] section of the
/etc/systemd/system.conf file.
# systemctl daemon-reload
Additional resources
Memory close to the CPU has lower latency (local memory) than memory that is local for a different
CPU (foreign memory) or is shared between a set of CPUs.
In terms of the Linux kernel, NUMA policy governs where (for example, on which NUMA nodes) the
kernel allocates physical memory pages for the process.
systemd provides unit file options NUMAPolicy and NUMAMask to control memory allocation policies
for services.
Procedure
To set the NUMA memory policy through the NUMAPolicy unit file option:
1. Check the values of the NUMAPolicy unit file option in the service of your choice:
2. As a root, set the required policy type of the NUMAPolicy unit file option:
1. Search in the /etc/systemd/system.conf file for the NUMAPolicy option in the [Manager]
section of the file.
# systemd daemon-reload
136
CHAPTER 21. USING SYSTEMD TO MANAGE RESOURCES USED BY APPLICATIONS
IMPORTANT
When you configure a strict NUMA policy, for example bind, make sure that you also
appropriately set the CPUAffinity= unit file option.
Additional resources
NUMAPolicy
Controls the NUMA memory policy of the executed processes. You can use these policy types:
default
preferred
bind
interleave
local
NUMAMask
Controls the NUMA node list which is associated with the selected NUMA policy.
Note that you do not have to specify the NUMAMask option for the following policies:
default
local
For the preferred policy, the list specifies only a single NUMA node.
Additional resources
Procedure
To create a transient control group, use the systemd-run command in the following format:
137
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
This command creates and starts a transient service or a scope unit and runs a custom command
in such a unit.
The --unit=<name> option gives a name to the unit. If --unit is not specified, the name is
generated automatically.
The --slice=<name>.slice option makes your service or scope unit a member of a specified
slice. Replace <name>.slice with the name of an existing slice (as shown in the output of
systemctl -t slice), or create a new slice by passing a unique name. By default, services and
scopes are created as members of the system.slice.
Replace <command> with the command you want to enter in the service or the scope unit.
The following message is displayed to confirm that you created and started the service or
the scope successfully:
Optional: Keep the unit running after its processes finished to collect run-time information:
The command creates and starts a transient service unit and runs a custom command in the unit.
The --remain-after-exit option ensures that the service keeps running after its processes have
finished.
Additional resources
Transient cgroups are automatically released once all the processes that a service or a scope unit
contains finish.
Procedure
The command uses the --kill-who option to select process(es) from the control group you want
138
CHAPTER 21. USING SYSTEMD TO MANAGE RESOURCES USED BY APPLICATIONS
to terminate. To kill multiple processes at the same time, pass a comma-separated list of PIDs.
The --signal option determines the type of POSIX signal to be sent to the specified processes.
The default signal is SIGTERM.
Additional resources
139
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
The systemd service manager uses cgroups to organize all units and services that it governs. Manually,
you can manage the hierarchies of cgroups by creating and removing sub-directories in the
/sys/fs/cgroup/ directory.
The resource controllers in the kernel then modify the behavior of processes in cgroups by limiting,
prioritizing or allocating system resources, of those processes. These resources include the following:
CPU time
Memory
Network bandwidth
The primary use case of cgroups is aggregating system processes and dividing hardware resources
among applications and users. This makes it possible to increase the efficiency, stability, and security of
your environment.
IMPORTANT
140
CHAPTER 22. UNDERSTANDING CONTROL GROUPS
IMPORTANT
Additional resources
cgroups-v1
cgroups-v2
A resource controller, also called a control group subsystem, is a kernel subsystem that represents a
single resource, such as CPU time, memory, network bandwidth or disk I/O. The Linux kernel provides a
range of resource controllers that are mounted automatically by the systemd service manager. You can
find a list of the currently mounted resource controllers in the /proc/cgroups file.
blkio
Sets limits on input/output access to and from block devices.
cpu
Adjusts the parameters of the Completely Fair Scheduler (CFS) for a control group’s tasks. The cpu
controller is mounted together with the cpuacct controller on the same mount.
cpuacct
Creates automatic reports on CPU resources used by tasks in a control group. The cpuacct
controller is mounted together with the cpu controller on the same mount.
cpuset
Restricts control group tasks to run only on a specified subset of CPUs and to direct the tasks to use
memory only on specified memory nodes.
devices
Controls access to devices for tasks in a control group.
freezer
Suspends or resumes tasks in a control group.
memory
Sets limits on memory use by tasks in a control group and generates automatic reports on memory
resources used by those tasks.
net_cls
Tags network packets with a class identifier (classid) that enables the Linux traffic controller (the tc
command) to identify packets that originate from a particular control group task. A subsystem of
net_cls, the net_filter (iptables), can also use this tag to perform actions on such packets. The
net_filter tags network sockets with a firewall identifier ( fwid) that allows the Linux firewall to identify
packets that originate from a particular control group task (by using the iptables command).
141
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
net_prio
Sets the priority of network traffic.
pids
Sets limits for a number of processes and their children in a control group.
perf_event
Groups tasks for monitoring by the perf performance monitoring and reporting utility.
rdma
Sets limits on Remote Direct Memory Access/InfiniBand specific resources in a control group.
hugetlb
Can be used to limit the usage of large size virtual memory pages by tasks in a control group.
io
Sets limits on input/output access to and from block devices.
memory
Sets limits on memory use by tasks in a control group and generates automatic reports on memory
resources used by those tasks.
pids
Sets limits for a number of processes and their children in a control group.
rdma
Sets limits on Remote Direct Memory Access/InfiniBand specific resources in a control group.
cpu
Adjusts the parameters of the Completely Fair Scheduler (CFS) for a control group’s tasks and
creates automatic reports on CPU resources used by tasks in a control group.
cpuset
Restricts control group tasks to run only on a specified subset of CPUs and to direct the tasks to use
memory only on specified memory nodes. Supports only the core functionality (cpus{,.effective},
mems{,.effective}) with a new partition feature.
perf_event
Groups tasks for monitoring by the perf performance monitoring and reporting utility. perf_event is
enabled automatically on the v2 hierarchy.
IMPORTANT
Additional resources
Documentation in /usr/share/doc/kernel-doc-<kernel_version>/Documentation/cgroups-v1/
directory (after installing the kernel-doc package).
142
CHAPTER 22. UNDERSTANDING CONTROL GROUPS
Namespaces are one of the most important methods for organizing and identifying software objects.
A namespace wraps a global system resource (for example, a mount point, a network device, or a
hostname) in an abstraction that makes it appear to processes within the namespace that they have
their own isolated instance of the global resource. One of the most common technologies that use
namespaces are containers.
Changes to a particular global resource are visible only to processes in that namespace and do not
affect the rest of the system or other namespaces.
To inspect which namespaces a process is a member of, you can check the symbolic links in the
/proc/<PID>/ns/ directory.
Namespace Isolates
Additional resources
143
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
IMPORTANT
In general, Red Hat recommends you use systemd for controlling the usage of system
resources. You should manually configure the cgroups virtual file system only in special
cases. For example, when you need to use cgroup-v1 controllers that have no
equivalents in cgroup-v2 hierarchy.
Prerequisites
Procedure
# mkdir /sys/fs/cgroup/Example/
The /sys/fs/cgroup/Example/ directory defines a child group. When you create the
/sys/fs/cgroup/Example/ directory, some cgroups-v2 interface files are automatically created
in the directory. The /sys/fs/cgroup/Example/ directory contains also controller-specific files
for the memory and pids controllers.
# ll /sys/fs/cgroup/Example/
-r—r—r--. 1 root root 0 Jun 1 10:33 cgroup.controllers
-r—r—r--. 1 root root 0 Jun 1 10:33 cgroup.events
-rw-r—r--. 1 root root 0 Jun 1 10:33 cgroup.freeze
-rw-r—r--. 1 root root 0 Jun 1 10:33 cgroup.procs
…
-rw-r—r--. 1 root root 0 Jun 1 10:33 cgroup.subtree_control
-r—r—r--. 1 root root 0 Jun 1 10:33 memory.events.local
-rw-r—r--. 1 root root 0 Jun 1 10:33 memory.high
-rw-r—r--. 1 root root 0 Jun 1 10:33 memory.low
…
144
CHAPTER 23. USING CGROUPFS TO MANUALLY MANAGE CGROUPS
The example output shows general cgroup control interface files such as cgroup.procs or
cgroup.controllers. These files are common to all control groups, regardless of enabled
controllers.
The files such as memory.high and pids.max relate to the memory and pids controllers, which
are in the root control group (/sys/fs/cgroup/), and are enabled by default by systemd.
By default, the newly created child group inherits all settings from the parent cgroup. In this
case, there are no limits from the root cgroup.
3. Verify that the desired controllers are available in the /sys/fs/cgroup/cgroup.controllers file:
# cat /sys/fs/cgroup/cgroup.controllers
cpuset cpu io memory hugetlb pids rdma
4. Enable the desired controllers. In this example it is cpu and cpuset controllers:
These commands enable the cpu and cpuset controllers for the immediate child groups of the
/sys/fs/cgroup/ root control group. Including the newly created Example control group. A child
group is where you can specify processes and apply control checks to each of the processes
based on your criteria.
Users can read the contents of the cgroup.subtree_control file at any level to get an idea of
what controllers are going to be available for enablement in the immediate child group.
NOTE
5. Enable the desired controllers for child cgroups of the Example control group:
This command ensures that the immediate child control group will only have controllers relevant
to regulate the CPU time distribution - not to memory or pids controllers.
# mkdir /sys/fs/cgroup/Example/tasks/
The /sys/fs/cgroup/Example/tasks/ directory defines a child group with files that relate purely
to cpu and cpuset controllers. You can now assign processes to this control group and utilize
cpu and cpuset controller options for your processes.
145
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
# ll /sys/fs/cgroup/Example/tasks
-r—r—r--. 1 root root 0 Jun 1 11:45 cgroup.controllers
-r—r—r--. 1 root root 0 Jun 1 11:45 cgroup.events
-rw-r—r--. 1 root root 0 Jun 1 11:45 cgroup.freeze
-rw-r—r--. 1 root root 0 Jun 1 11:45 cgroup.max.depth
-rw-r—r--. 1 root root 0 Jun 1 11:45 cgroup.max.descendants
-rw-r—r--. 1 root root 0 Jun 1 11:45 cgroup.procs
-r—r—r--. 1 root root 0 Jun 1 11:45 cgroup.stat
-rw-r—r--. 1 root root 0 Jun 1 11:45 cgroup.subtree_control
-rw-r—r--. 1 root root 0 Jun 1 11:45 cgroup.threads
-rw-r—r--. 1 root root 0 Jun 1 11:45 cgroup.type
-rw-r—r--. 1 root root 0 Jun 1 11:45 cpu.max
-rw-r—r--. 1 root root 0 Jun 1 11:45 cpu.pressure
-rw-r—r--. 1 root root 0 Jun 1 11:45 cpuset.cpus
-r—r—r--. 1 root root 0 Jun 1 11:45 cpuset.cpus.effective
-rw-r—r--. 1 root root 0 Jun 1 11:45 cpuset.cpus.partition
-rw-r—r--. 1 root root 0 Jun 1 11:45 cpuset.mems
-r—r—r--. 1 root root 0 Jun 1 11:45 cpuset.mems.effective
-r—r—r--. 1 root root 0 Jun 1 11:45 cpu.stat
-rw-r—r--. 1 root root 0 Jun 1 11:45 cpu.weight
-rw-r—r--. 1 root root 0 Jun 1 11:45 cpu.weight.nice
-rw-r—r--. 1 root root 0 Jun 1 11:45 io.pressure
-rw-r—r--. 1 root root 0 Jun 1 11:45 memory.pressure
IMPORTANT
The cpu controller is only activated if the relevant child control group has at least 2
processes which compete for time on a single CPU.
Verification steps
Optional: confirm that you have created a new cgroup with only the desired controllers active:
# cat /sys/fs/cgroup/Example/tasks/cgroup.controllers
cpuset cpu
Additional resources
Mounting cgroups-v1
Prerequisites
146
CHAPTER 23. USING CGROUPFS TO MANUALLY MANAGE CGROUPS
You have applications for which you want to control distribution of CPU time.
You created a two level hierarchy of child control groups inside the /sys/fs/cgroup/ root control
group as in the following example:
…
├── Example
│ ├── g1
│ ├── g2
│ └── g3
…
You enabled the cpu controller in the parent control group and in child control groups similarly
as described in Creating cgroups and enabling controllers in cgroups-v2 file system .
Procedure
1. Configure desired CPU weights to achieve resource restrictions within the control groups:
2. Add the applications' PIDs to the g1, g2, and g3 child groups:
The example commands ensure that desired applications become members of the Example/g*/
child cgroups and will get their CPU time distributed as per the configuration of those cgroups.
The weights of the children cgroups (g1, g2, g3) that have running processes are summed up at
the level of the parent cgroup (Example). The CPU resource is then distributed proportionally
based on the respective weights.
As a result, when all processes run at the same time, the kernel allocates to each of them the
proportionate CPU time based on their respective cgroup’s cpu.weight file:
g3 50 ~16% (50/300)
If one process stopped running, leaving cgroup g2 with no running processes, the calculation
147
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
If one process stopped running, leaving cgroup g2 with no running processes, the calculation
would omit the cgroup g2 and only account weights of cgroups g1 and g3:
g3 50 ~25% (50/200)
IMPORTANT
If a child cgroup had multiple running processes, the CPU time allocated to the
respective cgroup would be distributed equally to the member processes of that
cgroup.
Verification
The command output shows the processes of the specified applications that run in the
Example/g*/ child cgroups.
# top
top - 05:17:18 up 1 day, 18:25, 1 user, load average: 3.03, 3.03, 3.00
Tasks: 95 total, 4 running, 91 sleeping, 0 stopped, 0 zombie
%Cpu(s): 18.1 us, 81.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 st
MiB Mem : 3737.0 total, 3233.7 free, 132.8 used, 370.5 buff/cache
MiB Swap: 4060.0 total, 4060.0 free, 0.0 used. 3373.1 avail Mem
NOTE
We forced all the example processes to run on a single CPU for clearer
illustration. The CPU weight applies the same principles also when used on
multiple CPUs.
Notice that the CPU resource for the PID 33373, PID 33374, and PID 33377 was allocated
148
CHAPTER 23. USING CGROUPFS TO MANUALLY MANAGE CGROUPS
Notice that the CPU resource for the PID 33373, PID 33374, and PID 33377 was allocated
based on the weights, 150, 100, 50, you assigned to the respective child cgroups. The weights
correspond to around 50%, 33%, and 16% allocation of CPU time for each application.
Additional resources
NOTE
Both cgroup-v1 and cgroup-v2 are fully enabled in the kernel. There is no default control
group version from the kernel point of view, and is decided by systemd to mount at
startup.
Prerequisites
Procedure
1. Configure the system to mount cgroups-v1 by default during system boot by the systemd
system and service manager:
This adds the necessary kernel command-line parameters to the current boot entry.
Verification
149
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
# ll /sys/fs/cgroup/
dr-xr-xr-x. 10 root root 0 Mar 16 09:34 blkio
lrwxrwxrwx. 1 root root 11 Mar 16 09:34 cpu → cpu,cpuacct
lrwxrwxrwx. 1 root root 11 Mar 16 09:34 cpuacct → cpu,cpuacct
dr-xr-xr-x. 10 root root 0 Mar 16 09:34 cpu,cpuacct
dr-xr-xr-x. 2 root root 0 Mar 16 09:34 cpuset
dr-xr-xr-x. 10 root root 0 Mar 16 09:34 devices
dr-xr-xr-x. 2 root root 0 Mar 16 09:34 freezer
dr-xr-xr-x. 2 root root 0 Mar 16 09:34 hugetlb
dr-xr-xr-x. 10 root root 0 Mar 16 09:34 memory
dr-xr-xr-x. 2 root root 0 Mar 16 09:34 misc
lrwxrwxrwx. 1 root root 16 Mar 16 09:34 net_cls → net_cls,net_prio
dr-xr-xr-x. 2 root root 0 Mar 16 09:34 net_cls,net_prio
lrwxrwxrwx. 1 root root 16 Mar 16 09:34 net_prio → net_cls,net_prio
dr-xr-xr-x. 2 root root 0 Mar 16 09:34 perf_event
dr-xr-xr-x. 10 root root 0 Mar 16 09:34 pids
dr-xr-xr-x. 2 root root 0 Mar 16 09:34 rdma
dr-xr-xr-x. 11 root root 0 Mar 16 09:34 systemd
The /sys/fs/cgroup/ directory, also called the root control group, by default, contains controller-
specific directories such as cpuset. In addition, there are some directories related to systemd.
Additional resources
150
CHAPTER 23. USING CGROUPFS TO MANUALLY MANAGE CGROUPS
Prerequisites
You configured the system to mount cgroups-v1 by default during system boot by the
systemd system and service manager:
This adds the necessary kernel command-line parameters to the current boot entry.
Procedure
1. Identify the process ID (PID) of the application that you want to restrict in CPU consumption:
# top
top - 11:34:09 up 11 min, 1 user, load average: 0.51, 0.27, 0.22
Tasks: 267 total, 3 running, 264 sleeping, 0 stopped, 0 zombie
%Cpu(s): 49.0 us, 3.3 sy, 0.0 ni, 47.5 id, 0.0 wa, 0.2 hi, 0.0 si, 0.0 st
MiB Mem : 1826.8 total, 303.4 free, 1046.8 used, 476.5 buff/cache
MiB Swap: 1536.0 total, 1396.0 free, 140.0 used. 616.4 avail Mem
This example output of the top program reveals that illustrative application sha1sum with PID
6955 consumes a lot of CPU resources.
# mkdir /sys/fs/cgroup/cpu/Example/
This directory represents a control group, where you can place specific processes and apply
certain CPU limits to the processes. At the same time, a number of cgroups-v1 interface files
and cpu controller-specific files will be created in the directory.
151
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
# ll /sys/fs/cgroup/cpu/Example/
-rw-r—r--. 1 root root 0 Mar 11 11:42 cgroup.clone_children
-rw-r—r--. 1 root root 0 Mar 11 11:42 cgroup.procs
-r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.stat
-rw-r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage
-r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_all
-r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_percpu
-r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_percpu_sys
-r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_percpu_user
-r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_sys
-r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_user
-rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.cfs_period_us
-rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.cfs_quota_us
-rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.rt_period_us
-rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.rt_runtime_us
-rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.shares
-r—r—r--. 1 root root 0 Mar 11 11:42 cpu.stat
-rw-r—r--. 1 root root 0 Mar 11 11:42 notify_on_release
-rw-r—r--. 1 root root 0 Mar 11 11:42 tasks
This example output shows files, such as cpuacct.usage, cpu.cfs._period_us, that represent
specific configurations and/or limits, which can be set for processes in the Example control
group. Note that the respective file names are prefixed with the name of the control group
controller to which they belong.
By default, the newly created control group inherits access to the system’s entire CPU
resources without a limit.
The cpu.cfs_quota_us file represents the total amount of time in microseconds for which
all processes collectively in a control group can run during one period (as defined by
cpu.cfs_period_us). When processes in a control group, during a single period, use up all
the time specified by the quota, they are throttled for the remainder of the period and not
allowed to run until the next period. The lower limit is 1 000 microseconds.
The example commands above set the CPU time limits so that all processes collectively in
the Example control group will be able to run only for 0.2 seconds (defined by
cpu.cfs_quota_us) out of every 1 second (defined by cpu.cfs_period_us).
# cat /sys/fs/cgroup/cpu/Example/cpu.cfs_period_us
/sys/fs/cgroup/cpu/Example/cpu.cfs_quota_us
1000000
200000
152
CHAPTER 23. USING CGROUPFS TO MANUALLY MANAGE CGROUPS
This command ensures that a specific application becomes a member of the Example control
group and hence does not exceed the CPU limits configured for the Example control group.
The PID should represent an existing process in the system. The PID 6955 here was assigned to
process sha1sum /dev/zero &, used to illustrate the use case of the cpu controller.
Verification
# cat /proc/6955/cgroup
12:cpuset:/
11:hugetlb:/
10:net_cls,net_prio:/
9:memory:/user.slice/user-1000.slice/[email protected]
8:devices:/user.slice
7:blkio:/
6:freezer:/
5:rdma:/
4:pids:/user.slice/user-1000.slice/[email protected]
3:perf_event:/
2:cpu,cpuacct:/Example
1:name=systemd:/user.slice/user-1000.slice/[email protected]/gnome-terminal-
server.service
This example output shows that the process of the desired application runs in the Example
control group, which applies CPU limits to the application’s process.
# top
top - 12:28:42 up 1:06, 1 user, load average: 1.02, 1.02, 1.00
Tasks: 266 total, 6 running, 260 sleeping, 0 stopped, 0 zombie
%Cpu(s): 11.0 us, 1.2 sy, 0.0 ni, 87.5 id, 0.0 wa, 0.2 hi, 0.0 si, 0.2 st
MiB Mem : 1826.8 total, 287.1 free, 1054.4 used, 485.3 buff/cache
MiB Swap: 1536.0 total, 1396.7 free, 139.2 used. 608.3 avail Mem
Note that the CPU consumption of the PID 6955 has decreased from 99% to 20%.
NOTE
153
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
Additional resources
154
CHAPTER 24. ANALYZING SYSTEM PERFORMANCE WITH BPF COMPILER COLLECTION
Procedure
1. Install bcc-tools.
# ll /usr/share/bcc/tools/
...
-rwxr-xr-x. 1 root root 4198 Dec 14 17:53 dcsnoop
-rwxr-xr-x. 1 root root 3931 Dec 14 17:53 dcstat
-rwxr-xr-x. 1 root root 20040 Dec 14 17:53 deadlock_detector
-rw-r--r--. 1 root root 7105 Dec 14 17:53 deadlock_detector.c
drwxr-xr-x. 3 root root 8192 Mar 11 10:28 doc
-rwxr-xr-x. 1 root root 7588 Dec 14 17:53 execsnoop
-rwxr-xr-x. 1 root root 6373 Dec 14 17:53 ext4dist
-rwxr-xr-x. 1 root root 10401 Dec 14 17:53 ext4slower
...
The doc directory in the listing above contains documentation for each tool.
Prerequisites
Root permissions
# /usr/share/bcc/tools/execsnoop
155
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
$ ls /usr/share/bcc/tools/doc/
3. The terminal running execsnoop shows the output similar to the following:
The execsnoop program prints a line of output for each new process, which consumes system
resources. It even detects processes of programs that run very shortly, such as ls, and most
monitoring tools would not register them.
PCOMM
The parent process name. (ls)
PID
The process ID. (8382)
PPID
The parent process ID. (8287)
RET
The return value of the exec() system call (0), which loads program code into new processes.
ARGS
The location of the started program with arguments.
To see more details, examples, and options for execsnoop, refer to the
/usr/share/bcc/tools/doc/execsnoop_example.txt file.
# /usr/share/bcc/tools/opensnoop -n uname
The above prints output for files, which are opened only by the process of the uname command.
$ uname
The command above opens certain files, which are captured in the next step.
3. The terminal running opensnoop shows the output similar to the following:
156
CHAPTER 24. ANALYZING SYSTEM PERFORMANCE WITH BPF COMPILER COLLECTION
The opensnoop program watches the open() system call across the whole system, and prints a
line of output for each file that uname tried to open along the way.
PID
The process ID. (8596)
COMM
The process name. (uname)
FD
The file descriptor - a value that open() returns to refer to the open file. ( 3)
ERR
Any errors.
PATH
The location of files that open() tried to open.
If a command tries to read a non-existent file, then the FD column returns -1 and the ERR
column prints a value corresponding to the relevant error. As a result, opensnoop can help you
identify an application that does not behave properly.
To see more details, examples, and options for opensnoop, refer to the
/usr/share/bcc/tools/doc/opensnoop_example.txt file.
# /usr/share/bcc/tools/biotop 30
The command enables you to monitor the top processes, which perform I/O operations on the
disk. The argument ensures that the command will produce a 30 second summary.
NOTE
# dd if=/dev/vda of=/dev/zero
The command above reads the content from the local hard disk device and writes the output to
the /dev/zero file. This step generates certain I/O traffic to illustrate biotop.
3. The terminal running biotop shows the output similar to the following:
157
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
PID
The process ID. (9568)
COMM
The process name. (dd)
DISK
The disk performing the read operations. (vda)
I/O
The number of read operations performed. (16294)
Kbytes
The amount of Kbytes reached by the read operations. (14,440,636)
AVGms
The average I/O time of read operations. (3.69)
To see more details, examples, and options for biotop, refer to the
/usr/share/bcc/tools/doc/biotop_example.txt file.
# /usr/share/bcc/tools/xfsslower 1
The command above measures the time the XFS file system spends in performing read, write,
open or sync (fsync) operations. The 1 argument ensures that the program shows only the
operations that are slower than 1 ms.
NOTE
158
CHAPTER 24. ANALYZING SYSTEM PERFORMANCE WITH BPF COMPILER COLLECTION
NOTE
$ vim text
The command above creates a text file in the vim editor to initiate certain interaction with the
XFS file system.
3. The terminal running xfsslower shows something similar upon saving the file from the previous
step:
Each line above represents an operation in the file system, which took more time than a certain
threshold. xfsslower is good at exposing possible file system problems, which can take form of
unexpectedly slow operations.
COMM
The process name. (b’bash')
T
The operation type. (R)
Read
Write
Sync
OFF_KB
The file offset in KB. (0)
FILENAME
The file being read, written, or synced.
To see more details, examples, and options for xfsslower, refer to the
/usr/share/bcc/tools/doc/xfsslower_example.txt file.
159
Red Hat Enterprise Linux 9 Managing, monitoring, and updating the kernel
160