The document provides an overview of kernel crash dump analysis, including:
1) It defines a kernel panic as the operating system's response to an unrecoverable internal error, and lists the primary causes as software bugs, hardware faults, firmware issues, or manual/conditional triggers.
2) It explains that a kernel crash dump (vmcore) captured during a panic or system hang is required to determine the root cause.
3) It describes a system hang as an unresponsive state depending on the observer, with possible causes including bugs, hardware/firmware faults, resource overload, or hypervisor issues.
Kdump and the kernel crash dump analysisBuland Singh
Kdump is a kernel crash dumping mechanism that uses kexec to load a separate crash kernel to capture a kernel memory dump (vmcore file) when the primary kernel crashes. It can be configured to dump the vmcore file to local storage or over the network. Testing involves triggering a kernel panic using SysRq keys which causes the crash kernel to load and dump diagnostic information to the configured target path for analysis.
The document discusses analyzing Linux kernel crash dumps. It covers various ways to gather crash data like serial console, netconsole, kmsg dumpers, Kdump, and Pstore. It then discusses analyzing the crashed kernel using tools like ksymoops, crash utility, and examining the backtrace, kernel logs, processes, and file descriptors. The document provides examples of gathering data from Pstore and using commands like bt, log, and ps with the crash utility to extract information from a crash dump.
The document provides an overview of kernel crash dump analysis including:
- The tools and data required such as the crash utility, kernel symbol files, vmcore files
- How to install and use these components
- Basic crash commands to analyze system, memory, storage, and network subsystems
- How to dynamically load crash extension modules to add custom commands
This document provides an introduction to kdump and kernel crash dump analysis. It discusses kexec, which allows fast rebooting by loading a new kernel from an already running kernel. Kdump uses kexec to boot a capture kernel to analyze the state of a crashed production kernel and capture a vmcore dump file. The document outlines how to configure kdump by reserving memory, setting the dump target, enabling the kdump service, and testing a crash. Kernel crash dumps can be analyzed using the crash utility to help debug issues.
Video: https://www.facebook.com/atscaleevents/videos/1693888610884236/ . Talk by Brendan Gregg from Facebook's Performance @Scale: "Linux performance analysis has been the domain of ancient tools and metrics, but that's now changing in the Linux 4.x series. A new tracer is available in the mainline kernel, built from dynamic tracing (kprobes, uprobes) and enhanced BPF (Berkeley Packet Filter), aka, eBPF. It allows us to measure latency distributions for file system I/O and run queue latency, print details of storage device I/O and TCP retransmits, investigate blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. This talk will summarize this new technology and some long-standing issues that it can solve, and how we intend to use it at Netflix."
eBPF is an exciting new technology that is poised to transform Linux performance engineering. eBPF enables users to dynamically and programatically trace any kernel or user space code path, safely and efficiently. However, understanding eBPF is not so simple. The goal of this talk is to give audiences a fundamental understanding of eBPF, how it interconnects existing Linux tracing technologies, and provides a powerful aplatform to solve any Linux performance problem.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Linux has become integral part of Embedded systems. This three part presentation gives deeper perspective of Linux from system programming perspective. Stating with basics of Linux it goes on till advanced aspects like thread and IPC programming.
Building Network Functions with eBPF & BCCKernel TLV
eBPF (Extended Berkeley Packet Filter) is an in-kernel virtual machine that allows running user-supplied sandboxed programs inside of the kernel. It is especially well-suited to network programs and it's possible to write programs that filter traffic, classify traffic and perform high-performance custom packet processing.
BCC (BPF Compiler Collection) is a toolkit for creating efficient kernel tracing and manipulation programs. It makes use of eBPF.
BCC provides an end-to-end workflow for developing eBPF programs and supplies Python bindings, making eBPF programs much easier to write.
Together, eBPF and BCC allow you to develop and deploy network functions safely and easily, focusing on your application logic (instead of kernel datapath integration).
In this session, we will introduce eBPF and BCC, explain how to implement a network function using BCC, discuss some real-life use-cases and show a live demonstration of the technology.
About the speaker
Shmulik Ladkani, Chief Technology Officer at Meta Networks,
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Meta Networks where he's been busy architecting secure, multi-tenant, large-scale network infrastructure as a cloud-based service.
Netronome's half-day tutorial on host data plane acceleration at ACM SIGCOMM 2018 introduced attendees to models for host data plane acceleration and provided an in-depth understanding of SmartNIC deployment models at hyperscale cloud vendors and telecom service providers.
Presenter Bios
Jakub Kicinski is a long term Linux kernel contributor, who has been leading the kernel team at Netronome for the last two years. Jakub’s major contributions include the creation of BPF hardware offload mechanisms in the kernel and bpftool user space utility, as well as work on the Linux kernel side of OVS offload.
David Beckett is a Software Engineer at Netronome with a strong technical background of computer networks including academic research with DDoS. David has expertise in the areas of Linux architecture and computer programming. David has a Masters Degree in Electrical, Electronic Engineering at Queen’s University Belfast and continues as a PhD student studying Emerging Application Layer DDoS threats.
This document discusses how eBPF (extended Berkeley Packet Filter) can be used for kernel tracing. It provides an overview of BPF and eBPF, how eBPF programs are compiled and run in the kernel, the use of BPF maps, and how eBPF enables new possibilities for dynamic kernel instrumentation through techniques like Kprobes and ftrace.
- The document discusses Linux network stack monitoring and configuration. It begins with definitions of key concepts like RSS, RPS, RFS, LRO, GRO, DCA, XDP and BPF.
- It then provides an overview of how the network stack works from the hardware interrupts and driver level up through routing, TCP/IP and to the socket level.
- Monitoring tools like ethtool, ftrace and /proc/interrupts are described for viewing hardware statistics, software stack traces and interrupt information.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
Keynote by Brendan Gregg for YOW! 2018. Video: https://www.youtube.com/watch?v=03EC8uA30Pw . Description: "At Netflix, improving the performance of our cloud means happier customers and lower costs, and involves root cause
analysis of applications, runtimes, operating systems, and hypervisors, in an environment of 150k cloud instances
that undergo numerous production changes each week. Apart from the developers who regularly optimize their own code
, we also have a dedicated performance team to help with any issue across the cloud, and to build tooling to aid in
this analysis. In this session we will summarize the Netflix environment, procedures, and tools we use and build t
o do root cause analysis on cloud performance issues. The analysis performed may be cloud-wide, using self-service
GUIs such as our open source Atlas tool, or focused on individual instances, and use our open source Vector tool, f
lame graphs, Java debuggers, and tooling that uses Linux perf, ftrace, and bcc/eBPF. You can use these open source
tools in the same way to find performance wins in your own environment."
The Linux kernel is undergoing the most fundamental architecture evolution in history and is becoming a microkernel. Why is the Linux kernel evolving into a microkernel? The potentially biggest fundamental change ever happening to the Linux kernel. This talk covers how companies like Facebook and Google use BPF to patch 0-day exploits, how BPF will change the way features are added to the kernel forever, and how BPF is introducing a new type of application deployment method for the Linux kernel.
Kernel Recipes 2015 - Kernel dump analysisAnne Nicolas
Kernel dump analysis
Cloud this, cloud that…It’s making everything easier, especially for web hosted services. But what about the servers that are not supposed to crash ? For applications making the assumption the OS won’t do any fault or go down, what can you write in your post-mortem once the server froze and has been restarted ? How to track down the bug that lead to service unavailability ?
In this talk, we’ll see how to setup kdump and how to panic a server to generate a coredump. Once you have the vmcore file, how to track the issue with “crash” tool to find why your OS went down. Last but not least : with “crash” you can also modify your live kernel, the same way you would do with gdb.
Adrien Mahieux – System administrator obsessed with performance and uptime, tracking down microseconds from hardware to software since 2011. The application must be seen as a whole to provide efficiently the requested service. This includes searching for bottlenecks and tradeoffs, design issues or hardware optimization.
Accelerating Envoy and Istio with Cilium and the Linux KernelThomas Graf
The document discusses how Cilium can accelerate Envoy and Istio by using eBPF/XDP to provide transparent acceleration of network traffic between Kubernetes pods and sidecars without any changes required to applications or Envoy. Cilium also provides features like service mesh datapath, network security policies, load balancing, and visibility/tracing capabilities. BPF/XDP in Cilium allows for transparent TCP/IP acceleration during the data phase of communications between pods and sidecars.
The Linux Kernel Scheduler (For Beginners) - SFO17-421Linaro
Session ID: SFO17-421
Session Name: The Linux Kernel Scheduler (For Beginners) - SFO17-421
Speaker: Viresh Kumar
Track: Power Management
★ Session Summary ★
This talk will take you through the internals of the Linux Kernel scheduler.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/sfo17/sfo17-421/
Presentation:
Video: https://www.youtube.com/watch?v=q283Wm__QQ0
---------------------------------------------------
★ Event Details ★
Linaro Connect San Francisco 2017 (SFO17)
25-29 September 2017
Hyatt Regency San Francisco Airport
---------------------------------------------------
Keyword:
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
Linux Kernel Booting Process (1) - For NLKBshimosawa
Describes the bootstrapping part in Linux and some related technologies.
This is the part one of the slides, and the succeeding slides will contain the errata for this slide.
Process Address Space: The way to create virtual address (page table) of user...Adrian Huang
Process Address Space: The way to create virtual address (page table) of userspace application.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Launch the First Process in Linux SystemJian-Hong Pan
The session: https://coscup.org/2022/en/session/AGCMDJ
After Linux kernel boots, it will try to launch first process “init” in User Space. Then, the system begins the featured journey of the Linux distribution.
This sharing takes Busybox as the example and shows that how does Linux kernel find the “init” which directs to the Busybox. And, what will Busybox do and how to get the console. Try to make it like a simple Linux system.
Before Linux kernel launches “init” process, the file system and storage corresponding drivers/modules must be loaded to find the “init”. Besides, to mount the root file system correctly, the kernel boot command must include the root device and file system format parameters.
On the other hand, the Busybox directed from “init” is a lightweight program, but has rich functions, just like a Swiss Army Knife. So, it is usually used on the simple environment, like embedded Linux system.
This sharing will have a demo on a virtual machine first, then on the Raspberry Pi.
Drafts:
* https://hackmd.io/@starnight/Busbox_as_the_init
* https://hackmd.io/@starnight/Build_Alpines_Root_Filesystem_Bootstrap
Relate idea: https://hackmd.io/@starnight/Systems_init_and_Containers_COMMAND_Dockerfiles_CMD
This document provides an introduction to eBPF and XDP. It discusses the history of BPF and how it evolved into eBPF. Key aspects of eBPF covered include the instruction set, JIT compilation, verifier, helper functions, and maps. XDP is introduced as a way to program the data plane using eBPF programs attached early in the receive path. Example use cases and performance benchmarks for XDP are also mentioned.
This talk discusses Linux profiling using perf_events (also called "perf") based on Netflix's use of it. It covers how to use perf to get CPU profiling working and overcome common issues. The speaker will give a tour of perf_events features and show how Netflix uses it to analyze performance across their massive Amazon EC2 Linux cloud. They rely on tools like perf for customer satisfaction, cost optimization, and developing open source tools like NetflixOSS. Key aspects covered include why profiling is needed, a crash course on perf, CPU profiling workflows, and common "gotchas" to address like missing stacks, symbols, or profiling certain languages and events.
The document discusses challenges with processor benchmarking and provides recommendations. It summarizes a case study where a popular CPU benchmark claimed a new processor was 2.6x faster than Intel, but detailed analysis found the benchmark was testing division speed, which accounted for only 0.1% of cycles on Netflix servers. The document advocates for low-level, active benchmarking and profiling over statistical analysis. It also provides a checklist for evaluating benchmarks and cautions that increased processor complexity and cloud environments make accurate benchmarking more difficult.
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...Adrian Huang
This document describes setting up a QEMU virtual machine with Ubuntu 20.04.1 to debug Linux kernel code using gdb. It has a 2-socket CPU configuration with 16GB of memory and disabled KASAN and ASLR. The QEMU VM can be used to run sample code and observe Linux kernel behavior under gdb, such as setting conditional breakpoints to analyze page fault behavior for mmap addresses by referencing a gdb debugging text file.
This document discusses Linux kernel crash capture and analysis. It begins with an overview of what constitutes a kernel crash and reasons crashes may occur, both from hardware and software issues. It then covers using kdump to capture virtual memory cores (vmcores) when a crash happens, and configuring kdump for optimal core collection. Finally, it discusses analyzing vmcores after collection using the crash utility, including commands to inspect system information, backtraces, logs, and more.
Debugging linux kernel tools and techniquesSatpal Parmar
This document discusses tools and techniques for debugging the Linux kernel, including debuggers like gdb, built-in debugging facilities, system logs, and crash dump analysis tools like LKCD. It outlines common issues like kernel crashes and hangs, and provides an example of analyzing an "oops" crash dump to identify the failing line of code through tools like ksymoops. It also covers generating a full system memory dump using LKCD for thorough crash investigation.
Building Network Functions with eBPF & BCCKernel TLV
eBPF (Extended Berkeley Packet Filter) is an in-kernel virtual machine that allows running user-supplied sandboxed programs inside of the kernel. It is especially well-suited to network programs and it's possible to write programs that filter traffic, classify traffic and perform high-performance custom packet processing.
BCC (BPF Compiler Collection) is a toolkit for creating efficient kernel tracing and manipulation programs. It makes use of eBPF.
BCC provides an end-to-end workflow for developing eBPF programs and supplies Python bindings, making eBPF programs much easier to write.
Together, eBPF and BCC allow you to develop and deploy network functions safely and easily, focusing on your application logic (instead of kernel datapath integration).
In this session, we will introduce eBPF and BCC, explain how to implement a network function using BCC, discuss some real-life use-cases and show a live demonstration of the technology.
About the speaker
Shmulik Ladkani, Chief Technology Officer at Meta Networks,
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Meta Networks where he's been busy architecting secure, multi-tenant, large-scale network infrastructure as a cloud-based service.
Netronome's half-day tutorial on host data plane acceleration at ACM SIGCOMM 2018 introduced attendees to models for host data plane acceleration and provided an in-depth understanding of SmartNIC deployment models at hyperscale cloud vendors and telecom service providers.
Presenter Bios
Jakub Kicinski is a long term Linux kernel contributor, who has been leading the kernel team at Netronome for the last two years. Jakub’s major contributions include the creation of BPF hardware offload mechanisms in the kernel and bpftool user space utility, as well as work on the Linux kernel side of OVS offload.
David Beckett is a Software Engineer at Netronome with a strong technical background of computer networks including academic research with DDoS. David has expertise in the areas of Linux architecture and computer programming. David has a Masters Degree in Electrical, Electronic Engineering at Queen’s University Belfast and continues as a PhD student studying Emerging Application Layer DDoS threats.
This document discusses how eBPF (extended Berkeley Packet Filter) can be used for kernel tracing. It provides an overview of BPF and eBPF, how eBPF programs are compiled and run in the kernel, the use of BPF maps, and how eBPF enables new possibilities for dynamic kernel instrumentation through techniques like Kprobes and ftrace.
- The document discusses Linux network stack monitoring and configuration. It begins with definitions of key concepts like RSS, RPS, RFS, LRO, GRO, DCA, XDP and BPF.
- It then provides an overview of how the network stack works from the hardware interrupts and driver level up through routing, TCP/IP and to the socket level.
- Monitoring tools like ethtool, ftrace and /proc/interrupts are described for viewing hardware statistics, software stack traces and interrupt information.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
Keynote by Brendan Gregg for YOW! 2018. Video: https://www.youtube.com/watch?v=03EC8uA30Pw . Description: "At Netflix, improving the performance of our cloud means happier customers and lower costs, and involves root cause
analysis of applications, runtimes, operating systems, and hypervisors, in an environment of 150k cloud instances
that undergo numerous production changes each week. Apart from the developers who regularly optimize their own code
, we also have a dedicated performance team to help with any issue across the cloud, and to build tooling to aid in
this analysis. In this session we will summarize the Netflix environment, procedures, and tools we use and build t
o do root cause analysis on cloud performance issues. The analysis performed may be cloud-wide, using self-service
GUIs such as our open source Atlas tool, or focused on individual instances, and use our open source Vector tool, f
lame graphs, Java debuggers, and tooling that uses Linux perf, ftrace, and bcc/eBPF. You can use these open source
tools in the same way to find performance wins in your own environment."
The Linux kernel is undergoing the most fundamental architecture evolution in history and is becoming a microkernel. Why is the Linux kernel evolving into a microkernel? The potentially biggest fundamental change ever happening to the Linux kernel. This talk covers how companies like Facebook and Google use BPF to patch 0-day exploits, how BPF will change the way features are added to the kernel forever, and how BPF is introducing a new type of application deployment method for the Linux kernel.
Kernel Recipes 2015 - Kernel dump analysisAnne Nicolas
Kernel dump analysis
Cloud this, cloud that…It’s making everything easier, especially for web hosted services. But what about the servers that are not supposed to crash ? For applications making the assumption the OS won’t do any fault or go down, what can you write in your post-mortem once the server froze and has been restarted ? How to track down the bug that lead to service unavailability ?
In this talk, we’ll see how to setup kdump and how to panic a server to generate a coredump. Once you have the vmcore file, how to track the issue with “crash” tool to find why your OS went down. Last but not least : with “crash” you can also modify your live kernel, the same way you would do with gdb.
Adrien Mahieux – System administrator obsessed with performance and uptime, tracking down microseconds from hardware to software since 2011. The application must be seen as a whole to provide efficiently the requested service. This includes searching for bottlenecks and tradeoffs, design issues or hardware optimization.
Accelerating Envoy and Istio with Cilium and the Linux KernelThomas Graf
The document discusses how Cilium can accelerate Envoy and Istio by using eBPF/XDP to provide transparent acceleration of network traffic between Kubernetes pods and sidecars without any changes required to applications or Envoy. Cilium also provides features like service mesh datapath, network security policies, load balancing, and visibility/tracing capabilities. BPF/XDP in Cilium allows for transparent TCP/IP acceleration during the data phase of communications between pods and sidecars.
The Linux Kernel Scheduler (For Beginners) - SFO17-421Linaro
Session ID: SFO17-421
Session Name: The Linux Kernel Scheduler (For Beginners) - SFO17-421
Speaker: Viresh Kumar
Track: Power Management
★ Session Summary ★
This talk will take you through the internals of the Linux Kernel scheduler.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/sfo17/sfo17-421/
Presentation:
Video: https://www.youtube.com/watch?v=q283Wm__QQ0
---------------------------------------------------
★ Event Details ★
Linaro Connect San Francisco 2017 (SFO17)
25-29 September 2017
Hyatt Regency San Francisco Airport
---------------------------------------------------
Keyword:
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
Linux Kernel Booting Process (1) - For NLKBshimosawa
Describes the bootstrapping part in Linux and some related technologies.
This is the part one of the slides, and the succeeding slides will contain the errata for this slide.
Process Address Space: The way to create virtual address (page table) of user...Adrian Huang
Process Address Space: The way to create virtual address (page table) of userspace application.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Launch the First Process in Linux SystemJian-Hong Pan
The session: https://coscup.org/2022/en/session/AGCMDJ
After Linux kernel boots, it will try to launch first process “init” in User Space. Then, the system begins the featured journey of the Linux distribution.
This sharing takes Busybox as the example and shows that how does Linux kernel find the “init” which directs to the Busybox. And, what will Busybox do and how to get the console. Try to make it like a simple Linux system.
Before Linux kernel launches “init” process, the file system and storage corresponding drivers/modules must be loaded to find the “init”. Besides, to mount the root file system correctly, the kernel boot command must include the root device and file system format parameters.
On the other hand, the Busybox directed from “init” is a lightweight program, but has rich functions, just like a Swiss Army Knife. So, it is usually used on the simple environment, like embedded Linux system.
This sharing will have a demo on a virtual machine first, then on the Raspberry Pi.
Drafts:
* https://hackmd.io/@starnight/Busbox_as_the_init
* https://hackmd.io/@starnight/Build_Alpines_Root_Filesystem_Bootstrap
Relate idea: https://hackmd.io/@starnight/Systems_init_and_Containers_COMMAND_Dockerfiles_CMD
This document provides an introduction to eBPF and XDP. It discusses the history of BPF and how it evolved into eBPF. Key aspects of eBPF covered include the instruction set, JIT compilation, verifier, helper functions, and maps. XDP is introduced as a way to program the data plane using eBPF programs attached early in the receive path. Example use cases and performance benchmarks for XDP are also mentioned.
This talk discusses Linux profiling using perf_events (also called "perf") based on Netflix's use of it. It covers how to use perf to get CPU profiling working and overcome common issues. The speaker will give a tour of perf_events features and show how Netflix uses it to analyze performance across their massive Amazon EC2 Linux cloud. They rely on tools like perf for customer satisfaction, cost optimization, and developing open source tools like NetflixOSS. Key aspects covered include why profiling is needed, a crash course on perf, CPU profiling workflows, and common "gotchas" to address like missing stacks, symbols, or profiling certain languages and events.
The document discusses challenges with processor benchmarking and provides recommendations. It summarizes a case study where a popular CPU benchmark claimed a new processor was 2.6x faster than Intel, but detailed analysis found the benchmark was testing division speed, which accounted for only 0.1% of cycles on Netflix servers. The document advocates for low-level, active benchmarking and profiling over statistical analysis. It also provides a checklist for evaluating benchmarks and cautions that increased processor complexity and cloud environments make accurate benchmarking more difficult.
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...Adrian Huang
This document describes setting up a QEMU virtual machine with Ubuntu 20.04.1 to debug Linux kernel code using gdb. It has a 2-socket CPU configuration with 16GB of memory and disabled KASAN and ASLR. The QEMU VM can be used to run sample code and observe Linux kernel behavior under gdb, such as setting conditional breakpoints to analyze page fault behavior for mmap addresses by referencing a gdb debugging text file.
This document discusses Linux kernel crash capture and analysis. It begins with an overview of what constitutes a kernel crash and reasons crashes may occur, both from hardware and software issues. It then covers using kdump to capture virtual memory cores (vmcores) when a crash happens, and configuring kdump for optimal core collection. Finally, it discusses analyzing vmcores after collection using the crash utility, including commands to inspect system information, backtraces, logs, and more.
Debugging linux kernel tools and techniquesSatpal Parmar
This document discusses tools and techniques for debugging the Linux kernel, including debuggers like gdb, built-in debugging facilities, system logs, and crash dump analysis tools like LKCD. It outlines common issues like kernel crashes and hangs, and provides an example of analyzing an "oops" crash dump to identify the failing line of code through tools like ksymoops. It also covers generating a full system memory dump using LKCD for thorough crash investigation.
The document discusses techniques for debugging issues in the Linux kernel. It begins by explaining the differences between debugging in user space versus kernel space. Kernel problems are then categorized as kernel panics, which halt the system, or kernel oops, which are recoverable errors. The rest of the document demonstrates debugging outputs for a kernel panic and oops, including register dumps and call traces, and discusses common causes of kernel faults.
This document outlines the Linux I/O stack as of kernel version 3.3. It shows the path that I/O requests take from applications through the various layers including direct I/O, the page cache, block I/O layer, I/O scheduler, storage devices, filesystems, and network filesystems. Optional components are shown that can be stacked on top of the basic I/O stack like LVM, device mapper targets, multipath, and network transports.
The document describes the ARIES recovery algorithm, which involves three passes - analysis, redo, and undo.
The analysis pass scans the log from the most recent checkpoint and determines which transactions need to be undone, which pages were dirty at the time of crash, and the redo start point. The redo pass repeats history by redoing log records from the redo start point. The undo pass rolls back incomplete transactions by scanning the log backwards and undoing actions of uncommitted transactions.
7 ways to crash Postgres
1. Do not apply updates and remain on outdated versions of PostgreSQL.
2. Run out of disk space by allowing the database to grow without monitoring disk usage. This can result in errors and panics.
3. Delete important database files and directories which causes the database to fail to start.
4. Set memory settings too high and overload the system memory, triggering out of memory kills of the PostgreSQL process.
5. Use faulty hardware without monitoring for failures which can lead to corrupted blocks and index errors.
6. Allow too many open connections without connection pooling which can prevent new connections.
7. Accumulate zombie locks by not closing transactions, slowing down
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSStephan Cadene
This thesis evaluates how the energy efficiency of the ARMv7 architecture based processors
Cortex-A9 MPCpre and Cortex-A8 compare in applications such as a SIPProxy
and a web server compared to Intel Xeon processors. The focus is on comparing
the energy efficiency between the two architectures rather than just the performance.
As the processors used in servers today have more computational power than
the Cortex-A9 MPCore, several of these slower but more energy efficient processors
are needed. Depending on the application, benchmarks indicate energy efficiency of
3-11 times greater for the ARM Cortex-A9 in comparison to the Intel Xeon. The topics
of interconnects between processors and overhead caused by using an increasing
number of processors, are left for later research
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsJiannan Ouyang, PhD
This slides were presented at the 12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE’16).
Virtual Machine based approaches to workload consolidation, as seen in IaaS cloud as well as datacenter platforms, have long had to contend with performance degradation caused by synchronization primitives inside the guest environments. These primitives can be affected by virtual CPU preemptions by the host scheduler that can introduce delays that are orders of magnitude longer than those primitives were designed for. While a significant amount of work has focused on the behavior of spinlock primitives as a source of these performance issues, spinlocks do not represent the entirety of synchronization mechanisms that are susceptible to scheduling issues when running in a virtualized environment. In this paper we address the virtualized performance issues introduced by TLB shootdown operations. Our profiling study, based on the PARSEC benchmark suite, has shown that up to 64% of a VM's CPU time can be spent on TLB shootdown operations under certain workloads. In order to address this problem, we present a paravirtual TLB shootdown scheme named Shoot4U. Shoot4U completely eliminates TLB shootdown preemptions by invalidating guest TLB entries from the VMM and allowing guest TLB shootdown operations to complete without waiting for remote virtual CPUs to be scheduled. Our performance evaluation using the PARSEC benchmark suite demonstrates that Shoot4U can reduce benchmark runtime by up to 85% compared an unmodified Linux kernel, and up to 44% over a state-of-the-art paravirtual TLB shootdown scheme.
Achieving Performance Isolation with Lightweight Co-KernelsJiannan Ouyang, PhD
This slides were presented at the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC '15)
Performance isolation is emerging as a requirement for High Performance Computing (HPC) applications, particularly as HPC architectures turn to in situ data processing and application composition techniques to increase system throughput. These approaches require the co-location of disparate workloads on the same compute node, each with different resource and runtime requirements. In this paper we claim that these workloads cannot be effectively managed by a single Operating System/Runtime (OS/R). Therefore, we present Pisces, a system software architecture that enables the co-existence of multiple independent and fully isolated OS/Rs, or enclaves, that can be customized to address the disparate requirements of next generation HPC workloads. Each enclave consists of a specialized lightweight OS co-kernel and runtime, which is capable of independently managing partitions of dynamically assigned hardware resources. Contrary to other co-kernel approaches, in this work we consider performance isolation to be a primary requirement and present a novel co-kernel architecture to achieve this goal. We further present a set of design requirements necessary to ensure performance isolation, including: (1) elimination of cross OS dependencies, (2) internalized management of I/O, (3) limiting cross enclave communication to explicit shared memory channels, and (4) using virtualization techniques to provide missing OS features. The implementation of the Pisces co-kernel architecture is based on the Kitten Lightweight Kernel and Palacios Virtual Machine Monitor, two system software architectures designed specifically for HPC systems. Finally we will show that lightweight isolated co-kernels can provide better performance for HPC applications, and that isolated virtual machines are even capable of outperforming native environments in the presence of competing workloads.
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linuxbrouer
This talk my 2009 updates on the progress of doing 10Gbit/s routing on standard hardware running Linux. The results are good, BUT to achieve these results, a lot of tuning is required of hardware queues, MSI interrupts and SMP affinity, together with some (now) submitted patches. I\'ll explain the concept of network hardware queues and why interrupt and SMP tuning is essential. I\'ll present results from different hardware both 10GbE netcards and CPUs (current CPUs under test is AMD phenom and Core i7). Many future challenges still exists, especially in the area of more easy tuning. A high knowledge level about the Linux kernel is required to follow all the details.
Denser, cooler, faster, stronger: PHP on ARM microserversJez Halford
This is the story of how I helped make arm.com run on a small collection of state-of-the art ARM-based microservers, and the lessons I learned along the way, from how not to crash your entire multi-node MySQL database several times a day, to how not to have nginx consume all your disk space within a few minutes, all the way through to how to build a rock solid PHP application that is *truly* scalable to almost any size.
Docker by Demo presented at Hyderabad Scalability Meetup - http://www.meetup.com/hyderabad-scalability/events/218796914/ by Srikanth
Peemuperf is a Linux kernel module and userspace tool that uses the Performance Monitoring Unit (PMU) on ARM processors to monitor performance metrics like CPU cycles, cache misses, and stalls. It can profile the ARM Cortex A8 and A9 by dynamically configuring the number and types of performance counters. The tool outputs profiling data to the Linux proc filesystem for inspection in userspace. Peemuperf aims to provide cache monitoring capabilities for ARM devices where the oprofile tool is currently limited.
Preemptable ticket spinlocks: improving consolidated performance in the cloudJiannan Ouyang, PhD
This slides were presented at the 9th ACM SIGPLAN/SIGOPS international conference on Virtual Execution Environments (VEE '13).
When executing inside a virtual machine environment, OS level synchronization primitives are faced with significant challenges due to the scheduling behavior of the underlying virtual machine monitor. Operations that are ensured to last only a short amount of time on real hardware, are capable of taking considerably longer when running virtualized. This change in assumptions has significant impact when an OS is executing inside a critical region that is protected by a spinlock. The interaction between OS level spinlocks and VMM scheduling is known as the Lock Holder Preemption problem and has a significant impact on overall VM performance. However, with the use of ticket locks instead of generic spinlocks, virtual environments must also contend with waiters being preempted before they are able to acquire the lock. This has the effect of blocking access to a lock, even if the lock itself is available. We identify this scenario as the Lock Waiter Preemption problem. In order to solve both problems we introduce Preemptable Ticket spinlocks, a new locking primitive that is designed to enable a VM to always make forward progress by relaxing the ordering guarantees offered by ticket locks. We show that the use of Preemptable Ticket spinlocks improves VM performance by 5.32X on average, when running on a non paravirtual VMM, and by 7.91X when running on a VMM that supports a paravirtual locking interface, when executing a set of microbenchmarks as well as a realistic e-commerce benchmark.
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Eric Van Hensbergen
The document discusses ARM's approach to future high performance computing (HPC) node architectures. It advocates for balance, flexibility, and partnership in developing exascale systems. ARM believes that a heterogeneous approach using different core types optimized for various workloads, flexible memory technologies, and partnerships to develop the software ecosystem will help overcome challenges in achieving exascale performance while improving power efficiency.
If you are a cloud computing provider, soon you might start facing problems with the network part of it. Conventional solutions for network doesn't apply very well for Cloud environments. SDN give us a new way of thinking about network, embracing Inovation. In this session, you will see how Locaweb implemented SDN to solve their network problems after 3 years providing Cloud Solutions in Brazil. A new era for network on the way...
The PROSE approach allows running applications in stand-alone partitions with an easy-to-use execution environment. It enables the creation of specialized kernels as easily as developing an application library. Resource sharing between library-OS partitions and traditional partitions keeps library-OS kernels simple and reliable. Extensions allow bridging resource sharing and management across an entire cluster with a unified communication protocol.
[Question Paper] Linux Administration (75:25 Pattern) [April / 2015]Mumbai B.Sc.IT Study
This is a Question Papers of Mumbai University for B.Sc.IT Student of Semester - V [Linux Administration] (75:25 Pattern). [Year - April / 2015] . . . Solution Set of this Paper is Coming soon . . .
,april 2015 question paper ,bscit question papers ,bscit semester - v question paper ,mumbai bscit study ,mumbai university questions paper ,old question paper ,question papers ,revised syllabus ,linux administration
This document provides an introduction and overview of the Microsoft SQL Server Black Book. It discusses:
1) The purpose of the book is to provide step-by-step guidance on installing, configuring, and troubleshooting Microsoft SQL Server to create a solid production database server.
2) Each chapter contains explanatory material and a practical guide section with hands-on exercises.
3) The book assumes basic Windows NT knowledge and is intended for readers with a range of SQL Server experience, from beginners to experienced professionals.
Westlake Olive Configuration Scenario Michael BoddieAdvan.docxphilipnelson29183
Westlake Olive: Configuration Scenario
Michael Boddie
Advanced Windows Services
NTC/328
April 2, 2018
-Welcome-
1
Introduction
In this presentation I will play the role of an IT manager for Westlake Olive. As IT manager, I will be responsible for the administration and configuration of all the organization servers
Microsoft Company have made large improvements in clustering in server 2012; more new features are in windows server 2012 R2.
Configuring Clustering services in windows server 2012 R2
The reason as to why Westlake Olive company need to implement cluster services is to increase availability and scalability
Over view of Westlake Olive Company
Westlake is a large producer of olive oil which produces only one product. The company have dominated the industry for the last five years.
The company has two facilities which are Sacramento and California. The production facility is in Colorado and Denver
Previously my team had configured network services between the two locations where we used MPLS network and what we referred to as westlakeolives.local domain which had been already established.
My team will consist of the a system administrator and a network administrator.
2
Overview of Failover clustering
Failover clustering is a group of computers which are independent that work together to increase scalability and availability of cluster roles
Usually the clustered servers formerly called nodes are usually connected using physical cables and by software
One of the reason we are implementing this service in this organization is, when one if one of the cluster nodes fails, other nodes are able to provide the service
Clustering begins with definition of terms:
Nodes are defined as individual servers. A node can also be an active or inactive member of cluster depending on whether it is currently online or in communication with other cluster nodes.
A cluster service refer to as a collection of clustering software which manages all cluster-specific activity
Resource dependencies is the reliance between two resources which makes it necessary for both resources to run on the same node.
Failover is term that has been used widely and it is the process of moving a group of resources from one node to another in case of failure.
3
Requirements of Fail-Clustering
Failover cluster is available in both datacenter and standard editions of server 2012 R2
For failover clustering to be implemented successfully each of the organization server must be equipped with four NICs
All the hardware components must have a Microsoft logo on windows 2012 designations
The role of the four NICs
First NIC: For external connections
Second NIC: For dedicated to iSCSI storage connectivity
Third NIC: to handle the VMs network traffic
Forth NIC: To handle cluster node comm.
Autonomous Transaction Processing (ATP): In Heavy Traffic, Why Drive Stick?Jim Czuprynski
Autonomous Transaction Processing (ATP) - the second in the family of Oracle’s Autonomous Databases – offers Oracle DBAs the ability to apply a force multiplier for their OLTP database application workloads. However, it’s important to understand both the benefits and limitations of ATP before migrating any workloads to that environment. I'll offer a quick but deep dive into how best to take advantage of ATP - including how to load data quickly into the underlying database – and some ideas on how ATP will impact the role of Oracle DBA in the immediate future. (Hint: Think automatic transmission instead of stick-shift.)
[Question Paper] Linux Administration (75:25 Pattern) [November / 2014]Mumbai B.Sc.IT Study
This is a Question Papers of Mumbai University for B.Sc.IT Student of Semester - V [Linux Administration] (75:25 Pattern). [Year - November / 2014] . . . Solution Set of this Paper is Coming soon . . .
Beowulf Cluster Computing with Windows Thomas Sterling all chapter instant do...mohnsenrick73
Get Beowulf Cluster Computing with Windows Thomas Sterling instantly by paying at https://ebookultra.com/download/beowulf-cluster-computing-with-windows-thomas-sterling. Explore additional textbooks and ebooks in https://ebookultra.com Download entire PDF chapter.
[Question Paper] Embedded System (Revised Course) [June / 2016]Mumbai B.Sc.IT Study
This is a Question Papers of Mumbai University for B.Sc.IT Student of Semester - IV [Embedded System] (Revised Course). [Year - June / 2016] . . . Solution Set of this Paper is Coming soon . . .
The document contains a list of technical interview questions related to networking, Active Directory, and Exchange Server. Some of the networking questions include what an IP address and subnet mask are, what ARP and DHCP are used for, and tools used to monitor network traffic. The Active Directory questions cover topics like the AD database, FSMO roles, GPOs and how they are applied. The Exchange questions focus on the different Exchange versions, installation requirements, management tools, and permissions.
The document discusses NetWare, a network operating system developed by Novell. It provides an overview of NetWare's history and versions. The key advantages of NetWare include centralized management, support for multiple protocols, and integration with other network operating systems. The document also describes planning and installing a NetWare server, including hardware requirements, the installation process, establishing user accounts and groups, and providing client access and interoperability with other operating systems.
The document discusses various technical requirements to gather during the requirement gathering phase for setting up new or growing customers in a datacenter. It covers questions around network requirements like load balancing, firewalls, subnets; hardware requirements like servers, storage, backups; operating system requirements; and database requirements. The key things to inquire about include load balancing configuration, firewall rules, hardware specifications, operating systems, software dependencies, database versions and clustering.
This document outlines the steps for building a SQL Server cluster for high availability, including planning considerations, required hardware, installing Windows clustering features, configuring storage, installing and configuring SQL Server across nodes, and testing the cluster configuration. Key aspects that are discussed include defining recovery time and point objectives, installing SQL Server using the "Create New Failover Cluster" option, installing SQL on each node to enable failover, and performing backups and restores from cluster-owned drives. Testing the applications on the clustered environment is also emphasized.
Fine-grained fault tolerance using device checkpointsasimkadav
The document discusses an approach called Fine-Grained Fault Tolerance (FGFT) that aims to provide fault isolation and recovery for device drivers. FGFT allows select driver entry points to run as transactions and uses checkpoint-based recovery to quickly and correctly restore driver and device state after failures. This removes the need for slow device reinitialization during recovery. The approach requires only incremental changes to device drivers and has low overhead.
operating system assignment waht is interuptTayyabKhan61
What is the purpose of interrupts? How does an interrupt differ from a trap? Can traps be generated intentionally by a user program? If so, for what purpose?
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...rschuppe
Application Performance doesn't come easy. How to find the root cause of performance issues in modern and complex applications? All you have is a complaining user to start with?
In this presentation (mainly in German, but understandable for english speakers) I'd reprised the fundamentals of trouble shooting and have some new examples on how to tackle issues.
Follow up presentation to "Performance Trouble Shooting 101 - Schweine, Schlangen und Papierschnitte"
Active Directory is a directory service that stores information about objects on a network and makes this information
available to users and administrators. It provides single sign-on access to resources and a centralized point of
administration. The Global Catalog is a partial replica of every object in the Active Directory forest that allows for
faster searches across domains. Various support tools like LDP, REPLMON, ADSIEDIT, NETDOM and REPADMIN
can be used to manage Active Directory, diagnose replication issues, and access directory objects.
This document contains a list of technical interview questions related to networking, Active Directory, and Exchange 2003. For networking, it asks about IP addresses, subnet masks, ARP, default gateways, subnets, DHCP, and network monitoring tools. For Active Directory, questions cover domains, sites, replication, Global Catalog, FSMO roles, backups and restores, and Group Policy. For Exchange 2003, it asks about installation requirements, management tools, recipient types, mailboxes, permissions, and distribution groups.
DockerCon Europe 2018 Monitoring & Logging WorkshopBrian Christner
This is the Docker Logging & Monitoring workshop completed during DockerCon 2018 Europe. We cover how to build native tools in Docker, deploy an ELK stack, and Prometheus with cAdvisor, node-exporter, Prometheus, and Grafana stack
There are many tasks that need to be done. But why get bored by doing the exact same series of actions every time? Especially when a machine can do it faster and in a more reliable way! That's what incited us to automate the way we build, test, and deploy apps so we can focus on more creative tasks.
1.
Kernel Crash Dump Analysis
PREPARED BY : Buland Singh, Red Hat
REVIEW BY : Jijesh Kalliyat, Red Hat
Version 1.0
Jan 2016
Send feedback to [email protected]
12.
sshkey <path>
o Used to specify the path of the ssh key you want to use when do ssh dump, the default value is
/root/.ssh/kdump_id_rsa
o kdump will use the sshkey to do ssh dump.
Eg:
sshkey /root/.ssh/kdump_id_rsa
path <path>
o "path" represents the filesystem path in which vmcore will be saved.
o If a dump target is specified in kdump.conf, then "path" is relative to the specified dump target.
o Interpretation of path changes a bit if user has not specified a dump target explicitly in kdump.conf. In this
case, "path" represents the
absolute path from root. And dump target and adjusted path are arrived at automatically depending on what's
mounted in the current system.
o Ignored for raw device dumps.
o If unset, will default to /var/crash.
Eg:
path /var/crash
core_collector <command> <options>
o This allows you to specify the command to copy the vmcore.
o Default core_collector for other targets is: "makedumpfile c messagelevel 1 d 31"
Eg:
core_collector makedumpfile c messagelevel 1 d 31
o The c tells makedumpfile command to compress the vmcore file.
o The "d" option is used to set the dump_level.
o "dump_level" is used to decide which pages to be removed from the resultant vmcore file.
o The option is a bit mask, having each page type specified like so:
zero pages = 1
cache pages = 2
cache private = 4
user pages = 8
free pages = 16
47.
Eg:
crash> extend swap_usage.so
./swap_usage.so: shared object loaded
o Use "pswap" command to check swap usage per task.
Eg:
crash> pswap G | sort n k 2 |tail
6946 3099 opcmona
6964 3334 agtrep
30548 9362 R
30575 9377 R
30566 9378 R
30557 9427 R
6123 14284 splunkd
2098 18739 python
19406 164229 rsession
28882 8107112 rsession
[Q:5] How to determine memory usage in kernelspace?
crash> kmem i
PAGES TOTAL PERCENTAGE
TOTAL MEM 4051286 15.5 GB
FREE 33838 132.2 MB 0% of TOTAL MEM
USED 4017448 15.3 GB 99% of TOTAL MEM
SHARED 957 3.7 MB 0% of TOTAL MEM
BUFFERS 58 232 KB 0% of TOTAL MEM
CACHED 900 3.5 MB 0% of TOTAL MEM
SLAB 3954789 15.1 GB 97% of TOTAL MEM
TOTAL SWAP 4194303 16 GB
SWAP USED 249162 973.3 MB 5% of TOTAL SWAP
SWAP FREE 3945141 15 GB 94% of TOTAL SWAP
COMMIT LIMIT 6219946 23.7 GB
COMMITTED 4054963 15.5 GB 65% of TOTAL LIMIT
"kmem s" command displays basic kmalloc() slab data.