SlideShare a Scribd company logo
Linux Huge Pages
Why? How? When?
1
• What are you talking about?
• Linux kernel map
• Memory Allocation
• Paging Model
• Page Fault
• Swapping
• Why Huge Pages
• How to configure
• When to configure
• Summary
2
Agenda
• This is mainly about X86-64
(Intel and AMD CPUs produced after 2004)
• There are some differences on huge pages among different
hardware architectures that are out of our scope
• We will not explore MMU, TLB and all the internals of virtual memory
management
• Some images are outdated
(e.g.: Linux kernel 2.6 while current version is 5.5)
but it illustrates very well the aspects discussed in this presentation
3
Premises
4
What are you talking about?
5
This is the Linux
kernel map on
version 2.6.36
While it is dated
by 10 years, it
gives us the big
picture
6
Memory Allocation
.
.
.
.
.
.
7
Paging Model
8
Page Fault
9
Swapping
• As we can see, memory management is complicated process
involving many ‘round-trips’
• Huge pages is about allocating larger blocks of memory at once
Thus, cutting the ‘round-trips’ associated with small pages
• Huge Pages cannot be swapped out
• A set of 4 KB pages can turn into a single 2 MB (with PAE), 4 MB or
even 1 GB
10
Why Huge Pages
Number of Pages (4 KB) Number of Huge Pages Huge Page Equivalence
512 1 2 MB (2048 KB)
1024 1 4 MB (4096 KB)
262.144 1 1 GB (1024 MB or 1.048.576 KB)
• There are 2 huge page variants
• HugeTLB File System
• Works as a pseudo filesystem where you need to manually define the allocation
• We will use this approach
• Transparent Huge Pages
• Works transparently – Linux kernel will decide on its own if the application requires or
not huge pages but it is not recommended for latency sensitive applications
11
Why Huge Pages
• Checking if it is possible to enable huge pages
12
How to Configure
netto@bella:~$ getconf PAGESIZE
4096
netto@bella:~$ cat /proc/cpuinfo | grep 'pse|pdpe' | tail -1
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch CPUid_fault
epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2
smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln
pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
getconf returns the standard page size
for a given CPU architecture in bytes
/proc/cpuinfo contains all data related
to CPU
pse => supports huge page of 2MB
Pdpe1gb => supports huge page of 1GB
• Installing the required packages to configure huge pages as root
• WARNING: your distribution might require a slightly different setup
(e.g.: different package manager/names, less steps)
13
How to Configure
Red Hat / CentOS Debian / Ubuntu
root@bella:~$ yum -y install libhugetlbfs libhugetlbfs-utils root@bella:~$ apt-get -y install hugepages
• In the following case, we can select which huge page size is more
convenient for your application
14
How to Configure
# this is the pseudo directory where huge pages will be mapped, it needs to be an existing directory
# RedHat configuration differs a little
root@bella:~$ mkdir –p /dev/hugepages
# this can be converted to a /etc/fstab entry
root@bella:~$ mount -t hugetlbfs -o gid=<group id>, pagesize=<2M or 1G>,... none /dev/hugepages
# formula: (2 MB / 4 KB) or (1 GB / 4 KB) * size required for your scenario
# there are situations like Oracle DB where it is recommended to allocate huge pages only for SGA
vm.nr_hugepages = <number of pages>
# the same group gid on mount that must be associated with the group where your application is running
vm.hugetlb_shm_group = <group id>
• Add to sysctl.conf
• Reboot
# if huge pages are correctly setup, at least one pool will be displayed
netto@bella:~$ hugeadm --pool-list
Size Minimum Current Maximum Default
2097152 0 0 0 *
1073741824 1 1 1
# hugepages enabled if HugePages_Total is > 0
netto@bella:~$ cat /proc/meminfo
...
HugePages_Total: <huge pages pool size>
HugePages_Free: <number of huge pages that are not allocated>
HugePages_Rsvd: <number of huge pages that are reserved but not allocated>
HugePages_Surp: <maximum number of huge pages>
Hugepagesize: 2048 kB
Hugetlb: 1048576 kB
DirectMap4k: 572400 kB
DirectMap2M: 12943360 kB
DirectMap1G: 19922944 kB 15
How to Configure
16
How to Configure
Application Where Syntax
Oracle JDK/OpenJDK Command line argument –XX:+UseLargePages
MySQL my.cnf, inside the block [mysqld] large_pages=ON
PHP php.ini, opcache block opcache.huge_code_pages 1
Python Using mmap module MADV_HUGEPAGE
PostgreSQL postgresql.conf huge_pages=ON
Docker Command line argument --device=/dev/hugepages:/dev/hugepages
17
When to Configure
Advantages Disadvantages
Huge Pages can reduce pressure on TLB/MMU
Internal and external memory fragmentation will be
potentialized if not configured properly
Huge Pages are not swappable
“Swappability” avoids quick memory starvation imposing
some performance cost
Any data-intensive application that properly use mmap(),
madvise(), shmget(), shmat() and some other calls can
benefit from it
It’s a POSIX extension, other Unix like Solaris, FreeBSD and
even Windows have similar feature with a totally different
setup
Any memory-bound application can benefit from it
NUMA (non uniform memory access) systems may not
have all the benefits from an UMA system
(hardware with uniform/unified memory management)
When latency/response time is critical
Transparent Huge Pages is not recommended in general
(has very specific use cases)
• Many other advantages and disadvantages can come up but most importantly: test!
• It might be required to increase memory allocation on /etc/security/limits.conf
• Operating System Concepts
Silberschatz, Gagne, Galvin
John Wiley & Sons
• Understanding Linux Kernel
Daniel Bovet, Marco Cesati
O'Reilly Media; 3rd edition
• Professional Linux Kernel Architecture
Wolfgang Mauerer
Wrox Press
• Low level programming
Igor Zhirkov
Apress
• Systems Performance – enterprise and the cloud
Brendan Gregg
Prentice Hall
18
References
• Configuring huge pages for your PostgreSQL instance, Debian version
• Performance Tuning: HugePages In Linux
• KVM - Using Hugepages
• LinuxMM: HugePages
• Configuring HugePages for Oracle on Linux (x86-64)
• How to enable huge page support in a Dockerfile
• ZGC
• PostgreSQL and Hugepages: Working with an abundance of memory in
modern servers
• How to configure HugePage using hugeadm (RHEL/CentOS 7)
• RedHat 7 Documentation: Configuring HugeTLB HUGE PAGES
19
References
• PHP 7 - runtime configuration
• PostgreSQL 9.4 Resource Consumption
• Python mmap module
• 7 easy steps to configure HugePages for your Oracle Database Server
• Redis latency problems troubleshooting
• Wikipedia: Linux Kernel
• Interactive map of Linux Kernel
• Huge pages part 1 (Introduction)
• Huge pages part 2: Interfaces
• Huge pages part 3: Administration
• Memory part 3: Virtual Memory
20
References
21
Thank you!
Geraldo Netto
geraldo.netto@gmail.com
Ad

More Related Content

What's hot (20)

Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
Adrian Huang
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
Adrian Huang
 
Anatomy of the loadable kernel module (lkm)
Anatomy of the loadable kernel module (lkm)Anatomy of the loadable kernel module (lkm)
Anatomy of the loadable kernel module (lkm)
Adrian Huang
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
The Linux Foundation
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg
 
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Adrian Huang
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
Adrian Huang
 
Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)
Adrian Huang
 
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Anne Nicolas
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
shimosawa
 
The e820 trap of Linux kernel hibernation
The e820 trap of Linux kernel hibernationThe e820 trap of Linux kernel hibernation
The e820 trap of Linux kernel hibernation
joeylikernel
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
shimosawa
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
Adrian Huang
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
Adrian Huang
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
Vadim Nikitin
 
Static partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-VStatic partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-V
RISC-V International
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
shimosawa
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
Ray Jenkins
 
spinlock.pdf
spinlock.pdfspinlock.pdf
spinlock.pdf
Adrian Huang
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
Adrian Huang
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
Adrian Huang
 
Anatomy of the loadable kernel module (lkm)
Anatomy of the loadable kernel module (lkm)Anatomy of the loadable kernel module (lkm)
Anatomy of the loadable kernel module (lkm)
Adrian Huang
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
The Linux Foundation
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg
 
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Adrian Huang
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
Adrian Huang
 
Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)
Adrian Huang
 
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Anne Nicolas
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
shimosawa
 
The e820 trap of Linux kernel hibernation
The e820 trap of Linux kernel hibernationThe e820 trap of Linux kernel hibernation
The e820 trap of Linux kernel hibernation
joeylikernel
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
shimosawa
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
Adrian Huang
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
Adrian Huang
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
Vadim Nikitin
 
Static partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-VStatic partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-V
RISC-V International
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
shimosawa
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
Ray Jenkins
 

Similar to Linux Huge Pages (20)

PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
Equnix Business Solutions
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
Louis liu
 
The Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelThe Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux Kernel
Yasunori Goto
 
Running MySQL on Linux
Running MySQL on LinuxRunning MySQL on Linux
Running MySQL on Linux
Great Wide Open
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
solarisyougood
 
os
osos
os
lavanya lalu
 
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
Joao Galdino Mello de Souza
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
Colin Charles
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld
 
Time For D.I.M.E?
Time For D.I.M.E?Time For D.I.M.E?
Time For D.I.M.E?
Martin Packer
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
PyData
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
Mike Pittaro
 
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)
Pekka Männistö
 
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
Szymon Haly
 
Comparison of foss distributed storage
Comparison of foss distributed storageComparison of foss distributed storage
Comparison of foss distributed storage
Marian Marinov
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
xKinAnx
 
Time For DIME
Time For DIMETime For DIME
Time For DIME
Martin Packer
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Splunk
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data Deduplication
RedWireServices
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
DataStax Academy
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
Equnix Business Solutions
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
Louis liu
 
The Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelThe Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux Kernel
Yasunori Goto
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
solarisyougood
 
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
z/VM 6.3 - Mudanças de Comportamento do hypervisor para suporte de partições ...
Joao Galdino Mello de Souza
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
Colin Charles
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
PyData
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
Mike Pittaro
 
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)
Pekka Männistö
 
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
Szymon Haly
 
Comparison of foss distributed storage
Comparison of foss distributed storageComparison of foss distributed storage
Comparison of foss distributed storage
Marian Marinov
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
xKinAnx
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Splunk
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data Deduplication
RedWireServices
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
DataStax Academy
 
Ad

Recently uploaded (20)

How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Shift Left using Lean for Agile Software Development
Shift Left using Lean for Agile Software DevelopmentShift Left using Lean for Agile Software Development
Shift Left using Lean for Agile Software Development
SathyaShankar6
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Agentic AI Use Cases using GenAI LLM models
Agentic AI Use Cases using GenAI LLM modelsAgentic AI Use Cases using GenAI LLM models
Agentic AI Use Cases using GenAI LLM models
Manish Chopra
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
wareshashahzadiii
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Adobe Photoshop Lightroom CC 2025 Crack Latest Version
Adobe Photoshop Lightroom CC 2025 Crack Latest VersionAdobe Photoshop Lightroom CC 2025 Crack Latest Version
Adobe Photoshop Lightroom CC 2025 Crack Latest Version
usmanhidray
 
Salesforce Aged Complex Org Revitalization Process .pdf
Salesforce Aged Complex Org Revitalization Process .pdfSalesforce Aged Complex Org Revitalization Process .pdf
Salesforce Aged Complex Org Revitalization Process .pdf
SRINIVASARAO PUSULURI
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Shift Left using Lean for Agile Software Development
Shift Left using Lean for Agile Software DevelopmentShift Left using Lean for Agile Software Development
Shift Left using Lean for Agile Software Development
SathyaShankar6
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Agentic AI Use Cases using GenAI LLM models
Agentic AI Use Cases using GenAI LLM modelsAgentic AI Use Cases using GenAI LLM models
Agentic AI Use Cases using GenAI LLM models
Manish Chopra
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
wareshashahzadiii
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Adobe Photoshop Lightroom CC 2025 Crack Latest Version
Adobe Photoshop Lightroom CC 2025 Crack Latest VersionAdobe Photoshop Lightroom CC 2025 Crack Latest Version
Adobe Photoshop Lightroom CC 2025 Crack Latest Version
usmanhidray
 
Salesforce Aged Complex Org Revitalization Process .pdf
Salesforce Aged Complex Org Revitalization Process .pdfSalesforce Aged Complex Org Revitalization Process .pdf
Salesforce Aged Complex Org Revitalization Process .pdf
SRINIVASARAO PUSULURI
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Ad

Linux Huge Pages

  • 1. Linux Huge Pages Why? How? When? 1
  • 2. • What are you talking about? • Linux kernel map • Memory Allocation • Paging Model • Page Fault • Swapping • Why Huge Pages • How to configure • When to configure • Summary 2 Agenda
  • 3. • This is mainly about X86-64 (Intel and AMD CPUs produced after 2004) • There are some differences on huge pages among different hardware architectures that are out of our scope • We will not explore MMU, TLB and all the internals of virtual memory management • Some images are outdated (e.g.: Linux kernel 2.6 while current version is 5.5) but it illustrates very well the aspects discussed in this presentation 3 Premises
  • 4. 4 What are you talking about?
  • 5. 5 This is the Linux kernel map on version 2.6.36 While it is dated by 10 years, it gives us the big picture
  • 10. • As we can see, memory management is complicated process involving many ‘round-trips’ • Huge pages is about allocating larger blocks of memory at once Thus, cutting the ‘round-trips’ associated with small pages • Huge Pages cannot be swapped out • A set of 4 KB pages can turn into a single 2 MB (with PAE), 4 MB or even 1 GB 10 Why Huge Pages Number of Pages (4 KB) Number of Huge Pages Huge Page Equivalence 512 1 2 MB (2048 KB) 1024 1 4 MB (4096 KB) 262.144 1 1 GB (1024 MB or 1.048.576 KB)
  • 11. • There are 2 huge page variants • HugeTLB File System • Works as a pseudo filesystem where you need to manually define the allocation • We will use this approach • Transparent Huge Pages • Works transparently – Linux kernel will decide on its own if the application requires or not huge pages but it is not recommended for latency sensitive applications 11 Why Huge Pages
  • 12. • Checking if it is possible to enable huge pages 12 How to Configure netto@bella:~$ getconf PAGESIZE 4096 netto@bella:~$ cat /proc/cpuinfo | grep 'pse|pdpe' | tail -1 flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch CPUid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d getconf returns the standard page size for a given CPU architecture in bytes /proc/cpuinfo contains all data related to CPU pse => supports huge page of 2MB Pdpe1gb => supports huge page of 1GB
  • 13. • Installing the required packages to configure huge pages as root • WARNING: your distribution might require a slightly different setup (e.g.: different package manager/names, less steps) 13 How to Configure Red Hat / CentOS Debian / Ubuntu root@bella:~$ yum -y install libhugetlbfs libhugetlbfs-utils root@bella:~$ apt-get -y install hugepages
  • 14. • In the following case, we can select which huge page size is more convenient for your application 14 How to Configure # this is the pseudo directory where huge pages will be mapped, it needs to be an existing directory # RedHat configuration differs a little root@bella:~$ mkdir –p /dev/hugepages # this can be converted to a /etc/fstab entry root@bella:~$ mount -t hugetlbfs -o gid=<group id>, pagesize=<2M or 1G>,... none /dev/hugepages # formula: (2 MB / 4 KB) or (1 GB / 4 KB) * size required for your scenario # there are situations like Oracle DB where it is recommended to allocate huge pages only for SGA vm.nr_hugepages = <number of pages> # the same group gid on mount that must be associated with the group where your application is running vm.hugetlb_shm_group = <group id> • Add to sysctl.conf • Reboot
  • 15. # if huge pages are correctly setup, at least one pool will be displayed netto@bella:~$ hugeadm --pool-list Size Minimum Current Maximum Default 2097152 0 0 0 * 1073741824 1 1 1 # hugepages enabled if HugePages_Total is > 0 netto@bella:~$ cat /proc/meminfo ... HugePages_Total: <huge pages pool size> HugePages_Free: <number of huge pages that are not allocated> HugePages_Rsvd: <number of huge pages that are reserved but not allocated> HugePages_Surp: <maximum number of huge pages> Hugepagesize: 2048 kB Hugetlb: 1048576 kB DirectMap4k: 572400 kB DirectMap2M: 12943360 kB DirectMap1G: 19922944 kB 15 How to Configure
  • 16. 16 How to Configure Application Where Syntax Oracle JDK/OpenJDK Command line argument –XX:+UseLargePages MySQL my.cnf, inside the block [mysqld] large_pages=ON PHP php.ini, opcache block opcache.huge_code_pages 1 Python Using mmap module MADV_HUGEPAGE PostgreSQL postgresql.conf huge_pages=ON Docker Command line argument --device=/dev/hugepages:/dev/hugepages
  • 17. 17 When to Configure Advantages Disadvantages Huge Pages can reduce pressure on TLB/MMU Internal and external memory fragmentation will be potentialized if not configured properly Huge Pages are not swappable “Swappability” avoids quick memory starvation imposing some performance cost Any data-intensive application that properly use mmap(), madvise(), shmget(), shmat() and some other calls can benefit from it It’s a POSIX extension, other Unix like Solaris, FreeBSD and even Windows have similar feature with a totally different setup Any memory-bound application can benefit from it NUMA (non uniform memory access) systems may not have all the benefits from an UMA system (hardware with uniform/unified memory management) When latency/response time is critical Transparent Huge Pages is not recommended in general (has very specific use cases) • Many other advantages and disadvantages can come up but most importantly: test! • It might be required to increase memory allocation on /etc/security/limits.conf
  • 18. • Operating System Concepts Silberschatz, Gagne, Galvin John Wiley & Sons • Understanding Linux Kernel Daniel Bovet, Marco Cesati O'Reilly Media; 3rd edition • Professional Linux Kernel Architecture Wolfgang Mauerer Wrox Press • Low level programming Igor Zhirkov Apress • Systems Performance – enterprise and the cloud Brendan Gregg Prentice Hall 18 References
  • 19. • Configuring huge pages for your PostgreSQL instance, Debian version • Performance Tuning: HugePages In Linux • KVM - Using Hugepages • LinuxMM: HugePages • Configuring HugePages for Oracle on Linux (x86-64) • How to enable huge page support in a Dockerfile • ZGC • PostgreSQL and Hugepages: Working with an abundance of memory in modern servers • How to configure HugePage using hugeadm (RHEL/CentOS 7) • RedHat 7 Documentation: Configuring HugeTLB HUGE PAGES 19 References
  • 20. • PHP 7 - runtime configuration • PostgreSQL 9.4 Resource Consumption • Python mmap module • 7 easy steps to configure HugePages for your Oracle Database Server • Redis latency problems troubleshooting • Wikipedia: Linux Kernel • Interactive map of Linux Kernel • Huge pages part 1 (Introduction) • Huge pages part 2: Interfaces • Huge pages part 3: Administration • Memory part 3: Virtual Memory 20 References