SlideShare a Scribd company logo
Experience on porting
HIGHMEM to 32bit RISC-V Linux
Eric Lin
2020.8.2
About me
• 2016 ~ 2018 NCKU
– MS
• 2018.12 ~ 2020.7 Andes technology
– Software engineer in Linux kernel team
• Experience
– RISC-V Linux kernel
– Device driver
– U-BOOT
Outline
• Why need highmem?
• Porting highmem
• An end to high memory?
Why need high memory ?
• Earliest days, kernel maintained ′direct map ′ to map all physical memory in
kernel space.
– It easy for the kernel to manipulate any page in the system
• 32bit platform only have 4GB virtual address space
– To reduce TLB flush cost between kernel and user space.
– Split 4G address space to 1:3 ( 1GB => kernel, 3GB => user )
• With direct map, kernel can only map 1GB physical memory.
1G
3G
1G
0x00000000
0xC0000000
0xFFFFFFFF
Physical memory
User
kernel
(direct map)
va_pa_offset
Why need high memory? (cont.)
• Reserved ~896MB for linear mapping (direct map) => low memory
• > 896MB virtual address space :
– Temporary mapping
• PKMAP => kmap() 、VMALLOC => vmalloc()、vmap()
– Permanent mapping
• FIXMAP => DTB、early_ioremap()
(ZONE_NORMAL)
low mem
3G
1G
Physical memory > 1G
user
kernel
(ZONE_HIGHMEM)
high mem
linear mapping
( direct map )
896MB
0xC0000000
VMALLOC
PKMAP
FIXMAP
896MB
i386
va_pa_offset
Porting highmem
• Decide RV32 linux memory layout
– Refer other architecture (arm, x86, nds32)
– (option) Move VMALLOC、FIXMAP..etc after PAGE_OFFSET
• Leave user-process more address space.
• Add a PKMAP region in virtual address space
– for kmap()
• Temporary mapping for a single page
• alloc_page(__GFP_HIGHMEM) from ZONE_HIGHMEM
• Create a page table for pkmap
• Add memory slots in FIXMAP for kmap_atomic()
• Add architecture kmap() and kmap_atomic()
7
• 64bit platform needn’t highmem => kernel have 128GB address space.
• After porting highmem, we would like RV32 linux memory layout as below:
RISC-V 5.6 Linux Memory layout
linear mapping
(direct map)
PKMAP
kernel
user 3G
VMALLOC
FIXMAP
Reserved
VMEMMAP
PCI_IO
linear mapping
(direct map)
kernel
user
VMALLOC
FIXMAP
VMEMMAP
PCI_IO
0xffffffe0_00000000
0xffffffff_ffffffff
0xffffffff
0xc0000000
0x00000000 0x00000000_00000000
128 GB
16777215 TB
PAGE_OFFSET
PAGE_OFFSET
RV64
RV32
Move memory layout
(arch/riscv/include/asm/pgtable.h)
+#define VMALLOC_SIZE (SZ_128M)
+/* Reserve 4MB from top of RAM to align with PGDIR_SIZE */
+#define VMALLOC_END (0xffc00000UL)
+#define VMALLOC_START (VMALLOC_END - VMALLOC_SIZE)
…..
51 #define VMEMMAP_END (VMALLOC_START - 1)
52 #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE)
+#ifdef CONFIG_HIGHMEM
+/* Set LOWMEM_END alignment with PGDIR_SIZE */
+#define LOWMEM_END (ALIGN_DOWN(PKMAP_BASE, SZ_4M))
+#define LOWMEM_SIZE (LOWMEM_END - PAGE_OFFSET)
+#endif /* CONFIG_HIGHMEM */
#define TASK_SIZE PAGE_OFFSET
---------------------------
(arch/riscv/include/asm/highmem.h)
+#define PKMAP_BASE (FIXADDR_START - SZ_2M)
linear mapping
(direct map)
PKMAP 2M
user 3G
VMALLOC 128M
FIXMAP 4M
Reserved 4M
VMEMMAP 16M
PCI_IO 16M
0xffffffff
0xc0000000
0x00000000
LOWMEM_END
• Locate VMALLOC_END and LOWMEM_END
• Must Reserved 4MB from top of RAM to
align with PGDIR_SIZE
• Add a new region for PKMAP
VMALLOC_END
TASK_SIZE
Porting highmem
setup_bootm()
– max_low_pfn => the end of low_memory
– max_pfn => the end of physical memory
(arch/riscv/mm/init.c)
150 void __init setup_bootmem(void)
151 {
…
188 #ifdef CONFIG_HIGHMEM
189 max_low_pfn = (PFN_DOWN(__pa(LOWMEM_END)));
190 max_pfn = PFN_DOWN(memblock_end_of_DRAM());
191 memblock_set_current_limit(__pa(LOWMEM_END));
192 #else
low mem
Physical memory
high mem
max_low_pfn
max_pfn
27 static void __init zone_sizes_init(void)
28 {
…….
34 #endif
35 max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
36 #ifdef CONFIG_HIGHMEM
37 max_zone_pfns[ZONE_HIGHMEM] = max_pfn;
38 #endif
39 free_area_init_nodes(max_zone_pfns);
40 }
Porting highmem (cont.)
• Prepare a page table and set it to swapper_pg_dir
– swapper_pg_dir is a page directory pointer for kernel.
• 32 bit use 2 level page table (RISC-V)
pte entry
pgd
pmd (pkmap_p )
PAGE
swapper_pg_dir
Physical memory
Porting highmem (cont.)
+static void __init pkmap_init(void)
+{
.
+ /*
+ * Permanent kmaps:
+ */
+ vaddr = PKMAP_BASE;
+
+ pgd = swapper_pg_dir + pgd_index(vaddr);
+ p4d = p4d_offset(pgd, vaddr);
+ pud = pud_offset(p4d, vaddr);
+ pmd = pmd_offset(pud, vaddr);
+ pkmap_p = (pte_t *)__va(memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE));
…….
+ memset(pkmap_p, 0, PAGE_SIZE);
+ pfn = PFN_DOWN(__pa(pkmap_p));
+ set_pmd(pmd, __pmd((pfn << _PAGE_PFN_SHIFT) |
+ pgprot_val(__pgprot(_PAGE_TABLE))));
+
+ /* Adjust pkmap page table base */
+ pkmap_page_table = pkmap_p + pte_index(vaddr);
start_kernel()
-> setup_arch
-> paging_init
->pkmap_init
• Add new function pkmap_init() for creating pkmap page table
Porting highmem (cont.)
//arch/riscv/include/asm/fixmap.h
enum fixed_addresses {
FIX_PTE,
FIX_PMD,
FIX_EARLYCON_MEM_BASE,
+#ifdef CONFIG_HIGHMEM
+ FIX_KMAP_RESERVED,
+ FIX_KMAP_BEGIN,
+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS),
+#endif
+ __end_of_fixed_addresses,
};
23 #define FIXADDR_TOP (PKMAP_BASE)
24 #define FIXADDR_SIZE ((__end_of_fixed_addresses) << PAGE_SHIFT)
25 #define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE)
FIXADDR_TOP
FIXADDR_START
(__end_of_fixed_address)
FIX_KMAP_BEGIN
4K
4K
4K
4K
FIX_KMAP_END
kmap_atomic
• Add memory slots
in FIXMAP
After porting
• If success, you will see …
After porting
• If fail, you will see …
????
An end to high memory?
• Upstream my first kernel patch, but …
• Arnd Bergmann (Linaro) reply …
– I would much prefer to not see highmem added to new architectures
at all if possible, see https://lwn.net/Articles/813201/
• Weiner like to improve memory-reclaim performance
– Inode-cache shrinking vs. highmem
• Inodes, being kernel data structures => low memory
• page-cache pages => can be placed in high memory
• With a large number of one-byte files on a 7G machine, it invoke inode
shrinker to reclaim inode with populated page cache. It can drop gigabytes of
hot and active page cache.
• Linus Torvalds say …
Reference
• https://www.kernel.org/doc/Documentation/vm/highmem.txt
• https://lwn.net/Articles/813201/
• https://lkml.org/lkml/2020/4/2/253
END
Ad

More Related Content

What's hot (20)

Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
Adrian Huang
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Anne Nicolas
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
Daniel T. Lee
 
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
Linaro
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
Adrian Huang
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
Adrian Huang
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
Adrian Huang
 
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101
SUSE Labs Taipei
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
Adrian Huang
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
shimosawa
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
shimosawa
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
Thomas Graf
 
Memory model
Memory modelMemory model
Memory model
Yi-Hsiu Hsu
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
Adrian Huang
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
Kernel TLV
 
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Valeriy Kravchuk
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
Adrian Huang
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
Brendan Gregg
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
Kernel TLV
 
Jagan Teki - U-boot from scratch
Jagan Teki - U-boot from scratchJagan Teki - U-boot from scratch
Jagan Teki - U-boot from scratch
linuxlab_conf
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
Adrian Huang
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Anne Nicolas
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
Daniel T. Lee
 
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
Linaro
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
Adrian Huang
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
Adrian Huang
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
Adrian Huang
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
Adrian Huang
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
shimosawa
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
shimosawa
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
Thomas Graf
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
Adrian Huang
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
Kernel TLV
 
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Valeriy Kravchuk
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
Adrian Huang
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
Brendan Gregg
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
Kernel TLV
 
Jagan Teki - U-boot from scratch
Jagan Teki - U-boot from scratchJagan Teki - U-boot from scratch
Jagan Teki - U-boot from scratch
linuxlab_conf
 

Similar to COSCUP 2020 RISC-V 32 bit linux highmem porting (20)

Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Eric Lin
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
libfetion
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Akihiro Hayashi
 
Raspberry Pi tutorial
Raspberry Pi tutorialRaspberry Pi tutorial
Raspberry Pi tutorial
艾鍗科技
 
OSインストーラーの自作方法
OSインストーラーの自作方法OSインストーラーの自作方法
OSインストーラーの自作方法
LINE Corporation
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...
Adrian Huang
 
Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntu
Sim Janghoon
 
Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?
inside-BigData.com
 
BPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveBPF Hardware Offload Deep Dive
BPF Hardware Offload Deep Dive
Netronome
 
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarExploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Spark Summit
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
Valeriia Maliarenko
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
Stanley Huang
 
Lrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with rLrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with r
Ferdinand Jamitzky
 
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
Edge AI and Vision Alliance
 
LCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platformLCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platform
Linaro
 
Unified Memory on POWER9 + V100
Unified Memory on POWER9 + V100Unified Memory on POWER9 + V100
Unified Memory on POWER9 + V100
inside-BigData.com
 
Basic Linux kernel
Basic Linux kernelBasic Linux kernel
Basic Linux kernel
Morteza Nourelahi Alamdari
 
Db2
Db2Db2
Db2
rishabshare
 
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry code
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry codeKernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry code
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry code
Anne Nicolas
 
eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
SUSE Labs Taipei
 
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Eric Lin
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
libfetion
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Akihiro Hayashi
 
Raspberry Pi tutorial
Raspberry Pi tutorialRaspberry Pi tutorial
Raspberry Pi tutorial
艾鍗科技
 
OSインストーラーの自作方法
OSインストーラーの自作方法OSインストーラーの自作方法
OSインストーラーの自作方法
LINE Corporation
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...
Adrian Huang
 
Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntu
Sim Janghoon
 
BPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveBPF Hardware Offload Deep Dive
BPF Hardware Offload Deep Dive
Netronome
 
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarExploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Spark Summit
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
Valeriia Maliarenko
 
Lrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with rLrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with r
Ferdinand Jamitzky
 
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
Edge AI and Vision Alliance
 
LCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platformLCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platform
Linaro
 
Unified Memory on POWER9 + V100
Unified Memory on POWER9 + V100Unified Memory on POWER9 + V100
Unified Memory on POWER9 + V100
inside-BigData.com
 
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry code
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry codeKernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry code
Kernel Recipes 2016 - entry_*.S: A carefree stroll through kernel entry code
Anne Nicolas
 
eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
SUSE Labs Taipei
 
Ad

Recently uploaded (20)

Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko
Fwdays
 
"PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System""PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System"
Jainul Musani
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Network Security. Different aspects of Network Security.
Network Security. Different aspects of Network Security.Network Security. Different aspects of Network Security.
Network Security. Different aspects of Network Security.
gregtap1
 
Learn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step GuideLearn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step Guide
Marcel David
 
Automation Dreamin' 2022: Sharing Some Gratitude with Your Users
Automation Dreamin' 2022: Sharing Some Gratitude with Your UsersAutomation Dreamin' 2022: Sharing Some Gratitude with Your Users
Automation Dreamin' 2022: Sharing Some Gratitude with Your Users
Lynda Kane
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Salesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docxSalesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docx
José Enrique López Rivera
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5..."Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
Fwdays
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Leading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael JidaelLeading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael Jidael
Michael Jidael
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko"Rebranding for Growth", Anna Velykoivanenko
"Rebranding for Growth", Anna Velykoivanenko
Fwdays
 
"PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System""PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System"
Jainul Musani
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Network Security. Different aspects of Network Security.
Network Security. Different aspects of Network Security.Network Security. Different aspects of Network Security.
Network Security. Different aspects of Network Security.
gregtap1
 
Learn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step GuideLearn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step Guide
Marcel David
 
Automation Dreamin' 2022: Sharing Some Gratitude with Your Users
Automation Dreamin' 2022: Sharing Some Gratitude with Your UsersAutomation Dreamin' 2022: Sharing Some Gratitude with Your Users
Automation Dreamin' 2022: Sharing Some Gratitude with Your Users
Lynda Kane
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Salesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docxSalesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docx
José Enrique López Rivera
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5..."Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
"Client Partnership — the Path to Exponential Growth for Companies Sized 50-5...
Fwdays
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Leading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael JidaelLeading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael Jidael
Michael Jidael
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Ad

COSCUP 2020 RISC-V 32 bit linux highmem porting

  • 1. Experience on porting HIGHMEM to 32bit RISC-V Linux Eric Lin 2020.8.2
  • 2. About me • 2016 ~ 2018 NCKU – MS • 2018.12 ~ 2020.7 Andes technology – Software engineer in Linux kernel team • Experience – RISC-V Linux kernel – Device driver – U-BOOT
  • 3. Outline • Why need highmem? • Porting highmem • An end to high memory?
  • 4. Why need high memory ? • Earliest days, kernel maintained ′direct map ′ to map all physical memory in kernel space. – It easy for the kernel to manipulate any page in the system • 32bit platform only have 4GB virtual address space – To reduce TLB flush cost between kernel and user space. – Split 4G address space to 1:3 ( 1GB => kernel, 3GB => user ) • With direct map, kernel can only map 1GB physical memory. 1G 3G 1G 0x00000000 0xC0000000 0xFFFFFFFF Physical memory User kernel (direct map) va_pa_offset
  • 5. Why need high memory? (cont.) • Reserved ~896MB for linear mapping (direct map) => low memory • > 896MB virtual address space : – Temporary mapping • PKMAP => kmap() 、VMALLOC => vmalloc()、vmap() – Permanent mapping • FIXMAP => DTB、early_ioremap() (ZONE_NORMAL) low mem 3G 1G Physical memory > 1G user kernel (ZONE_HIGHMEM) high mem linear mapping ( direct map ) 896MB 0xC0000000 VMALLOC PKMAP FIXMAP 896MB i386 va_pa_offset
  • 6. Porting highmem • Decide RV32 linux memory layout – Refer other architecture (arm, x86, nds32) – (option) Move VMALLOC、FIXMAP..etc after PAGE_OFFSET • Leave user-process more address space. • Add a PKMAP region in virtual address space – for kmap() • Temporary mapping for a single page • alloc_page(__GFP_HIGHMEM) from ZONE_HIGHMEM • Create a page table for pkmap • Add memory slots in FIXMAP for kmap_atomic() • Add architecture kmap() and kmap_atomic()
  • 7. 7 • 64bit platform needn’t highmem => kernel have 128GB address space. • After porting highmem, we would like RV32 linux memory layout as below: RISC-V 5.6 Linux Memory layout linear mapping (direct map) PKMAP kernel user 3G VMALLOC FIXMAP Reserved VMEMMAP PCI_IO linear mapping (direct map) kernel user VMALLOC FIXMAP VMEMMAP PCI_IO 0xffffffe0_00000000 0xffffffff_ffffffff 0xffffffff 0xc0000000 0x00000000 0x00000000_00000000 128 GB 16777215 TB PAGE_OFFSET PAGE_OFFSET RV64 RV32
  • 8. Move memory layout (arch/riscv/include/asm/pgtable.h) +#define VMALLOC_SIZE (SZ_128M) +/* Reserve 4MB from top of RAM to align with PGDIR_SIZE */ +#define VMALLOC_END (0xffc00000UL) +#define VMALLOC_START (VMALLOC_END - VMALLOC_SIZE) ….. 51 #define VMEMMAP_END (VMALLOC_START - 1) 52 #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) +#ifdef CONFIG_HIGHMEM +/* Set LOWMEM_END alignment with PGDIR_SIZE */ +#define LOWMEM_END (ALIGN_DOWN(PKMAP_BASE, SZ_4M)) +#define LOWMEM_SIZE (LOWMEM_END - PAGE_OFFSET) +#endif /* CONFIG_HIGHMEM */ #define TASK_SIZE PAGE_OFFSET --------------------------- (arch/riscv/include/asm/highmem.h) +#define PKMAP_BASE (FIXADDR_START - SZ_2M) linear mapping (direct map) PKMAP 2M user 3G VMALLOC 128M FIXMAP 4M Reserved 4M VMEMMAP 16M PCI_IO 16M 0xffffffff 0xc0000000 0x00000000 LOWMEM_END • Locate VMALLOC_END and LOWMEM_END • Must Reserved 4MB from top of RAM to align with PGDIR_SIZE • Add a new region for PKMAP VMALLOC_END TASK_SIZE
  • 9. Porting highmem setup_bootm() – max_low_pfn => the end of low_memory – max_pfn => the end of physical memory (arch/riscv/mm/init.c) 150 void __init setup_bootmem(void) 151 { … 188 #ifdef CONFIG_HIGHMEM 189 max_low_pfn = (PFN_DOWN(__pa(LOWMEM_END))); 190 max_pfn = PFN_DOWN(memblock_end_of_DRAM()); 191 memblock_set_current_limit(__pa(LOWMEM_END)); 192 #else low mem Physical memory high mem max_low_pfn max_pfn 27 static void __init zone_sizes_init(void) 28 { ……. 34 #endif 35 max_zone_pfns[ZONE_NORMAL] = max_low_pfn; 36 #ifdef CONFIG_HIGHMEM 37 max_zone_pfns[ZONE_HIGHMEM] = max_pfn; 38 #endif 39 free_area_init_nodes(max_zone_pfns); 40 }
  • 10. Porting highmem (cont.) • Prepare a page table and set it to swapper_pg_dir – swapper_pg_dir is a page directory pointer for kernel. • 32 bit use 2 level page table (RISC-V) pte entry pgd pmd (pkmap_p ) PAGE swapper_pg_dir Physical memory
  • 11. Porting highmem (cont.) +static void __init pkmap_init(void) +{ . + /* + * Permanent kmaps: + */ + vaddr = PKMAP_BASE; + + pgd = swapper_pg_dir + pgd_index(vaddr); + p4d = p4d_offset(pgd, vaddr); + pud = pud_offset(p4d, vaddr); + pmd = pmd_offset(pud, vaddr); + pkmap_p = (pte_t *)__va(memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE)); ……. + memset(pkmap_p, 0, PAGE_SIZE); + pfn = PFN_DOWN(__pa(pkmap_p)); + set_pmd(pmd, __pmd((pfn << _PAGE_PFN_SHIFT) | + pgprot_val(__pgprot(_PAGE_TABLE)))); + + /* Adjust pkmap page table base */ + pkmap_page_table = pkmap_p + pte_index(vaddr); start_kernel() -> setup_arch -> paging_init ->pkmap_init • Add new function pkmap_init() for creating pkmap page table
  • 12. Porting highmem (cont.) //arch/riscv/include/asm/fixmap.h enum fixed_addresses { FIX_PTE, FIX_PMD, FIX_EARLYCON_MEM_BASE, +#ifdef CONFIG_HIGHMEM + FIX_KMAP_RESERVED, + FIX_KMAP_BEGIN, + FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS), +#endif + __end_of_fixed_addresses, }; 23 #define FIXADDR_TOP (PKMAP_BASE) 24 #define FIXADDR_SIZE ((__end_of_fixed_addresses) << PAGE_SHIFT) 25 #define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE) FIXADDR_TOP FIXADDR_START (__end_of_fixed_address) FIX_KMAP_BEGIN 4K 4K 4K 4K FIX_KMAP_END kmap_atomic • Add memory slots in FIXMAP
  • 13. After porting • If success, you will see …
  • 14. After porting • If fail, you will see … ????
  • 15. An end to high memory? • Upstream my first kernel patch, but … • Arnd Bergmann (Linaro) reply … – I would much prefer to not see highmem added to new architectures at all if possible, see https://lwn.net/Articles/813201/ • Weiner like to improve memory-reclaim performance – Inode-cache shrinking vs. highmem • Inodes, being kernel data structures => low memory • page-cache pages => can be placed in high memory • With a large number of one-byte files on a 7G machine, it invoke inode shrinker to reclaim inode with populated page cache. It can drop gigabytes of hot and active page cache. • Linus Torvalds say …
  • 17. END

Editor's Notes

  • #2: Hi 大家好 我是Eric~ 今天很高興來到coscup跟大家分享如何把 HIGHMEM porting到 32bit RISC-V linux kernel
  • #3: 這邊簡單自我介紹一下,主要在碩士到晶心科技Linux kernel team 對於Linux kernel也還在學習階段~
  • #4: 主要跟大家簡介Linux 為何HIGHMEM這個機制 如何porting highmem Highmem 機制是不是要被deprecate , 主要來自於今年2月LWN 一篇文章
  • #5: 早期kernel使用direct map,來map所有的physical memory到kernel space,這樣好處方便Linux kernel 管理這些page 在32bit platform只有4GB address sapce Kernel 為了要減少, kernel與user space之間切換 TLB flush的overhead,就把4G address 切成1:3 這樣kernel direct map 就只能夠有1GB physical memory 如果physical memory > 1GB,如何讓kernel可以map超過1GB的記憶體
  • #6: highmem 主要是保留896MB給direct map,對應到physical就稱為low mem(normal zone),大於896M就稱為high mem (high memory zone) 在virtual address space 部分主要會有2種mapping機制 Temporary mapping (暫時映射) => PKMAP (kmap) 、VMALLOC(vmamlloc,vmap) 永久映射: FIXADDR or FIXMAP => (DTB or early ioremap )
  • #7: 有了上面的觀念後,接下來就可以開始porting highmem部分 這邊主要分享porting的幾個重點, 要知道rv32 linux memory layout,這邊主要是參考arm、x86 新增一個PKMAP個空間給kmap()、kmap主要是當我們從highmem zone allocate 一塊page時候,建立va ->pa之間的mapping,因此需要建立一塊page table 給pkmap 新增一些memory slot 新增architecture kmap kmap atomic()
  • #8: 首先我們要先了解rv64 memory layout 以5.6 ,右邊這張圖 這邊順便跟提醒一下64bit是不需要highmem這個機制,因為它的direct map就有128G,已經很夠用 再來就是希望把vmlloc..fixmap部分往上移,讓user 可以比較多空間
  • #9: 主要是在pagetable.h檔案,定出整個layout開始跟結束的位置, 另外需要注意的是保留4MB空間,因為要align PGDIR_SIZE,如果不align會無法開機
  • #10: 接下來就是要在setup bootm()定義2個重要參數 max_low_pfn => low memory結束的pfn max_pfn => 整個physical memory Linux在對每個zone做初始化的時候就需要這2個參數,也就是開機畫面會看到 這是在計算每個zone的範圍
  • #11: 再來就是剛才提到我們需要幫PKMAP region建立page table ,主要就是幫kmap的建立va 與pa 對應的關係 需要先跟系統要一塊page (紅色框) 5這塊就是pkamp的page table 把它assign給swapper_pg_dir ,swapper_pg_dir主要是kernel page directory pointer 剛才有提到為何要保留4MB,主要是我們可以看到一個pgd entry 管理4MB,一開始如果沒有把它會找不到level的pmd entry
  • #12: 這邊主要就是在做上一張投影片的實作, 需要在start_kernel的paging_init新增一個pkmap_init
  • #14: 在porting之後,如果順利的話我們會看到機畫面會有2個zone,在memory layout可以看到我們剛才加入的PKMAP region 之後進入shell, 在memory info 可以看到kernel的high memory page有多少
  • #15: 通常就要開始debug,
  • #16: 把highmem porting好之後就想要upstream人生的第一個kernel patch,不過送完patch之後可以linaro