SlideShare a Scribd company logo
Memory Mapping Implementation (mmap) in
Linux Kernel
Adrian Huang | May, 2022
* Based on kernel 5.11 (x86_64) – QEMU
* SMP (4 CPUs) and 8GB memory
* Kernel parameter: nokaslr norandmaps
* Userspace: ASLR is disabled
* Legacy BIOS
Agenda
• Three different IO types
• Process Address Space – mm_struct & VMA
• Four types of memory mappings
• mmap system call implementation
• Demand page: page fault handling for four types of memory mappings
• fork()
• COW mapping configuration: set ‘write protect’ when calling fork()
• COW fault call path
Three different IO types
• Buffered IO
o Leverage page cache
• Direct IO
o Bypass page cache
• Memory-mapped IO (file-based mapping,
memory-mapped file)
• File content can be accessed by operations on the
bytes in the corresponding memory region.
Process Address Space – mm_struct & VMA (1/2)
Process Address Space – mm_struct & VMA (2/2)
Four types of memory mappings
Reference from: Chapter 49, The Linux Programming Interface
File Anonymous
Private
(Modification is not visible to other processes)
Initializing memory from contents of file
Example: Process's .text and .data segements
(Changes are not carried through to the underlying file)
Memory Allocation
Shared
(Modification is visible to other processes)
1. Memory-mapped IO: Changes are carried through to
the underlying file
2. Sharing memory between processes (IPC)
Sharing memory between processes (IPC)
Visibility of Modification
Mapping Type
mmap system call implementation (1/2)
mmap system call implementation (2/2)
1. thp_get_unmapped_area() eventually calls arch_get_unmapped_area_topdown()
2. unmapped_area_topdown() is the key point for allocating an userspace address
inode_operations & file_operations: file
thp_get_unmapped_area(): DAX supported for huge page
inode_operations & file_operations: directory
vma
vm_start =
0x400000
vm_end =
0x401000
vma
vm_start =
0x401000
vm_end =
0x496000
vma
vm_start =
0x496000
vm_end =
0x4bd000
vma
vm_start =
0x4be000
vm_end =
0x4c1000
vma
vm_start =
0x4c1000
vm_end =
0x4c4000
vma (heap)
vm_start =
0x4c4000
vm_end =
0x4e8000
vma (vvar)
vm_start =
0x7ffff7ffa000
vm_end =
0x7ffff7ffe000
vma (vdso)
vm_start =
0x7ffff7ffe000
vm_end =
0x7ffff7fff000
vma (stack)
vm_start =
0x7ffffffde000
vm_end =
0x7ffffffff000
GAP
GAP
G
A
P
GAP
user space address
0 0x7ffffffff000
Process address space: Doubly linked-list
Process address space: VMA rbtree
Process address space: VMA rbtree -> rb_subtree_gap
• Maximum of the following items:
o gap between this vma and previous one
o rb_left.subtree_gap
o rb_right.subtree_gap
rb_subtree_gap calculation
Process address space: VMA rbtree -> rb_subtree_gap
• Maximum of the following items:
o gap between this vma and previous one
o rb_left.subtree_gap
o rb_right.subtree_gap
rb_subtree_gap calculation
max(0x7ffff7ffa000 – 0x4e8000, 0, 0) =
0x7fff7b12000
Process address space: VMA rbtree -> rb_subtree_gap
• VM_GROWSDOWN is set in vma->vm_flags
• max(0x7ffffffde000 – 0x7ffff7fff000 -
stack_guard_gap, 0, 0) = 0x7edf000
• stack_guard_gap = 256UL<<PAGE_SHIFT
unmapped_area_topdown()
unmapped_area_topdown() - Principle
vm_unmapped_area_info
length = 0x1000
low_limit = PAGE_SIZE
high_limit = mm->mmap_base
= 0x7ffff7fff000
gap_end = info->high_limit (0x7ffff7fff000)
high_limit = gap_end – info->length = 0x7ffff7ffe000
gap_start = mm->highest_vm_end = 0x7ffffffff000
unmapped_area_topdown(): Preparation
[Efficiency] Traverse vma rbtree to find gap space instead of traversing vma linked list
unmapped_area_topdown()
unmapped_area_topdown()
unmapped_area_topdown()
unmapped_area_topdown()
Stack guard gap
unmapped_area_topdown()
Stack guard gap
gap_end = 0x7ffffffde000 – 0x100000 = 0x7fffffede000 * stack_guard_gap = 256UL<<PAGE_SHIFT = 0x100000
unmapped_area_topdown()
Not found yet
unmapped_area_topdown()
1
2
unmapped_area_topdown()
unmapped_area_topdown()
Found
return 0x7ffff7ffa000 – 0x1000 = 0x7ffff7ff9000
vma_merge() – Possible ways to merge
Note: VMAs must have the same attributes
mmap_region()->munmap_vma_range()->find_vma_links
find_vma_links(…, 0x7ffff7ff9000, …)
Iterate rbtree to find an empty VMA link (rb_link)
• rb_parent: The parent of rb_link
• rb_prev: Previous vma of rb_parent
• rb_prev is used in vma_merge()
• rb_link, rb_parent and rb_prev
are used in vma_link().
• rb_prev is the previous VMA
after the new VMA is inserted.
Note
vm_area_alloc() & vma_link()
__vma_link_file(): i_mmap for recording all memory mappings
(vma) of the memory-mapped file
task_struct
mm
files_struct
fd_array[]
file
f_inode
f_pos
f_mapping
.
.
file
inode
*i_mapping
i_atime
i_mtime
i_ctime
mnt
dentry
f_path
address_space
i_data
host
page_tree
i_mmap
page
mapping
index
radix_tree_root
height = 2
rnode
radix_tree_node
count = 2
63
0 1 …
page
2 3
radix_tree_node
count = 1
63
0 1 …
2 3
page page
slots[0]
slots[3]
slots[1] slots[3] slots[2]
index = 1 index = 3 index = 194
radix_tree_node
count = 1
63
0 1 …
2 3
Interval tree implemented
via red-black tree
Radix Tree (v4.19 or earlier)
Xarray (v4.20 or later)
files
mm
struct vm_area_struct *mmap
get_unmapped_area
mmap_base
page cache
vm_area_struct
vm_mm
vm_ops
vm_file
vm_area_struct
vm_file
.
.
pgd
1. i_mmap: Reverse mapping (RMAP) for the memory-mapped file (page cache) → check __vma_link_file ()
2. anon_vma: Reverse mapping for anonymous pages -> anon_vma_prepare() invoked during page fault
RMAP
mm_populate()
Anonymous Page: Page Fault – Discussion List
• Private (MAP_PRIATE)
• Write
• Read before a write
• Share (MAP_SHARED)
Demand Paging: page fault – Anonymous page (MAP_PRIVATE)
Demand Paging: page fault → Page table configuration - Anonymous page
(MAP_PRIVATE)
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PDE #511
PML4E #255
PML4E for
kernel
PDPTE #511
Physical Memory
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
Legend
Allocated pages or page table entry
Will be allocated if page fault occurs
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PDE #447
PTE #505 mmap
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Demand Paging: page fault → Page table configuration
[anonymous page: MAP_PRIVATE] page fault flow
man mmap
[gdb] backtrace
Demand Paging: page fault → Page table configuration - Anonymous page
(MAP_PRIVATE) [anonymous page: MAP_PRIVATE] page fault flow
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PDE #511
PML4E #255
PML4E for
kernel
PDPTE #511
Physical Memory
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
Legend
Allocated pages or page table entry
Will be allocated if page fault occurs
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PDE #447
PTE #505 mmap
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Demand Paging: page fault → Page table configuration - Anonymous page
(MAP_PRIVATE)
Demand Paging: page fault → call trace - Anonymous page (MAP_PRIVATE)
Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE)
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PDE #511
PML4E #255
PML4E for
kernel
PDPTE #511
Physical Memory
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PDE #447
PTE #505 mmap
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PDE #511
PML4E #255
PML4E for
kernel
PDPTE #511
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
Legend
Allocated pages or page table entry
Will be allocated if page fault occurs
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PDE #447
PTE #505
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE) –
Read before writing data (read fault)
Pre-allocated
zero page
Physical Memory
read fault
1. A pre-allocated zero page is initialized during system init -> Check init_zero_pfn()
2. Anonymous private page + read fault -> Link to the pre-allocated zero page
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PDE #511
PML4E #255
PML4E for
kernel
PDPTE #511
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
Legend
Allocated pages or page table entry
Will be allocated if page fault occurs
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PDE #447
PTE #505
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE) –
Read before writing data (read fault)
Pre-allocated
zero page
Physical Memory
read fault
1. A pre-allocated zero page is initialized during system init -> Check init_zero_pfn()
2. Anonymous private page + read fault -> Link to the pre-allocated zero page
Link to the pre-allocated zero page
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PDE #511
PML4E #255
PML4E for
kernel
PDPTE #511
Physical Memory
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
Legend
Allocated pages or page table entry
Will be allocated if page fault occurs
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PDE #447
PTE #505
Dedicated zero
page
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE) –
Read before writing data (read fault) -> Write fault
mmap
(anonymous)
1
Allocate a
zeroed page
2
Link to the newly
allocated page
write fault
Page fault handler for userspace address
Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE &
MAP_SHARED): first-time access
Anonymous page +
MAP_PRIVATE
Anonymous page +
MAP_SHARED + WRITE
Anonymous page +
MAP_SHARED + READ
[do_anonymous_page]
• Read fault: Apply the pre-allocated zero page
Demand Paging: page fault → VMAs – Anonymous page (MAP_SHARED):
first-time write
Demand Paging: page fault → VMAs – Anonymous page (MAP_SHARED):
first-time write
Anonymous page + MAP_SHARED
Types of memory mappings: update
Reference from: Chapter 49, The Linux Programming Interface
Description Backing file
Private
(Modification is not visible to other processes)
Initializing memory from contents of file
Example: Process's .text and .data segements
(Changes are not carried through to the underlying file)
Memory Allocation No
Shared
(Modification is visible to other processes)
1. Memory-mapped IO: Changes are carried through to
the underlying file
2. Sharing memory between processes (IPC)
Sharing memory between processes (IPC) /dev/zero
Mapping Type
Anonymous
File
Visibility of Modification
Memory-mapped file
vma->vm_ops: invoke vm_ops callbacks in page fault
handler
• struct vm_operations_struct
o fault : Invoked by page fault handler to read
the corresponding data into a physical page.
o map_pages : Map the page if it is in page
cache (check function ‘do_fault_around’:
warm/cold page cache).
o page_mkwrite: Notification that a previously
read-only page is about to become writable.
Memory-mapped file: related data structures
NULL: anonymous page
Page fault handler for userspace address: Memory-mapped file
Look at this firstly
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PML4E #255
PML4E for
kernel
PDPTE #511
Physical Memory
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
Legend
Allocated pages or page table entry
Will be allocated if page fault occurs
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PTE #505 mmap
(memory-mapped file
= page cache)
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Disk
PDE #511
PDE #447
Already in page cache:
file->f_mapping
1
2
Demand Paging: page fault → do_read_fault(): warm page cache
read page fault
Memory-mapped file
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PML4E #255
PML4E for
kernel
PDPTE #511
Physical Memory
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
Legend
Allocated pages or page table entry
Will be allocated if page fault occurs
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PTE #505 mmap
(memory-mapped file
= page cache)
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Disk
PDE #511
PDE #447
Already in page cache:
file->f_mapping
1
2
Demand Paging: page fault → do_read_fault(): warm page cache
read page fault
Memory-mapped file
Demand Paging: page fault → filemap_map_pages(): warm page cache
read page fault
1
2
map_pages callback: map a page cache (already available) to process address space
(process page table) → warm page cache
Memory-mapped file
Demand Paging: page fault → call path for warm/cold page cache
do_read_fault
alloc_set_pte
filemap_map_pages
vmf->vma->vm_ops->map_pages
do_fault_around
Find the page cache from
address_space (page cache pool)
__do_fault
vmf->vma->vm_ops->fault
ext4_filemap_fault
filemap_fault
do_sync_mmap_readahead
finish_fault
alloc_set_pte
warm page cache
cold page cache
find_get_page
do_async_mmap_readahead
No page in page cache
page in page cache
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PML4E #255
PML4E for
kernel
PDPTE #511
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PTE #505 mmap
(memory-mapped file
= page cache)
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Disk
PDE #511
PDE #447
Already in page cache:
file->f_mapping
1
2
warm page cache: make sure “page cache = mmap physical page” (1/3)
Physical Memory
Page Cache - Verification
page->mapping
• [bit 0] = 1: anonymous page → mapping field = anon_vma descriptor
• [bit 0] = 0: page cache → mapping field = address_space descriptor
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PML4E #255
PML4E for
kernel
PDPTE #511
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PTE #505 mmap
(memory-mapped file
= page cache)
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Disk
PDE #511
PDE #447
Already in page cache:
file->f_mapping
1
2
After alloc_set_pte()
After alloc_set_pte()
Physical Memory
<< PAGE_SHIFT
warm page cache: make sure “page cache = mmap physical page” (2/3)
Page Map
Level-4 Table
Sign-extend
Page Map
Level-4 Offset
30 21
39 38 29
47
48
63
Page Directory
Pointer Offset
Page Directory
Offset
Page Directory
Pointer Table
Page Directory
Table
PML4E #255
PML4E for
kernel
PDPTE #511
PTE #510
stack
task_struct
pgd
mm
mm_struct
mmap
PTE #509
…
PTE #478
PML4E #0
PDE #2
PDPTE #0
PTE #0
.text, .data, …
Linear Address: 0x7ffff7ff9000
PTE #188
…
PTE #190
PTE #199
…
heap
PTE #505 mmap
(memory-mapped file
= page cache)
12
20 11 0
Page Table
Page Directory
Pointer Offset
Page Directory Offset
Disk
PDE #511
PDE #447
Already in page cache:
file->f_mapping
1
2
After alloc_set_pte()
After alloc_set_pte()
Physical Memory
will generate a write fault for next write
warm page cache: make sure “page cache = mmap physical page” (3/3)
write fault (write-protected fault)
write fault
Memory-mapped file
do_wp_page
wp_page_copy
new_page = alloc_page_vma(…)
cow_user_page
copy_user_highpage
maybe_mkwrite
wp_page_shared
[MAP_PRIVATE] COW: Copy On Write
[MAP_SHARED] vma is (VM_WRITE|VM_SHARED)
write fault (write-protected fault): do_wp_page
write fault
[MAP_PRIVATE] COW: Quote from `man mmap`
Memory-mapped file
write fault (write-protected fault) – call path
write fault without previously reading (MAP_SHARED): do_shared_fault
write fault
Memory-mapped file
write fault without previously reading (MAP_SHARED): do_shared_fault
do_shared_fault
__do_fault
vmf->vma->vm_ops->fault
ext4_filemap_fault
filemap_fault
do_sync_mmap_readahead
finish_fault
alloc_set_pte
find_get_page
do_async_mmap_readahead
No page in page cache
page in page cache
do_page_mkwrite
fault_dirty_shared_page
write fault without previously reading (MAP_SHARED): do_shared_fault
do_shared_fault
__do_fault
vmf->vma->vm_ops->fault
ext4_filemap_fault
filemap_fault
do_sync_mmap_readahead
finish_fault
alloc_set_pte
find_get_page
do_async_mmap_readahead
No page in page cache
page in page cache: COW
do_page_mkwrite
fault_dirty_shared_page
write fault without previously reading (MAP_PRIVATE): do_cow_fault
write fault
Memory-mapped file
write fault without previously reading (MAP_PRIVATE): do_cow_fault
do_cow_fault
vmf->cow_page = alloc_page_vma(…)
__do_fault
vmf->vma->vm_ops->fault
ext4_filemap_fault
filemap_fault
do_sync_mmap_readahead
finish_fault
alloc_set_pte
find_get_page
do_async_mmap_readahead
No page in page cache
page in page cache
copy_user_highpage(vmf->cow_page, vmf->page, …)
1. Apply vmf->page if not a COW page
2. Apply vmf->cow_page if a COW page
Quote from `man mmap`
do_cow_fault
vmf->cow_page = alloc_page_vma(…)
__do_fault
vmf->vma->vm_ops->fault
ext4_filemap_fault
filemap_fault
do_sync_mmap_readahead
finish_fault
alloc_set_pte
find_get_page
do_async_mmap_readahead
No page in page cache
page in page cache: COW
copy_user_highpage(vmf->cow_page, vmf->page, …)
1. Apply vmf->page if not a COW page
2. Apply vmf->cow_page if a COW page
1
2
3
4
breakpoint
write fault without previously reading (MAP_PRIVATE): do_cow_fault
do_cow_fault
vmf->cow_page = alloc_page_vma(…)
__do_fault
vmf->vma->vm_ops->fault
ext4_filemap_fault
filemap_fault
do_sync_mmap_readahead
finish_fault
alloc_set_pte
find_get_page
do_async_mmap_readahead
No page in page cache
page in page cache: COW
copy_user_highpage(vmf->cow_page, vmf->page, …)
1. Apply vmf->page if not a COW page
2. Apply vmf->cow_page if a COW page
1
2
3
4
breakpoint
write fault without previously reading (MAP_PRIVATE): do_cow_fault
2
do_cow_fault
vmf->cow_page = alloc_page_vma(…)
__do_fault
vmf->vma->vm_ops->fault
ext4_filemap_fault
filemap_fault
do_sync_mmap_readahead
finish_fault
alloc_set_pte
find_get_page
do_async_mmap_readahead
No page in page cache
page in page cache: COW
copy_user_highpage(vmf->cow_page, vmf->page, …)
1. Apply vmf->page if not a COW page
2. Apply vmf->cow_page if a COW page
1
2
3
4
breakpoint
write fault without previously reading (MAP_PRIVATE): do_cow_fault
[Recap] Think about this: #1
write fault
read fault
1
1
2
2
Memory-mapped file
Think about this: #2
fork(): Which function (do_cow_fault or do_wp_page)
is called when COW is triggered?
Think about this…
this one?
or, this one?
fork(): COW mapping – set ‘write protect’ for src/dst PTEs
fork(): COW mapping – set ‘write protect’ for src/dst PTEs
fork(): COW Fault Call Path
Backup
vdso & vvar
• vsyscall (Virtual System Call)
o The context switch overhead (user <-> kernel) of some system calls (gettimeofday, time, getcpu) is greater than
execution time of those functions: Built on top of the fixed-mapped address
o Machine code format
o Core dump: debugger cannot provide the debugging info because symbols of this area are unavailable
• vDSO (Virtual Dynamic Shared Object)
• ELF format
• A small shared library that the kernel automatically maps into the address space of all userspace applications
• VVAR (vDSO Variable)
Ad

More Related Content

What's hot (20)

qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
Adrian Huang
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
Adrian Huang
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...
Adrian Huang
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
Adrian Huang
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
Adrian Huang
 
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Adrian Huang
 
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
Adrian Huang
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
Vadim Nikitin
 
spinlock.pdf
spinlock.pdfspinlock.pdf
spinlock.pdf
Adrian Huang
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
Gene Chang
 
Continguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelContinguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux Kernel
Kernel TLV
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
Ni Zo-Ma
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
Brendan Gregg
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
shimosawa
 
Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)
Pankaj Suryawanshi
 
Kdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysisKdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysis
Buland Singh
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Anne Nicolas
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
shimosawa
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg
 
Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)
Adrian Huang
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
Adrian Huang
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
Adrian Huang
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...
Adrian Huang
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
Adrian Huang
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
Adrian Huang
 
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Adrian Huang
 
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
Adrian Huang
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
Vadim Nikitin
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
Gene Chang
 
Continguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelContinguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux Kernel
Kernel TLV
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
Ni Zo-Ma
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
Brendan Gregg
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
shimosawa
 
Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)
Pankaj Suryawanshi
 
Kdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysisKdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysis
Buland Singh
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Anne Nicolas
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
shimosawa
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg
 
Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)Linux Synchronization Mechanism: RCU (Read Copy Update)
Linux Synchronization Mechanism: RCU (Read Copy Update)
Adrian Huang
 

Similar to Memory Mapping Implementation (mmap) in Linux Kernel (20)

memory_mapping.ppt
memory_mapping.pptmemory_mapping.ppt
memory_mapping.ppt
KalimuthuVelappan
 
Memory
MemoryMemory
Memory
Muhammed Mazhar Khan
 
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Eric Lin
 
Linux memory
Linux memoryLinux memory
Linux memory
ericrain911
 
Os8 2
Os8 2Os8 2
Os8 2
issbp
 
COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem porting
Eric Lin
 
Windows memory manager internals
Windows memory manager internalsWindows memory manager internals
Windows memory manager internals
Sisimon Soman
 
15 bufferand records
15 bufferand records15 bufferand records
15 bufferand records
ashish61_scs
 
Memory Management
Memory ManagementMemory Management
Memory Management
Ramasubbu .P
 
memory.ppt
memory.pptmemory.ppt
memory.ppt
KalimuthuVelappan
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdf
ycelgemici1
 
Chapter 04
Chapter 04Chapter 04
Chapter 04
Google
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production
Paris Data Engineers !
 
INFLOW-2014-NVM-Compression
INFLOW-2014-NVM-CompressionINFLOW-2014-NVM-Compression
INFLOW-2014-NVM-Compression
Dhananjoy ( Joy ) Das
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
Yongseok Oh
 
Assignment of SOS operating systemThe file lmemman.c has one incom.pdf
Assignment of SOS operating systemThe file lmemman.c has one incom.pdfAssignment of SOS operating systemThe file lmemman.c has one incom.pdf
Assignment of SOS operating systemThe file lmemman.c has one incom.pdf
sktambifortune
 
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Hsien-Hsin Sean Lee, Ph.D.
 
Linux Huge Pages
Linux Huge PagesLinux Huge Pages
Linux Huge Pages
Geraldo Netto
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB Alchemy
Ryan Robson
 
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
The Linux Foundation
 
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020
Eric Lin
 
Os8 2
Os8 2Os8 2
Os8 2
issbp
 
COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem porting
Eric Lin
 
Windows memory manager internals
Windows memory manager internalsWindows memory manager internals
Windows memory manager internals
Sisimon Soman
 
15 bufferand records
15 bufferand records15 bufferand records
15 bufferand records
ashish61_scs
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdf
ycelgemici1
 
Chapter 04
Chapter 04Chapter 04
Chapter 04
Google
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production
Paris Data Engineers !
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
Yongseok Oh
 
Assignment of SOS operating systemThe file lmemman.c has one incom.pdf
Assignment of SOS operating systemThe file lmemman.c has one incom.pdfAssignment of SOS operating systemThe file lmemman.c has one incom.pdf
Assignment of SOS operating systemThe file lmemman.c has one incom.pdf
sktambifortune
 
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Hsien-Hsin Sean Lee, Ph.D.
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB Alchemy
Ryan Robson
 
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
The Linux Foundation
 
Ad

Recently uploaded (20)

Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Xforce Keygen 64-bit AutoCAD 2025 Crack
Xforce Keygen 64-bit AutoCAD 2025  CrackXforce Keygen 64-bit AutoCAD 2025  Crack
Xforce Keygen 64-bit AutoCAD 2025 Crack
usmanhidray
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
wareshashahzadiii
 
Agentic AI Use Cases using GenAI LLM models
Agentic AI Use Cases using GenAI LLM modelsAgentic AI Use Cases using GenAI LLM models
Agentic AI Use Cases using GenAI LLM models
Manish Chopra
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Adobe Photoshop CC 2025 Crack Full Serial Key With Latest
Adobe Photoshop CC 2025 Crack Full Serial Key  With LatestAdobe Photoshop CC 2025 Crack Full Serial Key  With Latest
Adobe Photoshop CC 2025 Crack Full Serial Key With Latest
usmanhidray
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)Who Watches the Watchmen (SciFiDevCon 2025)
Who Watches the Watchmen (SciFiDevCon 2025)
Allon Mureinik
 
Xforce Keygen 64-bit AutoCAD 2025 Crack
Xforce Keygen 64-bit AutoCAD 2025  CrackXforce Keygen 64-bit AutoCAD 2025  Crack
Xforce Keygen 64-bit AutoCAD 2025 Crack
usmanhidray
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
Minitab 22 Full Crack Plus Product Key Free Download [Latest] 2025
wareshashahzadiii
 
Agentic AI Use Cases using GenAI LLM models
Agentic AI Use Cases using GenAI LLM modelsAgentic AI Use Cases using GenAI LLM models
Agentic AI Use Cases using GenAI LLM models
Manish Chopra
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Adobe Photoshop CC 2025 Crack Full Serial Key With Latest
Adobe Photoshop CC 2025 Crack Full Serial Key  With LatestAdobe Photoshop CC 2025 Crack Full Serial Key  With Latest
Adobe Photoshop CC 2025 Crack Full Serial Key With Latest
usmanhidray
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Ad

Memory Mapping Implementation (mmap) in Linux Kernel

  • 1. Memory Mapping Implementation (mmap) in Linux Kernel Adrian Huang | May, 2022 * Based on kernel 5.11 (x86_64) – QEMU * SMP (4 CPUs) and 8GB memory * Kernel parameter: nokaslr norandmaps * Userspace: ASLR is disabled * Legacy BIOS
  • 2. Agenda • Three different IO types • Process Address Space – mm_struct & VMA • Four types of memory mappings • mmap system call implementation • Demand page: page fault handling for four types of memory mappings • fork() • COW mapping configuration: set ‘write protect’ when calling fork() • COW fault call path
  • 3. Three different IO types • Buffered IO o Leverage page cache • Direct IO o Bypass page cache • Memory-mapped IO (file-based mapping, memory-mapped file) • File content can be accessed by operations on the bytes in the corresponding memory region.
  • 4. Process Address Space – mm_struct & VMA (1/2)
  • 5. Process Address Space – mm_struct & VMA (2/2)
  • 6. Four types of memory mappings Reference from: Chapter 49, The Linux Programming Interface File Anonymous Private (Modification is not visible to other processes) Initializing memory from contents of file Example: Process's .text and .data segements (Changes are not carried through to the underlying file) Memory Allocation Shared (Modification is visible to other processes) 1. Memory-mapped IO: Changes are carried through to the underlying file 2. Sharing memory between processes (IPC) Sharing memory between processes (IPC) Visibility of Modification Mapping Type
  • 7. mmap system call implementation (1/2)
  • 8. mmap system call implementation (2/2) 1. thp_get_unmapped_area() eventually calls arch_get_unmapped_area_topdown() 2. unmapped_area_topdown() is the key point for allocating an userspace address
  • 9. inode_operations & file_operations: file thp_get_unmapped_area(): DAX supported for huge page
  • 11. vma vm_start = 0x400000 vm_end = 0x401000 vma vm_start = 0x401000 vm_end = 0x496000 vma vm_start = 0x496000 vm_end = 0x4bd000 vma vm_start = 0x4be000 vm_end = 0x4c1000 vma vm_start = 0x4c1000 vm_end = 0x4c4000 vma (heap) vm_start = 0x4c4000 vm_end = 0x4e8000 vma (vvar) vm_start = 0x7ffff7ffa000 vm_end = 0x7ffff7ffe000 vma (vdso) vm_start = 0x7ffff7ffe000 vm_end = 0x7ffff7fff000 vma (stack) vm_start = 0x7ffffffde000 vm_end = 0x7ffffffff000 GAP GAP G A P GAP user space address 0 0x7ffffffff000 Process address space: Doubly linked-list
  • 13. Process address space: VMA rbtree -> rb_subtree_gap • Maximum of the following items: o gap between this vma and previous one o rb_left.subtree_gap o rb_right.subtree_gap rb_subtree_gap calculation
  • 14. Process address space: VMA rbtree -> rb_subtree_gap • Maximum of the following items: o gap between this vma and previous one o rb_left.subtree_gap o rb_right.subtree_gap rb_subtree_gap calculation max(0x7ffff7ffa000 – 0x4e8000, 0, 0) = 0x7fff7b12000
  • 15. Process address space: VMA rbtree -> rb_subtree_gap • VM_GROWSDOWN is set in vma->vm_flags • max(0x7ffffffde000 – 0x7ffff7fff000 - stack_guard_gap, 0, 0) = 0x7edf000 • stack_guard_gap = 256UL<<PAGE_SHIFT
  • 17. unmapped_area_topdown() - Principle vm_unmapped_area_info length = 0x1000 low_limit = PAGE_SIZE high_limit = mm->mmap_base = 0x7ffff7fff000 gap_end = info->high_limit (0x7ffff7fff000) high_limit = gap_end – info->length = 0x7ffff7ffe000 gap_start = mm->highest_vm_end = 0x7ffffffff000 unmapped_area_topdown(): Preparation [Efficiency] Traverse vma rbtree to find gap space instead of traversing vma linked list
  • 22. unmapped_area_topdown() Stack guard gap gap_end = 0x7ffffffde000 – 0x100000 = 0x7fffffede000 * stack_guard_gap = 256UL<<PAGE_SHIFT = 0x100000
  • 27. vma_merge() – Possible ways to merge Note: VMAs must have the same attributes
  • 28. mmap_region()->munmap_vma_range()->find_vma_links find_vma_links(…, 0x7ffff7ff9000, …) Iterate rbtree to find an empty VMA link (rb_link) • rb_parent: The parent of rb_link • rb_prev: Previous vma of rb_parent • rb_prev is used in vma_merge() • rb_link, rb_parent and rb_prev are used in vma_link(). • rb_prev is the previous VMA after the new VMA is inserted. Note
  • 30. __vma_link_file(): i_mmap for recording all memory mappings (vma) of the memory-mapped file task_struct mm files_struct fd_array[] file f_inode f_pos f_mapping . . file inode *i_mapping i_atime i_mtime i_ctime mnt dentry f_path address_space i_data host page_tree i_mmap page mapping index radix_tree_root height = 2 rnode radix_tree_node count = 2 63 0 1 … page 2 3 radix_tree_node count = 1 63 0 1 … 2 3 page page slots[0] slots[3] slots[1] slots[3] slots[2] index = 1 index = 3 index = 194 radix_tree_node count = 1 63 0 1 … 2 3 Interval tree implemented via red-black tree Radix Tree (v4.19 or earlier) Xarray (v4.20 or later) files mm struct vm_area_struct *mmap get_unmapped_area mmap_base page cache vm_area_struct vm_mm vm_ops vm_file vm_area_struct vm_file . . pgd 1. i_mmap: Reverse mapping (RMAP) for the memory-mapped file (page cache) → check __vma_link_file () 2. anon_vma: Reverse mapping for anonymous pages -> anon_vma_prepare() invoked during page fault RMAP
  • 32. Anonymous Page: Page Fault – Discussion List • Private (MAP_PRIATE) • Write • Read before a write • Share (MAP_SHARED)
  • 33. Demand Paging: page fault – Anonymous page (MAP_PRIVATE)
  • 34. Demand Paging: page fault → Page table configuration - Anonymous page (MAP_PRIVATE) Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PDE #511 PML4E #255 PML4E for kernel PDPTE #511 Physical Memory PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 Legend Allocated pages or page table entry Will be allocated if page fault occurs PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PDE #447 PTE #505 mmap 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset
  • 35. Demand Paging: page fault → Page table configuration [anonymous page: MAP_PRIVATE] page fault flow
  • 36. man mmap [gdb] backtrace Demand Paging: page fault → Page table configuration - Anonymous page (MAP_PRIVATE) [anonymous page: MAP_PRIVATE] page fault flow
  • 37. Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PDE #511 PML4E #255 PML4E for kernel PDPTE #511 Physical Memory PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 Legend Allocated pages or page table entry Will be allocated if page fault occurs PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PDE #447 PTE #505 mmap 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset Demand Paging: page fault → Page table configuration - Anonymous page (MAP_PRIVATE)
  • 38. Demand Paging: page fault → call trace - Anonymous page (MAP_PRIVATE)
  • 39. Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE) Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PDE #511 PML4E #255 PML4E for kernel PDPTE #511 Physical Memory PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PDE #447 PTE #505 mmap 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset
  • 40. Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PDE #511 PML4E #255 PML4E for kernel PDPTE #511 PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 Legend Allocated pages or page table entry Will be allocated if page fault occurs PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PDE #447 PTE #505 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE) – Read before writing data (read fault) Pre-allocated zero page Physical Memory read fault 1. A pre-allocated zero page is initialized during system init -> Check init_zero_pfn() 2. Anonymous private page + read fault -> Link to the pre-allocated zero page
  • 41. Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PDE #511 PML4E #255 PML4E for kernel PDPTE #511 PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 Legend Allocated pages or page table entry Will be allocated if page fault occurs PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PDE #447 PTE #505 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE) – Read before writing data (read fault) Pre-allocated zero page Physical Memory read fault 1. A pre-allocated zero page is initialized during system init -> Check init_zero_pfn() 2. Anonymous private page + read fault -> Link to the pre-allocated zero page Link to the pre-allocated zero page
  • 42. Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PDE #511 PML4E #255 PML4E for kernel PDPTE #511 Physical Memory PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 Legend Allocated pages or page table entry Will be allocated if page fault occurs PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PDE #447 PTE #505 Dedicated zero page 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE) – Read before writing data (read fault) -> Write fault mmap (anonymous) 1 Allocate a zeroed page 2 Link to the newly allocated page write fault
  • 43. Page fault handler for userspace address
  • 44. Demand Paging: page fault → VMAs – Anonymous page (MAP_PRIVATE & MAP_SHARED): first-time access Anonymous page + MAP_PRIVATE Anonymous page + MAP_SHARED + WRITE Anonymous page + MAP_SHARED + READ [do_anonymous_page] • Read fault: Apply the pre-allocated zero page
  • 45. Demand Paging: page fault → VMAs – Anonymous page (MAP_SHARED): first-time write
  • 46. Demand Paging: page fault → VMAs – Anonymous page (MAP_SHARED): first-time write Anonymous page + MAP_SHARED
  • 47. Types of memory mappings: update Reference from: Chapter 49, The Linux Programming Interface Description Backing file Private (Modification is not visible to other processes) Initializing memory from contents of file Example: Process's .text and .data segements (Changes are not carried through to the underlying file) Memory Allocation No Shared (Modification is visible to other processes) 1. Memory-mapped IO: Changes are carried through to the underlying file 2. Sharing memory between processes (IPC) Sharing memory between processes (IPC) /dev/zero Mapping Type Anonymous File Visibility of Modification
  • 48. Memory-mapped file vma->vm_ops: invoke vm_ops callbacks in page fault handler • struct vm_operations_struct o fault : Invoked by page fault handler to read the corresponding data into a physical page. o map_pages : Map the page if it is in page cache (check function ‘do_fault_around’: warm/cold page cache). o page_mkwrite: Notification that a previously read-only page is about to become writable.
  • 49. Memory-mapped file: related data structures NULL: anonymous page
  • 50. Page fault handler for userspace address: Memory-mapped file Look at this firstly
  • 51. Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PML4E #255 PML4E for kernel PDPTE #511 Physical Memory PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 Legend Allocated pages or page table entry Will be allocated if page fault occurs PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PTE #505 mmap (memory-mapped file = page cache) 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset Disk PDE #511 PDE #447 Already in page cache: file->f_mapping 1 2 Demand Paging: page fault → do_read_fault(): warm page cache read page fault Memory-mapped file
  • 52. Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PML4E #255 PML4E for kernel PDPTE #511 Physical Memory PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 Legend Allocated pages or page table entry Will be allocated if page fault occurs PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PTE #505 mmap (memory-mapped file = page cache) 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset Disk PDE #511 PDE #447 Already in page cache: file->f_mapping 1 2 Demand Paging: page fault → do_read_fault(): warm page cache read page fault Memory-mapped file
  • 53. Demand Paging: page fault → filemap_map_pages(): warm page cache read page fault 1 2 map_pages callback: map a page cache (already available) to process address space (process page table) → warm page cache Memory-mapped file
  • 54. Demand Paging: page fault → call path for warm/cold page cache do_read_fault alloc_set_pte filemap_map_pages vmf->vma->vm_ops->map_pages do_fault_around Find the page cache from address_space (page cache pool) __do_fault vmf->vma->vm_ops->fault ext4_filemap_fault filemap_fault do_sync_mmap_readahead finish_fault alloc_set_pte warm page cache cold page cache find_get_page do_async_mmap_readahead No page in page cache page in page cache
  • 55. Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PML4E #255 PML4E for kernel PDPTE #511 PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PTE #505 mmap (memory-mapped file = page cache) 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset Disk PDE #511 PDE #447 Already in page cache: file->f_mapping 1 2 warm page cache: make sure “page cache = mmap physical page” (1/3) Physical Memory Page Cache - Verification page->mapping • [bit 0] = 1: anonymous page → mapping field = anon_vma descriptor • [bit 0] = 0: page cache → mapping field = address_space descriptor
  • 56. Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PML4E #255 PML4E for kernel PDPTE #511 PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PTE #505 mmap (memory-mapped file = page cache) 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset Disk PDE #511 PDE #447 Already in page cache: file->f_mapping 1 2 After alloc_set_pte() After alloc_set_pte() Physical Memory << PAGE_SHIFT warm page cache: make sure “page cache = mmap physical page” (2/3)
  • 57. Page Map Level-4 Table Sign-extend Page Map Level-4 Offset 30 21 39 38 29 47 48 63 Page Directory Pointer Offset Page Directory Offset Page Directory Pointer Table Page Directory Table PML4E #255 PML4E for kernel PDPTE #511 PTE #510 stack task_struct pgd mm mm_struct mmap PTE #509 … PTE #478 PML4E #0 PDE #2 PDPTE #0 PTE #0 .text, .data, … Linear Address: 0x7ffff7ff9000 PTE #188 … PTE #190 PTE #199 … heap PTE #505 mmap (memory-mapped file = page cache) 12 20 11 0 Page Table Page Directory Pointer Offset Page Directory Offset Disk PDE #511 PDE #447 Already in page cache: file->f_mapping 1 2 After alloc_set_pte() After alloc_set_pte() Physical Memory will generate a write fault for next write warm page cache: make sure “page cache = mmap physical page” (3/3)
  • 58. write fault (write-protected fault) write fault Memory-mapped file
  • 59. do_wp_page wp_page_copy new_page = alloc_page_vma(…) cow_user_page copy_user_highpage maybe_mkwrite wp_page_shared [MAP_PRIVATE] COW: Copy On Write [MAP_SHARED] vma is (VM_WRITE|VM_SHARED) write fault (write-protected fault): do_wp_page write fault [MAP_PRIVATE] COW: Quote from `man mmap` Memory-mapped file
  • 60. write fault (write-protected fault) – call path
  • 61. write fault without previously reading (MAP_SHARED): do_shared_fault write fault Memory-mapped file
  • 62. write fault without previously reading (MAP_SHARED): do_shared_fault do_shared_fault __do_fault vmf->vma->vm_ops->fault ext4_filemap_fault filemap_fault do_sync_mmap_readahead finish_fault alloc_set_pte find_get_page do_async_mmap_readahead No page in page cache page in page cache do_page_mkwrite fault_dirty_shared_page
  • 63. write fault without previously reading (MAP_SHARED): do_shared_fault do_shared_fault __do_fault vmf->vma->vm_ops->fault ext4_filemap_fault filemap_fault do_sync_mmap_readahead finish_fault alloc_set_pte find_get_page do_async_mmap_readahead No page in page cache page in page cache: COW do_page_mkwrite fault_dirty_shared_page
  • 64. write fault without previously reading (MAP_PRIVATE): do_cow_fault write fault Memory-mapped file
  • 65. write fault without previously reading (MAP_PRIVATE): do_cow_fault do_cow_fault vmf->cow_page = alloc_page_vma(…) __do_fault vmf->vma->vm_ops->fault ext4_filemap_fault filemap_fault do_sync_mmap_readahead finish_fault alloc_set_pte find_get_page do_async_mmap_readahead No page in page cache page in page cache copy_user_highpage(vmf->cow_page, vmf->page, …) 1. Apply vmf->page if not a COW page 2. Apply vmf->cow_page if a COW page Quote from `man mmap`
  • 66. do_cow_fault vmf->cow_page = alloc_page_vma(…) __do_fault vmf->vma->vm_ops->fault ext4_filemap_fault filemap_fault do_sync_mmap_readahead finish_fault alloc_set_pte find_get_page do_async_mmap_readahead No page in page cache page in page cache: COW copy_user_highpage(vmf->cow_page, vmf->page, …) 1. Apply vmf->page if not a COW page 2. Apply vmf->cow_page if a COW page 1 2 3 4 breakpoint write fault without previously reading (MAP_PRIVATE): do_cow_fault
  • 67. do_cow_fault vmf->cow_page = alloc_page_vma(…) __do_fault vmf->vma->vm_ops->fault ext4_filemap_fault filemap_fault do_sync_mmap_readahead finish_fault alloc_set_pte find_get_page do_async_mmap_readahead No page in page cache page in page cache: COW copy_user_highpage(vmf->cow_page, vmf->page, …) 1. Apply vmf->page if not a COW page 2. Apply vmf->cow_page if a COW page 1 2 3 4 breakpoint write fault without previously reading (MAP_PRIVATE): do_cow_fault 2
  • 68. do_cow_fault vmf->cow_page = alloc_page_vma(…) __do_fault vmf->vma->vm_ops->fault ext4_filemap_fault filemap_fault do_sync_mmap_readahead finish_fault alloc_set_pte find_get_page do_async_mmap_readahead No page in page cache page in page cache: COW copy_user_highpage(vmf->cow_page, vmf->page, …) 1. Apply vmf->page if not a COW page 2. Apply vmf->cow_page if a COW page 1 2 3 4 breakpoint write fault without previously reading (MAP_PRIVATE): do_cow_fault
  • 69. [Recap] Think about this: #1 write fault read fault 1 1 2 2 Memory-mapped file
  • 70. Think about this: #2 fork(): Which function (do_cow_fault or do_wp_page) is called when COW is triggered? Think about this… this one? or, this one?
  • 71. fork(): COW mapping – set ‘write protect’ for src/dst PTEs
  • 72. fork(): COW mapping – set ‘write protect’ for src/dst PTEs
  • 73. fork(): COW Fault Call Path
  • 75. vdso & vvar • vsyscall (Virtual System Call) o The context switch overhead (user <-> kernel) of some system calls (gettimeofday, time, getcpu) is greater than execution time of those functions: Built on top of the fixed-mapped address o Machine code format o Core dump: debugger cannot provide the debugging info because symbols of this area are unavailable • vDSO (Virtual Dynamic Shared Object) • ELF format • A small shared library that the kernel automatically maps into the address space of all userspace applications • VVAR (vDSO Variable)