SlideShare a Scribd company logo
Linux kernel TLV meetup, 28.02.2016
kerneltlv.com
Kfir Gollan
 What is the Berkeley Packet Filter?
 Writing BPF filters
 Debuging BPF
 Using BPF in user-space applications
 Advanced features of BPF
 BPF at its base is a way to perform fast packet filtering
at the kernel level.
 Filters are defined by user space
 Filters are executed in the kernel
 Invented by Steven McCanne and Van Jacobson in
1990. First publication at December 1992.
 Support for BPF in Linux was added by Jay Schulist for
the 2.5 development kernel.
 Many features were added later on.
For example JIT for BPF was added for the 3.0 kernel
(2011)
 We want to be able to filter packets at the kernel level
 Discard irrelevant packets at the kernel without copying
them to user space.
 The performance gains when using promiscuous mode
are substantial.
 We want to change filters dynamically without
recompiling the kernel or using a custom kernel
module.
 We want the filters to be architecture independent.
Berkeley Packet Filters
Berkeley Packet Filters
 BPF defines a set of operations that can be performed
on the filtered packet. Each operation gets its own
opcode.
 BPF was designed to be protocol indepented, as such it
treats the packets as raw buffers. It is up to the filter
writer to parse the needed packet headers.
 BPF is based on three building blocks:
 A: A 32 bit wide accumulator
 X: A 32 bit wide index register
 M[]: 16 x 32 bit wide misc registers aka “scratch
memory”
 LOAD: copy a value into the accumulator or index
register.
 STORE: copy either the accumulator or index
register to the scratch memory.
 ALU: perform arithmetic or logic operation on
the accumulator register.
 BRANCH: alter the flow of control.
 RETURN: terminate the filter and indicate what
portion of the packet to save.
 MISC: various operations that doesn’t match the
other types.
jf: 8jt: 8opcode: 16
k: 32
 Each instruction is represented by 64 bits.
 opcode 16 bits, indicates the instruction type.
 jt 8 bits, offset of the next instruction for true
case (“if” block) – Jump True.
 jf 8 bits, offset of the next instruction for false
case (“else” block) – Jump False.
 k 32 bits, generic data, used for various
purposes.
DescriptionInstruction
Load word into Ald
Load half-word into Aldh
Load byte into Aldb
Load word into Xldx
Load byte into Xldxb
Store A into M[]st
Store X into M[]stx
Jump to labeljmp
Jump on k == Ajeq
Jump on k > Ajgt
Jump on k >= Ajge
Jump on k & Ajset
DescriptionInstruction
A + <x>add
A - <x>sub
A * <x>mul
A / <x>div
A & <x>and
A | <x>or
A ^ <x>xor
A << <x>lsh
A >> <x>rsh
Returnret
Copy A into Xtax
Copy X into Atxa
DescriptionMode
The literal value stored in k.#k
The length of the packet.#len
The word at offset k in the scratch
memory store.
M [ k ]
The byte, half-word or word at byte
offset k in the packet.
[ k ]
The byte, half-word or word at byte
offset x+k in the packet.
[ x + k ]
Jump to label LL
Jump to lt if true, otherwise jump to l f#k, lt. lf
The index registerx
Four times the value of the low four bits
of the byte at the offset k in the packet
4 * ([k] & 0xf)
 MAC header
 IP header
TypeSource MACDestination MAC
2 bytes6 bytes6 bytes
 To catch all the IP packets over MAC we need to check
the type field in the MAC header.
ldh [12]
jeq #ETHERTYPE_IP, L1, L2
L1: ret #-1
L2: ret #0
 Load half-word (2 bytes) from offset 12 of the packet
(the type field) into A register.
 Check if A register is equal to #ETHERTYPE_IP
 If true -> return #-1
 If false -> return 0
ldh [12] ; A <= ether.type
jeq #ETHERTYPE_IP, L1, L3 ; A == #ETHERTYPE_IP ?
L1: ld [26] ; A = ip.src
and #0xffffff00 ; A = A & 0xffffff00
jeq #0x80037000, L3, L2 ; A == 128.3.112.0
L2: ret #-1
L3: ret #0
 Check if the packet type is IP
 “Remove” the lower byte of the src IP
 Check if the src IP matches 128.3.112.X
Berkeley Packet Filters
 A utility used to create bpf binary code (bpf
“assembly” compiler”).
 Part of the mainline kernel. tools/net/bpf_asm.c
 Supports two output formats
 c style output
{ 0x28, 0, 0, 0x0000000c },
{ 0x15, 0, 1, 0x00000800 },
{ 0x06, 0, 0, 0xffffffff },
{ 0x06, 0, 0, 0000000000 },
 raw output
4,40 0 0 12,21 0 1 2048,6 0 0 4294967295,6 0 0 0,
 A utility used to debug bpf filters
 Part of the mainline kernel. tools/net/bpf_dbg.c
 Main features
 pcap files as input for filters.
 bpf-asm raw output format for bpf filter definition.
 single-stepping through filters
 breakpoints
 internal status (A,X,M, PC)
 disassemble raw bpf to bpf-asm
Berkeley Packet Filters
Berkeley Packet Filters
struct sock_filter { /* Filter block */
__u16 code; /* Actual filter code */
__u8 jt; /* Jump true */
__u8 jf; /* Jump false */
__u32 k; /* Generic multiuse field */
};
 Filter block is in fact a single BPF instruction.
 Used to pass a filter specifications to the kernel.
struct sock_filter code[] = {
{ 0x28, 0, 0, 0x0000000c },
{ 0x15, 0, 8, 0x000086dd },
{ 0x30, 0, 0, 0x00000014 },
…
};
struct sock_fprog { /* Required for SO_ATTACH_FILTER. */
unsigned short len; /* Number of filter blocks */
struct sock_filter *filter; /* Filter blocks list */
};
 A parameter for setsockopt that allows to attach a filter
to a socket.
struct sock_fprog bpf = {…};
setsockopt(int fd,
SOL_SOCKET,
SO_ATTACH_FILTER,
&bpf.
sizeof(bpf));
 SO_ATTACH_FILTER
Attach a BPF filter to a socket.
Note: only a single filter can be attached at a given
time.
 SO_DETACH_FILTER
Remove the currently attached filter from the socket.
 SO_LOCK_FILTER
Lock a filter on a socket. This is useful for setting a
filter and then dropping privileges.
 Example: create a raw socket, apply a filter, lock it, drop
CAP_NET_RAW
 Choosing the correct flags to the socket is critical for
making the filter work properly.
 The filtered buffer will start in the wanted location in the net
stack.
 Selecting the socket domain
 AF_PACKET – filtering at L2 (e.g ethernet)
 AF_INET – IPv4 filtering
 AF_INET6 – IPv6 filtering
 Selecting the socket type
 SOCK_RAW – raw filtering, no headers are handled by the
kernel
 SOCK_STREAM/SOCK_DGRAM etc – headers are handled
by the kernel
 libpcap – Packet CAPture library
 Provides an easy to use api for packet filtering.
 Supports a high level filtering format which is
converted to BPF.
 pcap_compile – create a bpf filter
 pcap_setfilter – attach a filter
 Look at man 7 pcap-filter for a detailed description.
ether dst [mac address]
dst net [ip address]
dst portrange [port1]-[port2]
 user-space packet sniffing program.
 Uses bpf for kernel level filtering (based on pcap)
 Dump the generated bpf filter
 -d bpf asm format
 -dd c format
 -ddd bpf raw format
$ sudo tcpdump -d "ip and udp“
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 5
(002) ldb [23]
(003) jeq #0x11 jt 4 jf 5
(004) ret #65535
(005) ret #0
Berkeley Packet Filters
 A just-in-time BPF instruction translation.
 Note that the translation is performed when attaching
the filter via bpf_jit_compile(..).
 BPF instructions are mapped directly to architecture
depended instructions.
 BPF registers are mapped to machine physical registers
 Provides a performance gain
 About 50ns per packet for simple filters (E5540 @
2.53GHz).
The more complex the filter the better performance
gains from JIT.
 Supported on x86,x86-64, powerpc. arm and more.
 Each BPF filter is verified before attaching it to a
socket.
 This is critical, the filters come from userspace!
 The following rules are enforced
 The filter must not contain references or jumps that are
out of range.
 The filter must contain only valid BPF opcodes.
 The filter must end with RET opcode.
 All jumps are forward – loops are not allowed!
 The verification is implemented at sk_chk_filter
function in net/core/filter.c (until kernel 3.16),
modified after adding seccomp.
 The Linux kernel also has a couple of BPF extensions
that are used along with the class of load instructions.
 The extensions are "overloading" the k argument with
a negative offset + a particular extension offset.
 The result of such BPF extensions are loaded into A.
DescriptionInstruction
skb->lenlen
skb->protocolproto
skb->pkt_typetype
Payload start offsetpoff
skb->dev->ifindexifidx
skb->markmark
DescriptionInstruction
skb->queue_mappingqueue
skb->hashrxhash
Prandom_u32()rand
Executing cpu idcpu
Netlink attributesnla
skb->dev->typehatype
 extended BPF is an internal mechanism that can be used
only in kernel context (not from userspace!)
 eBPF adds a set of new features:
 Increased number of registers 10 instead of 2.
 Register width increased to 64 bit
 Conditional jf/jt targets replaced with jt/fall-through
 bpf_call instruction and register passing convention for zero
overhead calls from/to other kernel functions.
 Originally designed to be a “restricted C” language that will
be architecture independent and JITed in kernel context.
 Eventually used mostly for kernel tracing.
Berkeley Packet Filters
 SECure COMPuting, or seccomp, is a security
mechanism available in the linux kernel.
 It applies BPF filtering to syscalls
 filter the syscall number & its parameters
 It can be used to limit the available syscalls
 For example strict mode allows only read, write, _exit
and sigreturn
 Uses the BPF filters, the filtered buffers are different.
 Look at man 2 seccomp for more details.
Berkeley Packet Filters

More Related Content

What's hot (20)

eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
SUSE Labs Taipei
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecture
hugo lu
 
Launch the First Process in Linux System
Launch the First Process in Linux SystemLaunch the First Process in Linux System
Launch the First Process in Linux System
Jian-Hong Pan
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
Thomas Graf
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
PLUMgrid
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
Adrian Huang
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
Brendan Gregg
 
Linux basics part 1
Linux basics part 1Linux basics part 1
Linux basics part 1
Lilesh Pathe
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
Viller Hsiao
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
Thomas Graf
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
RogerColl2
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
Netronome
 
Hands-on ethernet driver
Hands-on ethernet driverHands-on ethernet driver
Hands-on ethernet driver
SUSE Labs Taipei
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
Thomas Graf
 
Function Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe DriverFunction Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe Driver
인구 강
 
Introduction to linux ppt
Introduction to linux pptIntroduction to linux ppt
Introduction to linux ppt
Omi Vichare
 
Intel dpdk Tutorial
Intel dpdk TutorialIntel dpdk Tutorial
Intel dpdk Tutorial
Saifuddin Kaijar
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
Brendan Gregg
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
Adrian Huang
 
VMworld 2013: ESXi Native Networking Driver Model - Delivering on Simplicity ...
VMworld 2013: ESXi Native Networking Driver Model - Delivering on Simplicity ...VMworld 2013: ESXi Native Networking Driver Model - Delivering on Simplicity ...
VMworld 2013: ESXi Native Networking Driver Model - Delivering on Simplicity ...
VMworld
 
eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
SUSE Labs Taipei
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecture
hugo lu
 
Launch the First Process in Linux System
Launch the First Process in Linux SystemLaunch the First Process in Linux System
Launch the First Process in Linux System
Jian-Hong Pan
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
Thomas Graf
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
PLUMgrid
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
Adrian Huang
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
Brendan Gregg
 
Linux basics part 1
Linux basics part 1Linux basics part 1
Linux basics part 1
Lilesh Pathe
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
Viller Hsiao
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
Thomas Graf
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
RogerColl2
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
Thomas Graf
 
Function Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe DriverFunction Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe Driver
인구 강
 
Introduction to linux ppt
Introduction to linux pptIntroduction to linux ppt
Introduction to linux ppt
Omi Vichare
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
Brendan Gregg
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
Adrian Huang
 
VMworld 2013: ESXi Native Networking Driver Model - Delivering on Simplicity ...
VMworld 2013: ESXi Native Networking Driver Model - Delivering on Simplicity ...VMworld 2013: ESXi Native Networking Driver Model - Delivering on Simplicity ...
VMworld 2013: ESXi Native Networking Driver Model - Delivering on Simplicity ...
VMworld
 

Similar to Berkeley Packet Filters (20)

Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native Environments
Gergely Szabó
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
Affan Syed
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
Alex Maestretti
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy
 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Hsien-Hsin Sean Lee, Ph.D.
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28
rajeshkvdn
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
Revelation pyconuk2016
Revelation pyconuk2016Revelation pyconuk2016
Revelation pyconuk2016
Sarah Mount
 
BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!
Linaro
 
Risc vs cisc
Risc vs ciscRisc vs cisc
Risc vs cisc
Dileep Bhandarkar
 
Sockets and Socket-Buffer
Sockets and Socket-BufferSockets and Socket-Buffer
Sockets and Socket-Buffer
Sourav Punoriyar
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
Denys Haryachyy
 
Efficient JIT to 32-bit Arches
Efficient JIT to 32-bit ArchesEfficient JIT to 32-bit Arches
Efficient JIT to 32-bit Arches
Netronome
 
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPFUnifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Netronome
 
Basic Linux kernel
Basic Linux kernelBasic Linux kernel
Basic Linux kernel
Morteza Nourelahi Alamdari
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
Brendan Gregg
 
ocelot
ocelotocelot
ocelot
sean chen
 
Replacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumReplacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with Cilium
Michal Rostecki
 
eBPF Tooling and Debugging Infrastructure
eBPF Tooling and Debugging InfrastructureeBPF Tooling and Debugging Infrastructure
eBPF Tooling and Debugging Infrastructure
Netronome
 
BPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveBPF Hardware Offload Deep Dive
BPF Hardware Offload Deep Dive
Netronome
 
Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native Environments
Gergely Szabó
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
Affan Syed
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
Alex Maestretti
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy
 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Hsien-Hsin Sean Lee, Ph.D.
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28
rajeshkvdn
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
Revelation pyconuk2016
Revelation pyconuk2016Revelation pyconuk2016
Revelation pyconuk2016
Sarah Mount
 
BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!
Linaro
 
Efficient JIT to 32-bit Arches
Efficient JIT to 32-bit ArchesEfficient JIT to 32-bit Arches
Efficient JIT to 32-bit Arches
Netronome
 
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPFUnifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Netronome
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
Brendan Gregg
 
Replacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumReplacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with Cilium
Michal Rostecki
 
eBPF Tooling and Debugging Infrastructure
eBPF Tooling and Debugging InfrastructureeBPF Tooling and Debugging Infrastructure
eBPF Tooling and Debugging Infrastructure
Netronome
 
BPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveBPF Hardware Offload Deep Dive
BPF Hardware Offload Deep Dive
Netronome
 

More from Kernel TLV (20)

DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
Kernel TLV
 
SGX Trusted Execution Environment
SGX Trusted Execution EnvironmentSGX Trusted Execution Environment
SGX Trusted Execution Environment
Kernel TLV
 
Fun with FUSE
Fun with FUSEFun with FUSE
Fun with FUSE
Kernel TLV
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and Containers
Kernel TLV
 
Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545
Kernel TLV
 
Present Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityPresent Absence of Linux Filesystem Security
Present Absence of Linux Filesystem Security
Kernel TLV
 
OpenWrt From Top to Bottom
OpenWrt From Top to BottomOpenWrt From Top to Bottom
OpenWrt From Top to Bottom
Kernel TLV
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance Tools
Kernel TLV
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Kernel TLV
 
File Systems: Why, How and Where
File Systems: Why, How and WhereFile Systems: Why, How and Where
File Systems: Why, How and Where
Kernel TLV
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
Kernel TLV
 
KernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernelTLV Speaker Guidelines
KernelTLV Speaker Guidelines
Kernel TLV
 
Userfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentUserfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future Development
Kernel TLV
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
Kernel TLV
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
Kernel TLV
 
DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival Guide
Kernel TLV
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
Kernel TLV
 
WiFi and the Beast
WiFi and the BeastWiFi and the Beast
WiFi and the Beast
Kernel TLV
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Kernel TLV
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
Kernel TLV
 
SGX Trusted Execution Environment
SGX Trusted Execution EnvironmentSGX Trusted Execution Environment
SGX Trusted Execution Environment
Kernel TLV
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and Containers
Kernel TLV
 
Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545
Kernel TLV
 
Present Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityPresent Absence of Linux Filesystem Security
Present Absence of Linux Filesystem Security
Kernel TLV
 
OpenWrt From Top to Bottom
OpenWrt From Top to BottomOpenWrt From Top to Bottom
OpenWrt From Top to Bottom
Kernel TLV
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance Tools
Kernel TLV
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Kernel TLV
 
File Systems: Why, How and Where
File Systems: Why, How and WhereFile Systems: Why, How and Where
File Systems: Why, How and Where
Kernel TLV
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
Kernel TLV
 
KernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernelTLV Speaker Guidelines
KernelTLV Speaker Guidelines
Kernel TLV
 
Userfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentUserfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future Development
Kernel TLV
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
Kernel TLV
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
Kernel TLV
 
DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival Guide
Kernel TLV
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
Kernel TLV
 
WiFi and the Beast
WiFi and the BeastWiFi and the Beast
WiFi and the Beast
Kernel TLV
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Kernel TLV
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
Kernel TLV
 

Recently uploaded (20)

PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Ranjan Baisak
 
Salesforce Aged Complex Org Revitalization Process .pdf
Salesforce Aged Complex Org Revitalization Process .pdfSalesforce Aged Complex Org Revitalization Process .pdf
Salesforce Aged Complex Org Revitalization Process .pdf
SRINIVASARAO PUSULURI
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Mastering OOP: Understanding the Four Core Pillars
Mastering OOP: Understanding the Four Core PillarsMastering OOP: Understanding the Four Core Pillars
Mastering OOP: Understanding the Four Core Pillars
Marcel David
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
Shift Left using Lean for Agile Software Development
Shift Left using Lean for Agile Software DevelopmentShift Left using Lean for Agile Software Development
Shift Left using Lean for Agile Software Development
SathyaShankar6
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025
kashifyounis067
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Vibe Coding_ Develop a web application using AI.pdf
Vibe Coding_ Develop a web application using AI.pdfVibe Coding_ Develop a web application using AI.pdf
Vibe Coding_ Develop a web application using AI.pdf
Baiju Muthukadan
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Proactive Vulnerability Detection in Source Code Using Graph Neural Networks:...
Ranjan Baisak
 
Salesforce Aged Complex Org Revitalization Process .pdf
Salesforce Aged Complex Org Revitalization Process .pdfSalesforce Aged Complex Org Revitalization Process .pdf
Salesforce Aged Complex Org Revitalization Process .pdf
SRINIVASARAO PUSULURI
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Mastering OOP: Understanding the Four Core Pillars
Mastering OOP: Understanding the Four Core PillarsMastering OOP: Understanding the Four Core Pillars
Mastering OOP: Understanding the Four Core Pillars
Marcel David
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
Shift Left using Lean for Agile Software Development
Shift Left using Lean for Agile Software DevelopmentShift Left using Lean for Agile Software Development
Shift Left using Lean for Agile Software Development
SathyaShankar6
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025
kashifyounis067
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Vibe Coding_ Develop a web application using AI.pdf
Vibe Coding_ Develop a web application using AI.pdfVibe Coding_ Develop a web application using AI.pdf
Vibe Coding_ Develop a web application using AI.pdf
Baiju Muthukadan
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 

Berkeley Packet Filters

  • 1. Linux kernel TLV meetup, 28.02.2016 kerneltlv.com Kfir Gollan
  • 2.  What is the Berkeley Packet Filter?  Writing BPF filters  Debuging BPF  Using BPF in user-space applications  Advanced features of BPF
  • 3.  BPF at its base is a way to perform fast packet filtering at the kernel level.  Filters are defined by user space  Filters are executed in the kernel  Invented by Steven McCanne and Van Jacobson in 1990. First publication at December 1992.  Support for BPF in Linux was added by Jay Schulist for the 2.5 development kernel.  Many features were added later on. For example JIT for BPF was added for the 3.0 kernel (2011)
  • 4.  We want to be able to filter packets at the kernel level  Discard irrelevant packets at the kernel without copying them to user space.  The performance gains when using promiscuous mode are substantial.  We want to change filters dynamically without recompiling the kernel or using a custom kernel module.  We want the filters to be architecture independent.
  • 7.  BPF defines a set of operations that can be performed on the filtered packet. Each operation gets its own opcode.  BPF was designed to be protocol indepented, as such it treats the packets as raw buffers. It is up to the filter writer to parse the needed packet headers.  BPF is based on three building blocks:  A: A 32 bit wide accumulator  X: A 32 bit wide index register  M[]: 16 x 32 bit wide misc registers aka “scratch memory”
  • 8.  LOAD: copy a value into the accumulator or index register.  STORE: copy either the accumulator or index register to the scratch memory.  ALU: perform arithmetic or logic operation on the accumulator register.  BRANCH: alter the flow of control.  RETURN: terminate the filter and indicate what portion of the packet to save.  MISC: various operations that doesn’t match the other types.
  • 9. jf: 8jt: 8opcode: 16 k: 32  Each instruction is represented by 64 bits.  opcode 16 bits, indicates the instruction type.  jt 8 bits, offset of the next instruction for true case (“if” block) – Jump True.  jf 8 bits, offset of the next instruction for false case (“else” block) – Jump False.  k 32 bits, generic data, used for various purposes.
  • 10. DescriptionInstruction Load word into Ald Load half-word into Aldh Load byte into Aldb Load word into Xldx Load byte into Xldxb Store A into M[]st Store X into M[]stx Jump to labeljmp Jump on k == Ajeq Jump on k > Ajgt Jump on k >= Ajge Jump on k & Ajset DescriptionInstruction A + <x>add A - <x>sub A * <x>mul A / <x>div A & <x>and A | <x>or A ^ <x>xor A << <x>lsh A >> <x>rsh Returnret Copy A into Xtax Copy X into Atxa
  • 11. DescriptionMode The literal value stored in k.#k The length of the packet.#len The word at offset k in the scratch memory store. M [ k ] The byte, half-word or word at byte offset k in the packet. [ k ] The byte, half-word or word at byte offset x+k in the packet. [ x + k ] Jump to label LL Jump to lt if true, otherwise jump to l f#k, lt. lf The index registerx Four times the value of the low four bits of the byte at the offset k in the packet 4 * ([k] & 0xf)
  • 12.  MAC header  IP header TypeSource MACDestination MAC 2 bytes6 bytes6 bytes
  • 13.  To catch all the IP packets over MAC we need to check the type field in the MAC header. ldh [12] jeq #ETHERTYPE_IP, L1, L2 L1: ret #-1 L2: ret #0  Load half-word (2 bytes) from offset 12 of the packet (the type field) into A register.  Check if A register is equal to #ETHERTYPE_IP  If true -> return #-1  If false -> return 0
  • 14. ldh [12] ; A <= ether.type jeq #ETHERTYPE_IP, L1, L3 ; A == #ETHERTYPE_IP ? L1: ld [26] ; A = ip.src and #0xffffff00 ; A = A & 0xffffff00 jeq #0x80037000, L3, L2 ; A == 128.3.112.0 L2: ret #-1 L3: ret #0  Check if the packet type is IP  “Remove” the lower byte of the src IP  Check if the src IP matches 128.3.112.X
  • 16.  A utility used to create bpf binary code (bpf “assembly” compiler”).  Part of the mainline kernel. tools/net/bpf_asm.c  Supports two output formats  c style output { 0x28, 0, 0, 0x0000000c }, { 0x15, 0, 1, 0x00000800 }, { 0x06, 0, 0, 0xffffffff }, { 0x06, 0, 0, 0000000000 },  raw output 4,40 0 0 12,21 0 1 2048,6 0 0 4294967295,6 0 0 0,
  • 17.  A utility used to debug bpf filters  Part of the mainline kernel. tools/net/bpf_dbg.c  Main features  pcap files as input for filters.  bpf-asm raw output format for bpf filter definition.  single-stepping through filters  breakpoints  internal status (A,X,M, PC)  disassemble raw bpf to bpf-asm
  • 20. struct sock_filter { /* Filter block */ __u16 code; /* Actual filter code */ __u8 jt; /* Jump true */ __u8 jf; /* Jump false */ __u32 k; /* Generic multiuse field */ };  Filter block is in fact a single BPF instruction.  Used to pass a filter specifications to the kernel. struct sock_filter code[] = { { 0x28, 0, 0, 0x0000000c }, { 0x15, 0, 8, 0x000086dd }, { 0x30, 0, 0, 0x00000014 }, … };
  • 21. struct sock_fprog { /* Required for SO_ATTACH_FILTER. */ unsigned short len; /* Number of filter blocks */ struct sock_filter *filter; /* Filter blocks list */ };  A parameter for setsockopt that allows to attach a filter to a socket. struct sock_fprog bpf = {…}; setsockopt(int fd, SOL_SOCKET, SO_ATTACH_FILTER, &bpf. sizeof(bpf));
  • 22.  SO_ATTACH_FILTER Attach a BPF filter to a socket. Note: only a single filter can be attached at a given time.  SO_DETACH_FILTER Remove the currently attached filter from the socket.  SO_LOCK_FILTER Lock a filter on a socket. This is useful for setting a filter and then dropping privileges.  Example: create a raw socket, apply a filter, lock it, drop CAP_NET_RAW
  • 23.  Choosing the correct flags to the socket is critical for making the filter work properly.  The filtered buffer will start in the wanted location in the net stack.  Selecting the socket domain  AF_PACKET – filtering at L2 (e.g ethernet)  AF_INET – IPv4 filtering  AF_INET6 – IPv6 filtering  Selecting the socket type  SOCK_RAW – raw filtering, no headers are handled by the kernel  SOCK_STREAM/SOCK_DGRAM etc – headers are handled by the kernel
  • 24.  libpcap – Packet CAPture library  Provides an easy to use api for packet filtering.  Supports a high level filtering format which is converted to BPF.  pcap_compile – create a bpf filter  pcap_setfilter – attach a filter  Look at man 7 pcap-filter for a detailed description. ether dst [mac address] dst net [ip address] dst portrange [port1]-[port2]
  • 25.  user-space packet sniffing program.  Uses bpf for kernel level filtering (based on pcap)  Dump the generated bpf filter  -d bpf asm format  -dd c format  -ddd bpf raw format $ sudo tcpdump -d "ip and udp“ (000) ldh [12] (001) jeq #0x800 jt 2 jf 5 (002) ldb [23] (003) jeq #0x11 jt 4 jf 5 (004) ret #65535 (005) ret #0
  • 27.  A just-in-time BPF instruction translation.  Note that the translation is performed when attaching the filter via bpf_jit_compile(..).  BPF instructions are mapped directly to architecture depended instructions.  BPF registers are mapped to machine physical registers  Provides a performance gain  About 50ns per packet for simple filters (E5540 @ 2.53GHz). The more complex the filter the better performance gains from JIT.  Supported on x86,x86-64, powerpc. arm and more.
  • 28.  Each BPF filter is verified before attaching it to a socket.  This is critical, the filters come from userspace!  The following rules are enforced  The filter must not contain references or jumps that are out of range.  The filter must contain only valid BPF opcodes.  The filter must end with RET opcode.  All jumps are forward – loops are not allowed!  The verification is implemented at sk_chk_filter function in net/core/filter.c (until kernel 3.16), modified after adding seccomp.
  • 29.  The Linux kernel also has a couple of BPF extensions that are used along with the class of load instructions.  The extensions are "overloading" the k argument with a negative offset + a particular extension offset.  The result of such BPF extensions are loaded into A. DescriptionInstruction skb->lenlen skb->protocolproto skb->pkt_typetype Payload start offsetpoff skb->dev->ifindexifidx skb->markmark DescriptionInstruction skb->queue_mappingqueue skb->hashrxhash Prandom_u32()rand Executing cpu idcpu Netlink attributesnla skb->dev->typehatype
  • 30.  extended BPF is an internal mechanism that can be used only in kernel context (not from userspace!)  eBPF adds a set of new features:  Increased number of registers 10 instead of 2.  Register width increased to 64 bit  Conditional jf/jt targets replaced with jt/fall-through  bpf_call instruction and register passing convention for zero overhead calls from/to other kernel functions.  Originally designed to be a “restricted C” language that will be architecture independent and JITed in kernel context.  Eventually used mostly for kernel tracing.
  • 32.  SECure COMPuting, or seccomp, is a security mechanism available in the linux kernel.  It applies BPF filtering to syscalls  filter the syscall number & its parameters  It can be used to limit the available syscalls  For example strict mode allows only read, write, _exit and sigreturn  Uses the BPF filters, the filtered buffers are different.  Look at man 2 seccomp for more details.

Editor's Notes

  • #4: I used the following LWN articles for this slide: BPF: the universal in-kernel virtual machine: https://lwn.net/Articles/599755/ A JIT for packet filters: http://lwn.net/Articles/437981/
  • #6: Image taken from: http://www.tiger1997.jp/report/activity/securityreport_20131111.html
  • #8: Information taken from: https://www.kernel.org/doc/Documentation/networking/filter.txt
  • #9: Taken from: The BSD Packet Filter: A New Architecture for User-level Packet Capture http://www.tcpdump.org/papers/bpf-usenix93.pdf
  • #11: Note: this is only a subset of the BPF instruction set. Taken from https://www.kernel.org/doc/Documentation/networking/filter.txt
  • #21: https://www.kernel.org/doc/Documentation/networking/filter.txt
  • #22: https://www.kernel.org/doc/Documentation/networking/filter.txt
  • #26: taken from https://blog.cloudflare.com/bpf-the-forgotten-bytecode/
  • #28: JIT benchmark can be found here https://lwn.net/Articles/437986/
  • #29: http://lxr.free-electrons.com/source/net/core/filter.c?v=3.16#L1230
  • #31: http://www.brendangregg.com/blog/2015-05-15/ebpf-one-small-step.html