SlideShare a Scribd company logo
FreeBSD Netmap
Amir Razmjou
arazmjou2014@my.fit.edu
Intelligent Communication and Information Systems
Laboratory
Florida Tech
What’s netmap?
• Allows line-rate speed on commodity
hardware and NICs (14.88 Mpps on 10G 1.48
Mpps on 1G)
• Simple API but still allows to utilize hardware
specific features (multi-queue NICs).
• No need to mess up with Kernel coding,
Netmap applications can easily be ran in
userland.
• Full control over host stack.
Conventional Operating Systems
Packet Journey in Kernel
ether_input(struct ifnet *ifp, struct ether_header *eh,
struct mbuf *m) {
m->m_pkthdr.rcvif = ifp;
eh = mtod(m, struct ether_header *);
m->m_data += sizeof(struct ether_header);
m->m_len -= sizeof(struct ether_header);
m->m_pkthdr.len = m->m_len;
ether_demux(ifp, eh, m);
}
void ether_demux(ifp, eh, m)
ether_type = ntohs(eh>ether_type);
switch (ether_type) {
case ETHERTYPE_IP:
. . .
break;
schednetisr(NETISR_IP);
inq = &ipintrq;
(void) IF_HANDOFF(inq, m, NULL);
Packet Journey in Kernel
//ISR for IP
while(1) {
ip_input(m);
s = splimp();
IF_DEQUEUE(&ipintrq, m);
splx(s);
if (m == 0)
return;
ip_input(m);
}
void ip_input(struct mbuf *m) {
if (m->m_len < sizeof (struct ip) &&
(m = m_pullup(m, sizeof (struct ip))) == 0) {
}
ip = mtod(m, struct ip *);
ipstat.ips_toosmall++;
return;
Network I/O costs
• per-byte cost: amount of traffic - copying,
checksum computation, encryption
• per-packet cost: comes from the manipulation
of descriptors (allocation and destruction,
metadata management) and the execution of
system calls, interrupts, and device-driver
functions…
• The larger packets the less per-packet cost.
Sockets Bottlenecks
• IPC Communication “System Calls”.
• Parsing RAW Ethernet frames into different
data structure at different layers, mbufs,
skubuff.
• Memory Copy (inter layers)
• Memory Allocation/Deallocation
Design Decisions 30 years ago.
• memory was a scarce resource;
• links operated at low (by today’s standards)
speeds;
• parallel processing was an advanced research
topic; and the ability to work at line rate in
• all possible conditions was compromised by
hardware limitations in the NIC
Packet Handling Costs
UDP
System Call
mbuf
Memory Copy
Hardware
TCP
What’s netmap
• “Netmap is a novel framework that employs
some known techniques to reduce packet-
processing costs. Its key feature, apart from
performance, is that it integrates smoothly
with existing operating-system internals and
applications. This makes it possible to achieve
great speedups with just a limited amount of
new code, while building a robust and
maintainable system.”
Netmap goals
• Reduce system calls
• Remove allocation costs
• Eliminate data copies.
The inspiration for data structure design comes
from the way that network interface stores
buffers.
Netmap Data Structure
Safe slots: [curr … curr + available – 1]
netmap API
• Access
– open(‘’/dev/netmap’’);
– ioctl(fd, NIOCREG, arg)
– mmap(…, fd, 0) maps buffers and rings.
• Transmit
– Fill to avail buffers, starting from slot cur.
– IOCTL NIOCTXSYNC
• Receive
– IOCTL NIOCRXSYNC
– Process up to avail slots from slot cur
Example
Multi Queue NIC
• Multiple Descriptor Queues
– queues can be associated with specific processor
cores
• Receive-Side Scaling (RSS)
– To determine which receive queue to use for incoming
packets,
– specific flow for a given packet is determined by the
calculation of a hash value derived from fields
• Extended Message-Signaled Interrupts (MSI-X)
– Multiple interrupt vectors
Host Stack Access
• In netmap mode host stack is
disconnected from NIC.
• But the operating system still can use NIC.
• There’s two additional rings for incoming
outgoing traffic of host stack.
• Packets destinated to the host stack are
queued netmap and then and passed to
the host stack and vice versa.
• The approach provides an ideal solution
for traffic filters, manipulators and etc.
Lightening fast packet forwarding
...
src = &src_nifp->slot[i]; /* locate src and dst slots */
dst = &dst_nifp->slot[j];
/* swap the buffers */
tmp = dst->buf_index;
dst->buf_index = src->buf_index;
src->buf_index = tmp;
/* update length and flags */
dst->len = src->len;
/* tell kernel to update addresses in the NIC rings */
dst->flags = src->flags = BUF_CHANGED;
...
/ conditional part
Comparsion

More Related Content

What's hot (20)

PDF
On-device ML with TFLite
Margaret Maynard-Reid
 
PDF
Performance Analysis Tools for Linux Kernel
lcplcp1
 
PDF
キャパシティ プランニング
外道 父
 
PDF
Kubernetes Cost Optimization
Shiho ASA
 
PDF
Docker Compose 徹底解説
Masahito Zembutsu
 
PDF
POWER10 innovations for HPC
Ganesan Narayanasamy
 
PDF
TPC-DSから学ぶPostgreSQLの弱点と今後の展望
Kohei KaiGai
 
PDF
zenoh: zero overhead pub/sub store/query compute
Angelo Corsaro
 
PDF
Red Hat Storage - Introduction to GlusterFS
GlusterFS
 
PDF
Introduction to Cassandra
Gokhan Atil
 
PDF
Une introduction à MapReduce
Modern Data Stack France
 
PDF
DeathNote of Microsoft Windows Kernel
Peter Hlavaty
 
PDF
Debugging PySpark: Spark Summit East talk by Holden Karau
Spark Summit
 
PPTX
RISC-V Boot Process: One Step at a Time
Atish Patra
 
PPT
Linux architecture
mcganesh
 
PDF
Big Data, Hadoop & Spark
Alexia Audevart
 
PDF
Sistema de monitoramento para redes sem fio com Zabbix e openWRT
Marcelo Santana Camacho
 
PDF
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg
 
PDF
Physical Memory Management.pdf
Adrian Huang
 
PDF
RocksDB Performance and Reliability Practices
Yoshinori Matsunobu
 
On-device ML with TFLite
Margaret Maynard-Reid
 
Performance Analysis Tools for Linux Kernel
lcplcp1
 
キャパシティ プランニング
外道 父
 
Kubernetes Cost Optimization
Shiho ASA
 
Docker Compose 徹底解説
Masahito Zembutsu
 
POWER10 innovations for HPC
Ganesan Narayanasamy
 
TPC-DSから学ぶPostgreSQLの弱点と今後の展望
Kohei KaiGai
 
zenoh: zero overhead pub/sub store/query compute
Angelo Corsaro
 
Red Hat Storage - Introduction to GlusterFS
GlusterFS
 
Introduction to Cassandra
Gokhan Atil
 
Une introduction à MapReduce
Modern Data Stack France
 
DeathNote of Microsoft Windows Kernel
Peter Hlavaty
 
Debugging PySpark: Spark Summit East talk by Holden Karau
Spark Summit
 
RISC-V Boot Process: One Step at a Time
Atish Patra
 
Linux architecture
mcganesh
 
Big Data, Hadoop & Spark
Alexia Audevart
 
Sistema de monitoramento para redes sem fio com Zabbix e openWRT
Marcelo Santana Camacho
 
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg
 
Physical Memory Management.pdf
Adrian Huang
 
RocksDB Performance and Reliability Practices
Yoshinori Matsunobu
 

Similar to Netmap presentation (20)

PDF
Александр Черников: Использование netmap
Yandex
 
PDF
Recent advance in netmap/VALE(mSwitch)
micchie
 
PDF
Userspace networking
Stephen Hemminger
 
PDF
XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins...
The Linux Foundation
 
PPTX
Network sockets
Denys Haryachyy
 
PPT
Nad710 Introduction To Networks Using Linux
tmavroidis
 
PPTX
FlowER Erlang Openflow Controller
Holger Winkelmann
 
PDF
HKG18-110 - net_mdev: Fast path user space I/O
Linaro
 
PDF
Network visibility and control using industry standard sFlow telemetry
pphaal
 
PPTX
UNIT 4 - UNDERSTANDING THE NETWORK ARCHITECTURE.pptx
LeahRachael
 
PDF
The linux networking architecture
hugo lu
 
PPTX
Microx - A Unix like kernel for Embedded Systems written from scratch.
Waqar Sheikh
 
PPT
Distributed systems
Syed Zaid Irshad
 
PPTX
High performace network of Cloud Native Taiwan User Group
HungWei Chiu
 
PDF
introduction to linux kernel tcp/ip ptocotol stack
monad bobo
 
PPT
.ppt
webhostingguy
 
PPT
Intro (Distributed computing)
Sri Prasanna
 
ODP
Integrating Linux routing with FusionCLI™
Stephen Hemminger
 
PDF
Networking fundamentals
Showmax Engineering
 
Александр Черников: Использование netmap
Yandex
 
Recent advance in netmap/VALE(mSwitch)
micchie
 
Userspace networking
Stephen Hemminger
 
XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins...
The Linux Foundation
 
Network sockets
Denys Haryachyy
 
Nad710 Introduction To Networks Using Linux
tmavroidis
 
FlowER Erlang Openflow Controller
Holger Winkelmann
 
HKG18-110 - net_mdev: Fast path user space I/O
Linaro
 
Network visibility and control using industry standard sFlow telemetry
pphaal
 
UNIT 4 - UNDERSTANDING THE NETWORK ARCHITECTURE.pptx
LeahRachael
 
The linux networking architecture
hugo lu
 
Microx - A Unix like kernel for Embedded Systems written from scratch.
Waqar Sheikh
 
Distributed systems
Syed Zaid Irshad
 
High performace network of Cloud Native Taiwan User Group
HungWei Chiu
 
introduction to linux kernel tcp/ip ptocotol stack
monad bobo
 
Intro (Distributed computing)
Sri Prasanna
 
Integrating Linux routing with FusionCLI™
Stephen Hemminger
 
Networking fundamentals
Showmax Engineering
 
Ad

More from Amir Razmjou (7)

PPTX
Wrapper feature selection method
Amir Razmjou
 
PPTX
Using GSP data mining algorithm to detect malicious flows in Lawrence Berkele...
Amir Razmjou
 
PPTX
Data mining cyber security
Amir Razmjou
 
PPTX
Cite track presentation
Amir Razmjou
 
PPTX
Motif presentation
Amir Razmjou
 
PPTX
Who creates trends in online social media
Amir Razmjou
 
PPTX
Respina shaper presentation
Amir Razmjou
 
Wrapper feature selection method
Amir Razmjou
 
Using GSP data mining algorithm to detect malicious flows in Lawrence Berkele...
Amir Razmjou
 
Data mining cyber security
Amir Razmjou
 
Cite track presentation
Amir Razmjou
 
Motif presentation
Amir Razmjou
 
Who creates trends in online social media
Amir Razmjou
 
Respina shaper presentation
Amir Razmjou
 
Ad

Recently uploaded (20)

PPTX
本科硕士学历佛罗里达大学毕业证(UF毕业证书)24小时在线办理
Taqyea
 
PPTX
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
PPTX
Simplifying and CounFounding in egime.pptx
Ryanto10
 
PDF
Web Hosting for Shopify WooCommerce etc.
Harry_Phoneix Harry_Phoneix
 
PDF
Internet Governance and its role in Global economy presentation By Shreedeep ...
Shreedeep Rayamajhi
 
PPTX
西班牙武康大学毕业证书{UCAMOfferUCAM成绩单水印}原版制作
Taqyea
 
PPTX
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
PPTX
Presentation on Social Media1111111.pptx
tanamlimbu
 
PPTX
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
PPTX
英国学位证(RCM毕业证书)皇家音乐学院毕业证书如何办理
Taqyea
 
PDF
The Complete Guide to Chrome Net Internals DNS – 2025
Orage Technologies
 
PPTX
Research Design - Report on seminar in thesis writing. PPTX
arvielobos1
 
PPTX
sajflsajfljsdfljslfjslfsdfas;fdsfksadfjlsdflkjslgfs;lfjlsajfl;sajfasfd.pptx
theknightme
 
PDF
Digital Security in 2025 with Adut Angelina
The ClarityDesk
 
PDF
𝐁𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓
hokimamad0
 
PPT
introductio to computers by arthur janry
RamananMuthukrishnan
 
PPT
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
PPTX
unit 2_2 copy right fdrgfdgfai and sm.pptx
nepmithibai2024
 
PDF
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 
PPTX
Internet_of_Things_Presentation_KaifRahaman.pptx
kaifrahaman27593
 
本科硕士学历佛罗里达大学毕业证(UF毕业证书)24小时在线办理
Taqyea
 
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
Simplifying and CounFounding in egime.pptx
Ryanto10
 
Web Hosting for Shopify WooCommerce etc.
Harry_Phoneix Harry_Phoneix
 
Internet Governance and its role in Global economy presentation By Shreedeep ...
Shreedeep Rayamajhi
 
西班牙武康大学毕业证书{UCAMOfferUCAM成绩单水印}原版制作
Taqyea
 
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
Presentation on Social Media1111111.pptx
tanamlimbu
 
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
英国学位证(RCM毕业证书)皇家音乐学院毕业证书如何办理
Taqyea
 
The Complete Guide to Chrome Net Internals DNS – 2025
Orage Technologies
 
Research Design - Report on seminar in thesis writing. PPTX
arvielobos1
 
sajflsajfljsdfljslfjslfsdfas;fdsfksadfjlsdflkjslgfs;lfjlsajfl;sajfasfd.pptx
theknightme
 
Digital Security in 2025 with Adut Angelina
The ClarityDesk
 
𝐁𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓
hokimamad0
 
introductio to computers by arthur janry
RamananMuthukrishnan
 
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
unit 2_2 copy right fdrgfdgfai and sm.pptx
nepmithibai2024
 
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 
Internet_of_Things_Presentation_KaifRahaman.pptx
kaifrahaman27593
 

Netmap presentation

  • 1. FreeBSD Netmap Amir Razmjou [email protected] Intelligent Communication and Information Systems Laboratory Florida Tech
  • 2. What’s netmap? • Allows line-rate speed on commodity hardware and NICs (14.88 Mpps on 10G 1.48 Mpps on 1G) • Simple API but still allows to utilize hardware specific features (multi-queue NICs). • No need to mess up with Kernel coding, Netmap applications can easily be ran in userland. • Full control over host stack.
  • 4. Packet Journey in Kernel ether_input(struct ifnet *ifp, struct ether_header *eh, struct mbuf *m) { m->m_pkthdr.rcvif = ifp; eh = mtod(m, struct ether_header *); m->m_data += sizeof(struct ether_header); m->m_len -= sizeof(struct ether_header); m->m_pkthdr.len = m->m_len; ether_demux(ifp, eh, m); } void ether_demux(ifp, eh, m) ether_type = ntohs(eh>ether_type); switch (ether_type) { case ETHERTYPE_IP: . . . break; schednetisr(NETISR_IP); inq = &ipintrq; (void) IF_HANDOFF(inq, m, NULL);
  • 5. Packet Journey in Kernel //ISR for IP while(1) { ip_input(m); s = splimp(); IF_DEQUEUE(&ipintrq, m); splx(s); if (m == 0) return; ip_input(m); } void ip_input(struct mbuf *m) { if (m->m_len < sizeof (struct ip) && (m = m_pullup(m, sizeof (struct ip))) == 0) { } ip = mtod(m, struct ip *); ipstat.ips_toosmall++; return;
  • 6. Network I/O costs • per-byte cost: amount of traffic - copying, checksum computation, encryption • per-packet cost: comes from the manipulation of descriptors (allocation and destruction, metadata management) and the execution of system calls, interrupts, and device-driver functions… • The larger packets the less per-packet cost.
  • 7. Sockets Bottlenecks • IPC Communication “System Calls”. • Parsing RAW Ethernet frames into different data structure at different layers, mbufs, skubuff. • Memory Copy (inter layers) • Memory Allocation/Deallocation
  • 8. Design Decisions 30 years ago. • memory was a scarce resource; • links operated at low (by today’s standards) speeds; • parallel processing was an advanced research topic; and the ability to work at line rate in • all possible conditions was compromised by hardware limitations in the NIC
  • 9. Packet Handling Costs UDP System Call mbuf Memory Copy Hardware TCP
  • 10. What’s netmap • “Netmap is a novel framework that employs some known techniques to reduce packet- processing costs. Its key feature, apart from performance, is that it integrates smoothly with existing operating-system internals and applications. This makes it possible to achieve great speedups with just a limited amount of new code, while building a robust and maintainable system.”
  • 11. Netmap goals • Reduce system calls • Remove allocation costs • Eliminate data copies. The inspiration for data structure design comes from the way that network interface stores buffers.
  • 12. Netmap Data Structure Safe slots: [curr … curr + available – 1]
  • 13. netmap API • Access – open(‘’/dev/netmap’’); – ioctl(fd, NIOCREG, arg) – mmap(…, fd, 0) maps buffers and rings. • Transmit – Fill to avail buffers, starting from slot cur. – IOCTL NIOCTXSYNC • Receive – IOCTL NIOCRXSYNC – Process up to avail slots from slot cur
  • 15. Multi Queue NIC • Multiple Descriptor Queues – queues can be associated with specific processor cores • Receive-Side Scaling (RSS) – To determine which receive queue to use for incoming packets, – specific flow for a given packet is determined by the calculation of a hash value derived from fields • Extended Message-Signaled Interrupts (MSI-X) – Multiple interrupt vectors
  • 16. Host Stack Access • In netmap mode host stack is disconnected from NIC. • But the operating system still can use NIC. • There’s two additional rings for incoming outgoing traffic of host stack. • Packets destinated to the host stack are queued netmap and then and passed to the host stack and vice versa. • The approach provides an ideal solution for traffic filters, manipulators and etc.
  • 17. Lightening fast packet forwarding ... src = &src_nifp->slot[i]; /* locate src and dst slots */ dst = &dst_nifp->slot[j]; /* swap the buffers */ tmp = dst->buf_index; dst->buf_index = src->buf_index; src->buf_index = tmp; /* update length and flags */ dst->len = src->len; /* tell kernel to update addresses in the NIC rings */ dst->flags = src->flags = BUF_CHANGED; ... / conditional part

Editor's Notes

  • #2: Thank you for watching my videocast for FreeBSD netmap, for those of you who don’t know me my name is Amir Razmjou and I’m computer science graduate student at Florida Tech. Today I’m gonna talk about netmap, a new packet capturing framework. It’s going to brief an overview over to this technology. Hopefully we will going to have a seminar later and hands-on workshop and I hope that we will have more advanced videocasts on this topic in future as well. If you had any questions you can send me an email to the email address written here I’ll be happy to answer them.  
  • #3: These packet rates are can saturate maximum bandwidth that network interface can handle even if you have a 64 bytes long frames. For a long time, the bottleneck for packet processing on commodity hardware were network interfaces. In early 90s CPU memory and hardware were fast enough to cope with number of packets on that time. With advent of modern network interfaces 10Gs now we are game rules changed a lot. Memory is not a scarce resource anymore, network stacks were designed to serve different type application and were very generalized. Today we usually use network stacks for special purposes, content servers, data plane and etc.
  • #4: In conventional operating systems and network interfaces. The final packet handed to userland application is the result of complicated interactions between network interface and operating system. In this picture hardware representing network interface shown here… is basically a ring data structure that holds physical address for frames written to machine main memory usually through direct memory access to shared memory between operating system and network interface. As we can see these buffers cannot directly be used by applications, actually network stack does do a long processing on each one of them to make them useable for userland applications. For example consider that network interface captures a packet on wire as soon as it captures the packets it directly writes it in machine physical memory and fires a hardware interrupt to notify operating system that new packet with given physical address on memory has been received, usually operating system defers this task to later time, in order to not disturb other tasks in but in general it makes at least one copy of buffer to fit it to data structure suitable for both userland application and operating system. These structures in freebsd are called mbufs and skbuffs in their linux counterparts. Although these structures are preallocated but they have a fix size for payloads so operating system usually has to use a chain of these structures to represent a single packet.
  • #8: System calls happen when userland application requests a service from operating system. It’s costly because the processor has to do a context switch. It means that it should store current state of processor to some safe place and execute the service requested in operating system context
  • #10: Why is mbuf handling so time consuming? In the uniprocessor systems on the scene 30 years ago, memory allocators and reference counting were relatively inexpensive, especially in the presence of fixed-size objects. These days things are different: in the presence of multiple cores, allocations contend for global locks, and reference counts operate on shared memory variables; both operations may easily result in uncached DRAM accesses, which take up to 50-100 nanoseconds.