RADCOM - Efficient On-Demand Network Tapping and Smart Probing For 5G
RADCOM - Efficient On-Demand Network Tapping and Smart Probing For 5G
1. Introduction.......................................................................................................................................................................................................... 3
2. Virtual tapping options and performance......................................................................................................... 4
2a. vSwitch mirroring............................................................................................................................................................................. 5
Mirroring a VF on the same host vs. an external host......................................................... 6
2b. vSwitch acceleration with hardware offloading......................................................................... 8
2c. SR-IOV mirroring on the NIC.......................................................................................................................................... 9
2d. TAP on Top of Rack (TOR) switch........................................................................................................................... 11
2e. Intra-VM tapping............................................................................................................................................................................... 12
3. Network tapping challenges in an OpenStack environment........................................... 14
4. Integrating smart, on-demand tapping and probing into the NFVI.................... 15
5. Conclusion.............................................................................................................................................................................................................. 16
1. Introduction
Before Network Function Virtualization (NFV), network tapping for network monitoring or security
purposes used to be relatively simple as the operator would have access to the data via a physical tap
connected to a physical link. With physical links, the operator would always have complete network
visibility. However, virtualization used in a 5G-ready cloud creates new network blind spots and
makes gaining full network visibility more challenging;
• In a virtual network, a substantial percentage of the traffic never hits a physical link with most of
the virtual machine (VM) to VM communication being hidden from physical taps and multiple
virtual network functions (VNFs) being installed on one server. This traffic referred to as east-west
traffic can make up a substantial part of the virtual traffic and so creates significant blind spots for
the operator. This lack of network visibility will only increase as 5G continues to roll out and more
and more functions are virtualized.
• When making copies of virtual traffic for 5G monitoring and security purposes, there are compute
resources required and virtual tapping needs to fit in with the operators’ goal of maintaining
network efficiency and streamlining the network performance. Duplicating and forwarding
packet data for continuous monitoring of all the traffic, all the time is not practical or efficient.
• Large-scale operators are deploying distributed networks with multi-tenant cloud environments.
To streamline operations, ensure end-to-end service quality and be scalable these geographically
separate environments (e.g., regional, or edge deployments) will need to be managed centrally.
While the importance of security and monitoring has not changed, the ability to capture and
aggregate the traffic has. So, operators need to rethink their approaches to end-to-end visibility in
an increasingly complex virtualized world. Ultimately, the same network tapping and monitoring
tactics used in a physical environment will not work in a virtual one. With more and more operators
transitioning to a virtualized network it’s important that from day one operators deploy an efficient
tapping solution that is tightly integrated with their probing solution to gain full network visibility
and adjust things on-demand. A fully integrated solution will also enable operators to scale, manage
and optimize their cloud networks much more effectively. With this critical foundation in place,
operators can assure their migration to 5G more efficiently and implement a closed-loop approach to
managing the customer experience and troubleshooting their services.
Tapping
Option Overview
method
a vSwitch Port vSwitch port mirroring is used to send a copy of all packets seen on
Mirroring one switch port (or an entire VLAN) to another switch port (i.e., for
assurance purposes). This mirroring is used to copy the traffic of a
VM or VM’s to a single port and provides high-performance packet
acquisition from a vNIC with the minimum number of CPU cycles.
In promiscuous mode, a virtual interface connected to a vSwitch port
group will be able to enter promiscuous mode and capture traffic
from any other virtual interfaces connected to the vSwitch.
b vSwitch This method provides enhanced performance for VNFs that need
Acceleration maximum throughput and zero packet loss (such as virtual probes)
with by utilizing a SmartNIC for vSwitch Acceleration. Like SR-IOV
Hardware (described below) in concept and performance, this method is
Offloading based on standard mechanisms, does not require changes to the
VNF and provides the ability to migrate VNFs quickly and easily. The
main challenges are that per-packet processing in software can be
inefficient.
d TAP on Top A TOR switch is a switch that handles Layer 2 and Layer 3 frame and
of Rack (TOR) packet forwarding, data center bridging and the transport of Fibre
Switch Channel over Ethernet for the racks of servers connected to them
Traffic mirroring is performed by programming the TOR switch to
mirror specific traffic to a target destination, such as a specific IP
address or VIP, associated with a VF (Virtual Function). Specific VLANS
can also be used to mirror only part of the traffic
Port mirroring is the most common method used to extract data from the VM and send to service
assurance probes (also known as Switched Port Analyzer - SPAN). It is a software feature built into the
vSwitch that creates a copy (mirror) of selected packets passing through the VM and sends them to a
designated mirror port; where packets can be analyzed.
Pros Cons
Server Network
Forensics &
Analytics
VNF VNF VNF vFilter
Filtered and load Service
balanced traffic Assurance
VNF traffic VNF traffic VNF traffic
vSwitch Security
Figure 1 - Port mirroring with filtering and load balancing on the monitored VM host
Another tapping option allows the mirrored-traffic to be sent to an external filtering and load-balanc-
ing component, and later forward it to the service assurance probes for processing.
Server
vSwitch
Server Network
Forensics &
Analytics
VNF VNF VNF vFilter
Filtered and load Service
balanced traffic Assurance
vSwitch Security
Figure 2 - Port mirroring with filtering and load balancing outside the monitored VM host
The amount of traffic going through the vSwitch that needs to be processed governs the specific
option to be used, as outlined below.
Using a SmartNIC provides an alternative to data acquisition in virtual environments and delivers
results that show, it is no longer necessary to compromise on performance or agility, but possible to
achieve both. Also, the solution minimizes the number of CPU cores required to the bare minimum.
This reduces CAPEX and OPEX server costs while also providing the opportunity to consolidate virtual
functions on the least number of servers as possible, further reducing OPEX costs.
Pros Cons
• Mirroring, using the SmartNIC, to • There is a need to keep tracking the tapped
bypass the vSwitch, is still controlled traffic (network, port, trunk, VLANs) that
by the vSwitch but requires much less needs to be tapped and mirrored to a
packet processing resources from the specific VF-ID. This must always be visible to
vSwitch as opposed to the resources and orchestrated by the NFVO. The NFVO
necessary when the SmartNIC is not is responsible for matching the tapped/
available/utilized mirrored traffic and the mirroring VF
• The vSwitch must interact with the • For automation, solutions such as Tap as a
SmartNIC and utilize specific APIs to Service (TaaS) must be used to keep track of
start/stop traffic mirroring the mirrored-pairs
2c. SR-IOV mirroring on the NIC
SR-IOV enables network traffic to bypass the software switch layer of the hypervisor virtualization
stack. Because the VF is assigned to a child partition, the network traffic flows directly between the VF
and child partition. As a result, the overhead in the software emulation layer is reduced and achieves
network performance that is the same performance as in non-virtualized environments.
Modern SmartNIC/Intelligent-NIC adaptors that were becoming a commodity and placed on the
NFVI COTS and especially on the 5G MEC are places at the network edge, and strategic endpoints,
that support virtual functions offload of different tasks, including rule-based packet processing,
packet-filtering, offloading and mirroring. Latest intelligent NICs include a built-in switch that controls
the traffic flow between endpoints. Traffic mirroring can be easily programmed into the SmartNIC
with the latest SR-IOV Hypervisor sysfs1 management interface with enhancements such as VLAN
mirror, ingress mirror, egress mirror and more. These options are available with SmartNIC adaptors
from the major vendors, such as Mellanox (Connect-X5 and higher), Intel (Fortville XXV710, etc.) and
others.
The following drawing illustrates SR-IOV VLANs mirroring, where specific incoming and outgoing
traffic on different VLANs between two virtual functions is mirrored to a third VF by programming
the SmartNIC onboard switch to mirror the traffic associated with these VLANs to a specific VF-ID.
The specific ID associated with a particular VF can dynamically change along the VF lifecycle. As
a result, the NFV entity responsible for traffic mapping on the NFVI (usually the NFVO), must keep
track of which traffic goes where and reprogram the switch to mirror the traffic to the newly created
VF with its new ID. The latest update to the Tap as a Service (TaaS) OpenStack project supports
the automation of such mirroring so that traffic tapping along the VF lifecycle is continuous and
manageable by the NFVO.
CORE
SmartNIC
Onboard PF
Switch
VLAN 55 56 57 58
Mirrored
VLANS
VF ‘A’ VF ‘B’
VF ‘C’
vFilter NIC vProbe
HOST-A HOST-B
56
Incoming and outgoing NIC mirrored
VLAN traffic on different VLANs traffic
Figure 3 - SR-IOV VLAN mirroring
1 A pseudo file system provided by the Linux kernel that exports information about various kernel subsystems,
hardware devices, and associated device drivers from the kernel’s device model to user space through virtual files.
Pros Cons
• Makes a single PCI hardware device • SR-IOV ties PCI VFs on the physical
appear as multiple virtual PCI devices for NIC to VMs and VFs, which can be a
VMs, enabling direct communication with departure from the decoupling of
the hardware NIC hardware from software
• Packets bypass the Host Hypervisor • Features such as firewall filtering are
and vSwitch to deliver near wireline still not available when using SR-IOV
performance. This is highly suited to high with OpenStack
volume user plane traffic capture
• If TaaS and an NFVO are not available,
• Low latency – the hypervisor software it would be difficult to automate,
process is bypassed since the VM control and follow the VF-ID changes
is directly attached to a hardware throughout the VF lifecycle
component
• Scalability of the host is improved – by
directly attaching VMs to VFs on the PCI;
the CPU is bypassed enhancing CPU
available to VMs
• Trunk mirroring, VLANs mirroring, ingress
mirroring and egress mirroring are now
feasible and controlled via the HOST NIC
driver, given the right access permissions
• TaaS can be used to track and automate
the mirroring process throughout the VF
lifecycle
2d. TAP on Top of Rack (TOR) switch
Using the approach of placing physical tapping in the TOR switch to send the information to
non-cloud native bare metal, non-NFV physical probing systems deployed outside of the cloud
environment is not the right approach. This method is not automatically scalable via the cloud
orchestration and will require probe vendors to deliver more physical boxes when traffic fluctuates.
That means the operator will be required to deploy tapping on all the outputs of the TOR switch and
use a costly deployment of network packet brokers in every TOR switch to collect the data coming
from the VNFs and distribute to the physical probing system. With 80% of data center traffic being
east-west, tapping on a TOR switch will mean the operator has blind spots in the network and will
miss all the inter and intra-VM traffic.
End-of-Row architecture
Top-of-Rack architecture
Pros Cons
• For a legacy, non-NFV tapping solution • In an NFV environment, inter VNF east-
this method can use a legacy tapping west traffic is not visible, and so does
solution on TOR not appear on a TOR switch
• This provides one tapping location for all • If traffic is mirrored to the TOR switch
the traffic that traverses the TOR switch for tapping, traffic trombone will waste
but does not include inter VNF intra Host vital network resource
traffic
2e. Intra-VM tapping
This option uses the monitored VM and VNF to be part of the method for copying packets moving
through the virtual network interface controller (vNIC) from/to the NE virtualized components
(VNFCs). A tap agent is deployed into the VM on each monitored NE. The agent will then capture the
network data moving through the VM. The advantages of this method are:
• The tapping agent is collocated with the monitored VNF (on the same VM)
• The agent can be combined with a lightweight virtual network packet broker features for
filtering, sampling, load-balancing, etc.
To be able to tap a VNF that runs on a specific VM, there are two main options available. These
options do not require integration with the tapped-VNF software. The first option taps the vNIC ring
buffer using libpcap, where the second option allows bypassing the Linux kernel using the DPDK-
pdump technique, part of the Data Plane Development Kit (DPDK), that runs as a DPDK secondary
process and can enable packet capture on DPDK ports.
To improve the packet processing performance, DPDK can be used on both the tapped VNF and the
virtual probe side. DPDK can accelerate the overall packet processing operations needed in the vTAP.
The following diagram outlines the two options, where the tapping VNF is orchestrated to run on
the same VM where the tapped VNF runs. The tapping-VNF deployment can be achieved using
instantiation during onboarding or by incorporating the tapping-VNF into the VNF base-image as
part of the standard deployment package.
VM User VM User
Space Space
Kernel Kernel
Space Space
DPDK to bypass
the kernel
• Full visibility of all east-west VNF traffic • Need to allocate vCPUs for the tapping
traversing vSwitch VNF in the tapped VNF VM
• Easy to implement (by activating an • The NFV orchestrator (NFVO) needs
existing vSwitch feature) to make sure that the tapping VM is
instantiated on the same VM where the
• VNF agnostic and provides one tapping tapped-VNF runs
solution for all VNFs (DPDK/non-DPDK)
• Custom agent/API is required for a non-
• No integration with VNF vendor is Linux VNF
required
3. Network tapping challenges
in an OpenStack environment
Many telecom operators are using OpenStack as their NFV platform of choice. One of the
challenges in deploying OpenStack is the need to capture both south-north and east-west
(intra-VM) traffic and currently there is no built-in solution for efficient, virtual tapping in an
OpenStack environment. Current tapping options are:
• SR-IOV mirroring
• TOR tapping
The OpenStack industry is working on Tap as a Service (TaaS). However, there are currently
challenges in using this methodology;
• Orchestrating the tapping is not standardized and if not executed correctly can lead to
the forwarding of traffic to destinations that are not up and running which will flood the
network
• There is only manual provisioning or non-standard SDN control, and it does require tight
integration with the NFVO
Etcd
Neutron Server
TaaS Plugin
VPP - Etcd VPP
Designing a cloud service to leverage the underlying virtualized infrastructure to ensure scale out
elasticity, and multi-tenancy is essential. With cloud-native solutions, an operator can instantiate the
virtual probing environment (virtual load balancers and probes) on the target monitored VNF and
NFVI environment (vEPC, vIMS, etc.) or instruct a specific network element to extract the traffic and
mirror it to the probing environment and scale out.
A more effective solution is to deploy on-demand visibility and assurance with smart sampling and
filtering of the traffic with intelligent load balancing that lets the operator zoom into certain data-
sets (such as a specific service, subscriber group or protocol), troubleshoot and move on to the next
high-priority task. Utilizing bare-metal based visibility and probes will not be able to serve probing-
on-demand nor be dynamic enough for scaling.
NFVO
Enabling a closed loop
automated solution,
RADCOM NETWORK INSIGHTS and 5G on-demand
RADCOM VNF assurance
Deliver rich actionable service and MANAGER
External FeedsTo
customer experience insights for: Alarms | CRM | Events | (S-VFNM)
SOC, CEM customer care, Legacy probes
marketing, engineering
vNPB vNPB
RADCOM NETWORK VISIBILITY
Collect, process and
distribute traffic VM VM
across multiple
domains VNF VNF VNF vFilter VNF vFilter
• Probing is executed next to the monitoring point, so there is no need to send massive amounts
of data out of the cloud
• Utilizes techniques to offload the Open vSwitch (OVS) using SR-IOV, or using direct SR-IOV traffic
mirroring with SmartNIC, thus not overloading the NFVI when probing
• Gain dynamic scalability in/out as part of the orchestrated life cycle of the service or VNF being
monitored
• Deploy tapping agents inside the VM of the monitored-target VNF to tap on the virtual interface
and filter network traffic at the tapping point
By deploying an on-demand, smarter visibility layer operators can more efficiently troubleshoot their
network, closer to the tapping point thus moving some of this functionality from the service assurance
solution. Troubleshooting network issues at the tapping point enable operators to perform in-depth
protocol analysis at the raw data level, drill down into any network element, protocol or message type
and smartly capture filtered packets currently being transmitted through the network or examine
historical data. The service assurance solution can still perform the heavy lifting, but troubleshooting at
the tapping point is highly resource efficient and provides for a highly optimized network.
5. Conclusion
For telecom operators eradicating network blind spots in the age of virtualization and 5G, new
challenges have emerged and for operators choosing the best, and most cloud-efficient tapping
and probing strategy depends on multiple parameters such as the operators’ cloud environment, the
service being monitoring, the Virtualized Infrastructure Manager (VIM) software implemented as well as
the operators’ long-term goals. Furthermore, as described, the tapping performance varies between the
different tapping options and no one option provides operators with the ultimate way to extract traffic
on a 5G-ready cloud network. However, deciding on the right network visibility solution and tapping
methodology is critical to ensure a smooth transition to a 5G-ready cloud and enable operators to
transition to a dynamic, closed-loop approach to customer experience management.
As a market leader in virtualized service assurance and network visibility, with multiple large-scale
production systems already implemented, RADCOM offers a fully virtualized, end-to-end solution from
virtual tapping to business insights as well as deep cloud expertise specifically for telecom operators.
RADCOM’s team of experts can help operators choose the most suitable tapping and probing strategy
to help achieve their business goals.
This document and any and all content or material contained herein, including text, graphics, images and logos, are either exclusively owned by RADCOM Ltd., its subsidiaries and/
or affiliates (“RADCOM”) or are subject to rights of use granted to RADCOM, are protected by national and/or international copyright laws and may be used by the recipient solely
for its own internal review. Any other use, including the reproduction, incorporation, modification, distribution, transmission, republication, creation of a derivative work or display
of this document and/or the content or material contained herein, is strictly prohibited without the express prior written authorization of RADCOM.
The information, content or material herein is provided “AS IS”, is designated confidential and is subject to all restrictions in any law regarding such matters, and the relevant
confidentiality and non-disclosure clauses or agreements issued prior to and/or after the disclosure. All the information in this document is to be safeguarded and all steps must
be taken to prevent it from being disclosed to any person or entity other than the direct entity that received it directly from RADCOM.
The text and drawings herein are for the purpose of illustration and reference only.
RADCOM reserves the right to periodically change information that is contained in this document; however, RADCOM makes no commitment to provide any such changes,
updates, enhancements or other additions to this document to you in a timely manner or at all.