0% found this document useful (0 votes)
103 views

Trellis Platform Brief: First, Trellis Is Built Using Bare-Metal Switches With Merchant-Silicon Asics

Datacenter Fabric

Uploaded by

Hazar Alfahel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views

Trellis Platform Brief: First, Trellis Is Built Using Bare-Metal Switches With Merchant-Silicon Asics

Datacenter Fabric

Uploaded by

Hazar Alfahel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

T R EL L IS P L ATF ORM BRI EF

ONF’s Trellis is the leading open-source multi-purpose

leaf-spine fabric supporting distributed access

networks, NFV and edge cloud applications. Built

using white-box switches, merchant silicon and SDN

principles, Trellis is scalable, flexible, cost-effective

and production-ready. It is deployed in multiple

geographies by a Tier-1 US network operator.

BEN EFI TS

• Production-ready fabric optimized for Service


Provider networks
Overview
Trellis is an open-source multi-purpose L2/L3 leaf-spine switching
• Flexible, easily extensible and configurable fabric. The development of Trellis over the last four years has
using classic SDN methods been influenced by three core trends in the networking industry:

• First, Trellis is built using bare-metal switches with


• Capex and Opex savings with white-box
merchant-silicon ASICs. Instead of using OEM networking
hardware and open source software
hardware, Trellis uses hardware directly from ODMs. The trend
of using bare-metal (white-box) switches is unmistakable in the
• Offers horizontal scaling and redundancy
networking industry today, spurred by the massive bandwidth-
density and growing sophistication of merchant silicon ASICs.
• Future proof: P4 and Stratum integration
Production quality Trellis today is based on EdgeCore switches
to unlock advanced capabilities
with Broadcom Trident2, Tomahawk and Qumran switch
ASICs. The Trellis team continues to work towards including
more ODMs and merchant silicon vendors.
KEY USE CASES
• Second, Trellis is based on SDN principles, to provide
simpler, more flexible and easily customizable networks.
• Distributed Fabric for SP Access/Edge By externalizing the network’s control, management functions
Networking and policy decisions in the ONOS SDN controller, Trellis
provides network operators with a number of SDN benefits
• Disaggregated BNG in SEBA Using P4 compared to traditional box-embedded network control.
These include centralized configuration, automation,
• Enterprise Datacenter Fabrics operation and troubleshooting.

• Third, Trellis is open-source. The networking industry has


seen an explosion of open source projects, and network
operators have been eager to embrace open-source solutions.
Trellis allows operators unparalleled ability to customize Trellis
for their application, integrate with the rest of their systems,
add features and APIs themselves and not be beholden to
a traditional vendor’s timelines and prices. An absence of
commercial licenses lowers the bar for anyone to try out Trellis.

Together, all three attributes of Trellis considerably lower the


Total Cost of Ownership (TCO) for operators who plan to run it
in production.
© 2019 Open Networking Foundation 1
T R E L L I S PLATFORM B R I E F

Trellis Apps

ONOS Cluster
(SDN Controller)
SDN-based Bare-metal
Open-source Leaf-Spine Fabric

Spine

Access Leaf (ToR)


.......

Upstream
Nodes Leaf (ToR) Routers

Compute
Nodes

Trellis Software Component


Trellis Compliant Bare-metal Hardware

Architecture
Classic-SDN. Trellis operates as a hybrid L2/L3 fabric. As a pure essentially every ToR is 2-hops away from every other ToR in a
(or classic) SDN solution, Trellis does not use any of the traditional 2-stage Clos, and there are multiple such equal-cost paths. Trellis
control protocols typically found in networking, a non-exhaustive supports routing both IPv4 and IPv6 traffic using ECMP groups.
list of which includes: STP, MSTP, RSTP, LACP, MLAG, PIM, IGMP,
OSPF, IS-IS, Trill, RSVP, LDP and BGP. Instead, Trellis uses an The need for supporting bridging and access/trunk VLANs comes
SDN Controller (ONOS) decoupled from the data-plane hardware out of the use of Trellis in NFV infrastructures, where VMs (VNFs)
to directly program ASIC forwarding tables using OpenFlow designed for legacy networks often expect to communicate
and with OF-DPA, an open-API from Broadcom running on with each other at L2, and also send and receive multiple VLAN
the switches. In this design, a set of applications running on tagged packets. When necessary, such VMs can be allocated on
ONOS implement all the fabric functionality and features, servers in the same bridging domain. In other cases, Trellis can
such as Ethernet switching, IP routing, multicast, DHCP Relay, also be configured to operate completely in L3 mode.
pseudowires and more.
MPLS Segment Routing. Trellis architecture internally uses
Bridging and Routing. The main fabric-control application Segment Routing (SR) concepts like globally significant MPLS
running on ONOS programs L2 forwarding (bridging) within a labels that are assigned to each leaf and spine switch. This
server-rack, and L3 forwarding (routing) across racks. The use minimizes label state in the network compared to traditional
of L3 down-to-the Top-of-Rack switches (ToRs) is a well-known MPLS networks where locally-significant labels have to be
concept in leaf-spine fabrics, mainly due to better scalability of swapped at each node. In Trellis, the leaf switches push an MPLS
L3 over L2. In such cases, the ToRs (leaves) route traffic by label designating the destination ToR (leaf) onto the IPv4 or IPv6
hashing IP flows to different spines using ECMP forwarding— traffic, before hashing the flows to the spines. In turn the spines
forward the traffic solely on the basis of the MPLS labels.

© 2019 Open Networking Foundation 2


T R E L L I S PLATFORM B R I E F

This design concept, popular in IP/MPLS WAN networks, has and scale. An ONOS cluster with 3 or 5 instances are all active
significant advantages. Since the spines only maintain label state, nodes doing work simultaneously, and failure handling is fully
it leads to significantly less programming burden and better automated and completely handled by the ONOS platform.
scale. For example, in one use case, the leaf switches may each
hold 100k+ IPv4/v6 routes, while the spine switches need to Topologies. Trellis supports a number of different topological
be programmed with only 10s of labels! As a result, completely variants. In its simplest instantiation, one could use a single
different ASICs can be used for the leaf and spine switches— leaf or a leaf-pair to connect servers, external routers and other
the leaves can have bigger routing tables and deeper buffers equipment like access nodes or physical appliances (PNFs). Such
while sacrificing switching capacity (example Broadcom a deployment can also be scaled horizontally into a
Qumrans), while the spines can have smaller tables with high leaf-and-spine fabric (2-level folded Clos), by adding 2 or 4
switching capacity (example Broadcom Tomahawks). spines and up to 10 leaves in single or paired configurations.

External Connectivity with vRouter. Trellis assumes that the


fabric connects to the public Internet (or other networks) via S A L I EN T F EAT U RES
traditional routers. This external connectivity is handled by
• Classic-SDN Control with ONOS to directly program ASIC
a feature known as Trellis vRouter. In contrast to a software
forwarding tables in bare metal switches with merchant silicon
virtual-router (VNF), Trellis vRouter performs only control-plane
functionality in software. It uses a routing protocol stack (like • L2 forwarding (Bridging) within server-racks and L3 forwarding
Quagga or FRR) to speak BGP or OSPF with the external router, (Routing) across racks
while the data-plane functionality of the router is performed • MPLS Segment routing for better scale and reduced programming
by the fabric itself. In other words, the entire leaf-spine fabric
behaves as a single (distributed) router data-plane and is viewed • Control plane functionality with Trellis vRouter for
as such by the external world. The Trellis vRouter is a big external connectivity
multi-terabit-per-second distributed router, where the leaf- • N-way redundancy and tier-1 telecom operator scale
switches act like line-cards, and the spine switches act like
backplanes of a traditional big-box chassis router, with obvious
cost and performance advantages over chassis-based routers
or software router VNFs respectively. In a special configuration, a Trellis fabric can itself be distributed
across geographical regions, with spine switches in a primary
Redundancy. Trellis supports redundancy at every level. central location, connected to other spines in multiple secondary
A leaf-spine fabric is redundant by design in the spine layer, (remote) locations using WDM links. Such 4-level topologies
with the use of ECMP hashing and multiple spines. In addition, (leaf-spine-spine-leaf) can be used for backhaul in operator
Trellis supports leaf-pairs, where end-hosts can be dual-homed networks, where the secondary locations are deeper in the
to two ToRs in an active-active configuration. In the eventuality network and closer to the end-user. In these configurations,
that a ToR dies, it’s pair continues to support traffic flow. In the spines in the secondary locations serve as aggregation
certain configurations, operators may choose to single-home devices that backhaul traffic from the access nodes to the
connected equipment to only one leaf/ToR— this is typical in NFV primary location which typically has the facilities for compute
infrastructure where Access nodes (R-PHYs, OLTs, eNodeBs) have and storage for NFV applications.
a single wired connection to the operator’s facility. Trellis supports
both single and dual homing, combinations of which can be Network Configuration & Troubleshooting. A final point worth
used in the same infrastructure. Additionally, Trellis also supports noting about Trellis operation—all Trellis network configuration
dual-homing of the external connectivity—Trellis can appear and troubleshooting happens entirely in ONOS. In contrast to
to external routers as two BGP-speaking routers in well-known traditional embedded networking, there is no need at all for
redundant configurations. network operators to go through the error-prone process of
configuring individual leaf and spine switches. Similarly, the
Many SDN solutions use single-instance controllers, which are SDN controller is a single-pane-of-glass for monitoring and
single-points of failure. Other solutions use two controllers in troubleshooting network state. Trellis provides in-built tools
active-backup mode, which is redundant, but may lack scale as all for statistics gathering and connectivity diagnosis, which can
the work is still being done by one instance at any time and scale further be integrated with operator’s backend systems.
can never exceed the capacity of one server. In contrast, Trellis is
based on ONOS, an SDN controller that offers N-way redundancy

© 2019 Open Networking Foundation 3


T R E L L I S PLATFORM B R I E F

ONOS Applications. Trellis uses a collection of applications


Trellis Apps that run on ONOS to provide the fabric features and services.
The main application responsible for fabric operation handles
connectivity features according to Trellis architecture, while other
ONOS Cluster apps like dhcprelay, aaa, multicast, etc. handle more specialized
features. Importantly, Trellis uses the ONOS Flow-Objectives
API, which allows applications to program switching devices in
a pipeline-agnostic way. By using Flow-Objectives, applications
can be written without worrying about low-level pipeline details
of various switching chips. The API is implemented by specific
device-drivers that are aware of the pipelines they serve, and can
thus convert the application’s API calls to device specific rules.
This way the application can be written once, and adapted to
pipelines from different ASIC vendors.

Leaf & Spine Switch Hardware. In a typical configuration,


Indigo OF Agent the leaf and spine hardware used in Trellis are Open Compute
Project (OCP) certified switches from a selection of different
BRCM OF-DPA ODM vendors. The port configurations and ASICs used in these
switches are dependant on operator needs. For example, one
ONL SDK such application uses EdgeCore 5912 switches with Broadcom
StrataDNX ASICs (Qumran) as leaf switches with 48 x 10G ports
ONIE BRCM ASIC
and 6 x 100G ports, and EdgeCore 7712 switches with Broadcom
StrataXGS ASICs (Tomahawk) as spine switches with 32 x 100G
OCP Bare-metal Switch ports. However, other use cases and operators have used
Tomahawks or Barefoot Tofinos as leaf switches, or even Trident2
based 1G or 10G switches.

OF-DPA & Indigo. In production, Trellis leverages software from


System Components Broadcom’s OpenFlow Data Plane Abstraction (OF-DPA) project.
OF-DPA implements a key adaption role between OpenFlow
Open Network Operating System (ONOS). Trellis uses (OF) 1.3 protocol constructs and Broadcom’s proprietary ASIC
Open Network Operating System (ONOS) as the SDN controller. SDKs. In particular OF-DPA presents an abstraction of the switch
ONOS is designed as a distributed system, composed of multiple ASIC pipeline and hides the differences between the actual ASIC
instances operating in a cluster, with all instances actively pipelines (Broadcom Strata XGS or DNX) underneath, so that an
operating on the network while being functionally identical. open-source OF protocol agent like Indigo can be layered on
This unique capability of ONOS simultaneously affords top of the API. With this layering, an external SDN controller
high-availability and horizontal-scaling of the control plane. (like ONOS) can access, program and leverage the full
ONOS interacts with the network devices by means of pluggable capabilities of today’s modern ASICs, a fact that proves crucial
southbound interfaces. In particular, Trellis leverages OpenFlow in providing dataplane scale and performance. For trials, ONF
1.3 protocol for programming, and NETCONF for configuring provides a free downloadable binary for the community-version
certain features (like MacSec) in the fabric switches. Like other of OF-DPA. In production, a commercial version of OF-DPA is
SDN controllers, ONOS provides several core services like available from Broadcom.
topology discovery, and end-point discovery (hosts, routers
etc attached to the fabric). Unlike any other open source SDN
controller, ONOS provides these core services in a distributed
way over the entire cluster, such that applications running in any
instance of the controller have the same view and information.

© 2019 Open Networking Foundation 4


T R E L L I S PLATFORM B R I E F

ONL & ONIE. Trellis switch software stack includes Open Furthermore, Trellis includes the capability for the same fabric
Network Linux (ONL) and Open Network Install Environment to be geographically distributed where subscriber traffic can be
(ONIE) from OCP. The switches are shipped with ONIE, a boot aggregated from multiple secondary locations, and delivered
loader that enables the installation of the target OS as part of the for processing by virtual-applications (VNFs) at the primary
provisioning process. ONL, a Linux distribution for bare metal location—a backhaul service. And it supports multiple different
switches, is used as the base operating system. It ships with a types of ASICs for different applications—both typical datacenter
number of additional drivers for bare metal switch hardware designed ASICs like Trident/Tomahawk and Tofino, but also
elements (eg. LEDs, SFPs), that are typically unavailable in normal ASICs with bigger tables, deeper buffers and many QoS
Linux distributions for bare-metal servers (eg. Ubuntu). capabilities like Qumran.

Stratum. (optional) Trellis also supports switch software from


the ONF Stratum project. Stratum is an open source silicon-
independent switch operating system. While the integration Trellis is a hardened and production-
of Stratum in Trellis is preliminary and not ready for production,
Stratum is envisioned as a key software component of SDN tested platform, delivering broadband
solutions of the future. Stratum implements the latest SDN- to thousands of paying subscribers
centric northbound interfaces, including P4, P4Runtime, gNMI/
OpenConfig, and gNOI, thereby enabling interchangeability of of a large tier-1 operator.
forwarding devices and programmability of forwarding behaviors.
We discuss Trellis leveraging Stratum and P4 in more detail in
the Use Cases section.
These features, topologies and some ASICs are not typically
Kubernetes & Kafka. (optional) While ONOS instances can be found in fabric offerings in the industry, which tend to be
deployed natively on bare-metal servers, there are advantages localized, pure L2 or L3 fabrics meant only for enterprise
in deploying ONOS instances as containers and using a container datacenters. While Trellis can also be used as a pure L3 Clos,
management system like Kubernetes (K8s). In particular, K8s these additional capabilities make it ideal in NFV infrastructures,
can monitor and automatically reboot lost controller instances for access and edge network backhaul, and for Service Providers
(container-pods), which then rejoin the ONOS cluster seamlessly. that aim to modernize their infrastructure with edge-cloud
Additionally, a next-gen version of ONOS known as µONOS capabilities. This domain is the central idea behind ONF’s
(pronounced “micro-ONOS”) is composed of several micro- flagship umbrella project—Central Office Re-architected
services which can be optimally deployed and managed by as a Datacenter (CORD). Indeed Trellis is central to the CORD
K8s. Finally, the use of a messaging bus like Kafka can help network infrastructure.
integration with operator’s backend systems, where ONOS and
Trellis applications publish statistics and events on to Kafka for Offers All SDN Benefits. As Trellis is based on classic SDN, it
post-processing by operator OSS/BSS systems. comes with a number of benefits purely from this design choice.

• SDN simplifies the development of number of features


Benefits compared to traditional networking. For example, when
Trellis vRouter is integrated with overlay-networks in cloud
Optimized for Service Provider Applications. Service provider
applications, it allows any VM or container to directly
networks tend to use more complicated header and tunneling
communicate with the outside world, without having their
formats that the typical enterprise datacenter. For example,
traffic be hair-pinned through some other gateway VM, as is
residential subscriber traffic in Fiber-to-the-Home (FTTH)
prevalent in many other overlay-only networking solutions.
networks often is double VLAN tagged (QinQ) with PPPoE
This is an example of Distributed Virtual Routing (DVR).
encapsulation, whereas mobile subscriber traffic from eNodeBs
Another example of a feature simplified by SDN is the
are GTP tunnel encapsulated. Enterprise subscriber traffic could
handling of IPv4 Multicast for IPTV service to residential
be VXLAN encapsulated or pure IP, whereas cable network
subscribers. When users flip channels on their TV service,
traffic can be L2TP encapsulated. Trellis includes a number
the home set-top box sends an IGMP join message upstream
of features designed to handle these different types of traffic
requesting the IP multicast stream corresponding to the
and encapsulations.
channel selected. In response the multicast app directly
programs the fabric to replicate and deliver the stream

© 2019 Open Networking Foundation 5


T R E L L I S PLATFORM B R I E F

to the subscribers that have requested the same channel.


Such SDN optimized handling of multicast is far simpler
Trellis
operationally than the successive IGMP/PIM filtering Redundancy
and multicast-tree formation performed by independent
Ethernet switches and IP routers running embedded
ECMP Spine 1 Spine 2
multicast protocols in traditional networks.
Groups
• SDN control provides flexibility in the implementation
of features by optimizing for a deployment use case.
For example, in one deployment some leaf switches could
be programmed with 100k+ routes and others with only a
few thousand routes. By contrast if the fabric was using a
distributed routing protocol like OSPF, all leaves would get
all the routes. In general, SDN allows faster development Leaf 1 Leaf 3
Paired
of features (apps) and greater flexibility to operators to Leaves
deploy only what they need and optimize the features Leaf 2 Leaf 4
(Dual-
the way they want. ToRs)
• SDN allows the centralized configuration of all
Dual-homed Servers
network functionality, and allows network monitoring and Pair-
troubleshooting to be centralized as well. Both are significant link
Server
benefits over traditional box-by-box networking enabling
faster deployments, simplified operations and empowering
tier-1/tier-2 support when debugging production issues. Server
External
CapEx Reduction. The use of white-box (bare-metal) switching Routers
hardware from ODMs significantly reduces CapEx costs for
network operators when compared to products from OEM
vendors. By some accounts, the cost savings can be as high as Linux Bonding
60%. This is typically due to the OEM vendors (70% gross margin) Active-active
amortizing the cost of developing embedded switch/router
software into the price of their hardware.
Horizontal Scaling & Redundancy. Trellis allows horizontal
Benefits of Open Source. Further reduction of CapEx costs scaling of the fabric. Operators can start off with simply a single
can be achieved due to Trellis’ use of completely open source, switch or two (for redundancy). As needs grow, they can add
production-ready controller software that is Apache licensed. more spines and more leaves to their infrastructure seamlessly.
But more importantly, open-source allows the network operator This feature supports both scale and redundancy built into both
unparalleled control over their network, the features it includes, spine and leaf layers as well as the control-plane which is based
various customization and optimizations of those features, on a high-performance, scalable and resilient ONOS cluster.
and how they integrate with the operator’s backend systems.
For example, in one deployment the network operator has Benefits of P4. Trellis’ use of P4 and Stratum, while not yet
implemented a full-feature (multi-source multicast) themselves production ready, offers tantalizing possibilities in the near future.
and contributed that work back to open-source. In other In many ways the holy grail of SDN has been not just the ability
instances, they have been able to add rest-apis, and event to program forwarding rules from a decoupled control plane,
notifications on kafka for integration with their back-end systems. but to be able to define and create the forwarding pipeline in the
Such unfettered ability to control their own timelines, features dataplane hardware. P4 enabled switches give this power to SDN
and costs makes open-source Trellis a huge selling point for platforms like Trellis. As a result, instead of operators depending
some network operators. on vendor proprietary custom ASICs for specialized functionality,
they could leverage P4 defined functionalities in bare-metal
whitebox hardware used in Trellis, an example of which is
described in the next section.

© 2019 Open Networking Foundation 6


T R E L L I S PLATFORM B R I E F

Distributed Fabric for Trellis apps


IPv4/v6 unicast/multicast, vlans, MPLS SR, vRouter...
Access/Edge Networking

ONOS Cluster
Spine Spine

Field
Office
Internet

Leaf Leaf
Spine Spine

Central
Office

Base R-OLT Metro/Core


Station Routers
Leaf Leaf Leaf Leaf
R-PHY

Use Cases
Distributed Fabric for Access/Edge Networking. As detailed server-leafs are dual-homed to compute nodes (servers) that host
in this whitepaper, Trellis is designed for NFV and Access/Edge all the VNFs intended for subscriber traffic tunnel-termination,
cloud applications. As such, Trellis is central to the CORD network processing and forwarding. One of the leaf-pairs is also
infrastructure. CORD is an architecture for Telco Central-Offices connected to two upstream routers that provide access to
or Cableco Head-Ends that aims to revolutionize the way they the SP’s metro/core network.
are built and operated. CORD brings together principles of SDN,
NFV and Cloud to build cost-effective, common, manageable Trellis apps are responsible for all network connectivity functions.
and agile infrastructure to enable rapid service creation at the They include authentication of the access nodes via 802.1x and
network edge. By extending the agility of the cloud to access MacSec, DHCP based IP address assignment to subscriber set-
networks for residential, enterprise and mobile subscribers, top boxes, routing unicast IPv4 and IPv6 traffic, IPTV support via
CORD aims to enable Edge Computing for next-gen services. v4 and v6 multicast, advertising subscriber routes to the upstream
routers, blackholing routes, DHCPv6 prefix delegation, VLAN
In one instantiation of this idea, Trellis is deployed in production cross-connects for enterprise subscribers and more.
at a Tier-1 US Service Provider’s network, in multiple geographies
with thousands of live subscribers. In this case, Trellis is deployed As of this writing, Trellis scales in production to roughly 10k
as a PPOD (physical pod) with up to 10 switches—2 spines, subscribers in a single PPOD, with around 50k IPv4 and IPv6
4 access-leafs and 2 pairs of server-leafs, that can serve up to routes resulting in roughly 110k flows in the whole system.
20k subscribers. The 4 access-leafs are single-homed to up to In lab settings, nearly double the scale deployed in production
192 remote access nodes, both optical and RF devices, which has been achieved. We continue to work with the service provider
further connect to residential and enterprise customers. The and our ecosystem partners to increase maximum capacity to
support even larger production PPOD configurations.

© 2019 Open Networking Foundation 7


T R E L L I S PLATFORM B R I E F

SEBA POD

Trellis Enhanced
Network Edge Mediator (NEM)
with Embedded
& Disaggregated
BNG Using P4, Trellis
Supporting Docker
ONF’s SDN VOLTHA Apps Trellis Apps BNG-control
K8s
Enabled Helm
Broadband SDN Controller – ONOS
Access (SEBA)
Platform

VOLTHA P4Runtime,
gNMI, gNOI

ONU OLT Embedded BNG


P4-SWITCH
Stratum

Stratum
ONU
Upstream
Servers Routers

Disaggregated BNG in SEBA Using P4. SDN Enabled In SEBA, dataplane traffic from such white-box OLTs are
Broadband Access (SEBA) is another instantiation of CORD. aggregated upstream by a Trellis leaf switch (or leaf-pair)
Indeed, the SEBA exemplar implementation today is built on and VLAN-cross-connected (L2-only) to upstream external
the CORD software platform, and the project itself is a variant Broadband Network Gateways (BNGs). As such, Trellis is used
of an earlier version known as R-CORD (or Residential-CORD). in its simplest form in SEBA.
SEBA today leverages disaggregation of broadband equipment
using SDN techniques, rather than relying on containers or However, SEBA intends to go further by continuing the
VNFs for processing subscriber data plane traffic. As a result disaggregation story by doing the same to the BNG.
SEBA ‘fastpaths’ all subscriber traffic from the access node to A vendor proprietary BNG today supports multiple functions
the external (upstream) backbone straight through hardware, like subscriber traffic termination (QinQ, PPPoE), H-QoS, routing,
thus achieving much greater scale multicast, lawful-intercept, accounting and more. Inspired by
prior work from Deutsche Telekom, these functions can also be
In SEBA today, headend equipment in Fiber-to-the-Home (FTTH) disaggregated in SEBA in an SDN way by implementing the
networks called OLTs are disaggregated using software from dataplane features in a P4-enabled switch, and control plane
ONF’s VOLTHA project. Vendor proprietary chassis-based OLT features as applications running on ONOS. In particular,
systems are replaced by simpler white-box OLTs, managed by Trellis already supports routing, multicast and QinQ termination
VOLTHA and a set of applications running on ONOS. In principle, functionality. Trellis is then enhanced by additional apps that
it is similar to how Trellis manages a fabric of whitebox switches program a P4-defined pipeline to perform PPPoE tunnel
with OF-DPA and a set of applications running on ONOS. termination and control traffic handling, anti-spoofing,
per subscriber accounting, policing, subscriber traffic
mirroring and more.

© 2019 Open Networking Foundation 8


T R E L L I S PLATFORM B R I E F

SEBA in its current form, with an external BNG, is being trailed Chassis Routers. As we described in the Architecture section,
by several operators, and expected to go into production in Trellis with its vRouter feature behaves just like a traditional
2020. SEBA with a disaggregated BNG embedded in Trellis chassis based router. Instead of spreading the leaf and spine
using P4 switches is currently under development by ONF and switches over multiple server-racks in a datacenter, housing the
the SEBA community. We expect this use-case to gain significant switches and controller-servers in the same rack essentially mimics
momentum next year. a chassis-router. The leaf switches are the line-cards, the spine
switches are the backplane, and the SDN controller servers are
Enterprise Datacenter Fabrics. While Trellis was targeted at redundant route processor (RP) cards (although RPs are typically
Service Provider access & edge applications, there is nothing active-backup while ONOS instances in each server are all active).
fundamental in its architecture that precludes it from being used
as an Enterprise datacenter fabric. Indeed, Trellis can be used as Technically, this approach is similar to what Google did in its B4
a pure L3 fabric, or a hybrid L2/L3 Clos fabric without any of the work. And if the goal is similar to B4 in its deployment in a private
additional features and characteristics meant for carrier access internal network—B4 is a private network connecting Google’s
network backends. datacenters—then Trellis with its current feature set is likely
sufficient to replace chassis-based routers. However if the goal
Enterprise datacenters, similar to carrier datacenters, typically is to replace a chassis-router in a public network, Trellis vRouter
have some form of virtualization/orchestration system, either would need to support more features and protocols like EVPNs
open-source (K8s,OpenStack) or commercial (VMWare, RedHat). or L3VPNs. Depending on the architecture of the public
These VM or container orchestration systems typically have network, there may also be a need to carry the full Internet
needs from the network expressed as a set of API calls (example routing table in addition to VPN routes. Trellis does not currently
OpenStack Neutron or K8s CNI) that are then implemented by support hardware that is capable of carrying the global Internet
plugins from several vendors for their proprietary or open-source routing table, although techniques exist that propose carrying
overlay networking solutions. only a subset of the full table in hardware FIBs, similar
to ‘working sets’ in Operating Systems. We encourage the
When working with overlay networks, Trellis can be treated as an open-source community to leverage Trellis as a foundation
underlay fabric. Both underlay and overlay networks/fabrics are for such explorations.
independently useful, and can be used independently. However,
in certain cases, it is beneficial for underlay and overlay to be
aware of each other and work together to deliver the goals of
the orchestration-system’s network API. Trellis’ northbound APIs
can be enhanced and integrated with various overlay networking
plugins to enable such cooperation. In another instantiation, the
overlay network can also be eliminated and Trellis can be the
sole provider of the orchestration system’s networking needs.
To enable the latter, Trellis would need to implement the plugin
into the orchestration system’s APIs and develop capabilities for
network virtualization. We encourage the open-source community
to leverage Trellis as a foundation for such explorations.

© 2019 Open Networking Foundation 9


T R E L L I S PLATFORM B R I E F

Specifications
FE AT URES DE SC R I P TI ON
SDN Features • ONOS cluster of all-active N instances affording N-way redundancy and scale, where N = 3 or N = 5
• Unified operations interface (GUI/REST/CLI)
• Centralized configuration – all configuration is done on controller instead of each individual switch
• Centralized role-based access control (RBAC)
• Automatic host (end-point) discovery – attached hosts, access-devices, appliances (PNFs), routers, etc.
based on ARP, DHCP, NDP, etc.
• Automatic switch, link and topology discovery and maintenance (keep-alives, failure recovery)
L2 Features Various L2 connectivity and tunneling support
• VLAN-based bridging
Access, Trunk and Native VLAN support
• VLAN cross connect
Forward traffic based on outer VLAN id
Forward traffic based on outer and inner VLAN id (QinQ)
• Pseudowire
L2 tunneling across the L3 fabric
Support tunneling based on double tagged and single tagged traffic
Support VLAN translation of outer tag
L3 Features IP connectivity
• IPv4 and IPv6 unicast routing (internal use of MPLS Segment Routing)
• Subnetting configuration on all non-spine facing leaf ports; no configuration required on any spine port
• IPv6 router advertisement
• ARP, NDP, IGMP handling
• Number of flows in spines greatly simplified by MPLS Segment Routing
• Further reduction of per-leaf flows with route optimization logic
DHCP Relay DHCP L3 relay
• DHCPv4 and DHCPv6
• DHCP server either directly attached to fabric leaves, or indirectly connected via upstream router
• DHCP client directly either attached to fabric leaves, or indirectly connected via LDRA
• Multiple DHCP servers for HA
vRouter vRouter presents the entire Trellis fabric as a single router (or dual-routers for HA), with disaggregated
control/data plane
• Uses open-source protocol implementations like Quagga (or FRR)
• BGPv4 and BGPv6
• Static routes
• Route blackholing
• ACLs based on port, L2, L3 and L4 headers
Multicast Centralized multicast tree computation, programming and management
• Support both IPv4 and IPv6 multicast
• Dual-homed multicast sinks for HA
• Multiple multicast sources for HA
Troubleshooting & • Troubleshooting tool – T3: Trellis Troubleshooting Tool
Diagnostics • Diagnostics one-click collection tool `onos-diags`
Topology • Single leaf (ToR) or dual-ToR (dual-homing)
• Supports typical leaf-spine topology, 2 to 4 spines, up to 10 leaves
• Multi-stage leaf-spine fabric (leaf-spine-spine-leaf)
• Can start at the smallest scale (single leaf) and grow horizontally

© 2019 Open Networking Foundation 10


T R E L L I S PLATFORM B R I E F

Specifications (continued)
FE AT URES DE SC R I P TI ON
Resiliency Provides HA in following scenarios
• Controller instance failure (requires 3 or 5 node ONOS cluster)
• Link failures
• Spine failure
Further HA support in following failure scenarios with dual-homing enabled
• Leaf failure
• Upstream router failure
• Host NIC failure
Scalability • (in production) Up to 50k routes, 110k flows, 8 Leaf, 2 Spines, with route optimization enabled
• (in pre-production) Up to 120k routes, 250k flows, 8 Leaf, 2 Spines, with route optimization enabled
Security • TLS-secured connection between controllers and switches (premium feature)
• AAA 802.1x authentication
• MACSec (L2 encapsulation)
P4-ready • Support for Stratum, P4Runtime and gNMI and P4 programs
• Innovative services enabled by programmable pipeline
BNG – PPPoE, anti-spoofing, accounting and more
GTP encap/decap
Overlay Support Can be used/integrated with 3rd party overlay networks (e.g. OpenStack Neutron, Kubernetes CNI)
Orchestrator Support Can be integrated with external orchestrator, logging, telemetry and alarm service via REST apis and
Kafka events
Controller Recommended (per ONOS instance)
Server Specs • CPU: 32 Cores
• RAM: 128GB RAM. 65GB dedicated to ONOS JVM heap (based on 50K routes)
Whitebox Switch • Multi-vendor: Edgecore, QCT, Delta, Inventec
Hardware • Multi-chipset
Broadcom Tomahawk, Trident2, Qumran
Barefoot Tofino
• 1/10G, 25G, 40G to 100G
• Refer to docs.trellisfabric.org/supported-hardware.html for the most up-to-date hardware list
Whitebox Switch • Open source ONL, ONIE and Indigo OF client
Software • (in production) OF-DPA software commercial version – contact Broadcom
• (in labs/trials) OF-DPA software community version available from ONF (for switch models based
on Trident and Tomahawk, not Qumran)
• (in labs/trails) Stratum available from ONF
Documentation docs.trellisfabric.org

ONF is an operator-led consortium transforming networks into For more technical information and tutorials: opennetworking.org/trellis
agile platforms for service delivery. To learn of the Trellis commercial ecosystem: [email protected]

© 2019 Open Networking Foundation 11

You might also like