draft-ietf-bess-evpn-inter-subnet-forwarding-08
draft-ietf-bess-evpn-inter-subnet-forwarding-08
INTERNET-DRAFT S. Salam
Intended Status: Standards Track S. Thoria
Cisco
J. Drake
Juniper
J. Rabadan
Nokia
Abstract
Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved.
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 EVPN PE Model for IRB Operation . . . . . . . . . . . . . . . . 7
3 Symmetric and Asymmetric IRB . . . . . . . . . . . . . . . . . 8
3.1 IRB Interface and its MAC & IP addresses . . . . . . . . . . 11
3.2 Symmetric IRB Procedures . . . . . . . . . . . . . . . . . . 13
3.2.1 Control Plane - Advertising PE . . . . . . . . . . . . . 13
3.2.2 Control Plane - Receiving PE . . . . . . . . . . . . . . 14
3.2.3 Data Plane - Ingress PE . . . . . . . . . . . . . . . . 15
3.2.4 Data Plane - Egress PE . . . . . . . . . . . . . . . . . 15
3.3 Asymmetric IRB Procedures . . . . . . . . . . . . . . . . . 16
3.3.1 Control Plane - Advertising PE . . . . . . . . . . . . . 16
3.3.2 Control Plane - Receiving PE . . . . . . . . . . . . . . 17
3.3.3 Data Plane - Ingress PE . . . . . . . . . . . . . . . . 17
3.3.4 Data Plane - Egress PE . . . . . . . . . . . . . . . . . 18
4 Mobility Procedure . . . . . . . . . . . . . . . . . . . . . . . 18
4.1 Initiating an ARP Request upon a Move . . . . . . . . . . . 19
4.2 Sending Data Traffic without an ARP Request . . . . . . . . 20
4.3 Silent Host . . . . . . . . . . . . . . . . . . . . . . . . 22
5 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.1 Router’s MAC Extended Community . . . . . . . . . . . . . . 23
6 Operational Models for Symmetric Inter-Subnet Forwarding . . . . 23
6.1 IRB forwarding on NVEs for Tenant Systems . . . . . . . . . 23
6.1.1 Control Plane Operation . . . . . . . . . . . . . . . . 25
6.1.2 Data Plane Operation . . . . . . . . . . . . . . . . . . 26
6.2 IRB forwarding on NVEs for Subnets behind Tenant Systems . . 27
6.2.1 Control Plane Operation . . . . . . . . . . . . . . . . 29
6.2.2 Data Plane Operation . . . . . . . . . . . . . . . . . . 30
7 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31
8 Security Considerations . . . . . . . . . . . . . . . . . . . . 31
9 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 31
10 References . . . . . . . . . . . . . . . . . . . . . . . . . . 32
10.1 Normative References . . . . . . . . . . . . . . . . . . . 32
10.2 Informative References . . . . . . . . . . . . . . . . . . 32
Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . . 33
Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
EVI: EVPN Instance spanning the NVE/PE devices that are participating
on that EVPN, as per [RFC7432].
SBD: Supplementary Broadcast Domain. A BD that does not have any ACs,
only IRB interfaces, and it is used to provide connectivity among all
the IP-VRFs of the tenant. The SBD is only required in IP-VRF- to-IP-
VRF use-cases (see Section 4.4.).
SN: Subnet.
1 Introduction
R1: The solution must allow for both inter-subnet and intra-subnet
traffic belonging to the same tenant to be locally routed and bridged
respectively. The solution must provide IP routing for inter-subnet
traffic and Ethernet Bridging for intra-subnet traffic.
+-------------------------------------------------------------+
| |
| +------------------+ IRB PE |
| Attachment | +------------------+ |
| Circuit(AC1) | | +----------+ | MPLS/NVO tnl
----------------------*Bridge | | +-----
| | | |Table(BT1)| | +-----------+ / \ \
| | | | *---------* |<--> |Eth|
| | | | VLAN x | |IRB1| | \ / /
| | | +----------+ | | | +-----
| | | ... | | IP-VRF1 | |
| | | +----------+ | | RD2/RT2 |MPLS/NVO tnl
| | | |Bridge | | | | +-----
| | | |Table(BT2)| |IRB2| | / \ \
| | | | *---------* |<--> |IP |
----------------------* VLAN y | | +-----------+ \ / /
| AC2 | | +----------+ | +-----
| | | MAC-VRF1 | |
| +-+ RD1/RT1 | |
| +------------------+ |
| |
| |
+-------------------------------------------------------------+
Ingress PE Egress PE
+-------------------+ +------------------+
| | | |
| +-> IP-VRF ----|---->---|-----> IP-VRF -+ |
| | | | | |
| BT1 BT2 | | BT3 BT2 |
| | | | | |
| ^ | | v |
| | | | | |
+-------------------+ +------------------+
^ |
| |
TS1->-+ +->-TS2
Figure 2: Symmetric IRB
Ingress PE Egress PE
+-------------------+ +------------------+
| | | |
| +-> IP-VRF -> | | IP-VRF |
| | | | | |
| BT1 BT2 | | BT3 BT2 |
| | | | | | | |
| | +--|--->----|--------------+ | |
| | | | v |
+-------------------+ +----------------|-+
^ |
| |
TS1->-+ +->-TS2
Figure 3: Asymmetric IRB
PE 1 +---------+
+-------------+ | |
TS1-----| MACx| | | PE2
(IP1/M1) |(BT1) | | | +-------------+
TS5-----| \ | | MPLS/ | |MACy (BT3) |-----TS3
(IP5/M5) |Mx/IPx \ | | VxLAN/ | | / | (IP3/M3)
| (IP-VRF1)|----| NVGRE |---|(IP-VRF1) |
| / | | | | \ |
TS2-----|(BT2) / | | | | (BT1) |-----TS4
(IP2/M2) | | | | | | (IP4/M4)
+-------------+ | | +-------------+
| |
+---------+
1. All the PEs for a given tenant subnet use the same anycast default
gateway IP and MAC addresses . On each PE, this default gateway IP
and MAC addresses correspond to the IRB interface connecting the BT
associated with the tenant’s VLAN to the corresponding tenant’s IP-
VRF.
2. Each PE for a given tenant subnet uses the same anycast default
gateway IP address but its own MAC address. These MAC addresses are
aliased to the same anycast default gateway IP address through the
use of the Default Gateway extended community as specified in
[RFC7432], which is carried in the EVPN MAC/IP Advertisement routes.
On each PE, this default gateway IP address along with its associated
MAC addresses correspond to the IRB interface connecting the BT
associated with the tenant’s VLAN to the corresponding tenant’s IP-
VRF.
that TS.
address and thus when it sends an ARP request for IPx (anycast IP
address of the IRB interface for BT1), the PE1 sends an ARP reply
with the MACx which is the anycast MAC address of that IRB interface.
Traffic routed from IP-VRF1 to TS1 SHOULD use the anycast MAC address
as source MAC address.
- The Length field of the BGP EVPN NLRI for an EVPN MAC/IP
Advertisement route MUST be either 40 (if IPv4 address is carried) or
52 (if IPv6 address is carried).
Just as in [RFC7432], the RD, Ethernet Tag ID, MAC Address Length,
MAC Address, IP Address Length, and IP Address fields are part of
the route key used by BGP to compare routes. The rest of the fields
are not part of the route key.
For symmetric IRB mode, Router’s MAC EC is needed to carry the PE’s
overlay MAC address (e.g., inner MAC address in NVO encapsulation)
which is used for IP-VRF to IP-VRF communications with Ethernet NVO
tunnel. If MPLS or IP-only NVO tunnel is used, then there is no need
to send Router’s MAC Extended Community along with this route.
If the receiving PE receives this route with both the MAC-VRF and IP-
VRF route targets but the MAC/IP Advertisement route does not include
MPLS label2 field and if the receiving PE supports asymmetric IRB
mode, then the receiving PE installs the MAC address in the
corresponding MAC-VRF and <IP, MAC> association in the ARP table for
that tenant (identified by the corresponding IP-VRF route target).
If the receiving PE receives this route with both the MAC-VRF and IP-
VRF route targets but the MAC/IP Advertisement route does not include
MPLS label2 field and if the receiving PE does not support either
asymmetric or symmetric IRB modes, then if it has the corresponding
MAC-VRF, it only imports the MAC address; otherwise, if it doesn’t
have the corresponding MAC-VRF, it MUST treat the route as withdraw
[RFC7606] and log an error message.
If the receiving PE receives this route with both the MAC-VRF and IP-
VRF route targets and the MAC/IP Advertisement route includes MPLS
label2 field but the receiving PE only supports asymmetric IRB mode,
then the receiving PE MUST ignore MPLS label2 field and install the
MAC address in the corresponding MAC-VRF and <IP, MAC> association in
the ARP table for that tenant (identified by the corresponding IP-VRF
route target).
If the tunnel type is that of MPLS or IP-only NVO tunnel, then TS’s
IP packet is sent over the tunnel without any Ethernet header.
However, if the tunnel type is that of Ethernet NVO tunnel, then an
Ethernet header needs to be added to the TS’s IP packet. The source
MAC address of this inner Ethernet header is set to the ingress PE’s
router MAC address and the destination MAC address of this inner
Ethernet header is set to the egress PE’s router MAC address. The
MPLS VPN label or VNI fields are set accordingly and the packet is
forwarded to the egress PE.
The egress PE gets the destination TS’s MAC address for that TS’s IP
address from its ARP table, it encapsulates the packet with that
destination MAC address and a source MAC address corresponding to
that IRB interface and sends the packet to its destination subnet
MAC-VRF/BT.
sent on.
- The Length field of the BGP EVPN NLRI for an EVPN MAC/IP
Advertisement route MUST be either 37 (if IPv4 address is carried) or
49 (if IPv6 address is carried).
Just as in [RFC7432], the RD, Ethernet Tag ID, MAC Address Length,
MAC Address, IP Address Length, and IP Address fields are part of
the route key used by BGP to compare routes. The rest of the fields
are not part of the route key.
This route MUST always be advertised with the MAC-VRF route target.
It MAY also be advertised with a second route target corresponding to
the IP-VRF. If only MAC-VRF route target is used, then the receiving
PE uses the MAC-VRF route target to identify the corresponding IP-VRF
- i.e., many MAC-VRF route targets map to the same IP-VRF for a given
tenant. Since in this asymmetric IRB mode, each PE is configured with
all VLANs of a tenant, the MAC-VRF route target has the same
reachability as the IP-VRF route target and that is why the use of
IP-VRF route target is optional for this IRB mode.
MAC address. If the MAC address corresponds to its IRB Interface MAC
address, the ingress PE deduces that the packet must be inter-subnet
routed. Hence, the ingress PE performs an IP lookup in the associated
IP-VRF table. The lookup identifies a local adjacency to the IRB
interface associated with the egress subnet’s MAC-VRF/BT.
The ingress PE gets the destination TS’s MAC address for that TS’s IP
address from its ARP table, it encapsulates the packet with that
destination MAC address and a source MAC address corresponding to
that IRB interface and sends the packet to its destination subnet
MAC-VRF/BT.
4 Mobility Procedure
When a TS move from one NVE (aka source NVE) to another NVE (aka
target NVE), it is important that the MAC mobility procedures are
properly executed and the corresponding MAC-VRF and IP-VRF tables on
all participating NVEs are updated. [RFC7432] describes the MAC
mobility procedures for L2-only services for both single-homed TS and
multi-homed TS. This section describes the incremental procedures and
BGP Extended Communities needed to handle the MAC mobility for IRB.
In order to place the emphasis on the differences between L2-only and
IRB use cases, the incremental procedure is described for single-
homed TS with the expectation that the reader can easily extrapolate
multi-homed TS based on the procedures described in section 15 of
[RFC7432]. This section describes mobility procedures for both
symmetric and asymmetric IRB. Although the language used in this
section is for IPv4 ARP, it equally applies to IPv6 ND.
The target NVE upon receiving this ARP request, updates its MAC-VRF,
IP-VRF, and ARP table with the host MAC, IP, and local adjacency
information (e.g., local interface).
Since this NVE has previously learned the same MAC and IP addresses
from the source NVE, it recognizes that there has been a MAC move and
All other remote NVE devices upon receiving the MAC/IP advertisement
route with MAC Mobility extended community compare the sequence
number in this advertisement with the one previously received. If the
new sequence number is greater than the old one, then they update the
MAC/IP addresses of the TS in their corresponding MAC-VRF and IP-VRF
tables to point to the target NVE. Furthermore, upon receiving the
MAC/IP withdraw for the TS from the source NVE, these remote PEs
perform the cleanups for their BGP tables.
The target NVE upon receiving the first data packet, it learns the
MAC address of the TS in data plane and updates its MAC-VRF table
with the MAC address and the local adjacency information (e.g., local
interface) accordingly. The target NVE realizes that there has been a
MAC move because the same MAC address has been learned remotely from
the source NVE.
- The target NVE passes the ARP request to its locally attached TSes
and when it receives the ARP response, it updates its IP-VRF and ARP
table with the host <MAC, IP> information. It also sends an EVPN
MAC/IP advertisement route with both the MAC and IP addresses filled
in along with MAC Mobility Extended Community with the sequence
number set to the same value as the one for MAC-only advertisement
route it sent previously.
On the source NVE, an age-out timer (for the silent host that has
moved) is used to trigger an ARP probe. This age-out timer can be
either ARP timer or MAC age-out timer and this is an implementation
choice. The ARP request gets sent both locally to all the attached
TSes on that subnet as well as it gets sent to all the remote NVEs
(including the target NVE) participating in that subnet. It also
withdraw the EVPN MAC/IP route with only the MAC address (if it has
previously advertised such a route).
The target NVE passes the ARP request to its locally attached TSes
and when it receives the ARP response, it updates its MAC-VRF, IP-
VRF, and ARP table with the host <MAC, IP> and local adjacency
information (e.g., local interface). It also sends an EVPN MAC/IP
advertisement route with both the MAC and IP address fields filled in
along with MAC Mobility Extended Community with the sequence number
incremented by one.
All other remote NVE devices upon receiving the MAC/IP advertisement
route with MAC Mobility extended community compare the sequence
number in this advertisement with the one previously received. If the
new sequence number is greater than the old one, then they update the
MAC/IP addresses of the TS in their corresponding MAC-VRF and IP-VRF
tables to point to the new NVE. Furthermore, upon receiving the
MAC/IP withdraw for the TS from the old NVE, these remote PEs perform
5 BGP Encoding
This document defines one new BGP Extended Community for EVPN.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type=0x06 | Sub-Type=0x03 | Router’s MAC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Router’s MAC Cont’d |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This extended community is used to carry the PE’s MAC address for
symmetric IRB scenarios and it is sent with RT-2.
This section covers the symmetric IRB procedures for the scenario
where each Tenant System (TS) is attached to one or more NVEs and its
host IP and MAC addresses are learned by the attached NVEs and are
Each NVE MUST support QoS, Security, and OAM policies per IP-VRF
to/from the core network. This is not to be confused with the QoS,
Security, and OAM policies per Attachment Circuits (AC) to/from the
Tenant Systems. How this requirement is met is an implementation
choice and it is outside the scope of this document.
NVE1 +---------+
+-------------+ | |
TS1-----| MACx| | | NVE2
(IP1/M1) |(MAC- | | | +-------------+
TS5-----| VRF1)\ | | MPLS/ | |MACy (MAC- |-----TS3
(IP5/M5) | \ | | VxLAN/ | | / VRF3) | (IP3/M3)
| (IP-VRF1)|----| NVGRE |---|(IP-VRF1) |
| / | | | | \ |
TS2-----|(MAC- / | | | | (MAC- |-----TS4
(IP2/M2) | VRF2) | | | | VRF1) | (IP4/M4)
+-------------+ | | +-------------+
| |
+---------+
Each NVE advertises an RT-2 route with two Route Targets (one
corresponding to its MAC-VRF and the other corresponding to its IP-
VRF. Furthermore, the RT-2 is advertised with two BGP Extended
Communities. The first BGP Extended Community identifies the tunnel
type per section 4.5 of [TUNNEL-ENCAP] and the second BGP Extended
Community includes the MAC address of the NVE (e.g., MACx for NVE1 or
MACy for NVE2) as defined in section 5.1. This second Extended
Community (for the MAC address of NVE) is only required when Ethernet
NVO tunnel type is used. If IP NVO tunnel type is used, then there is
no need to send this second Extended Community. It should be noted
that IP NVO tunnel type is only applicable to symmetric IRB
procedures.
- It imports the MAC address from MAC/IP Advertisement route into the
MAC-VRF with BGP Next Hop address as underlay tunnel destination
address (e.g., VTEP DA for VxLAN encapsulation) and Label-1 as VNI
for VxLAN encapsulation or EVPN label for MPLS encapsulation.
- If the route carries the new Router’s MAC Extended Community, and
if the receiving NVE is using Ethernet NVO tunnel, then the receiving
NVE imports the IP address into IP-VRF with NVE’s MAC address (from
the new Router’s MAC Extended Community) as inner MAC DA and BGP Next
Hop address as underlay tunnel destination address, VTEP DA for VxLAN
encapsulation and Label-2 as IP-VPN VNI for VxLAN encapsulation.
If the receiving NVE receives a RT-2 with only Label-1 and only a
single Route Target corresponding to IP-VRF, or if it receives a RT-2
with only a single Route Target corresponding to MAC-VRF but with
both Label-1 and Label-2, or if it receives a RT-2 with MAC Address
Length of zero, then it MUST treat the route as withdraw [RFC7606]
and log an error message.
- Upon receiving the packet, the NVE1 uses VLAN-tag to identify the
MAC-VRF1. It then looks up the MAC DA and forwards the frame to its
IRB interface.
VPN-ID as the VNI. The inner MAC SA and VTEP SA are set to NVE’s MAC
and IP addresses respectively. If it is a MPLS encapsulation, then
corresponding EVPN and LSP labels are added to the packet. The packet
is then forwarded to the egress NVE.
This section covers the symmetric IRB procedures for the scenario
where some Tenant Systems (TSes) support one or more subnets and
these TSes are associated with one or more NVEs. Therefore, besides
the advertisement of MAC/IP addresses for each TS which can be multi-
homed with All-Active redundancy mode, the associated NVE needs to
also advertise the subnets statically configured on each TS.
The main difference between this solution and the previous one is the
additional advertisement corresponding to each subnet. These subnet
advertisements are accomplished using EVPN IP Prefix route defined in
[EVPN-PREFIX]. These subnet prefixes are advertised with the IP
address of their associated TS (which is in overlay address space) as
their next hop. The receiving NVEs perform recursive route resolution
to resolve the subnet prefix with its associated ingress NVE so that
they know which NVE to forward the packets to when they are destined
for that subnet prefix.
NVE1 +----------+
SN1--+ +-------------+ | |
|--TS1-----|(MAC- \ | | |
SN2--+ IP1/M1 | VRF1) \ | | |
| (IP-VRF)|---| |
| / | | |
TS2-----|(MAC- / | | MPLS/ |
IP2/M2 | VRF2) | | VxLAN/ |
+-------------+ | NVGRE |
+-------------+ | |
SN3--+--TS3-----|(MAC-\ | | |
IP3/M3 | VRF3)\ | | |
| (IP-VRF)|---| |
| / | | |
TS4-----|(MAC- / | | |
IP4/M4 | VRF1) | | |
+-------------+ +----------+
NVE2
This RT-5 is advertised with one or more Route Targets that have been
configured as "export route targets" of the IP-VRF from which the
route is originated.
Upon receiving the RT-5 advertisement, the receiving NVE performs the
following:
- The Ethernet header of the packet is stripped and the packet is fed
to the IP-VRF; where, IP lookup is performed on the destination
address. This lookup yields the fields needed for VxLAN encapsulation
with NVE2’s MAC address as the inner MAC DA, NVE’2 IP address as the
VTEP DA, and the VNI. MAC SA is set to NVE1’s MAC address and VTEP SA
is set to NVE1’s IP address.
7 Acknowledgements
8 Security Considerations
In this document, the RT-5 is used for certain scenarios. This route
uses an Overlay Index that requires a recursive resolution to a
different EVPN route (an RT-2). Because of this, it is worth noting
that any action that ends up filtering or modifying the RT-2 route
used to convey the Overlay Indexes, will modify the resolution of the
RT-5 and therefore the forwarding of packets to the remote subnet.
9 IANA Considerations
10 References
[RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", RFC 7432,
February, 2015.
Authors’ Addresses
Samer Salam
Cisco
Email: [email protected]
Samir Thoria
Cisco
Email: [email protected]
John E. Drake
Juniper
Email: [email protected]
Jorge Rabadan
Nokia
Email: [email protected]