Internet-Draft EVPN/VXLAN E-Tree IMET Route Filtering March 2023
Bamberger, et al. Expires 2 September 2023 [Page]
Workgroup:
BGP Enabled Services
Internet-Draft:
draft-bamberger-bess-imet-filter-evpn-etree-vxlan-00
Published:
Intended Status:
Standards Track
Expires:
Authors:
A. Bamberger
Arista Networks
A. Shashidhar
Arista Networks
S. Kolobov
Arista Networks
A. Appachi Gounder
Google

IMET Route Filtering for Ethernet VPN (EVPN) Ethernet Tree (E-Tree) with VXLAN Encapsulation

Abstract

RFC 8317 defines EVPN Ethernet Tree (E-Tree) and associated filtering rules for both known unicast and broadcast, unknown unicast, and multicast (BUM) traffic, using Multiprotocol Label Switching (MPLS) encapsulation. The processes and protocols specified in RFC 8317 for performing E-Tree filtering on known unicast traffic are implemented entirely with EVPN routes, and are thus also applicable to networks using Virtual Extensible LAN (VXLAN) encapsulation. However, E-Tree filtering for BUM traffic is accomplished using specific features of MPLS encapsulation, and is thus not applicable for networks using VXLAN encapsulation.

In networks where E-Tree root/leaf role classification is done per-provider edge (PE) device, or per-attachment circuit (AC) on each PE device, an extension to EVPN type-3 inclusive multicast (IMET) routes can be added to allow E-Tree filtering for BUM traffic in networks using VXLAN encapsulation. Additionally, this proposal specifies filtering BUM traffic on ingress, as opposed to the egress filtering specified by RFC 8317, which can be considered to be more optimal, as it reduces the amount of unnecessary BUM traffic transmitted over the network.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 2 September 2023.

Table of Contents

1. Introduction

[RFC8317] defines EVPN Ethernet Tree (E-Tree), a segmentation technology, for networks using EVPN with multiprotocol label switching (MPLS) encapsulation. In a network using E-Tree segmentation, hosts are assigned one of two different classifications, root or leaf. Root hosts are allowed to communicate with all other hosts in the network. Leaf hosts are only allowed to communicate with root hosts; they are blocked from communicating with other leaf hosts. [RFC8317] defines two different filtering methods to achieve the required segmentation:

  1. Ingress filtering, applicable to known unicast traffic
  2. Egress filtering, applicable to broadcast, unknown unicast, and multicast (BUM) traffic

Ingress filtering is the filtering of leaf-to-leaf traffic on the ingress side of an overlay tunnel, before it is encapsulated and sent to a destination provider edge (PE) device. Egress filtering is the filtering of leaf-to-leaf traffic on the egress side of an overlay tunnel, after it has been received on the destination PE device.

Ingress filtering as defined by [RFC8317] for known unicast traffic relies solely on additional metadata attached to the EVPN type-2 mac-ip routes advertised by PE devices for known hosts, and will thus work unmodified for networks using virtual extensible LAN (VXLAN) encapsulation. However, egress filtering for BUM traffic relies on specific features of MPLS encapsulation, specifically, the ability to attach multiple labels to each data packet. Therefore, egress filtering of BUM traffic as defined by [RFC8317] doesn't work unmodified for networks using VXLAN encapsulation.

[RFC8317] defines 3 primary methods for classifying hosts as roots and leafs:

  1. Each PE site (VTEP) contains only root or only leaf hosts
  2. Each attachment circuit (VLAN) contains only root or only leaf hosts
  3. Each host (MAC) can be individually classified as a root or a leaf

This document will define an approach for performing ingress filtering for BUM traffic, in addition to known unicast traffic, for networks using VXLAN encapsulation and E-Tree role classification using either method 1 or method 2 defined above. E-Tree filtering of BUM traffic for VXLAN networks using method 3 for E-Tree role classification (which is also not covered in [RFC8317]) is outside the scope of this document.

2. Terminology

2.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2.2. Terms and Abbreviations

BUM
Broadcast, Unknown Unicast, and Multicast traffic
IMET Route
Inclusive Multicast Ethernet Tag Route. An EVPN route type used to advertise remote destinations to send BUM traffic to
EVPN
Ethernet Virtual Private Network. A BGP-based protocol for building VPNs and network overlays. Defined in [RFC7432] for use with MPLS encapsulation, and extended in [RFC8365] for use with alternative encapsulations, including VXLAN
PE
Provider Edge device
VLAN
Virtual LAN. A single bridging domain local to a device.
VNI
VXLAN Network Identifier. An identifier of a bridging domain across a VXLAN overlay network.
VTEP
VXLAN Tunnel Endpoint
VXLAN
Virtual Extensible LAN. An overlay protocol used to build L2 overlay networks across a standard UDP/IP L3 underlay network. Defined in [RFC7348]

3. IMET Route Filtering for BUM Traffic

[RFC8317] defines how ingress filtering for known unicast traffic is performed, by attaching an E-Tree extended community to each EVPN type-2 mac-ip route that is advertised for a host with leaf classification. Any PE that imports this route is able to classify the remote host identified by the received route as a root or a leaf, based on the presence or absence of the attached E-Tree extended community. Any traffic locally originated from an attached leaf host, and destined to a remote leaf host, can then be dropped prior to encapsulation, preventing any local leaf hosts from communicating with known remote leaf hosts.

A bridging domain in a VXLAN network is identified by a VXLAN network identifier (VNI). Each PE will have a mapping of VNI to local bridging domain (VLAN). In addition to advertising a type-2 mac-ip route per known local host, each PE will advertise a type-3 IMET route per-VNI. This is used by receiving PEs to construct a per-VLAN floodset of remote PEs. When BUM traffic is flooded locally in a VLAN, this traffic can also be replicated to any remote PEs contained in the floodset for that VLAN. This ensures that all BUM traffic in a bridging domain will reach all hosts, remote and local, in that bridging domain.

Due to each PE advertising an IMET route per-VNI, there being a 1-1 mapping from VLAN to VNI on each PE, and the limitation that all hosts in given local VLAN must be classified as either roots or leaves (role classification methods 1 or 2 from [RFC8317]), it becomes possible for each advertised IMET route to be classified as either a "root" or "leaf" IMET route.

On a PE with VLANs configured as leaves, for any IMET route advertised by that PE for a VNI that maps to a local VLAN configured as a leaf, the PE MUST attach the E-Tree extended community defined in [RFC8317] with the leaf flag set to the advertised IMET route.

On any PE which receives an IMET route with the E-Tree extended community with leaf flag set attached, if the local VLAN which maps to the VNI from the received route is classified as a leaf, the PE MUST NOT install the remote PE identified in the received IMET route into the local floodset for the indicated VLAN.

By ensuring that both known remote hosts and remote flood targets are identified as being classified as leaves in their respective EVPN routes, it is possible for an ingress PE to prevent sending any locally sourced leaf-classified traffic to remote leaf hosts, ensuring E-Tree segmentation rules are followed.

When attached to a type-3 IMET route, the E-Tree extended community is used in the same way as when attached to a type-2 mac-ip route as per [RFC8317]. Namely, when the E-Tree extended community is advertised along with an IMET route, the Leaf-Indication flag MUST be set to one and the Leaf label SHOULD be set to zero. The receiving PE MUST ignore the Leaf label and only process the Leaf-Indication flag. A value of zero for the Leaf-Indication flag is invalid when sent along with an IMET route, and an error should be logged.

3.1. IMET Route Filtering with Ingress Replication

The following examples will refer to the network topology described in Figure 1.


                  +--------+    +--------+
                  | Host 3 |    | Host 4 |
                  +---+----+    +---+----+
                      |             |
                +-----+-------------+------+
                |     |    PE B     |      |
                |     |             |      |
                | +---+-----+  +----+----+ |
                | | VLAN 10 |  | VLAN 20 | |
                | |  Root   |  |  Root   | |
                | +---------+  +---------+ |
                |                          |
                +------------+-------------+
                             |
                             |
                             |
                             |
                 +-----------+------------+
                 |                        |
             +---+    Underlay Network    +---+
             |   |                        |   |
             |   +------------------------+   |
             |                                |
             |                                |
             |                                |
             |                                |
             |                                |
             |                                |
+------------+-------------+     +------------+-------------+
|          PE A            |     |          PE C            |
|                          |     |                          |
| +---------+  +---------+ |     | +---------+  +---------+ |
| | VLAN 10 |  | VLAN 20 | |     | | VLAN 10 |  | VLAN 20 | |
| |  Leaf   |  |  Leaf   | |     | |  Leaf   |  |  Root   | |
| +---+-----+  +----+----+ |     | +---+-----+  +----+----+ |
|     |             |      |     |     |             |      |
+-----+-------------+------+     +-----+-------------+------+
      |             |                  |             |
  +---+----+    +---+----+         +---+----+    +---+----+
  | Host 1 |    | Host 2 |         | Host 5 |    | Host 6 |
  +--------+    +--------+         +--------+    +--------+

Figure 1: Example E-Tree Topology

The topology shown in Figure 1 contains 3 PE devices, each with two VLANs locally configured, with one host directly connected to each PE in each local VLAN. All of the PE devices are connected together through an underlay network, and all peer with each other over EVPN. The VLAN to VNI mappings are configured identically on each PE device as shown in Table 1

Table 1: VLAN to VNI Mappings on Each PE in [Figure 1]
PE Local VLAN VNI
PE-A 10 10000
PE-A 20 20000
PE-B 10 10000
PE-B 20 20000
PE-C 10 10000
PE-C 20 20000

PE-A has two local VLANs configured. Each of these VLANs is classified as a leaf VLAN. PE-A advertises two EVPN type-3 IMET routes, one for VNI 10000 (mapping to local VLAN 10), and one for VNI 20000 (mapping to local VLAN 20). Because both of the local VLANs are configured as leaves, PE-A advertises both IMET routes with the E-Tree extended community with leaf flag set attached.

PE-B also has two local VLANs configured, also mapping to VNIs 10000 and 20000, but on PE-B, the local VLANs are both classified as root VLANs. Because of this, neither of the IMET routes that PE-B advertises have the E-Tree extended community attached.

PE-C also has two local VLANs configured, also mapping to VNIs 10000 and 20000. On PE-C, VLAN 10 is classified as a leaf VLAN, and VLAN 20 is classified as a root VLAN. Because of this, the IMET route that PE-C advertises for VNI 10000 has the E-Tree extended community with leaf flag set attached, while the IMET route that PE-C advertises for VNI 20000 does not have the E-Tree extended community attached.

PE-A receives four IMET routes, 2 each from PE-B and PE-C. Neither IMET route received from PE-B has the E-Tree extended community attached, so PE-A imports both routes into its local floodsets for VLANs 10 and 20. The IMET route received from PE-C for VNI 20000 does not have the E-Tree extended community attached, so PE-A imports PE-C into the local floodset for VLAN 20, but because the received IMET route from PE-C for VNI 10000 does have the E-Tree extended community attached, and the local VLAN that maps to VNI 10000 (VLAN 10) is configured as a leaf VLAN, PE-A does not import PE-C into the local floodset for VLAN 10.

PE-B receives four IMET routes, 2 each from PE-A and PE-C. Because both local VLANs on PE-B are configured as root VLANs, all received IMET routes are imported, regardless of the presence or absence of the E-Tree extended community.

PE-C receives four IMET routes, 2 each from PE-A and PE-B. Both IMET routes received from PE-B do not have the E-Tree extended community attached, so PE-B is imported into the local floodsets of both VLAN 10 and VLAN 20. Both IMET routes received from PE-A do have the E-Tree extended community with leaf flag set attached, but because the local VLAN that maps to VNI 20000 (VLAN 20) is configured as a root VLAN, PE-A is still imported into the local floodset for VLAN 20. However, because the local VLAN that maps to VNI 10000 (VLAN 10) is configured as a leaf VLAN, PE-A is not imported into the local floodset for VLAN 10.

The contents of each local floodset for each VLAN on each PE after all of the peer IMET routes are received are shown in Table 2

Table 2: Floodset Contents of Each PE in [Figure 1]
PE VLAN Floodset Contents
PE-A 10 PE-B
PE-A 20 PE-B, PE-C
PE-B 10 PE-A, PE-C
PE-B 20 PE-A, PE-C
PE-C 10 PE-B
PE-C 20 PE-A, PE-B

3.1.1. Leaf to Leaf BUM Traffic

Assume that Host 1 sends traffic that is classified as BUM on PE-A. The only remote host classified as a leaf in the same bridging domain (VLAN 10/VNI 10000) is Host 5 attached to PE-C. To maintain the E-Tree segmentation rules, this BUM traffic from Host 1 must not reach Host 5. PE-A will replicate the BUM traffic to all remote PEs in the local floodset for VLAN 10, which because of the filtered IMET route only contains PE-B. Therefore, BUM traffic sourced from leaf Host 1 is never replicated to PE-C on VNI 10000, and never reaches leaf Host 5.

3.1.2. Leaf to Root BUM Traffic

Using the same scenario from the previous section, Host 1 sends BUM traffic. This traffic must reach all root-classified hosts in the same bridging domain, which in this case is Host 3 attached to PE-B. Because the IMET route from PE-B wasn't filtered on PE-A (due to no attached E-Tree extended community), PE-B is in the local floodset for VLAN 10 on PE-A. Therefore, BUM traffic sourced from leaf Host 1 is replicated to PE-B and reaches root Host 3.

3.1.3. Root to Leaf and Root to Root BUM Traffic

Assume that Host 6 sends traffic that is classified as BUM on PE-C. Because Host 6 is classified as a root on PE-C, it must reach both remote root and leaf hosts. Because any received IMET routes are imported into a local root VLAN regardless of if the E-Tree extended community is attached or not, PE-C will import the VNI 20000 IMET routes from both PE-A and PE-B. Therefore, when Host 6 sends BUM traffic, it will be replicated to both PE-A and PE-B. It will then be received by both leaf Host 2 and root Host 4.

3.2. IMET Route Filtering with Multicast Replication

When multicast replication is used in the underlay instead of ingress replication, instead of each PE managing a local floodset of remote PEs to replicate BUM traffic to, each PE defines one or more multicast groups that it advertises via IMET routes, which remote PEs will join if they want to receive BUM traffic for the VNI described by the IMET route.

All of the details specified in Section 3.1 still apply when using multicast replication, with one important limitation: each PE MUST define and advertise at minimum a separate multicast group for each E-Tree classification present on the PE. This means that PEs whose locally configured VLANs are only roots or only leaves may configure a single multicast group, but PEs that have a mix of local root and leaf VLANs must configure at least two multicast groups, a "root" and "leaf" group. The PE will still advertise one IMET route per VNI, but the IMET routes advertised for any VNI that maps to a local VLAN configured as a root will contain the root multicast group, and not attach the E-Tree extended community, while the IMET routes advertised for any VNI that maps to a local VLAN configured as a leaf will contain the leaf multicast group, and will attach the E-Tree extended community with leaf flag set.

The worst case upper bound for the minimum number of required multicast groups is 2x the number of PEs with both root and leaf VLANs configured locally. PEs that have only root VLANs configured, or only leaf VLANs configured, are able to configure a single multicast group (because all of the IMET routes that they advertise will either have or not have the E-Tree extended community attached), but VTEPs that have both root and leaf VLANs configured must configure at least two multicast groups. One, the “root” multicast group, will be advertised in each IMET route that corresponds to a root VNI (without the E-Tree extended community attached), and the other, the “leaf” multicast group, will be advertised in each IMET route that corresponds to a leaf VNI (with the E-Tree extended community attached). More than 2 multicast groups can be configured per-PE (up to and including one multicast group per-VNI). The only restriction is that there must be at least one multicast group configured for each E-Tree classification present on a PE.

Aside from the aforementioned limitation, IMET route filtering with multicast replication functions very similarly to IMET route filtering with ingress replication. Instead of managing a local per-VLAN floodset, PEs that receive IMET routes manage their multicast group memberships, to ensure that no remote leaf traffic is received for a local leaf VLAN.

When a PE receives an IMET route with the E-Tree extended community with leaf flag set attached, it knows that any traffic received on the multicast group contained in the IMET route is leaf traffic. Therefore, if the local VLAN that maps to the VNI from the IMET route is configured as a leaf VLAN, the receiving PE MUST NOT join the multicast group contained in the received IMET route. Table 3 shows the multicast groups advertised by each PE, and which remote PEs join those groups.

Table 3: Multicast Groups for BUM Traffic when using Multicast Replication in [Figure 1]
PE VLAN Multicast Group Multicast Group Members
PE-A 10 Group A1 PE-B
PE-A 20 Group A2 PE-B, PE-C
PE-B 10 Group B1 PE-A, PE-C
PE-B 20 Group B2 PE-A, PE-C
PE-C 10 Group C1 PE-B
PE-C 20 Group C2 PE-A, PE-B

3.2.1. Leaf to Leaf BUM Traffic

Assume that Host 1 sends traffic that is classified as BUM on PE-A. The only remote host classified as a leaf in the same bridging domain (VLAN 10/VNI 10000) is Host 5 attached to PE-C. To maintain the E-Tree segmentation rules, this BUM traffic from Host 1 must not reach Host 5. Because the IMET route advertised by PE-A for VNI 10000 will have the E-Tree extended community with leaf flag set attached, PE-C will not join multicast group A1 advertised in this IMET route. Therefore, when PE-A sends any VLAN 10 BUM traffic to multicast group A1, it will not reach PE-C, and not reach remote leaf Host 5.

3.2.2. Leaf to Root BUM Traffic

Using the same scenario from the previous section, Host 1 sends BUM traffic. This traffic must reach all root-classified hosts in the same bridging domain, which in this case is Host 3 attached to PE-B. PEs where the local VLAN that maps to a received IMET route VNI are roots will join the advertised multicast group regardless of whether or not the IMET route has the E-Tree extended community attached. Therefore, PE-B will join multicast group A1, and will receive all leaf BUM traffic sourced from PE-A, including from leaf Host 1.

3.2.3. Root to Leaf and Root to Root BUM Traffic

Assume that Host 6 sends traffic that is classified as BUM on PE-C. Because Host 6 is classified as a root on PE-C, it must reach both remote root and leaf hosts. Because the IMET route that PE-C sends for VNI 20000 does not have the E-Tree extended community attached, both PE-A and PE-B will join multicast group C1 advertised in this IMET route. Therefore, when root Host 6 sends BUM traffic, it will be received by both PE-A and PE-B, and in turn by leaf Host 2 and root Host 4.

3.3. Multihoming Considerations

The proposals in this document are fully compatible with EVPN multihoming as described in [RFC7432], with one caveat. For the BUM ingress filtering approach described in this document to work properly, all PE devices that share an ethernet segment (ES) must configure their root/leaf VLAN classifications identically. In other words, any VLANs configured as leaves on one PE must also be configured as leaves on any other PEs in the same ES, and any VLANs configured as roots on on PE must be configured as roots on any other PEs in the same ES. If a VLAN is configured as a root on one PE and a leaf on another PE in the same ES, the ingress filtering behavior for BUM traffic will depend on the VLAN classification on the elected designated forwarder (DF) (which may change during normal network operation), leading to instability in the network. Furthermore, a host multihomed to two or more PEs with different root/leaf classifications for the same VLAN will effectively have a different E-Tree classification depending on which PE its traffic gets load-balanced to, breaking the E-Tree segmentation rules.

3.4. Optional Tradeoffs in PE Hardware Resource Utilization vs. Network Traffic

One additional advantage of performing ingress filtering for BUM traffic instead of egress filtering as defined by [RFC8317] is that it allows for a more compelling tradeoff between ingress filtering for known unicast traffic and converting known unicast traffic to BUM traffic for the purpose of potentially saving PE hardware resources.

While performing ingress filtering for known unicast leaf-to-leaf traffic is optimal in terms of network traffic (prohibited traffic is never allowed to leave the PE on which it originated), it is potentially suboptimal with respect to PE hardware resources and scaling. Depending on hardware implementation, ingress filtering for known-unicast traffic may require per-host resource utilization (in the form of drop routes or similar), which may be prohibitive depending on host scale. To avoid this per-host resource utilization, a PE could simply not install a local unicast route for a known remote leaf if the local VLAN is also configured as a leaf. This would have the effect of converting all remote-destined known-unicast traffic to BUM traffic, and when combined with E-Tree BUM filtering rules, would still ensure that E-Tree segmentation rules are respected. However, if only egress filtering is used for BUM traffic, this could result in a prohibitive amount of new BUM traffic, making this tradeoff unappealing.

By performing ingress filtering for BUM traffic (as defined in this proposal), a much smaller amount of extra BUM traffic is generated when treating all prohibited leaf-to-leaf known unicast traffic as BUM traffic. While all remote root PEs will still receive unnecessary flooded traffic for this "known unicast as BUM" traffic, no remote leaf PEs will receive a copy of the traffic and have to perform egress filtering. In network topologies with few root PEs (or topologies in which there is expected to be only a small amount of leaf-to-leaf traffic that must be blocked), the tradeoff between local hardware resources for known-unicast ingress filtering vs. extra flooded traffic due to treating prohibited leaf-to-leaf traffic as BUM traffic becomes much more tractable when ingress replication is used for BUM traffic as opposed to egress filtering.

4. IANA Considerations

This memo includes no request to IANA.

5. Security Considerations

This document builds upon the EVPN E-Tree constructs defined in [RFC8317], therefore, the same security considerations in that document are also applicable here. While the E-Tree segmentation guarantees defined in [RFC8317] are achieved in a different manner (specifically, using ingress filtering for both known unicast and BUM traffic), all of the same additional security functionality (and associated caveats) are provided.

6. References

6.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC8317]
Sajassi, A., Ed., Salam, S., Drake, J., Uttaro, J., Boutros, S., and J. Rabadan, "Ethernet-Tree (E-Tree) Support in Ethernet VPN (EVPN) and Provider Backbone Bridging EVPN (PBB-EVPN)", RFC 8317, DOI 10.17487/RFC8317, , <https://www.rfc-editor.org/info/rfc8317>.
[RFC7348]
Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, L., Sridhar, T., Bursell, M., and C. Wright, "Virtual eXtensible Local Area Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks", RFC 7348, DOI 10.17487/RFC7348, , <https://www.rfc-editor.org/info/rfc7348>.
[RFC7432]
Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, , <https://www.rfc-editor.org/info/rfc7432>.
[RFC8365]
Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R., Uttaro, J., and W. Henderickx, "A Network Virtualization Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365, DOI 10.17487/RFC8365, , <https://www.rfc-editor.org/info/rfc8365>.

Authors' Addresses

Aaron Bamberger
Arista Networks
Akhil Shashidhar
Arista Networks
Sergey Kolobov
Arista Networks
Arivudainambi Appachi Gounder
Google