idnits 2.17.1 draft-ietf-nvo3-evpn-applicability-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (21 June 2022) is 668 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-12) exists of draft-ietf-nvo3-encap-08 == Outdated reference: A later version (-11) exists of draft-ietf-bess-evpn-lsp-ping-07 == Outdated reference: A later version (-02) exists of draft-skr-bess-evpn-pim-proxy-01 == Outdated reference: A later version (-13) exists of draft-ietf-bess-evpn-pref-df-08 == Outdated reference: A later version (-11) exists of draft-ietf-bess-evpn-irb-mcast-06 == Outdated reference: A later version (-10) exists of draft-ietf-bess-evpn-ipvpn-interworking-06 == Outdated reference: A later version (-06) exists of draft-ietf-bess-evpn-geneve-04 == Outdated reference: A later version (-06) exists of draft-ietf-bess-evpn-mvpn-seamless-interop-03 == Outdated reference: A later version (-06) exists of draft-sajassi-bess-secure-evpn-05 == Outdated reference: A later version (-08) exists of draft-ietf-bess-rfc7432bis-04 Summary: 0 errors (**), 0 flaws (~~), 11 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NVO3 Workgroup J. Rabadan, Ed. 3 Internet-Draft M. Bocci 4 Intended status: Informational Nokia 5 Expires: 23 December 2022 S. Boutros 6 Ciena 7 A. Sajassi 8 Cisco 9 21 June 2022 11 Applicability of EVPN to NVO3 Networks 12 draft-ietf-nvo3-evpn-applicability-04 14 Abstract 16 In NVO3 networks, Network Virtualization Edge (NVE) devices sit at 17 the edge of the underlay network and provide Layer-2 and Layer-3 18 connectivity among Tenant Systems (TSes) of the same tenant. The 19 NVEs need to build and maintain mapping tables so that they can 20 deliver encapsulated packets to their intended destination NVE(s). 21 While there are different options to create and disseminate the 22 mapping table entries, NVEs may exchange that information directly 23 among themselves via a control-plane protocol, such as Ethernet 24 Virtual Private Network (EVPN). EVPN provides an efficient, flexible 25 and unified control-plane option that can be used for Layer-2 and 26 Layer-3 Virtual Network (VN) service connectivity. This document 27 describes the applicability of EVPN to NVO3 networks and how EVPN 28 solves the challenges in those networks. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on 23 December 2022. 47 Copyright Notice 49 Copyright (c) 2022 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 54 license-info) in effect on the date of publication of this document. 55 Please review these documents carefully, as they describe your rights 56 and restrictions with respect to this document. Code Components 57 extracted from this document must include Revised BSD License text as 58 described in Section 4.e of the Trust Legal Provisions and are 59 provided without warranty as described in the Revised BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2. EVPN and NVO3 Terminology . . . . . . . . . . . . . . . . . . 3 65 3. Why is EVPN Needed in NVO3 Networks? . . . . . . . . . . . . 6 66 4. Applicability of EVPN to NVO3 Networks . . . . . . . . . . . 8 67 4.1. EVPN Route Types Used in NVO3 Networks . . . . . . . . . 8 68 4.2. EVPN Basic Applicability for Layer-2 Services . . . . . . 9 69 4.2.1. Auto-Discovery and Auto-Provisioning . . . . . . . . 10 70 4.2.2. Remote NVE Auto-Discovery . . . . . . . . . . . . . . 12 71 4.2.3. Distribution of Tenant MAC and IP Information . . . . 12 72 4.3. EVPN Basic Applicability for Layer-3 Services . . . . . . 13 73 4.4. EVPN as Control Plane for NVO3 Encapsulations and 74 GENEVE . . . . . . . . . . . . . . . . . . . . . . . . . 15 75 4.5. EVPN OAM and Application to NVO3 . . . . . . . . . . . . 16 76 4.6. EVPN as the Control Plane for NVO3 Security . . . . . . . 16 77 4.7. Advanced EVPN Features for NVO3 Networks . . . . . . . . 16 78 4.7.1. Virtual Machine (VM) Mobility . . . . . . . . . . . . 16 79 4.7.2. MAC Protection, Duplication Detection and Loop 80 Protection . . . . . . . . . . . . . . . . . . . . . 17 81 4.7.3. Reduction/Optimization of BUM Traffic in Layer-2 82 Services . . . . . . . . . . . . . . . . . . . . . . 17 83 4.7.4. Ingress Replication (IR) Optimization for BUM 84 Traffic . . . . . . . . . . . . . . . . . . . . . . . 18 85 4.7.5. EVPN Multi-Homing . . . . . . . . . . . . . . . . . . 19 86 4.7.6. EVPN Recursive Resolution for Inter-Subnet Unicast 87 Forwarding . . . . . . . . . . . . . . . . . . . . . 20 88 4.7.7. EVPN Optimized Inter-Subnet Multicast Forwarding . . 21 89 4.7.8. Data Center Interconnect (DCI) . . . . . . . . . . . 21 90 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 22 91 6. Conventions Used in this Document . . . . . . . . . . . . . . 22 92 7. Security Considerations . . . . . . . . . . . . . . . . . . . 22 93 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 94 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 95 9.1. Normative References . . . . . . . . . . . . . . . . . . 23 96 9.2. Informative References . . . . . . . . . . . . . . . . . 23 97 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 27 98 Appendix B. Contributors . . . . . . . . . . . . . . . . . . . . 27 99 Appendix C. Authors' Addresses . . . . . . . . . . . . . . . . . 27 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 102 1. Introduction 104 In NVO3 networks, Network Virtualization Edge (NVE) devices sit at 105 the edge of the underlay network and provide Layer-2 and Layer-3 106 connectivity among Tenant Systems (TSes) of the same tenant. The 107 NVEs need to build and maintain mapping tables so that they can 108 deliver encapsulated packets to their intended destination NVE(s). 109 While there are different options to create and disseminate the 110 mapping table entries, NVEs may exchange that information directly 111 among themselves via a control-plane protocol, such as EVPN. EVPN 112 provides an efficient, flexible and unified control-plane option that 113 can be used for Layer-2 and Layer-3 Virtual Network (VN) service 114 connectivity. 116 In this document, we assume that the EVPN control-plane module 117 resides in the NVEs. The NVEs can be virtual switches in 118 hypervisors, TOR/Leaf switches or Data Center Gateways. As described 119 in [RFC7365], Network Virtualization Authorities (NVAs) may be used 120 to provide the forwarding information to the NVEs, and in that case, 121 EVPN could be used to disseminate the information across multiple 122 federated NVAs. The applicability of EVPN would then be similar to 123 the one described in this document. However, for simplicity, the 124 description assumes control-plane communication among NVE(s). 126 2. EVPN and NVO3 Terminology 128 * AC: Attachment Circuit or logical interface associated to a given 129 BT. To determine the AC on which a packet arrived, the NVE will 130 examine the physical/logical port and/or VLAN tags (where the VLAN 131 tags can be individual c-tags, s-tags or ranges of both). 133 * ARP and ND: Address Resolution Protocol and Neighbor Discovery 134 protocol. 136 * BD: or Broadcast Domain, it corresponds to a tenant IP subnet. If 137 no suppression techniques are used, a BUM frame that is injected 138 in a BD will reach all the NVEs that are attached to that BD. An 139 EVI may contain one or multiple BDs depending on the service model 140 [RFC7432]. This document will use the term BD to refer to a 141 tenant subnet. 143 * BT: a Bridge Table, as defined in [RFC7432]. A BT is the 144 instantiation of a BD in an NVE. When there is a single BD on a 145 given EVI, the MAC-VRF is equivalent to the BT on that NVE. 147 * BUM: Broadcast, Unknown unicast and Multicast frames. 149 * CLOS: a multistage network topology described in [CLOS1953], where 150 all the edge switches (or Leafs) are connected to all the core 151 switches (or Spines). Typically used in Data Centers nowadays. 153 * DF and NDF: they refer to Designated Forwarder and Non-Designated 154 Forwarder, which are the roles that a given PE can have in a given 155 ES. 157 * ECMP: Equal Cost Multi-Path. 159 * EVPN: Ethernet Virtual Private Networks, as described in 160 [RFC7432]. 162 * EVPN VLAN-based service model: one of the three service models 163 defined in [RFC7432]. It is characterized as a BD that uses a 164 single VLAN per physical access port to attach tenant traffic to 165 the BD. In this service model, there is only one BD per EVI. 167 * EVPN VLAN-bundle service model: similar to VLAN-based but uses a 168 bundle of VLANs per physical port to attach tenant traffic to the 169 BD. As in VLAN-based, in this model there is a single BD per EVI. 171 * EVPN VLAN-aware bundle service model: similar to the VLAN-bundle 172 model but each individual VLAN value is mapped to a different BD. 173 In this model there are multiple BDs per EVI for a given tenant. 174 Each BD is identified by an "Ethernet Tag", that is a control- 175 plane value that identifies the routes for the BD within the EVI. 177 * ES: Ethernet Segment. When a Tenant System (TS) is connected to 178 one or more NVEs via a set of Ethernet links, then that set of 179 links is referred to as an 'Ethernet segment'. Each ES is 180 represented by a unique Ethernet Segment Identifier (ESI) in the 181 NVO3 network and the ESI is used in EVPN routes that are specific 182 to that ES. 184 * Ethernet Tag: Used to represent a BD that is configured on a given 185 ES for the purpose of DF election. Note that any of the following 186 may be used to represent a BD: VIDs (including Q-in-Q tags), 187 configured IDs, VNIs (Virtual Extensible Local Area Network 188 (VXLAN) Network Identifiers), normalized VIDs, I-SIDs (Service 189 Instance Identifiers), etc., as long as the representation of the 190 BDs is configured consistently across the multihomed PEs attached 191 to that ES. The Ethernet Tag value MUST be different from zero. 193 * EVI: or EVPN Instance. It is a Layer-2 Virtual Network that uses 194 an EVPN control-plane to exchange reachability information among 195 the member NVEs. It corresponds to a set of MAC-VRFs of the same 196 tenant. See MAC-VRF in this section. 198 * GENEVE: Generic Network Virtualization Encapsulation, an NVO3 199 encapsulation defined in [RFC8926]. 201 * IP-VRF: an IP Virtual Routing and Forwarding table, as defined in 202 [RFC4364]. It stores IP Prefixes that are part of the tenant's IP 203 space, and are distributed among NVEs of the same tenant by EVPN. 204 Route-Distinghisher (RD) and Route-Target(s) (RTs) are required 205 properties of an IP-VRF. An IP-VRF is instantiated in an NVE for 206 a given tenant, if the NVE is attached to multiple subnets of the 207 tenant and local inter-subnet-forwarding is required across those 208 subnets. 210 * IRB: Integrated Routing and Bridging interface. It refers to the 211 logical interface that connects a BD instance (or a BT) to an IP- 212 VRF and allows to forward packets with destination in a different 213 subnet. 215 * MAC-VRF: a MAC Virtual Routing and Forwarding table, as defined in 216 [RFC7432]. The instantiation of an EVI (EVPN Instance) in an NVE. 217 Route-distinghisher (RD) and Route-Target(s) (RTs) are required 218 properties of a MAC-VRF and they are normally different than the 219 ones defined in the associated IP-VRF (if the MAC-VRF has an IRB 220 interface). 222 * NVE: Network Virtualization Edge is a network entity that sits at 223 the edge of an underlay network and implements L2 and/or L3 224 network virtualization functions. The network-facing side of the 225 NVE uses the underlying L3 network to tunnel tenant frames to and 226 from other NVEs. The tenant-facing side of the NVE sends and 227 receives Ethernet frames to and from individual Tenant Systems. 228 In this document, an NVE could be implemented as a virtual switch 229 within a hypervisor, a switch or a router, and runs EVPN in the 230 control-plane. 232 * NVO3 or Overlay tunnels: Network Virtualization Over Layer-3 233 tunnels. In this document, NVO3 tunnels or simply Overlay tunnels 234 will be used interchangeably. Both terms refer to a way to 235 encapsulate tenant frames or packets into IP packets whose IP 236 Source Addresses (SA) or Destination Addresses (DA) belong to the 237 underlay IP address space, and identify NVEs connected to the same 238 underlay network. Examples of NVO3 tunnel encapsulations are 239 VXLAN [RFC7348], GENEVE [RFC8926] or MPLSoUDP [RFC7510]. 241 * PE: Provider Edge router. 243 * PTA: Provider Multicast Service Interface Tunnel Attribute. 245 * RT and RD: Route Target and Route Distinguisher. 247 * RT-1, RT-2, RT-3, etc.: they refer to Route Type followed by the 248 type number as defined in the IANA registry for EVPN route types. 250 * SA and DA: Source Address and Destination Address. They are used 251 along with MAC or IP, e.g. IP SA or MAC DA. 253 * SBD: Supplementary Broadcast Domain. Defined in [RFC9136], it is 254 a BD that does not have any ACs, only IRB interfaces, and provides 255 connectivity among all the IP-VRFs of a tenant in the Interface- 256 ful IP-VRF-to-IP-VRF models. 258 * TS: Tenant System. 260 * VNI: Virtual Network Identifier. Irrespective of the NVO3 261 encapsulation, the tunnel header always includes a VNI that is 262 added at the ingress NVE (based on the mapping table lookup) and 263 identifies the BT at the egress NVE. This VNI is called VNI in 264 VXLAN or GENEVE, VSID in nvGRE or Label in MPLSoGRE or MPLSoUDP. 265 This document will refer to VNI as a generic Virtual Network 266 Identifier for any NVO3 encapsulation. 268 * VXLAN: Virtual eXtensible Local Area Network, an NVO3 269 encapsulation defined in [RFC7348]. 271 3. Why is EVPN Needed in NVO3 Networks? 273 Data Centers have adopted NVO3 architectures mostly due to the issues 274 discussed in [RFC7364]. The architecture of a Data Center is 275 nowadays based on a CLOS design, where every Leaf is connected to a 276 layer of Spines, and there is a number of ECMP paths between any two 277 leaf nodes. All the links between Leaf and Spine nodes are routed 278 links, forming what we also know as an underlay IP Fabric. The 279 underlay IP Fabric does not have issues with loops or flooding (like 280 old Spanning Tree Data Center designs did), convergence is fast and 281 ECMP provides a fairly optimal bandwidth utilization on all the 282 links. 284 On this architecture and as discussed by [RFC7364] multi-tenant 285 intra-subnet and inter-subnet connectivity services are provided by 286 NVO3 tunnels, being VXLAN [RFC7348] or GENEVE [RFC8926] two examples 287 of such tunnels. 289 Why is a control-plane protocol along with NVO3 tunnels required? 290 There are three main reasons: 292 a. Auto-discovery of the remote NVEs that are attached to the same 293 VPN instance (Layer-2 and/or Layer-3) as the ingress NVE is. 295 b. Dissemination of the MAC/IP host information so that mapping 296 tables can be populated on the remote NVEs. 298 c. Advanced features such as MAC Mobility, MAC Protection, BUM and 299 ARP/ND traffic reduction/suppression, Multi-homing, Prefix 300 Independent Convergence (PIC) like functionality, Fast 301 Convergence, etc. 303 A possible approach to achieve points (a) and (b) above for 304 multipoint Ethernet services, is "flood and learn". "Flood and 305 learn" refers to not using a specific control-plane on the NVEs, but 306 rather "flood" BUM traffic from the ingress NVE to all the egress 307 NVEs attached to the same BD. The egress NVEs may then use data path 308 MAC SA "learning" on the frames received over the NVO3 tunnels. When 309 the destination host replies back and the frames arrive at the NVE 310 that initially flooded BUM frames, the NVE will also "learn" the MAC 311 SA of the frame encapsulated on the NVO3 tunnel. This approach has 312 the following drawbacks: 314 * In order to flood a given BUM frame, the ingress NVE must know the 315 IP addresses of the remote NVEs attached to the same BD. This may 316 be done as follows: 318 - The remote tunnel IP addresses can be statically provisioned on 319 the ingress NVE. If the ingress NVE receives a BUM frame for 320 the BD on an ingress AC, it will do ingress replication and 321 will send the frame to all the configured egress NVE IP DAs in 322 the BD. 324 - All the NVEs attached to the same BD can subscribe to an 325 underlay IP Multicast Group that is dedicated to that BD. When 326 an ingress NVE receives a BUM frame on an ingress AC, it will 327 send a single copy of the frame encapsulated into an NVO3 328 tunnel, using the multicast address as IP DA of the tunnel. 329 This solution requires PIM in the underlay network and the 330 association of individual BDs to underlay IP multicast groups. 332 * "Flood and learn" solves the issues of auto-discovery and learning 333 of the MAC to VNI/tunnel IP mapping on the NVEs for a given BD. 334 However, it does not provide a solution for advanced features and 335 it does not scale well (mostly due to the need for constant 336 flooding and the underlay PIM states that are needed to maintain). 338 EVPN provides a unified control-plane that solves the NVE auto- 339 discovery, tenant MAP/IP dissemination and advanced features in a 340 scalable way and keeping the independence of the underlay IP Fabric, 341 i.e., there is no need to enable PIM in the underlay network and 342 maintain multicast states for tenant BDs. 344 Section 4 describes how EVPN can be used to meet the control-plane 345 requirements in an NVO3 network. 347 4. Applicability of EVPN to NVO3 Networks 349 This section discusses the applicability of EVPN to NVO3 networks. 350 The intend is not to provide a comprehensive explanation of the 351 protocol itself but give an introduction and point at the 352 corresponding reference document, so that the reader can easily find 353 more details if needed. 355 4.1. EVPN Route Types Used in NVO3 Networks 357 EVPN supports multiple Route Types and each type has a different 358 function. For convenience, Table 1 shows a summary of all the 359 existing EVPN route types and its usage. We will refer to these 360 route types as RT-x routes throughout the rest of the document, where 361 x is the type number included in the first column of Table 1. 363 +======+=====================+======================================+ 364 | Type | Description | Usage | 365 +======+=====================+======================================+ 366 | 1 | Ethernet Auto- | Multi-homing: Per-ES: Mass | 367 | | Discovery | withdrawal, Per-EVI: aliasing/backup | 368 +------+---------------------+--------------------------------------+ 369 | 2 | MAC/IP | Host MAC/IP dissemination, supports | 370 | | Advertisement | MAC mobility and protection | 371 +------+---------------------+--------------------------------------+ 372 | 3 | Inclusive | NVE discovery and BUM flooding tree | 373 | | Multicast | setup | 374 | | Ethernet Tag | | 375 +------+---------------------+--------------------------------------+ 376 | 4 | Ethernet | Multi-homing: ES auto-discovery and | 377 | | Segment | DF Election | 378 +------+---------------------+--------------------------------------+ 379 | 5 | IP Prefix | IP Prefix dissemination | 380 +------+---------------------+--------------------------------------+ 381 | 6 | Selective | Indicate interest for a multicast | 382 | | Multicast | S,G or *,G | 383 | | Ethernet Tag | | 384 +------+---------------------+--------------------------------------+ 385 | 7 | Multicast Join | Multi-homing: S,G or *,G state synch | 386 | | Synch | | 387 +------+---------------------+--------------------------------------+ 388 | 8 | Multicast | Multi-homing: S,G or *,G leave synch | 389 | | Leave Synch | | 390 +------+---------------------+--------------------------------------+ 391 | 9 | Per-Region | BUM tree creation across regions | 392 | | I-PMSI A-D | | 393 +------+---------------------+--------------------------------------+ 394 | 10 | S-PMSI A-D | Multicast tree for S,G or *,G states | 395 +------+---------------------+--------------------------------------+ 396 | 11 | Leaf A-D | Used for responses to explicit | 397 | | | tracking | 398 +------+---------------------+--------------------------------------+ 400 Table 1: EVPN route types 402 4.2. EVPN Basic Applicability for Layer-2 Services 404 Although the applicability of EVPN to NVO3 networks spans multiple 405 documents, EVPN's baseline specification is [RFC7432]. [RFC7432] 406 allows multipoint layer-2 VPNs to be operated as [RFC4364] IP-VPNs, 407 where MACs and the information to setup flooding trees are 408 distributed by MP-BGP [RFC4760]. Based on [RFC7432], [RFC8365] 409 describes how to use EVPN to deliver Layer-2 services specifically in 410 NVO3 Networks. 412 Figure 1 represents a Layer-2 service deployed with an EVPN BD in an 413 NVO3 network. 415 +--TS2---+ 416 * | Single-Active 417 * | ESI-1 418 +----+ +----+ 419 |BD1 | |BD1 | 420 +-------------| |--| |-----------+ 421 | +----+ +----+ | 422 | NVE2 NVE3 NVE4 423 | EVPN NVO3 Network +----+ 424 NVE1(IP-A) | BD1|-----+ 425 +-------------+ RT-2 | | | 426 | | +-------+ +----+ | 427 | +----+ | |MAC1 | NVE5 TS3 428 TS1--------|BD1 | | |IP1 | +----+ | 429 MAC1 | +----+ | |Label L|---> | BD1|-----+ 430 IP1 | | |NH IP-A| | | All-Active 431 | Hypervisor | +-------+ +----+ ESI-2 432 +-------------+ | 433 +--------------------------------------+ 435 Figure 1: EVPN for L2 in an NVO3 Network - example 437 In a simple NVO3 network, such as the example of Figure 1, these are 438 the basic constructs that EVPN uses for Layer-2 services (or Layer-2 439 Virtual Networks): 441 * BD1 is an EVPN Broadcast Domain for a given tenant and TS1, TS2 442 and TS3 are connected to it. The five represented NVEs are 443 attached to BD1 and are connected to the same underlay IP network. 444 That is, each NVE learns the remote NVEs' loopback addresses via 445 underlay routing protocol. 447 * NVE1 is deployed as a virtual switch in a Hypervisor with IP-A as 448 underlay loopback IP address. The rest of the NVEs in Figure 1 449 are physical switches and TS2/TS3 are multi-homed to them. TS1 is 450 a virtual machine, identified by MAC1 and IP1. TS2 and TS3 are 451 physically dual-connected to NVEs, hence they are normally not 452 considered virtual machines. 454 4.2.1. Auto-Discovery and Auto-Provisioning 456 Auto-discovery is one of the basic capabilities of EVPN. The 457 provisioning of EVPN components in NVEs is significantly automated, 458 simplifying the deployment of services and minimizing manual 459 operations that are prone to human error. 461 These are some of the Auto-Discovery and Auto-Provisioning 462 capabilities available in EVPN: 464 * Automation on Ethernet Segments (ES): an ES is defined as a group 465 of NVEs that are attached to the same TS or network. An ES is 466 identified by an Ethernet Segment Identifier (ESI) in the control 467 plane, but neither the ESI nor the NVEs that share the same ES are 468 required to be manually provisioned in the local NVE: 470 - If the multi-homed TS or network are running protocols such as 471 LACP (Link Aggregation Control Protocol) [IEEE.802.1AX_2014], 472 MSTP (Multiple-instance Spanning Tree Protocol), G.8032, etc. 473 and all the NVEs in the ES can listen to the protocol PDUs to 474 uniquely identify the multi-homed TS/network, then the ESI can 475 be "auto-sensed" or "auto-provisioned" following the guidelines 476 in [RFC7432] section 5. The ESI can also be auto-derived out 477 of other parameters that are common to all NVEs attached to the 478 same ES. 480 - As described in [RFC7432], EVPN can also auto-derive the BGP 481 parameters required to advertise the presence of a local ES in 482 the control plane (RT and RD). Local ESes are advertised using 483 RT-4 routes and the ESI-import Route-Target used by RT-4 routes 484 can be auto-derived based on the procedures of [RFC7432], 485 section 7.6. 487 - By listening to other RT-4 routes that match the local ESI and 488 import RT, an NVE can also auto-discover the other NVEs 489 participating in the multi-homing for the ES. 491 - Once the NVE has auto-discovered all the NVEs attached to the 492 same ES, the NVE can automatically perform the DF Election 493 algorithm (which determines the NVE that will forward traffic 494 to the multi-homed TS/network). EVPN guarantees that all the 495 NVEs in the ES have a consistent DF Election. 497 * Auto-provisioning of services: when deploying a Layer-2 Service 498 for a tenant in an NVO3 network, all the NVEs attached to the same 499 subnet must be configured with a MAC-VRF and the BD for the 500 subnet, as well as certain parameters for them. Note that, if the 501 EVPN service model is VLAN-based or VLAN-bundle, implementations 502 do not normally have a specific provisioning for the BD (since it 503 is in that case the same construct as the MAC-VRF). EVPN allows 504 auto-deriving as many MAC-VRF parameters as possible. As an 505 example, the MAC-VRF's RT and RD for the EVPN routes may be auto- 506 derived. Section 5.1.2.1 in [RFC8365] specifies how to auto- 507 derive a MAC-VRF's RT as long as VLAN-based service model is 508 implemented. [RFC7432] specifies how to auto-derive the RD. 510 4.2.2. Remote NVE Auto-Discovery 512 Auto-discovery via MP-BGP [RFC4760] is used to discover the remote 513 NVEs attached to a given BD, the NVEs participating in a given 514 redundancy group, the tunnel encapsulation types supported by an NVE, 515 etc. 517 In particular, when a new MAC-VRF and BD are enabled, the NVE will 518 advertise a new RT-3 route. Besides other fields, the RT-3 route 519 will encode the IP address of the advertising NVE, the Ethernet Tag 520 (which is zero in case of VLAN-based and VLAN-bundle models) and also 521 a PMSI Tunnel Attribute (PTA) that indicates the information about 522 the intended way to deliver BUM traffic for the BD. 524 In the example of Figure 1, when BD1 is enabled, NVE1 will send an 525 RT-3 route including its own IP address, Ethernet-Tag for BD1 and the 526 PTA to the remote NVEs. Assuming Ingress Replication (IR) is used, 527 the RT-3 route will include an identification for IR in the PTA and 528 the VNI that the other NVEs in the BD must use to send BUM traffic to 529 the advertising NVE. The other NVEs in the BD will import the RT-3 530 route and will add NVE1's IP address to the flooding list for BD1. 531 Note that the RT-3 route is also sent with a BGP encapsulation 532 attribute [RFC9012] that indicates what NVO3 encapsulation the remote 533 NVEs should use when sending BUM traffic to NVE1. 535 Refer to [RFC7432] for more information about the RT-3 route and 536 forwarding of BUM traffic, and to [RFC8365] for its considerations on 537 NVO3 networks. 539 4.2.3. Distribution of Tenant MAC and IP Information 541 Tenant MAC/IP information is advertised to remote NVEs using RT-2 542 routes. Following the example of Figure 1: 544 * In a given EVPN BD, TSes' MAC addresses are first learned at the 545 NVE they are attached to, via data path or management plane 546 learning. In Figure 1 we assume NVE1 learns MAC1/IP1 in the 547 management plane (for instance, via Cloud Management System) since 548 the NVE is a virtual switch. NVE2, NVE3, NVE4 and NVE5 are TOR/ 549 Leaf switches and they normally learn MAC addresses via data path. 551 * Once NVE1's BD1 learns MAC1/IP1, NVE1 advertises that information 552 along with a VNI and Next Hop IP-A in an RT-2 route. The EVPN 553 routes are advertised using the RD/RTs of the MAC-VRF where the BD 554 belongs. All the NVEs in BD1 learn local MAC/IP addresses and 555 advertise them in RT-2 routes in a similar way. 557 * The remote NVEs can then add MAC1 to their mapping table for BD1 558 (BT). For instance, when TS3 sends frames to NVE4 with MAC DA = 559 MAC1, NVE4 does a MAC lookup on the BT that yields IP-A and Label 560 L. NVE4 can then encapsulate the frame into an NVO3 tunnel with 561 IP-A as the tunnel IP DA and L as the Virtual Network Identifier. 562 Note that the RT-2 route may also contain the host's IP address 563 (as in the example of Figure 1). While the MAC of the received 564 RT-2 route is installed in the BT, the IP address may be installed 565 in the Proxy-ARP/ND table (if enabled) or in the ARP/IP-VRF tables 566 if the BD has an IRB. See Section 4.7.3 to see more information 567 about Proxy-ARP/ND and Section 4.3. for more details about IRB and 568 Layer-3 services. 570 Refer to [RFC7432] and [RFC8365] for more information about the RT-2 571 route and forwarding of known unicast traffic. 573 4.3. EVPN Basic Applicability for Layer-3 Services 575 [RFC9136] and [RFC9135] are the reference documents that describe how 576 EVPN can be used for Layer-3 services. Inter Subnet Forwarding in 577 EVPN networks is implemented via IRB interfaces between BDs and IP- 578 VRFs. An EVPN BD corresponds to an IP subnet. When IP packets 579 generated in a BD are destined to a different subnet (different BD) 580 of the same tenant, the packets are sent to the IRB attached to the 581 local BD in the source NVE. As discussed in [RFC9135], depending on 582 how the IP packets are forwarded between the ingress NVE and the 583 egress NVE, there are two forwarding models: Asymmetric and Symmetric 584 model. 586 The Asymmetric model is illustrated in the example of Figure 2 and it 587 requires the configuration of all the BDs of the tenant in all the 588 NVEs attached to the same tenant. In that way, there is no need to 589 advertise IP Prefixes between NVEs since all the NVEs are attached to 590 all the subnets. It is called Asymmetric because the ingress and 591 egress NVEs do not perform the same number of lookups in the data 592 plane. In Figure 2, if TS1 and TS2 are in different subnets, and TS1 593 sends IP packets to TS2, the following lookups are required in the 594 data path: a MAC lookup (on BD1's table), an IP lookup (on the IP- 595 VRF) and a MAC lookup (on BD2's table) at the ingress NVE1 and then 596 only a MAC lookup at the egress NVE. The two IP-VRFs in Figure 2 are 597 not connected by tunnels and all the connectivity between the NVEs is 598 done based on tunnels between the BDs. 600 +-------------------------------------+ 601 | EVPN NVO3 | 602 | | 603 NVE1 NVE2 604 +--------------------+ +--------------------+ 605 | +---+IRB +------+ | | +------+IRB +---+ | 606 TS1-----|BD1|----|IP-VRF| | | |IP-VRF|----|BD1| | 607 | +---+ | | | | | | +---+ | 608 | +---+ | | | | | | +---+ | 609 | |BD2|----| | | | | |----|BD2|----TS2 610 | +---+IRB +------+ | | +------+IRB +---+ | 611 +--------------------+ +--------------------+ 612 | | 613 +-------------------------------------+ 615 Figure 2: EVPN for L3 in an NVO3 Network - Asymmetric model 617 In the Symmetric model, depicted in Figure 3, the same number of data 618 path lookups is needed at the ingress and egress NVEs. For example, 619 if TS1 sends IP packets to TS3, the following data path lookups are 620 required: a MAC lookup at NVE1's BD1 table, an IP lookup at NVE1's 621 IP-VRF and then IP lookup and MAC lookup at NVE2's IP-VRF and BD3 622 respectively. In the Symmetric model, the Inter Subnet connectivity 623 between NVEs is done based on tunnels between the IP-VRFs. 625 +-------------------------------------+ 626 | EVPN NVO3 | 627 | | 628 NVE1 NVE2 629 +--------------------+ +--------------------+ 630 | +---+IRB +------+ | | +------+IRB +---+ | 631 TS1-----|BD1|----|IP-VRF| | | |IP-VRF|----|BD3|-----TS3 632 | +---+ | | | | | | +---+ | 633 | +---+IRB | | | | +------+ | 634 TS2-----|BD2|----| | | +--------------------+ 635 | +---+ +------+ | | 636 +--------------------+ | 637 | | 638 +-------------------------------------+ 640 Figure 3: EVPN for L3 in an NVO3 Network - Symmetric model 642 The Symmetric model scales better than the Asymmetric model because 643 it does not require the NVEs to be attached to all the tenant's 644 subnets. However, it requires the use of NVO3 tunnels on the IP-VRFs 645 and the exchange of IP Prefixes between the NVEs in the control 646 plane. EVPN uses RT-2 and RT-5 routes for the exchange of host IP 647 routes (in the case of the RT-2 and the RT-5 routes) and IP Prefixes 648 (RT-5 routes) of any length. As an example, in Figure 3, NVE2 needs 649 to advertise TS3's host route and/or TS3's subnet, so that the IP 650 lookup on NVE1's IP- VRF succeeds. 652 [RFC9135] specifies the use of RT-2 routes for the advertisement of 653 host routes. Section 4.4.1 in [RFC9136] specifies the use of RT-5 654 routes for the advertisement of IP Prefixes in an "Interface-less IP- 655 VRF-to-IP-VRF Model". The Symmetric model for host routes can be 656 implemented following either approach: 658 a. [RFC9135] uses RT-2 routes to convey the information to populate 659 L2, ARP/ND and L3 FIB tables in the remote NVE. For instance, in 660 Figure 3, NVE2 would advertise a RT-2 route with TS3's IP and MAC 661 addresses, and including two labels/VNIs: a label-3/VNI-3 that 662 identifies BD3 for MAC lookup (that would be used for L2 traffic 663 in case NVE1 was attached to BD3 too) and a label-1/VNI-1 that 664 identifies the IP-VRF for IP lookup (and will be used for L3 665 traffic). NVE1 imports the RT-2 route and installs TS3's IP in 666 the IP-VRF route table with label-1/VNI-1. Traffic from e.g., 667 TS2 to TS3, will be encapsulated with label-1/VNI-1 and forwarded 668 to NVE2. 670 b. [RFC9136] uses RT-2 routes to convey the information to populate 671 the L2 FIB and ARP/ND tables, and RT-5 routes to populate the IP- 672 VRF L3 FIB table. For instance, in Figure 3, NVE2 would 673 advertise a RT-2 route including TS3's MAC and IP addresses with 674 a single label-3/VNI-3. In this example, this RT-2 route 675 wouldn't be imported by NVE1 because NVE1 is not attached to BD3. 676 In addition, NVE2 would advertise a RT-5 route with TS3's IP 677 address and label-1/VNI-1. This RT-5 route would be imported by 678 NVE1's IP-VRF and the host route installed in the L3 FIB 679 associated to label-1/VNI-1. Traffic from TS2 to TS3 would be 680 encapsulated with label-1/VNI-1. 682 4.4. EVPN as Control Plane for NVO3 Encapsulations and GENEVE 684 [RFC8365] describes how to use EVPN for NVO3 encapsulations, such us 685 VXLAN, nvGRE or MPLSoGRE. The procedures can be easily applicable to 686 any other NVO3 encapsulation, in particular GENEVE. 688 The Generic Network Virtualization Encapsulation [RFC8926] has been 689 recommended to be the proposed standard for NVO3 Encapsulation. The 690 EVPN control plane can signal the GENEVE encapsulation type in the 691 BGP Tunnel Encapsulation Extended Community (see [RFC9012]). 693 The NVO3 encapsulation design team has made a recommendation in 694 [I-D.ietf-nvo3-encap] for a control plane to: 696 1. Negotiate a subset of GENEVE option TLVs that can be carried on a 697 GENEVE tunnel 699 2. Enforce an order for GENEVE option TLVs and 701 3. Limit the total number of options that could be carried on a 702 GENEVE tunnel. 704 The EVPN control plane can easily extend the BGP Tunnel Encapsulation 705 Attribute sub-TLV [RFC9012] to specify the GENEVE tunnel options that 706 can be received or transmitted over a GENEVE tunnels by a given NVE. 707 [I-D.ietf-bess-evpn-geneve] describes the EVPN control plane 708 extensions to support GENEVE. 710 4.5. EVPN OAM and Application to NVO3 712 EVPN OAM (as in [I-D.ietf-bess-evpn-lsp-ping]) defines mechanisms to 713 detect data plane failures in an EVPN deployment over an MPLS 714 network. These mechanisms detect failures related to P2P and P2MP 715 connectivity, for multi-tenant unicast and multicast L2 traffic, 716 between multi-tenant access nodes connected to EVPN PE(s), and in a 717 single-homed, single-active or all-active redundancy model. 719 In general, EVPN OAM mechanisms defined for EVPN deployed in MPLS 720 networks are equally applicable for EVPN in NVO3 networks. 722 4.6. EVPN as the Control Plane for NVO3 Security 724 EVPN can be used to signal the security protection capabilities of a 725 sender NVE, as well as what portion of an NVO3 packet (taking a 726 GENEVE packet as an example) can be protected by the sender NVE, to 727 ensure the privacy and integrity of tenant traffic carried over the 728 NVO3 tunnels [I-D.sajassi-bess-secure-evpn]. 730 4.7. Advanced EVPN Features for NVO3 Networks 732 This section describes how EVPN can be used to deliver advanced 733 capabilities in NVO3 networks. 735 4.7.1. Virtual Machine (VM) Mobility 737 [RFC7432] replaces the traditional Ethernet Flood-and-Learn behavior 738 among NVEs with BGP-based MAC learning, which in return provides more 739 control over the location of MAC addresses in the BD and consequently 740 advanced features, such as MAC Mobility. If we assume that VM 741 Mobility means the VM's MAC and IP addresses move with the VM, EVPN's 742 MAC Mobility is the required procedure that facilitates VM Mobility. 743 According to [RFC7432] section 15, when a MAC is advertised for the 744 first time in a BD, all the NVEs attached to the BD will store 745 Sequence Number zero for that MAC. When the MAC "moves" within the 746 same BD but to a remote NVE, the NVE that just learned locally the 747 MAC, increases the Sequence Number in the RT-2 route's MAC Mobility 748 extended community to indicate that it owns the MAC now. That makes 749 all the NVE in the BD change their tables immediately with no need to 750 wait for any aging timer. EVPN guarantees a fast MAC Mobility 751 without flooding or black-holes in the BD. 753 4.7.2. MAC Protection, Duplication Detection and Loop Protection 755 The advertisement of MACs in the control plane, allows advanced 756 features such as MAC protection, Duplication Detection and Loop 757 Protection. 759 [RFC7432] MAC Protection refers to EVPN's ability to indicate - in a 760 RT-2 route - that a MAC must be protected by the NVE receiving the 761 route. The Protection is indicated in the "Sticky bit" of the MAC 762 Mobility extended community sent along the RT-2 route for a MAC. 763 NVEs' ACs that are connected to subject-to-be-protected servers or 764 VMs, may set the Sticky bit on the RT-2 routes sent for the MACs 765 associated to the ACs. Also, statically configured MAC addresses 766 should be advertised as Protected MAC addresses, since they are not 767 subject to MAC Mobility procedures. 769 [RFC7432] MAC Duplication Detection refers to EVPN's ability to 770 detect duplicate MAC addresses. A "MAC move" is a relearn event that 771 happens at an access AC or through a RT-2 route with a Sequence 772 Number that is higher than the stored one for the MAC. When a MAC 773 moves a number of times N within an M-second window between two NVEs, 774 the MAC is declared as Duplicate and the detecting NVE does not re- 775 advertise the MAC anymore. 777 [RFC7432] provides MAC Duplication Detection, and with an extension 778 it can protect the BD against loops created by backdoor links between 779 NVEs. The same principle (based on the Sequence Number) may be 780 extended to protect the BD against loops. When a MAC is detected as 781 duplicate, the NVE may install it as a black-hole MAC and drop 782 received frames with MAC SA and MAC DA matching that duplicate MAC. 783 The MAC Duplication extension to support Loop Protection is described 784 in [I-D.ietf-bess-rfc7432bis]. 786 4.7.3. Reduction/Optimization of BUM Traffic in Layer-2 Services 788 In BDs with a significant amount of flooding due to Unknown unicast 789 and Broadcast frames, EVPN may help reduce and sometimes even 790 suppress the flooding. 792 In BDs where most of the Broadcast traffic is caused by ARP (Address 793 Resolution Protocol) and ND (Neighbor Discovery) protocols on the 794 TSes, EVPN's Proxy-ARP and Proxy-ND capabilities may reduce the 795 flooding drastically. The use of Proxy-ARP/ND is specified in 796 [RFC9161]. 798 Proxy-ARP/ND procedures along with the assumption that TSes always 799 issue a GARP (Gratuitous ARP) or an unsolicited Neighbor 800 Advertisement message when they come up in the BD, may drastically 801 reduce the unknown unicast flooding in the BD. 803 The flooding caused by TSes' IGMP/MLD or PIM messages in the BD may 804 also be suppressed by the use of IGMP/MLD and PIM Proxy functions, as 805 specified in [I-D.ietf-bess-evpn-igmp-mld-proxy] and 806 [I-D.skr-bess-evpn-pim-proxy]. These two documents also specify how 807 to forward IP multicast traffic efficiently within the same BD, 808 translate soft state IGMP/MLD/PIM messages into hard state BGP routes 809 and provide fast-convergence redundancy for IP Multicast on multi- 810 homed Ethernet Segments (ESes). 812 4.7.4. Ingress Replication (IR) Optimization for BUM Traffic 814 When an NVE attached to a given BD needs to send BUM traffic for the 815 BD to the remote NVEs attached to the same BD, Ingress Replication is 816 a very common option in NVO3 networks, since it is completely 817 independent of the multicast capabilities of the underlay network. 818 Also, if the optimization procedures to reduce/suppress the flooding 819 in the BD are enabled (Section 4.7.3), in spite of creating multiple 820 copies of the same frame at the ingress NVE, Ingress Replication may 821 be good enough. However, in BDs where Multicast (or Broadcast) 822 traffic is significant, Ingress Replication may be very inefficient 823 and cause performance issues on virtual-switch-based NVEs. 825 [I-D.ietf-bess-evpn-optimized-ir] specifies the use of AR (Assisted 826 Replication) NVO3 tunnels in EVPN BDs. AR retains the independence 827 of the underlay network while providing a way to forward Broadcast 828 and Multicast traffic efficiently. AR uses AR-REPLICATORs that can 829 replicate the Broadcast/Multicast traffic on behalf of the AR-LEAF 830 NVEs. The AR-LEAF NVEs are typically virtual-switches or NVEs with 831 limited replication capabilities. AR can work in a single-stage 832 replication mode (Non-Selective Mode) or in a dual-stage replication 833 mode (Selective Mode). Both modes are detailed in 834 [I-D.ietf-bess-evpn-optimized-ir]. 836 In addition, [I-D.ietf-bess-evpn-optimized-ir] also describes a 837 procedure to avoid sending Broadcast, Multicast or Unknown unicast to 838 certain NVEs that do not need that type of traffic. This is done by 839 enabling PFL (Pruned Flood Lists) on a given BD. For instance, an 840 virtual-switch NVE that learns all its local MAC addresses for a BD 841 via Cloud Management System, does not need to receive the BD's 842 Unknown unicast traffic. Pruned Flood Lists help optimize the BUM 843 flooding in the BD. 845 4.7.5. EVPN Multi-Homing 847 Another fundamental concept in EVPN is multi-homing. A given TS can 848 be multi-homed to two or more NVEs for a given BD, and the set of 849 links connected to the same TS is defined as Ethernet Segment (ES). 850 EVPN supports single-active and all-active multi-homing. In single- 851 active multi-homing only one link in the ES is active. In all-active 852 multi-homing all the links in the ES are active for unicast traffic. 853 Both modes support load-balancing: 855 * Single-active multi-homing means per-service load-balancing to/ 856 from the TS. For example, in Figure 1, for BD1, only one of the 857 NVEs can forward traffic from/to TS2. For a different BD, the 858 other NVE may forward traffic. 860 * All-active multi-homing means per-flow load-balanding for unicast 861 frames to/from the TS. That is, in Figure 1 and for BD1, both 862 NVE4 and NVE5 can forward known unicast traffic to/from TS3. For 863 BUM traffic only one of the two NVEs can forward traffic to TS3, 864 and both can forward traffic from TS3. 866 There are two key aspects in the EVPN multi-homing procedures: 868 * DF (Designated Forwarder) election: the DF is the NVE that 869 forwards the traffic to the ES in single-active mode. In case of 870 all-active, the DF is the NVE that forwards the BUM traffic to the 871 ES. 873 * Split-horizon function: prevents the TS from receiving echoed BUM 874 frames that the TS itself sent to the ES. This is especially 875 relevant in all-active ESes, where the TS may forward BUM frames 876 to a non-DF NVE that can flood the BUM frames back to the DF NVE 877 and then the TS. As an example, in Figure 1, assuming NVE4 is the 878 DF for ES-2 in BD1, BUM frames sent from TS3 to NVE5 will be 879 received at NVE4 and, since NVE4 is the DF for DB1, it will 880 forward them back to TS3. Split-horizon allows NVE4 (and any 881 multi-homed NVE for that matter) to identify if an EVPN BUM frame 882 is coming from the same ES or different, and if the frame belongs 883 to the same ES2, NVE4 will not forward the BUM frame to TS3, in 884 spite of being the DF. 886 While [RFC7432] describes the default algorithm for the DF Election, 887 [RFC8584] and [I-D.ietf-bess-evpn-pref-df] specify other algorithms 888 and procedures that optimize the DF Election. 890 The Split-horizon function is specified in [RFC7432] and it is 891 carried out by using a special ESI-label that it identifies in the 892 data path, all the BUM frames being originated from a given NVE and 893 ES. Since the ESI-label is an MPLS label, it cannot be used in all 894 the non-MPLS NVO3 encapsulations, therefore [RFC8365] defines a 895 modified Split-horizon procedure that is based on the IP SA of the 896 NVO3 tunnel, and it is known as "Local-Bias". It is worth noting 897 that Local-Bias only works for all-active multi-homing, and not for 898 single-active multi-homing. 900 4.7.6. EVPN Recursive Resolution for Inter-Subnet Unicast Forwarding 902 Section 4.3 describes how EVPN can be used for Inter Subnet 903 Forwarding among subnets of the same tenant. RT-2 routes and RT-5 904 routes allow the advertisement of host routes and IP Prefixes (RT-5 905 route) of any length. The procedures outlined by Section 4.3 are 906 similar to the ones in [RFC4364], only for NVO3 tunnels. However, 907 [RFC9136] also defines advanced Inter Subnet Forwarding procedures 908 that allow the resolution of RT-5 routes to not only BGP next-hops 909 but also "overlay indexes" that can be a MAC, a GW IP or an ESI, all 910 of them in the tenant space. 912 Figure 4 illustrates an example that uses Recursive Resolution to a 913 GW-IP as per [RFC9136] section 4.4.2. In this example, IP-VRFs in 914 NVE1 and NVE2 are connected by a SBD (Supplementary BD). An SBD is a 915 BD that connects all the IP-VRFs of the same tenant, via IRB, and has 916 no ACs. NVE1 advertises the host route TS2-IP/L (IP address and 917 Prefix Length of TS2) in an RT-5 route with overlay index GWIP=IP1. 918 Also, IP1 is advertised in an RT-2 route associated to M1, VNI-S and 919 BGP next-hop NVE1. Upon importing the two routes, NVE2 installs TS2- 920 IP/L in the IP-VRF with a next-hop that is the GWIP IP1. NVE2 also 921 installs M1 in the SBD, with VNI-S and NVE1 as next-hop. If TS3 922 sends a packet with IP DA=TS2, NVE2 will perform a Recursive 923 Resolution of the RT-5 route prefix information to the forwarding 924 information of the correlated RT-2 route. The RT-5 route's Recursive 925 Resolution has several advantages such as better convergence in 926 scaled networks (since multiple RT-5 routes can be invalidated with a 927 single withdrawal of the overlay index route) or the ability to 928 advertise multiple RT-5 routes from an overlay index that can move or 929 change dynamically. [RFC9136] describes a few use-cases. 931 +-------------------------------------+ 932 | EVPN NVO3 | 933 | + 934 NVE1 NVE2 935 +--------------------+ +--------------------+ 936 | +---+IRB +------+ | | +------+IRB +---+ | 937 TS1-----|BD1|----|IP-VRF| | | |IP-VRF|----|BD3|-----TS3 938 | +---+ | |-(SBD)------(SBD)-| | +---+ | 939 | +---+IRB | |IRB(IP1/M1) IRB+------+ | 940 TS2-----|BD2|----| | | +-----------+--------+ 941 | +---+ +------+ | | 942 +--------------------+ | 943 | RT-2(M1,IP1,VNI-S,NVE1)--> | 944 | RT-5(TS2-IP/L,GWIP=IP1)--> | 945 +-------------------------------------+ 947 Figure 4: EVPN for L3 - Recursive Resolution example 949 4.7.7. EVPN Optimized Inter-Subnet Multicast Forwarding 951 The concept of the SBD described in Section 4.7.6 is also used in 952 [I-D.ietf-bess-evpn-irb-mcast] for the procedures related to Inter 953 Subnet Multicast Forwarding across BDs of the same tenant. For 954 instance, [I-D.ietf-bess-evpn-irb-mcast] allows the efficient 955 forwarding of IP multicast traffic from any BD to any other BD (or 956 even to the same BD where the Source resides). The 957 [I-D.ietf-bess-evpn-irb-mcast] procedures are supported along with 958 EVPN multi-homing, and for any tree allowed on NVO3 networks, 959 including IR or AR. [I-D.ietf-bess-evpn-irb-mcast] also describes 960 the interoperability between EVPN and other multicast technologies 961 such as MVPN (Multicast VPN) and PIM for inter-subnet multicast. 963 [I-D.ietf-bess-evpn-mvpn-seamless-interop] describes another 964 potential solution to support EVPN to MVPN interoperability. 966 4.7.8. Data Center Interconnect (DCI) 968 Tenant Layer-2 and Layer-3 services deployed on NVO3 networks must be 969 extended to remote NVO3 networks that are connected via non-NOV3 WAN 970 networks (mostly MPLS based WAN networks). [RFC9014] defines some 971 architectural models that can be used to interconnect NVO3 networks 972 via MPLS WAN networks. 974 When NVO3 networks are connected by MPLS WAN networks, [RFC9014] 975 specifies how EVPN can be used end-to-end, in spite of using a 976 different encapsulation in the WAN. [RFC9014] also supports the use 977 of NVO3 or Segment Routing (encoding 32-bit or 128-bit Segment 978 Identifiers into labels or IPv6 addresses respectively) transport 979 tunnels in the WAN. 981 Even if EVPN can also be used in the WAN for Layer-2 and Layer-3 982 services, there may be a need to provide a Gateway function between 983 EVPN for NVO3 encapsulations and IPVPN for MPLS tunnels, if the 984 operator uses IPVPN in the WAN. 985 [I-D.ietf-bess-evpn-ipvpn-interworking] specifies the interworking 986 function between EVPN and IPVPN for unicast Inter Subnet Forwarding. 987 If Inter Subnet Multicast Forwarding is also needed across an IPVPN 988 WAN, [I-D.ietf-bess-evpn-irb-mcast] describes the required 989 interworking between EVPN and MVPN (Multicast Virtual Private 990 Networks). 992 5. Conclusion 994 EVPN provides a unified control-plane that solves the NVE auto- 995 discovery, tenant MAP/IP dissemination and advanced features required 996 by NVO3 networks, in a scalable way and keeping the independence of 997 the underlay IP Fabric, i.e. there is no need to enable PIM in the 998 underlay network and maintain multicast states for tenant BDs. 1000 This document justifies the use of EVPN for NVO3 networks, discusses 1001 its applicability to basic Layer-2 and Layer-3 connectivity 1002 requirements, as well as advanced features such as MAC-mobility, MAC 1003 Protection and Loop Protection, multi-homing, DCI and much more. 1005 6. Conventions Used in this Document 1007 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 1008 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 1009 "OPTIONAL" in this document are to be interpreted as described in BCP 1010 14 [RFC2119] [RFC8174] when, and only when, they appear in all 1011 capitals, as shown here. 1013 7. Security Considerations 1015 This document does not introduce any new procedure or additional 1016 signaling in EVPN, and relies on the security considerations of the 1017 individual specifications used as a reference throughout the 1018 document. In particular, and as mentioned in [RFC7432], control 1019 plane and forwarding path protection are aspects to secure in any 1020 EVPN domain, when applied to NVO3 networks. 1022 [RFC7432] mentions security techniques such as those discussed in 1023 [RFC5925] to authenticate BGP messages, and those included in 1024 [RFC4271], [RFC4272] and [RFC6952] to secure BGP are relevant for 1025 EVPN in NVO3 networks as well. 1027 8. IANA Considerations 1029 None. 1031 9. References 1033 9.1. Normative References 1035 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1036 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 1037 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 1038 2015, . 1040 [RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. 1041 Rekhter, "Framework for Data Center (DC) Network 1042 Virtualization", RFC 7365, DOI 10.17487/RFC7365, October 1043 2014, . 1045 [RFC7364] Narten, T., Ed., Gray, E., Ed., Black, D., Fang, L., 1046 Kreeger, L., and M. Napierala, "Problem Statement: 1047 Overlays for Network Virtualization", RFC 7364, 1048 DOI 10.17487/RFC7364, October 2014, 1049 . 1051 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1052 Requirement Levels", BCP 14, RFC 2119, 1053 DOI 10.17487/RFC2119, March 1997, 1054 . 1056 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1057 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1058 May 2017, . 1060 9.2. Informative References 1062 [RFC9136] Rabadan, J., Ed., Henderickx, W., Drake, J., Lin, W., and 1063 A. Sajassi, "IP Prefix Advertisement in Ethernet VPN 1064 (EVPN)", RFC 9136, DOI 10.17487/RFC9136, October 2021, 1065 . 1067 [RFC9135] Sajassi, A., Salam, S., Thoria, S., Drake, J., and J. 1068 Rabadan, "Integrated Routing and Bridging in Ethernet VPN 1069 (EVPN)", RFC 9135, DOI 10.17487/RFC9135, October 2021, 1070 . 1072 [RFC8365] Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R., 1073 Uttaro, J., and W. Henderickx, "A Network Virtualization 1074 Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365, 1075 DOI 10.17487/RFC8365, March 2018, 1076 . 1078 [RFC8926] Gross, J., Ed., Ganga, I., Ed., and T. Sridhar, Ed., 1079 "Geneve: Generic Network Virtualization Encapsulation", 1080 RFC 8926, DOI 10.17487/RFC8926, November 2020, 1081 . 1083 [I-D.ietf-nvo3-encap] 1084 Boutros, S. and D. E. Eastlake, "Network Virtualization 1085 Overlays (NVO3) Encapsulation Considerations", Work in 1086 Progress, Internet-Draft, draft-ietf-nvo3-encap-08, 30 1087 April 2022, . 1090 [RFC9012] Patel, K., Van de Velde, G., Sangli, S., and J. Scudder, 1091 "The BGP Tunnel Encapsulation Attribute", RFC 9012, 1092 DOI 10.17487/RFC9012, April 2021, 1093 . 1095 [I-D.ietf-bess-evpn-lsp-ping] 1096 Jain, P., Salam, S., Sajassi, A., Boutros, S., and G. 1097 Mirsky, "LSP-Ping Mechanisms for EVPN and PBB-EVPN", Work 1098 in Progress, Internet-Draft, draft-ietf-bess-evpn-lsp- 1099 ping-07, 10 February 2022, 1100 . 1103 [RFC9161] Rabadan, J., Ed., Sathappan, S., Nagaraj, K., Hankins, G., 1104 and T. King, "Operational Aspects of Proxy ARP/ND in 1105 Ethernet Virtual Private Networks", RFC 9161, 1106 DOI 10.17487/RFC9161, January 2022, 1107 . 1109 [I-D.ietf-bess-evpn-igmp-mld-proxy] 1110 Sajassi, A., Thoria, S., Mishra, M., Drake, J., and W. 1111 Lin, "IGMP and MLD Proxy for EVPN", Work in Progress, 1112 Internet-Draft, draft-ietf-bess-evpn-igmp-mld-proxy-21, 22 1113 March 2022, . 1116 [I-D.skr-bess-evpn-pim-proxy] 1117 Rabadan, J., Kotalwar, J., Sathappan, S., Zhang, Z., and 1118 A. Sajassi, "PIM Proxy in EVPN Networks", Work in 1119 Progress, Internet-Draft, draft-skr-bess-evpn-pim-proxy- 1120 01, 30 October 2017, . 1123 [I-D.ietf-bess-evpn-optimized-ir] 1124 Rabadan, J., Sathappan, S., Lin, W., Katiyar, M., and A. 1125 Sajassi, "Optimized Ingress Replication Solution for 1126 Ethernet VPN (EVPN)", Work in Progress, Internet-Draft, 1127 draft-ietf-bess-evpn-optimized-ir-12, 25 January 2022, 1128 . 1131 [RFC8584] Rabadan, J., Ed., Mohanty, S., Ed., Sajassi, A., Drake, 1132 J., Nagaraj, K., and S. Sathappan, "Framework for Ethernet 1133 VPN Designated Forwarder Election Extensibility", 1134 RFC 8584, DOI 10.17487/RFC8584, April 2019, 1135 . 1137 [I-D.ietf-bess-evpn-pref-df] 1138 Rabadan, J., Sathappan, S., Przygienda, T., Lin, W., 1139 Drake, J., Sajassi, A., and S. Mohanty, "Preference-based 1140 EVPN DF Election", Work in Progress, Internet-Draft, 1141 draft-ietf-bess-evpn-pref-df-08, 23 September 2021, 1142 . 1145 [I-D.ietf-bess-evpn-irb-mcast] 1146 Lin, W., Zhang, Z., Drake, J., Rosen, E. C., Rabadan, J., 1147 and A. Sajassi, "EVPN Optimized Inter-Subnet Multicast 1148 (OISM) Forwarding", Work in Progress, Internet-Draft, 1149 draft-ietf-bess-evpn-irb-mcast-06, 24 May 2021, 1150 . 1153 [RFC9014] Rabadan, J., Ed., Sathappan, S., Henderickx, W., Sajassi, 1154 A., and J. Drake, "Interconnect Solution for Ethernet VPN 1155 (EVPN) Overlay Networks", RFC 9014, DOI 10.17487/RFC9014, 1156 May 2021, . 1158 [I-D.ietf-bess-evpn-ipvpn-interworking] 1159 Rabadan, J., Sajassi, A., Rosen, E., Drake, J., Lin, W., 1160 Uttaro, J., and A. Simpson, "EVPN Interworking with 1161 IPVPN", Work in Progress, Internet-Draft, draft-ietf-bess- 1162 evpn-ipvpn-interworking-06, 22 September 2021, 1163 . 1166 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1167 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1168 eXtensible Local Area Network (VXLAN): A Framework for 1169 Overlaying Virtualized Layer 2 Networks over Layer 3 1170 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, 1171 . 1173 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 1174 "Encapsulating MPLS in UDP", RFC 7510, 1175 DOI 10.17487/RFC7510, April 2015, 1176 . 1178 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1179 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1180 2006, . 1182 [CLOS1953] Clos, C., "A Study of Non-Blocking Switching Networks", 1183 March 1953. 1185 [I-D.ietf-bess-evpn-geneve] 1186 Boutros, S., Sajassi, A., Drake, J., Rabadan, J., and S. 1187 Aldrin, "EVPN control plane for Geneve", Work in Progress, 1188 Internet-Draft, draft-ietf-bess-evpn-geneve-04, 23 May 1189 2022, . 1192 [I-D.ietf-bess-evpn-mvpn-seamless-interop] 1193 Sajassi, A., Thiruvenkatasamy, K., Thoria, S., Gupta, A., 1194 and L. Jalil, "Seamless Multicast Interoperability between 1195 EVPN and MVPN PEs", Work in Progress, Internet-Draft, 1196 draft-ietf-bess-evpn-mvpn-seamless-interop-03, 25 October 1197 2021, . 1200 [I-D.sajassi-bess-secure-evpn] 1201 Sajassi, A., Banerjee, A., Thoria, S., Carrel, D., Weis, 1202 B., and J. Drake, "Secure EVPN", Work in Progress, 1203 Internet-Draft, draft-sajassi-bess-secure-evpn-05, 25 1204 October 2021, . 1207 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1208 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 1209 June 2010, . 1211 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1212 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1213 DOI 10.17487/RFC4271, January 2006, 1214 . 1216 [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", 1217 RFC 4272, DOI 10.17487/RFC4272, January 2006, 1218 . 1220 [RFC6952] Jethanandani, M., Patel, K., and L. Zheng, "Analysis of 1221 BGP, LDP, PCEP, and MSDP Issues According to the Keying 1222 and Authentication for Routing Protocols (KARP) Design 1223 Guide", RFC 6952, DOI 10.17487/RFC6952, May 2013, 1224 . 1226 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 1227 "Multiprotocol Extensions for BGP-4", RFC 4760, 1228 DOI 10.17487/RFC4760, January 2007, 1229 . 1231 [I-D.ietf-bess-rfc7432bis] 1232 Sajassi, A., Burdet, L. A., Drake, J., and J. Rabadan, 1233 "BGP MPLS-Based Ethernet VPN", Work in Progress, Internet- 1234 Draft, draft-ietf-bess-rfc7432bis-04, 7 March 2022, 1235 . 1238 [IEEE.802.1AX_2014] 1239 IEEE, "IEEE Standard for Local and metropolitan area 1240 networks -- Link Aggregation", 24 December 2014. 1242 Appendix A. Acknowledgments 1244 The authors want to thank Aldrin Isaac for his comments. 1246 Appendix B. Contributors 1248 Appendix C. Authors' Addresses 1250 Authors' Addresses 1251 Jorge Rabadan (editor) 1252 Nokia 1253 520 Almanor Ave 1254 Sunnyvale, CA 94085 1255 United States of America 1256 Email: jorge.rabadan@nokia.com 1258 Matthew Bocci 1259 Nokia 1260 Email: matthew.bocci@nokia.com 1262 Sami Boutros 1263 Ciena 1264 Email: sboutros@ciena.com 1266 Ali Sajassi 1267 Cisco Systems, Inc. 1268 Email: sajassi@cisco.com