idnits 2.17.1 draft-ietf-nvo3-evpn-applicability-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 8, 2019) is 1754 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'EVPN-PREFIX' is mentioned on line 911, but not defined == Unused Reference: 'BUM-UPDATE' is defined on line 1094, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-08 == Outdated reference: A later version (-16) exists of draft-ietf-nvo3-geneve-13 == Outdated reference: A later version (-12) exists of draft-ietf-nvo3-encap-03 == Outdated reference: A later version (-22) exists of draft-ietf-idr-tunnel-encaps-12 == Outdated reference: A later version (-11) exists of draft-ietf-bess-evpn-lsp-ping-00 == Outdated reference: A later version (-04) exists of draft-snr-bess-evpn-loop-protect-02 == Outdated reference: A later version (-16) exists of draft-ietf-bess-evpn-proxy-arp-nd-07 == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-03 == Outdated reference: A later version (-02) exists of draft-skr-bess-evpn-pim-proxy-01 == Outdated reference: A later version (-12) exists of draft-ietf-bess-evpn-optimized-ir-06 == Outdated reference: A later version (-13) exists of draft-ietf-bess-evpn-pref-df-04 == Outdated reference: A later version (-11) exists of draft-ietf-bess-evpn-irb-mcast-02 == Outdated reference: A later version (-14) exists of draft-ietf-bess-evpn-bum-procedure-updates-06 -- No information found for draft-ietf-sajassi-bess-evpn-ipvpn-interworking - is the name correct? Summary: 0 errors (**), 0 flaws (~~), 16 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NVO3 Workgroup J. Rabadan, Ed. 3 Internet Draft M. Bocci 4 Intended status: Informational Nokia 6 S. Boutros 7 VMware 9 A. Sajassi 10 Cisco 12 Expires: January 9, 2020 July 8, 2019 14 Applicability of EVPN to NVO3 Networks 15 draft-ietf-nvo3-evpn-applicability-02 17 Abstract 19 In NVO3 networks, Network Virtualization Edge (NVE) devices sit at 20 the edge of the underlay network and provide Layer-2 and Layer-3 21 connectivity among Tenant Systems (TSes) of the same tenant. The NVEs 22 need to build and maintain mapping tables so that they can deliver 23 encapsulated packets to their intended destination NVE(s). While 24 there are different options to create and disseminate the mapping 25 table entries, NVEs may exchange that information directly among 26 themselves via a control-plane protocol, such as EVPN. EVPN provides 27 an efficient, flexible and unified control-plane option that can be 28 used for Layer-2 and Layer-3 Virtual Network (VN) service 29 connectivity. This document describes the applicability of EVPN to 30 NVO3 networks and how EVPN solves the challenges in those networks. 32 Status of this Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF), its areas, and its working groups. Note that 39 other groups may also distribute working documents as Internet- 40 Drafts. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 46 The list of current Internet-Drafts can be accessed at 47 http://www.ietf.org/ietf/1id-abstracts.txt 49 The list of Internet-Draft Shadow Directories can be accessed at 50 http://www.ietf.org/shadow.html 52 This Internet-Draft will expire on January 9, 2020. 54 Copyright Notice 56 Copyright (c) 2019 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents 61 (http://trustee.ietf.org/license-info) in effect on the date of 62 publication of this document. Please review these documents 63 carefully, as they describe your rights and restrictions with respect 64 to this document. Code Components extracted from this document must 65 include Simplified BSD License text as described in Section 4.e of 66 the Trust Legal Provisions and are provided without warranty as 67 described in the Simplified BSD License. 69 Table of Contents 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 72 2. EVPN and NVO3 Terminology . . . . . . . . . . . . . . . . . . . 3 73 3. Why Is EVPN Needed In NVO3 Networks? . . . . . . . . . . . . . 6 74 4. Applicability of EVPN to NVO3 Networks . . . . . . . . . . . . 8 75 4.1. EVPN Route Types used in NVO3 Networks . . . . . . . . . . 8 76 4.2. EVPN Basic Applicability For Layer-2 Services . . . . . . . 9 77 4.2.1. Auto-Discovery and Auto-Provisioning of ES, 78 Multi-Homing PEs and NVE services . . . . . . . . . . . 10 79 4.2.2. Remote NVE Auto-Discovery . . . . . . . . . . . . . . . 11 80 4.2.3. Distribution Of Tenant MAC and IP Information . . . . . 12 81 4.3. EVPN Basic Applicability for Layer-3 Services . . . . . . . 13 82 4.4. EVPN as a Control Plane for NVO3 Encapsulations and 83 GENEVE . . . . . . . . . . . . . . . . . . . . . . . . . . 15 84 4.5. EVPN OAM and application to NVO3 . . . . . . . . . . . . . 16 85 4.6. EVPN as the control plane for NVO3 security . . . . . . . . 16 86 4.7. Advanced EVPN Features For NVO3 Networks . . . . . . . . . 16 87 4.7.1. Virtual Machine (VM) Mobility . . . . . . . . . . . . . 16 88 4.7.2. MAC Protection, Duplication Detection and Loop 89 Protection . . . . . . . . . . . . . . . . . . . . . . 17 90 4.7.3. Reduction/Optimization of BUM Traffic In Layer-2 91 Services . . . . . . . . . . . . . . . . . . . . . . . 17 93 4.7.4. Ingress Replication (IR) Optimization For BUM Traffic . 18 94 4.7.5. EVPN Multi-homing . . . . . . . . . . . . . . . . . . . 19 95 4.7.6. EVPN Recursive Resolution for Inter-Subnet Unicast 96 Forwarding . . . . . . . . . . . . . . . . . . . . . . 20 97 4.7.7. EVPN Optimized Inter-Subnet Multicast Forwarding . . . 21 98 4.7.8. Data Center Interconnect (DCI) . . . . . . . . . . . . 21 99 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 22 100 6. Conventions used in this document . . . . . . . . . . . . . . . 22 101 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 22 102 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 23 103 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 104 9.1 Normative References . . . . . . . . . . . . . . . . . . . . 23 105 9.2 Informative References . . . . . . . . . . . . . . . . . . . 23 106 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 26 107 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 26 108 12. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 26 110 1. Introduction 112 In NVO3 networks, Network Virtualization Edge (NVE) devices sit at 113 the edge of the underlay network and provide Layer-2 and Layer-3 114 connectivity among Tenant Systems (TSes) of the same tenant. The NVEs 115 need to build and maintain mapping tables so that they can deliver 116 encapsulated packets to their intended destination NVE(s). While 117 there are different options to create and disseminate the mapping 118 table entries, NVEs may exchange that information directly among 119 themselves via a control-plane protocol, such as EVPN. EVPN provides 120 an efficient, flexible and unified control-plane option that can be 121 used for Layer-2 and Layer-3 Virtual Network (VN) service 122 connectivity. 124 In this document, we assume that the EVPN control-plane module 125 resides in the NVEs. The NVEs can be virtual switches in hypervisors, 126 TOR/Leaf switches or Data Center Gateways. As described in [RFC7365], 127 Network Virtualization Authorities (NVAs) may be used to provide the 128 forwarding information to the NVEs, and in that case, EVPN could be 129 used to disseminate the information across multiple federated NVAs. 130 The applicability of EVPN would then be similar to the one described 131 in this document. However, for simplicity, the description assumes 132 control-plane communication among NVE(s). 134 2. EVPN and NVO3 Terminology 136 o EVPN: Ethernet Virtual Private Networks, as described in [RFC7432]. 138 o PE: Provider Edge router. 140 o NVO3 or Overlay tunnels: Network Virtualization Over Layer-3 141 tunnels. In this document, NVO3 tunnels or simply Overlay tunnels 142 will be used interchangeably. Both terms refer to a way to 143 encapsulate tenant frames or packets into IP packets whose IP 144 Source Addresses (SA) or Destination Addresses (DA) belong to the 145 underlay IP address space, and identify NVEs connected to the same 146 underlay network. Examples of NVO3 tunnel encapsulations are VXLAN 147 [RFC7348], [GENEVE] or MPLSoUDP [RFC7510]. 149 o VXLAN: Virtual eXtensible Local Area Network, an NVO3 encapsulation 150 defined in [RFC7348]. 152 o GENEVE: Generic Network Virtualization Encapsulation, an NVO3 153 encapsulation defined in [GENEVE]. 155 o CLOS: a multistage network topology described in [CLOS1953], where 156 all the edge switches (or Leafs) are connected to all the core 157 switches (or Spines). Typically used in Data Centers nowadays. 159 o ECMP: Equal Cost Multi-Path. 161 o NVE: Network Virtualization Edge is a network entity that sits at 162 the edge of an underlay network and implements L2 and/or L3 network 163 virtualization functions. The network-facing side of the NVE uses 164 the underlying L3 network to tunnel tenant frames to and from other 165 NVEs. The tenant-facing side of the NVE sends and receives Ethernet 166 frames to and from individual Tenant Systems. In this document, an 167 NVE could be implemented as a virtual switch within a hypervisor, a 168 switch or a router, and runs EVPN in the control-plane. 170 o EVI: or EVPN Instance. It is a Layer-2 Virtual Network that uses an 171 EVPN control-plane to exchange reachability information among the 172 member NVEs. It corresponds to a set of MAC-VRFs of the same 173 tenant. See MAC-VRF in this section. 175 o BD: or Broadcast Domain, it corresponds to a tenant IP subnet. If 176 no suppression techniques are used, a BUM frame that is injected in 177 a BD will reach all the NVEs that are attached to that BD. An EVI 178 may contain one or multiple BDs depending on the service model 179 [RFC7432]. This document will use the term BD to refer to a tenant 180 subnet. 182 o EVPN VLAN-based service model: it refers to one of the three 183 service models defined in [RFC7432]. It is characterized as a BD 184 that uses a single VLAN per physical access port to attach tenant 185 traffic to the BD. In this service model, there is only one BD per 186 EVI. 188 o EVPN VLAN-bundle service model: similar to VLAN-based but uses a 189 bundle of VLANs per physical port to attach tenant traffic to the 190 BD. As in VLAN-based, in this model there is a single BD per EVI. 192 o EVPN VLAN-aware bundle service model: similar to the VLAN-bundle 193 model but each individual VLAN value is mapped to a different BD. 194 In this model there are multiple BDs per EVI for a given tenant. 195 Each BD is identified by an "Ethernet Tag", that is a control-plane 196 value that identifies the routes for the BD within the EVI. 198 o IP-VRF: an IP Virtual Routing and Forwarding table, as defined in 199 [RFC4364]. It stores IP Prefixes that are part of the tenant's IP 200 space, and are distributed among NVEs of the same tenant by EVPN. 201 Route-Distinghisher (RD) and Route-Target(s) (RTs) are required 202 properties of an IP-VRF. An IP-VRF is instantiated in an NVE for a 203 given tenant, if the NVE is attached to multiple subnets of the 204 tenant and local inter-subnet-forwarding is required across those 205 subnets. 207 o MAC-VRF: a MAC Virtual Routing and Forwarding table, as defined in 208 [RFC7432]. The instantiation of an EVI (EVPN Instance) in an NVE. 209 Route-distinghisher (RD) and Route-Target(s) (RTs) are required 210 properties of a MAC-VRF and they are normally different than the 211 ones defined in the associated IP-VRF (if the MAC-VRF has an IRB 212 interface). 214 o BT: a Bridge Table, as defined in [RFC7432]. A BT is the 215 instantiation of a BD in an NVE. When there is a single BD on a 216 given EVI, the MAC-VRF is equivalent to the BT on that NVE. 218 o AC: Attachment Circuit or logical interface associated to a given 219 BT. To determine the AC on which a packet arrived, the NVE will 220 examine the physical/logical port and/or VLAN tags (where the VLAN 221 tags can be individual c-tags, s-tags or ranges of both). 223 o IRB: Integrated Routing and Bridging interface. It refers to the 224 logical interface that connects a BD instance (or a BT) to an IP- 225 VRF and allows to forward packets with destination in a different 226 subnet. 228 o ES: Ethernet Segment. When a Tenant System (TS) is connected to one 229 or more NVEs via a set of Ethernet links, then that set of links is 230 referred to as an 'Ethernet segment'. Each ES is represented by a 231 unique Ethernet Segment Identifier (ESI) in the NVO3 network and 232 the ESI is used in EVPN routes that are specific to that ES. 234 o DF and NDF: they refer to Designated Forwarder and Non-Designated 235 Forwarder, which are the roles that a given PE can have in a given 236 ES. 238 o VNI: Virtual Network Identifier. Irrespective of the NVO3 239 encapsulation, the tunnel header always includes a VNI that is 240 added at the ingress NVE (based on the mapping table lookup) and 241 identifies the BT at the egress NVE. This VNI is called VNI in 242 VXLAN or GENEVE, VSID in nvGRE or Label in MPLSoGRE or MPLSoUDP. 243 This document will refer to VNI as a generic Virtual Network 244 Identifier for any NVO3 encapsulation. 246 o BUM: Broadcast, Unknown unicast and Multicast frames. 248 o SA and DA: they refer to Source Address and Destination Address. 249 They are used along with MAC or IP, e.g. IP SA or MAC DA. 251 o RT and RD: they refer to Route Target and Route Distinguisher. 253 o PTA: Provider Multicast Service Interface Tunnel Attribute. 255 o RT-1, RT-2, RT-3, etc.: they refer to Route Type followed by the 256 type number as defined in the IANA registry for EVPN route types. 258 o TS: Tenant System. 260 o ARP and ND: they refer to Address Resolution Protocol and Neighbor 261 Discovery protocol. 263 o Ethernet Tag: Used to represent a BD that is configured on a given 264 ES for the purpose of DF election. Note that any of the following 265 may be used to represent a BD: VIDs (including Q-in-Q tags), 266 configured IDs, VNIs (Virtual Extensible Local Area Network (VXLAN) 267 Network Identifiers), normalized VIDs, I-SIDs (Service Instance 268 Identifiers), etc., as long as the representation of the BDs is 269 configured consistently across the multihomed PEs attached to that 270 ES. The Ethernet Tag value MUST be different from zero. 272 3. Why Is EVPN Needed In NVO3 Networks? 274 Data Centers have adopted NVO3 architectures mostly due to the issues 275 discussed in [RFC7364]. The architecture of a Data Center is nowadays 276 based on a CLOS design, where every Leaf is connected to a layer of 277 Spines, and there is a number of ECMP paths between any two leaf 278 nodes. All the links between Leaf and Spine nodes are routed links, 279 forming what we also know as an underlay IP Fabric. The underlay IP 280 Fabric does not have issues with loops or flooding (like old Spanning 281 Tree Data Center designs did), convergence is fast and ECMP provides 282 a fairly optimal bandwidth utilization on all the links. 284 On this architecture and as discussed by [RFC7364] multi-tenant 285 intra-subnet and inter-subnet connectivity services are provided by 286 NVO3 tunnels, being VXLAN [RFC7348] or [GENEVE] two examples of such 287 tunnels. 289 Why is a control-plane protocol along with NVO3 tunnels required? 290 There are three main reasons: 292 a) Auto-discovery of the remote NVEs that are attached to the same 293 VPN instance (Layer-2 and/or Layer-3) as the ingress NVE is. 295 b) Dissemination of the MAC/IP host information so that mapping 296 tables can be populated on the remote NVEs. 298 c) Advanced features such as MAC Mobility, MAC Protection, BUM and 299 ARP/ND traffic reduction/suppression, Multi-homing, Prefix 300 Independent Convergence (PIC) like functionality, Fast 301 Convergence, etc. 303 A possible approach to achieve points (a) and (b) above for 304 multipoint Ethernet services, is "Flood and Learn". "Flood and Learn" 305 refers to not using a specific control-plane on the NVEs, but rather 306 "Flood" BUM traffic from the ingress NVE to all the egress NVEs 307 attached to the same BD. The egress NVEs may then use data path MAC 308 SA "Learning" on the frames received over the NVO3 tunnels. When the 309 destination host replies back and the frames arrive at the NVE that 310 initially flooded BUM frames, the NVE will also "Learn" the MAC SA of 311 the frame encapsulated on the NVO3 tunnel. This approach has the 312 following drawbacks: 314 o In order to Flood a given BUM frame, the ingress NVE must know the 315 IP addresses of the remote NVEs attached to the same BD. This may 316 be done as follows: 318 - The remote tunnel IP addresses can be statically provisioned on 319 the ingress NVE. If the ingress NVE receives a BUM frame for the 320 BD on an ingress AC, it will do ingress replication and will send 321 the frame to all the configured egress NVE IP DAs in the BD. 323 - All the NVEs attached to the same BD can subscribe to an underlay 324 IP Multicast Group that is dedicated to that BD. When an ingress 325 NVE receives a BUM frame on an ingress AC, it will send a single 326 copy of the frame encapsulated into an NVO3 tunnel, using the 327 multicast address as IP DA of the tunnel. This solution requires 328 PIM in the underlay network and the association of individual BDs 329 to underlay IP multicast groups. 331 o "Flood and Learn" solves the issues of auto-discovery and learning 332 of the MAC to VNI/tunnel IP mapping on the NVEs for a given BD. 333 However, it does not provide a solution for advanced features and 334 it does not scale well. 336 EVPN provides a unified control-plane that solves the NVE auto- 337 discovery, tenant MAP/IP dissemination and advanced features in a 338 scalable way and keeping the independence of the underlay IP Fabric, 339 i.e. there is no need to enable PIM in the underlay network and 340 maintain multicast states for tenant BDs. 342 Section 4 describes how to apply EVPN to meet the control-plane 343 requirements in an NVO3 network. 345 4. Applicability of EVPN to NVO3 Networks 347 This section discusses the applicability of EVPN to NVO3 networks. 348 The intend is not to provide a comprehensive explanation of the 349 protocol itself but give an introduction and point at the 350 corresponding reference document, so that the reader can easily find 351 more details if needed. 353 4.1. EVPN Route Types used in NVO3 Networks 355 EVPN supports multiple Route Types and each type has a different 356 function. For convenience, Table 1 shows a summary of all the 357 existing EVPN route types and its usage. We will refer to these route 358 types as RT-x throughout the rest of the document, where x is the 359 type number included in the first column of Table 1. 361 +----+------------------------+-------------------------------------+ 362 |Type|Description |Usage | 363 +----+------------------------+-------------------------------------+ 364 |1 |Ethernet Auto-Discovery |Multi-homing: | 365 | | | Per-ES: Mass withdrawal | 366 | | | Per-EVI: aliasing/backup | 367 +----+------------------------+-------------------------------------+ 368 |2 |MAC/IP Advertisement |Host MAC/IP dissemination | 369 | | |Supports MAC mobility and protection | 370 +----+------------------------+-------------------------------------+ 371 |3 |Inclusive Multicast |NVE discovery and BUM flooding tree | 372 | |Ethernet Tag |setup | 373 +----+------------------------+-------------------------------------+ 374 |4 |Ethernet Segment |Multi-homing: ES auto-discovery and | 375 | | |DF Election | 376 +----+------------------------+-------------------------------------+ 377 |5 |IP Prefix |IP Prefix dissemination | 378 +----+------------------------+-------------------------------------+ 379 |6 |Selective Multicast |Indicate interest for a multicast | 380 | |Ethernet Tag |S,G or *,G | 381 +----+------------------------+-------------------------------------+ 382 |7 |Multicast Join Synch |Multi-homing: S,G or *,G state synch | 383 +----+------------------------+-------------------------------------+ 384 |8 |Multicast Leave Synch |Multi-homing: S,G or *,G leave synch | 385 +----+------------------------+-------------------------------------+ 386 |9 |Per-Region I-PMSI A-D |BUM tree creation across regions | 387 +----+------------------------+-------------------------------------+ 388 |10 |S-PMSI A-D |Multicast tree for S,G or *,G states | 389 +----+------------------------+-------------------------------------+ 390 |11 |Leaf A-D |Used for responses to explicit | 391 | | |tracking | 392 +----+------------------------+-------------------------------------+ 394 Table 1 EVPN route types 396 4.2. EVPN Basic Applicability For Layer-2 Services 398 Although the applicability of EVPN to NVO3 networks spans multiple 399 documents, EVPN's baseline specification is [RFC7432]. [RFC7432] 400 allows multipoint layer-2 VPNs to be operated as [RFC4364] IP-VPNs, 401 where MACs and the information to setup flooding trees are 402 distributed by MP-BGP. Based on [RFC7432], [RFC8365] describes how to 403 use EVPN to deliver Layer-2 services specifically in NVO3 Networks. 405 Figure 1 represents a Layer-2 service deployed with an EVPN BD in an 406 NVO3 network. 408 +--TS2---+ 409 * | Single-Active 410 * | ESI-1 411 +----+ +----+ 412 |BD1 | |BD1 | 413 +-------------| |--| |-----------+ 414 | +----+ +----+ | 415 | NVE2 NVE3 NVE4 416 | EVPN NVO3 Network +----+ 417 NVE1(IP-A) | BD1|=====+ 418 +-------------+ RT-2 | | | 419 | +-MAC-VRF1+ | +-------+ +----+ | 420 | | +----+ | | |MAC1 | NVE5 TS3 421 TS1--------|BD1 | | | |IP1 | +----+ | 422 MAC1 | | +----+ | | |Label L|---> | BD1|=====+ 423 IP1 | +---------+ | |NH IP-A| | | All-Active 424 | Hypervisor | +-------+ +----+ ESI-2 425 +-------------+ | 426 +--------------------------------------+ 428 Figure 1 EVPN for L2 in an NVO3 Network - example 430 In a simple NVO3 network, such as the example of Figure 1, these are 431 the basic constructs that EVPN uses for Layer-2 services (or Layer-2 432 Virtual Networks): 434 o BD1 is an EVPN Broadcast Domain for a given tenant and TS1, TS2 and 435 TS3 are connected to it. The five represented NVEs are attached to 436 BD1 and are connected to the same underlay IP network. That is, 437 each NVE learns the remote NVEs' loopback addresses via underlay 438 routing protocol. 440 o NVE1 is deployed as a virtual switch in a Hypervisor with IP-A as 441 underlay loopback IP address. The rest of the NVEs in Figure 1 are 442 physical switches and TS2/TS3 are multi-homed to them. TS1 is a 443 virtual machine, identified by MAC1 and IP1. TS2 and TS3 are 444 physically dual-connected to NVEs, hence they are normally not 445 considered virtual machines. 447 4.2.1. Auto-Discovery and Auto-Provisioning of ES, Multi-Homing PEs and 448 NVE services 450 Auto-discovery is one of the basic capabilities of EVPN. The 451 provisioning of EVPN components in NVEs is significantly automated, 452 simplifying the deployment of services and minimizing manual 453 operations that are prone to human error. 455 These are some of the Auto-Discovery and Auto-Provisioning 456 capabilities available in EVPN: 458 o Automation on Ethernet Segments (ES): an ES is defined as a group 459 of NVEs that are attached to the same TS or network. An ES is 460 identified by an Ethernet Segment Identifier (ESI) in the control 461 plane, but neither the ESI nor the NVEs that share the same ES are 462 required to be manually provisioned in the local NVE: 464 - If the multi-homed TS or network are running protocols such as 465 LACP (Link Aggregation Control Protocol), MSTP (Multiple-instance 466 Spanning Tree Protocol), G.8032, etc. and all the NVEs in the ES 467 can listen to the protocol PDUs to uniquely identify the multi- 468 homed TS/network, then the ESI can be "auto-sensed" or "auto- 469 provisioned" following the guidelines in [RFC7432] section 5. 471 - As described in [RFC7432], EVPN can also auto-derive the BGP 472 parameters required to advertise the presence of a local ES in 473 the control plane (RT and RD). Local ESes are advertised using 474 RT-4s and the ESI-import Route-Target used by RT-4s can be auto- 475 derived based on the procedures of [RFC7432], section 7.6. 477 - By listening to other RT-4s that match the local ESI and import 478 RT, an NVE can also auto-discover the other NVEs participating in 479 the multi-homing for the ES. 481 - Once the NVE has auto-discovered all the NVEs attached to the 482 same ES, the NVE can automatically perform the DF Election 483 algorithm (which determines the NVE that will forward traffic to 484 the multi-homed TS/network). EVPN guarantees that all the NVEs in 485 the ES have a consistent DF Election. 487 o Auto-provisioning of services: when deploying a Layer-2 Service for 488 a tenant in an NVO3 network, all the NVEs attached to the same 489 subnet must be configured with a MAC-VRF and the BD for the subnet, 490 as well as certain parameters for them. Note that, if the EVPN 491 service model is VLAN-based or VLAN-bundle, implementations do not 492 normally have a specific provisioning for the BD (since it is in 493 that case the same construct as the MAC-VRF). EVPN allows auto- 494 deriving as many MAC-VRF parameters as possible. As an example, the 495 MAC-VRF's RT and RD for the EVPN routes may be auto-derived. 496 Section 5.1.2.1 in [RFC8365] specifies how to auto-derive a MAC- 497 VRF's RT as long as VLAN-based service model is implemented. 498 [RFC7432] specifies how to auto-derive the RD. 500 4.2.2. Remote NVE Auto-Discovery 502 Auto-discovery via MP-BGP is used to discover the remote NVEs 503 attached to a given BD, NVEs participating in a given redundancy 504 group, the tunnel encapsulation types supported by an NVE, etc. 506 In particular, when a new MAC-VRF and BD are enabled, the NVE will 507 advertise a new RT-3. Besides other fields, the RT-3 will encode the 508 IP address of the advertising NVE, the Ethernet Tag (which is zero in 509 case of VLAN-based and VLAN-bundle models) and also a PMSI Tunnel 510 Attribute (PTA) that indicates the information about the intended way 511 to deliver BUM traffic for the BD. 513 In the example of Figure 1, when MAC-VRF1/BD1 are enabled, NVE1 will 514 send an RT-3 including its own IP address, Ethernet-Tag for BD1 and 515 the PTA. Assuming Ingress Replication (IR), the RT-3 will include an 516 identification for IR in the PTA and the VNI the NVEs must use to 517 send BUM traffic to the advertising NVE. The other NVEs in the BD, 518 will import the RT-3 and will add NVE1's IP address to the flooding 519 list for BD1. Note that the RT-3 is also sent with a BGP 520 encapsulation attribute [TUNNEL-ENCAP] that indicates what NVO3 521 encapsulation the remote NVEs should use when sending BUM traffic to 522 NVE1. 524 Refer to [RFC7432] for more information about the RT-3 and forwarding 525 of BUM traffic, and to [RFC8365] for its considerations on NVO3 526 networks. 528 4.2.3. Distribution Of Tenant MAC and IP Information 530 Tenant MAC/IP information is advertised to remote NVEs using RT-2s. 531 Following the example of Figure 1: 533 o In a given EVPN BD, TSes' MAC addresses are first learned at the 534 NVE they are attached to, via data path or management plane 535 learning. In Figure 1 we assume NVE1 learns MAC1/IP1 in the 536 management plane (for instance, via Cloud Management System) since 537 the NVE is a virtual switch. NVE2, NVE3, NVE4 and NVE4 are TOR/Leaf 538 switches and they normally learn MAC addresses via data path. 540 o Once NVE1's BD1 learns MAC1/IP1, NVE1 advertises that information 541 along with a VNI and Next Hop IP-A in an RT-2. The EVPN routes are 542 advertised using the RD/RTs of the MAC-VRF where the BD belongs. 543 All the NVEs in BD1 learn local MAC/IP addresses and advertise them 544 in RT-2 routes in a similar way. 546 o The remote NVEs can then add MAC1 to their mapping table for BD1 547 (BT). For instance, when TS3 sends frames to NVE4 with MAC DA = 548 MAC1, NVE4 does a MAC lookup on the BT that yields IP-A and Label 549 L. NVE4 can then encapsulate the frame into an NVO3 tunnel with IP- 550 A as the tunnel IP DA and L as the Virtual Network Identifier. Note 551 that the RT-2 may also contain the host's IP address (as in the 552 example of Figure 1). While the MAC of the received RT-2 is 553 installed in the BT, the IP address may be installed in the Proxy- 554 ARP/ND table (if enabled) or in the ARP/IP-VRF tables if the BD has 555 an IRB. See section 4.7.3. to see more information about Proxy- 556 ARP/ND and section 4.3. for more details about IRB and Layer-3 557 services. 559 Refer to [RFC7432] and [RFC8365] for more information about the RT-2 560 and forwarding of known unicast traffic. 562 4.3. EVPN Basic Applicability for Layer-3 Services 564 [IP-PREFIX] and [INTER-SUBNET] are the reference documents that 565 describe how EVPN can be used for Layer-3 services. Inter Subnet 566 Forwarding in EVPN networks is implemented via IRB interfaces between 567 BDs and IP-VRFs. As discussed, an EVPN BD corresponds to an IP 568 subnet. When IP packets generated in a BD are destined to a different 569 subnet (different BD) of the same tenant, the packets are sent to the 570 IRB attached to local BD in the source NVE. As discussed in [INTER- 571 SUBNET], depending on how the IP packets are forwarded between the 572 ingress NVE and the egress NVE, there are two forwarding models: 573 Asymmetric and Symmetric. 575 The Asymmetric model is illustrated in the example of Figure 2 and it 576 requires the configuration of all the BDs of the tenant in all the 577 NVEs attached to the same tenant. In that way, there is no need to 578 advertise IP Prefixes between NVEs since all the NVEs are attached to 579 all the subnets. It is called Asymmetric because the ingress and 580 egress NVEs do not perform the same number of lookups in the data 581 plane. In Figure 2, if TS1 and TS2 are in different subnets, and TS1 582 sends IP packets to TS2, the following lookups are required in the 583 data path: a MAC lookup (on BD1's table), an IP lookup (on the IP- 584 VRF) and a MAC lookup (on BD2's table) at the ingress NVE1 and then 585 only a MAC lookup at the egress NVE. The two IP-VRFs in Figure 2 are 586 not connected by tunnels and all the connectivity between the NVEs is 587 done based on tunnels between the BDs. 589 +-------------------------------------+ 590 | EVPN NVO3 | 591 | | 592 NVE1 NVE2 593 +--------------------+ +--------------------+ 594 | +---+IRB +------+ | | +------+IRB +---+ | 595 TS1-----|BD1|----|IP-VRF| | | |IP-VRF|----|BD1| | 596 | +---+ | | | | | | +---+ | 597 | +---+ | | | | | | +---+ | 598 | |BD2|----| | | | | |----|BD2|----TS2 599 | +---+IRB +------+ | | +------+IRB +---+ | 600 +--------------------+ +--------------------+ 601 | | 602 +-------------------------------------+ 604 Figure 2 EVPN for L3 in an NVO3 Network - Asymmetric model 606 In the Symmetric model, depicted in Figure 3, the same number of data 607 path lookups is needed at the ingress and egress NVEs. For example, 608 if TS1 sends IP packets to TS3, the following data path lookups are 609 required: a MAC lookup at NVE1's BD1 table, an IP lookup at NVE1's 610 IP-VRF and then IP lookup and MAC lookup at NVE2's IP-VRF and BD3 611 respectively. In the Symmetric model, the Inter Subnet connectivity 612 between NVEs is done based on tunnels between the IP-VRFs. 614 +-------------------------------------+ 615 | EVPN NVO3 | 616 | | 617 NVE1 NVE2 618 +--------------------+ +--------------------+ 619 | +---+IRB +------+ | | +------+IRB +---+ | 620 TS1-----|BD1|----|IP-VRF| | | |IP-VRF|----|BD3|-----TS3 621 | +---+ | | | | | | +---+ | 622 | +---+IRB | | | | +------+ | 623 TS2-----|BD2|----| | | +--------------------+ 624 | +---+ +------+ | | 625 +--------------------+ | 626 | | 627 +-------------------------------------+ 629 Figure 3 EVPN for L3 in an NVO3 Network - Symmetric model 631 The Symmetric model scales better than the Asymmetric model because 632 it does not require the NVEs to be attached to all the tenant's 633 subnets. However, it requires the use of NVO3 tunnels on the IP-VRFs 634 and the exchange of IP Prefixes between the NVEs in the control 635 plane. EVPN uses RT-2 and RT-5 routes for the exchange of host IP 636 routes (in the case of RT-2 and RT-5) and IP Prefixes (RT-5s) of any 637 length. As an example, in Figure 3, NVE2 needs to advertise TS3's 638 host route and/or TS3's subnet, so that the IP lookup on NVE1's IP- 639 VRF succeeds. 641 [INTER-SUBNET] specifies the use of RT-2s for the advertisement of 642 host routes. Section 4.4.1 in [IP-PREFIX] specifies the use of RT-5s 643 for the advertisement of IP Prefixes in an "Interface-less IP-VRF-to- 644 IP-VRF Model". The Symmetric model for host routes can be implemented 645 following either approach: 647 a. [INTER-SUBNET] uses RT-2s to convey the information to populate 648 L2, ARP/ND and L3 FIB tables in the remote NVE. For instance, in 649 Figure 3, NVE2 would advertise a RT-2 with TS3's IP and MAC 650 addresses, and including two labels/VNIs: a label-3/VNI-3 that 651 identifies BD3 for MAC lookup (that would be used for L2 traffic 652 in case NVE1 was attached to BD3 too) and a label-1/VNI-1 that 653 identifies the IP-VRF for IP lookup (and will be used for L3 654 traffic). NVE1 imports the RT-2 and installs TS3's IP in the IP- 655 VRF route table with label-1/VNI-1. Traffic from e.g., TS2 to TS3, 656 will be encapsulated with label-1/VNI-1 and forwarded to NVE2. 658 b. [IP-PREFIX] uses RT-2s to convey the information to populate the 659 L2 FIB and ARP/ND tables, and RT-5s to populate the IP-VRF L3 FIB 660 table. For instance, in Figure 3, NVE2 would advertise a RT-2 661 including TS3's MAC and IP addresses with a single label-3/VNI-3. 662 In this example, this RT-2 wouldn't be imported by NVE1 because 663 NVE1 is not attached to BD3. In addition, NVE2 would advertise a 664 RT-5 with TS3's IP address and label-1/VNI-1. This RT-5 would be 665 imported by NVE1's IP-VRF and the host route installed in the L3 666 FIB associated to label-1/VNI-1. Traffic from TS2 to TS3 would be 667 encapsulated with label-1/VNI-1. 669 4.4. EVPN as a Control Plane for NVO3 Encapsulations and GENEVE 671 [RFC8365] describes how to use EVPN for NVO3 encapsulations, such us 672 VXLAN, nvGRE or MPLSoGRE. The procedures can be easily applicable to 673 any other NVO3 encapsulation, in particular GENEVE. 675 The NVO3 working group has been working on different data plane 676 encapsulations. The Generic Network Virtualization Encapsulation 677 [GENEVE] has been recommended to be the proposed standard for NVO3 678 Encapsulation. The EVPN control plane can signal the GENEVE 679 encapsulation type in the BGP Tunnel Encapsulation Extended Community 680 (see [TUNNEL-ENCAP]). 682 The NVO3 encapsulation design team has made a recommendation in 683 [NVO3-ENCAP] for a control plane to: 685 1- Negotiate a subset of GENEVE option TLVs that can be carried on a 686 GENEVE tunnel 688 2- Enforce an order for GENEVE option TLVs and 690 3- Limit the total number of options that could be carried on a 691 GENEVE tunnel. 693 The EVPN control plane can easily extend the BGP Tunnel Encapsulation 694 Attribute sub-TLV [TUNNEL-ENCAP] to specify the GENEVE tunnel options 695 that can be received or transmitted over a GENEVE tunnels by a given 696 NVE. [EVPN-GENEVE] describes the EVPN control plane extensions to 697 support GENEVE. 699 4.5. EVPN OAM and application to NVO3 701 EVPN OAM (as in [EVPN-LSP-PING]) defines mechanisms to detect data 702 plane failures in an EVPN deployment over an MPLS network. These 703 mechanisms detect failures related to P2P and P2MP connectivity, for 704 multi-tenant unicast and multicast L2 traffic, between multi-tenant 705 access nodes connected to EVPN PE(s), and in a single-homed, single- 706 active or all-active redundancy model. 708 In general, EVPN OAM mechanisms defined for EVPN deployed in MPLS 709 networks are equally applicable for EVPN in NVO3 networks. 711 4.6. EVPN as the control plane for NVO3 security 713 EVPN can be used to signal the security protection capabilities of a 714 sender NVE, as well as what portion of an NVO3 packet (taking a 715 GENEVE packet as an example) can be protected by the sender NVE, to 716 ensure the privacy and integrity of tenant traffic carried over the 717 NVO3 tunnels. 719 4.7. Advanced EVPN Features For NVO3 Networks 721 This section describes how EVPN can be used to deliver advanced 722 capabilities in NVO3 networks. 724 4.7.1. Virtual Machine (VM) Mobility 726 [RFC7432] replaces the traditional Ethernet Flood-and-Learn behavior 727 among NVEs with BGP-based MAC learning, which in return provides more 728 control over the location of MAC addresses in the BD and consequently 729 advanced features, such as MAC Mobility. If we assume that VM 730 Mobility means the VM's MAC and IP addresses move with the VM, EVPN's 731 MAC Mobility is the required procedure that facilitates VM Mobility. 732 According to [RFC7432] section 15, when a MAC is advertised for the 733 first time in a BD, all the NVEs attached to the BD will store 734 Sequence Number zero for that MAC. When the MAC "moves" within the 735 same BD but to a remote NVE, the NVE that just learned locally the 736 MAC, increases the Sequence Number in the RT-2's MAC Mobility 737 extended community to indicate that it owns the MAC now. That makes 738 all the NVE in the BD change their tables immediately with no need to 739 wait for any aging timer. EVPN guarantees a fast MAC Mobility without 740 flooding or black-holes in the BD. 742 4.7.2. MAC Protection, Duplication Detection and Loop Protection 744 The advertisement of MACs in the control plane, allows advanced 745 features such as MAC protection, Duplication Detection and Loop 746 Protection. 748 [RFC7432] MAC Protection refers to EVPN's ability to indicate - in an 749 RT-2 - that a MAC must be protected by the NVE receiving the route. 750 The Protection is indicated in the "Sticky bit" of the MAC Mobility 751 extended community sent along the RT-2 for a MAC. NVEs' ACs that are 752 connected to subject-to-be-protected servers or VMs may set the 753 Sticky bit on the RT-2s sent for the MACs associated to the ACs. Also 754 statically configured MAC addresses should be advertised as Protected 755 MAC addresses, since they are not subject to MAC Mobility procedures. 757 [RFC7432] MAC Duplication Detection refers to EVPN's ability to 758 detect duplicate MAC addresses. A "MAC move" is a relearn event that 759 happens at an access AC or through an RT-2 with a Sequence Number 760 that is higher than the stored one for the MAC. When a MAC moves a 761 number of times N within an M-second window between two NVEs, the MAC 762 is declared as Duplicate and the detecting NVE does not re-advertise 763 the MAC anymore. 765 While [RFC7432] provides MAC Duplication Detection, it does not 766 protect the BD against loops created by backdoor links between NVEs. 767 However, the same principle (based on the Sequence Number) may be 768 extended to protect the BD against loops. When a MAC is detected as 769 duplicate, the NVE may install it as a black-hole MAC and drop 770 received frames with MAC SA and MAC DA matching that duplicate MAC. 771 Loop Protection is described in [LOOP]. 773 4.7.3. Reduction/Optimization of BUM Traffic In Layer-2 Services 774 In BDs with a significant amount of flooding due to Unknown unicast 775 and Broadcast frames, EVPN may help reduce and sometimes even 776 suppress the flooding. 778 In BDs where most of the Broadcast traffic is caused by ARP (Address 779 Resolution Protocol) and ND (Neighbor Discovery) protocols on the 780 TSes, EVPN's Proxy-ARP and Proxy-ND capabilities may reduce the 781 flooding drastically. The use of Proxy-ARP/ND is specified in [PROXY- 782 ARP-ND]. 784 Proxy-ARP/ND procedures along with the assumption that TSes always 785 issue a GARP (Gratuitous ARP) or an unsolicited Neighbor 786 Advertisement message when they come up in the BD, may drastically 787 reduce the unknown unicast flooding in the BD. 789 The flooding caused by TSes' IGMP/MLD or PIM messages in the BD may 790 also be suppressed by the use of IGMP/MLD and PIM Proxy functions, as 791 specified in [IGMP-MLD-PROXY] and [PIM-PROXY]. These two documents 792 also specify how to forward IP multicast traffic efficiently within 793 the same BD, translate soft state IGMP/MLD/PIM messages into hard 794 state BGP routes and provide fast-convergence redundancy for IP 795 Multicast on multi-homed Ethernet Segments (ESes). 797 4.7.4. Ingress Replication (IR) Optimization For BUM Traffic 799 When an NVE attached to a given BD needs to send BUM traffic for the 800 BD to the remote NVEs attached to the same BD, IR is a very common 801 option in NVO3 networks, since it is completely independent of the 802 multicast capabilities of the underlay network. Also, if the 803 optimization procedures to reduce/suppress the flooding in the BD are 804 enabled (section 4.7.3), in spite of creating multiple copies of the 805 same frame at the ingress NVE, IR may be good enough. However, in BDs 806 where Multicast (or Broadcast) traffic is significant, IR may be very 807 inefficient and cause performance issues on virtual-switch-based 808 NVEs. 810 [OPT-IR] specifies the use of AR (Assisted Replication) NVO3 tunnels 811 in EVPN BDs. AR retains the independence of the underlay network 812 while providing a way to forward Broadcast and Multicast traffic 813 efficiently. AR uses AR-REPLICATORs that can replicate the 814 Broadcast/Multicast traffic on behalf of the AR-LEAF NVEs. The AR- 815 LEAF NVEs are typically virtual-switches or NVEs with limited 816 replication capabilities. AR can work in a single-stage replication 817 mode (Non-Selective Mode) or in a dual-stage replication mode 818 (Selective Mode). Both modes are detailed in [OPT-IR]. 820 In addition, [OPT-IR] also describes a procedure to avoid sending 821 Broadcast, Multicast or Unknown unicast to certain NVEs that don't 822 need that type of traffic. This is done by enabling PFL (Pruned Flood 823 Lists) on a given BD. For instance, an virtual-switch NVE that learns 824 all its local MAC addresses for a BD via Cloud Management System, 825 does not need to receive the BD's Unknown unicast traffic. PFLs help 826 optimize the BUM flooding in the BD. 828 4.7.5. EVPN Multi-homing 830 Another fundamental concept in EVPN is multi-homing. A given TS can 831 be multi-homed to two or more NVEs for a given BD, and the set of 832 links connected to the same TS is defined as Ethernet Segment (ES). 833 EVPN supports single-active and all-active multi-homing. In single- 834 active multi-homing only one link in the ES is active. In all-active 835 multi-homing all the links in the ES are active for unicast traffic. 836 Both modes support load-balancing: 838 o Single-active multi-homing means per-service load-balancing 839 to/from the TS, for example, in Figure 1, for BD1 only one of the 840 NVEs can forward traffic from/to TS2. For a different BD, the 841 other NVE may forward traffic. 843 o All-active multi-homing means per-flow load-balanding for unicast 844 frames to/from the TS. That is, in Figure 1 and for BD1, both 845 NVE4 and NVE5 can forward known unicast traffic to/from TS3. For 846 BUM traffic only one of the two NVEs can forward traffic to TS3, 847 and both can forward traffic from TS3. 849 There are two key aspects of EVPN multi-homing: 851 o DF (Designated Forwarder) election: the DF is the NVE that 852 forwards the traffic to the ES in single-active mode. In case of 853 all-active, the DF is the NVE that forwards the BUM traffic to 854 the ES. 856 o Split-horizon function: prevents the TS from receiving echoed BUM 857 frames that the TS itself sent to the ES. This is especially 858 relevant in all-active ESes, where the TS may forward BUM frames 859 to a non-DF NVE that can flood the BUM frames back to the DF NVE 860 and then the TS. As an example, in Figure 1, assuming NVE4 is the 861 DF for ES-2 in BD1, BUM frames sent from TS3 to NVE5 will be 862 received at NVE4 and, since NVE4 is the DF for DB1, it will 863 forward them back to TS3. Split-horizon allows NVE4 (and any 864 multi-homed NVE for that matter) to identify if an EVPN BUM frame 865 is coming from the same ES or different, and if the frame belongs 866 to the same ES2, NVE4 will not forward the BUM frame to TS3, in 867 spite of being the DF. 869 While [RFC7432] describes the default algorithm for the DF Election, 870 [RFC8584] and [PREF-DF] specify other algorithms and procedures that 871 optimize the DF Election. 873 The Split-horizon function is specified in [RFC7432] and it is 874 carried out by using a special ESI-label that it identifies in the 875 data path, all the BUM frames being originated from a given NVE and 876 ES. Since the ESI-label is an MPLS label, it cannot be used in all 877 the non-MPLS NVO3 encapsulations, therefore [RFC8365] defines a 878 modified Split-horizon procedure that is based on the IP SA of the 879 NVO3 tunnel, known as "Local-Bias". It is worth noting that Local- 880 Bias only works for all-active multi-homing, and not for single- 881 active multi-homing. 883 4.7.6. EVPN Recursive Resolution for Inter-Subnet Unicast Forwarding 885 Section 4.3. describes how EVPN can be used for Inter Subnet 886 Forwarding among subnets of the same tenant. RT-2s and RT-5s allow 887 the advertisement of host routes and IP Prefixes (RT-5) of any 888 length. The procedures outlined by section 4.3. are similar to the 889 ones in [RFC4364], only for NVO3 tunnels. However, [EVPN-PREFIX] also 890 defines advanced Inter Subnet Forwarding procedures that allow the 891 resolution of RT-5s to not only BGP next-hops but also "overlay 892 indexes" that can be a MAC, a GW IP or an ESI, all of them in the 893 tenant space. 895 Figure 4 illustrates an example that uses Recursive Resolution to a 896 GWIP as per [IP-PREFIX] section 4.4.2. In this example, IP-VRFs in 897 NVE1 and NVE2 are connected by a SBD (Supplementary BD). An SBD is a 898 BD that connects all the IP-VRFs of the same tenant, via IRB, and has 899 no ACs. NVE1 advertises the host route TS2-IP/L (IP address and 900 Prefix Length of TS2) in an RT-5 with overlay index GWIP=IP1. Also, 901 IP1 is advertised in an RT-2 associated to M1, VNI-S and BGP next-hop 902 NVE1. Upon importing the two routes, NVE2 installs TS2-IP/L in the 903 IP-VRF with a next-hop that is the GWIP IP1. NVE2 also installs M1 in 904 the SBD, with VNI-S and NVE1 as next-hop. If TS3 sends a packet with 905 IP DA=TS2, NVE2 will perform a Recursive Resolution of the RT-5 906 prefix information to the forwarding information of the correlated 907 RT-2. The RT-5's Recursive Resolution has several advantages such as 908 better convergence in scaled networks (since multiple RT-5s can be 909 invalidated with a single withdrawal of the overlay index route) or 910 the ability to advertise multiple RT-5s from an overlay index that 911 can move or change dynamically. [EVPN-PREFIX] describes a few use- 912 cases. 914 +-------------------------------------+ 915 | EVPN NVO3 | 916 | + 917 NVE1 NVE2 918 +--------------------+ +--------------------+ 919 | +---+IRB +------+ | | +------+IRB +---+ | 920 TS1-----|BD1|----|IP-VRF| | | |IP-VRF|----|BD3|-----TS3 921 | +---+ | |-(SBD)------(SBD)-| | +---+ | 922 | +---+IRB | |IRB(IP1/M1) IRB+------+ | 923 TS2-----|BD2|----| | | +-----------+--------+ 924 | +---+ +------+ | | 925 +--------------------+ | 926 | RT-2(M1,IP1,VNI-S,NVE1)--> | 927 | RT-5(TS2-IP/L,GWIP=IP1)--> | 928 +-------------------------------------+ 930 Figure 4 EVPN for L3 - Recursive Resolution example 932 4.7.7. EVPN Optimized Inter-Subnet Multicast Forwarding 934 The concept of the SBD described in section 4.7.6 is also used in 935 [OISM] for the procedures related to Inter Subnet Multicast 936 Forwarding across BDs of the same tenant. For instance, [OISM] allows 937 the efficient forwarding of IP multicast traffic from any BD to any 938 other BD (or even to the same BD where the Source resides). The 939 [OISM] procedures are supported along with EVPN multi-homing, and for 940 any tree allowed on NVO3 networks, including IR or AR. [OISM] also 941 describes the interoperability between EVPN and other multicast 942 technologies such as MVPN (Multicast VPN) and PIM for inter-subnet 943 multicast. 945 [EVPN-MVPN] describes another potential solution to support EVPN to 946 MVPN interoperability. 948 4.7.8. Data Center Interconnect (DCI) 950 Tenant Layer-2 and Layer-3 services deployed on NVO3 networks must be 951 extended to remote NVO3 networks that are connected via non-NOV3 WAN 952 networks (mostly MPLS based WAN networks). [EVPN-DCI] defines some 953 architectural models that can be used to interconnect NVO3 networks 954 via MPLS WAN networks. 956 When NVO3 networks are connected by MPLS WAN networks, [EVPN-DCI] 957 specifies how EVPN can be used end-to-end, in spite of using a 958 different encapsulation in the WAN. 960 Even if EVPN can also be used in the WAN for Layer-2 and Layer-3 961 services, there may be a need to provide a Gateway function between 962 EVPN for NVO3 encapsulations and IPVPN for MPLS tunnels. [EVPN-IPVPN] 963 specifics the interworking function between EVPN and IPVPN for 964 unicast Inter Subnet Forwarding. If Inter Subnet Multicast Forwarding 965 is also needed across an IPVPN WAN, [OISM] describes the required 966 interworking between EVPN and MVPN. 968 5. Conclusion 970 EVPN provides a unified control-plane that solves the NVE auto- 971 discovery, tenant MAP/IP dissemination and advanced features required 972 by NVO3 networks, in a scalable way and keeping the independence of 973 the underlay IP Fabric, i.e. there is no need to enable PIM in the 974 underlay network and maintain multicast states for tenant BDs. 976 This document justifies the use of EVPN for NVO3 networks, discusses 977 its applicability to basic Layer-2 and Layer-3 connectivity 978 requirements, as well as advanced features such as MAC-mobility, MAC 979 Protection and Loop Protection, multi-homing, DCI and much more. 981 6. Conventions used in this document 983 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 984 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 985 "OPTIONAL" in this document are to be interpreted as described in BCP 986 14 [RFC2119] [RFC8174] when, and only when, they appear in all 987 capitals, as shown here. 989 7. Security Considerations 991 This document does not introduce any new procedure or additional 992 signaling in EVPN, and relies on the security considerations of the 993 individual specifications used as a reference throughout the 994 document. In particular, and as mentioned in [RFC7432], control plane 995 and forwarding path protection are aspects to secure in any EVPN 996 domain, when applied to NVO3 networks. 998 [RFC7432] mentions security techniques such as those discussed in 999 [RFC5925] to authenticate BGP messages, and those included in 1000 [RFC4271], [RFC4272] and [RFC6952] to secure BGP are relevant for 1001 EVPN in NVO3 networks as well. 1003 8. IANA Considerations 1005 None. 1007 9. References 1009 9.1 Normative References 1011 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1012 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 1013 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, . 1016 [RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. 1017 Rekhter, "Framework for Data Center (DC) Network Virtualization", 1018 RFC 7365, DOI 10.17487/RFC7365, October 2014, . 1021 [RFC7364] Narten, T., Ed., Gray, E., Ed., Black, D., Fang, L., 1022 Kreeger, L., and M. Napierala, "Problem Statement: Overlays for 1023 Network Virtualization", RFC 7364, DOI 10.17487/RFC7364, October 1024 2014, . 1026 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1027 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1028 1997, . 1030 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in 1031 RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 1032 2017, . 1034 9.2 Informative References 1036 [IP-PREFIX] Rabadan et al., "IP Prefix Advertisement in EVPN", 1037 draft-ietf-bess-evpn-prefix-advertisement-11, work in progress, May, 1038 2018. 1040 [INTER-SUBNET] Sajassi et al., "IP Inter-Subnet Forwarding in EVPN", 1041 draft-ietf-bess-evpn-inter-subnet-forwarding-08, work in progress, 1042 March, 2019. 1044 [RFC8365] Sajassi-Drake et al., "A Network Virtualization Overlay 1045 Solution using EVPN", RFC 8365, March 2017, . 1048 [GENEVE] Gross et al., "Geneve: Generic Network Virtualization 1049 Encapsulation", draft-ietf-nvo3-geneve-13, work in progress, March 1050 2019. 1052 [NVO3-ENCAP] Boutros et al., "NVO3 Encapsulation Considerations", 1053 draft-ietf-nvo3-encap-03, work in progress, July 2019. 1055 [TUNNEL-ENCAP] Rosen et al., "The BGP Tunnel Encapsulation 1056 Attribute", draft-ietf-idr-tunnel-encaps-12, work in progress, May 1057 2019. 1059 [EVPN-LSP-PING] Jain et al., "LSP-Ping Mechanisms for EVPN and PBB- 1060 EVPN", draft-ietf-bess-evpn-lsp-ping-00, work in progress, May 2019. 1062 [LOOP] Rabadan et al., "Loop Protection in EVPN networks", draft- 1063 snr-bess-evpn-loop-protect-02, work in progress, August 2018. 1065 [PROXY-ARP-ND] Rabadan et al., "Operational Aspects of Proxy-ARP/ND 1066 in EVPN Networks", draft-ietf-bess-evpn-proxy-arp-nd-07, work in 1067 progress, July 2019. 1069 [IGMP-MLD-PROXY] Sajassi et al., "IGMP and MLD Proxy for EVPN", 1070 draft-ietf-bess-evpn-igmp-mld-proxy-03, work in progress, June 2019. 1072 [PIM-PROXY] Rabadan et al., "PIM Proxy in EVPN Networks", draft-skr- 1073 bess-evpn-pim-proxy-01, work in progress, October 2017. 1075 [OPT-IR] Rabadan et al., "Optimized Ingress Replication solution for 1076 EVPN", draft-ietf-bess-evpn-optimized-ir-06, work in progress, 1077 October 2018. 1079 [RFC8584] Rabadan-Mohanty et al., "Framework for EVPN Designated 1080 Forwarder Election Extensibility", , April 2019. 1083 [PREF-DF] Rabadan et al., "Preference-based EVPN DF Election", 1084 draft-ietf-bess-evpn-pref-df-04, work in progress, June 2019. 1086 [OISM] Lin at al., "EVPN Optimized Inter-Subnet Multicast (OISM) 1087 Forwarding", draft-ietf-bess-evpn-irb-mcast-02, work in progress, 1088 January 2019. 1090 [EVPN-DCI] Rabadan et al., "Interconnect Solution for EVPN Overlay 1091 networks", draft-ietf-bess-dci-evpn-overlay-10, work in progress, 1092 March 2018. 1094 [BUM-UPDATE] Zhang et al., "Updates on EVPN BUM Procedures", draft- 1095 ietf-bess-evpn-bum-procedure-updates-06, work in progress, June 2019. 1097 [EVPN-IPVPN] Rabadan-Sajassi et al., "EVPN Interworking with IPVPN", 1098 draft-ietf-sajassi-bess-evpn-ipvpn-interworking-01, work in progress, 1099 March 2019. 1101 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1102 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual eXtensible 1103 Local Area Network (VXLAN): A Framework for Overlaying Virtualized 1104 Layer 2 Networks over Layer 3 Networks", RFC 7348, DOI 1105 10.17487/RFC7348, August 2014, . 1108 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 1109 "Encapsulating MPLS in UDP", RFC 7510, DOI 10.17487/RFC7510, April 1110 2015, . 1112 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1113 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006, 1114 . 1116 [CLOS1953] Clos, C., "A Study of Non-Blocking Switching Networks", 1117 The Bell System Technical Journal, Vol. 32(2), DOI 10.1002/j.1538- 1118 7305.1953.tb01433.x, March 1953. 1120 [EVPN-GENEVE] Boutros et al., "EVPN control plane for Geneve", 1121 draft-boutros-bess-evpn-geneve-04, work in progress, March 2019. 1123 [EVPN-MVPN] Sajassi et al., "Seamless Multicast Interoperability 1124 between EVPN and MVPN PEs", draft-sajassi-bess-evpn-mvpn-seamless- 1125 interop-04, work in progress, July 2019. 1127 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1128 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, June 2010, 1129 . 1131 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1132 Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 10.17487/RFC4271, 1133 January 2006, . 1135 [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", RFC 1136 4272, DOI 10.17487/RFC4272, January 2006, . 1139 [RFC6952] Jethanandani, M., Patel, K., and L. Zheng, "Analysis of 1140 BGP, LDP, PCEP, and MSDP Issues According to the Keying and 1141 Authentication for Routing Protocols (KARP) Design Guide", RFC 6952, 1142 DOI 10.17487/RFC6952, May 2013, . 1145 10. Acknowledgments 1147 The authors want to thank Aldrin Isaac for his comments. 1149 11. Contributors 1151 12. Authors' Addresses 1153 Jorge Rabadan (Editor) 1154 Nokia 1155 777 E. Middlefield Road 1156 Mountain View, CA 94043 USA 1157 Email: jorge.rabadan@nokia.com 1159 Sami Boutros 1160 VMware 1161 Email: boutross@vmware.com 1163 Matthew Bocci 1164 Nokia 1165 Email: matthew.bocci@nokia.com 1167 Ali Sajassi 1168 Cisco 1169 Email: sajassi@cisco.com