idnits 2.17.1 draft-sajassi-bess-evpn-mvpn-seamless-interop-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 26 instances of lines with control characters in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 266: '... The solution SHALL support optimum ...' RFC 2119 keyword, line 268: '...As. The solution SHALL support optimum...' RFC 2119 keyword, line 273: '...ability, the solution SHALL use only a...' RFC 2119 keyword, line 276: '...ls. The solution MUST support optimum ...' RFC 2119 keyword, line 279: '... - Non-IP traffic SHALL be forwarded per EVPN baseline [RFC7432] or...' (34 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1044 has weird spacing: '...rder to choos...' -- The document date (October 22, 2018) is 2003 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 189, but not defined == Missing Reference: 'IEEE802.1Q' is mentioned on line 381, but not defined == Missing Reference: 'RFC 6513' is mentioned on line 440, but not defined == Missing Reference: 'RF7432' is mentioned on line 735, but not defined == Missing Reference: 'IGMP-PROXY' is mentioned on line 861, but not defined == Missing Reference: 'TUNNEL-ENCAP' is mentioned on line 1093, but not defined == Unused Reference: 'RFC7080' is defined on line 1229, but no explicit reference was found in the text == Unused Reference: 'RFC7209' is defined on line 1233, but no explicit reference was found in the text == Unused Reference: 'RFC4389' is defined on line 1236, but no explicit reference was found in the text == Unused Reference: 'RFC4761' is defined on line 1239, but no explicit reference was found in the text == Unused Reference: 'TUNNEL-ENCAPS' is defined on line 1247, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-01 == Outdated reference: A later version (-02) exists of draft-skr-bess-evpn-pim-proxy-00 Summary: 2 errors (**), 0 flaws (~~), 16 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Working Group A. Sajassi 3 Internet Draft S. Thoria 4 Category: Standard Track Cisco 5 A. Gupta 6 Avi Networks 7 L. Jalil 8 Verizon 10 Expires: May 22, 2019 October 22, 2018 12 Seamless Multicast Interoperability between EVPN and MVPN PEs 13 draft-sajassi-bess-evpn-mvpn-seamless-interop-03 15 Abstract 17 Ethernet Virtual Private Network (EVPN) solution is becoming 18 pervasive for Network Virtualization Overlay (NVO) services in data 19 center (DC) networks and as the next generation VPN services in 20 service provider (SP) networks. 22 As service providers transform their networks in their COs toward 23 next generation data center with Software Defined Networking (SDN) 24 based fabric and Network Function Virtualization (NFV), they want to 25 be able to maintain their offered services including Multicast VPN 26 (MVPN) service between their existing network and their new Service 27 Provider Data Center (SPDC) network seamlessly without the use of 28 gateway devices. They want to have such seamless interoperability 29 between their new SPDCs and their existing networks for a) reducing 30 cost, b) having optimum forwarding, and c) reducing provisioning. 31 This document describes a unified solution based on RFCs 6513 & 6514 32 for seamless interoperability of Multicast VPN between EVPN and MVPN 33 PEs. Furthermore, it describes how the proposed solution can be used 34 as a routed multicast solution in data centers with only EVPN PEs. 36 Status of this Memo 38 This Internet-Draft is submitted to IETF in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF), its areas, and its working groups. Note that 43 other groups may also distribute working documents as 44 Internet-Drafts. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 The list of current Internet-Drafts can be accessed at 52 http://www.ietf.org/1id-abstracts.html 54 The list of Internet-Draft Shadow Directories can be accessed at 55 http://www.ietf.org/shadow.html 57 Copyright and License Notice 59 Copyright (c) 2015 IETF Trust and the persons identified as the 60 document authors. All rights reserved. 62 This document is subject to BCP 78 and the IETF Trust's Legal 63 Provisions Relating to IETF Documents 64 (http://trustee.ietf.org/license-info) in effect on the date of 65 publication of this document. Please review these documents 66 carefully, as they describe your rights and restrictions with respect 67 to this document. Code Components extracted from this document must 68 include Simplified BSD License text as described in Section 4.e of 69 the Trust Legal Provisions and are provided without warranty as 70 described in the Simplified BSD License. 72 Table of Contents 74 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 75 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 5 76 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 77 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 6 78 4.1. Optimum Forwarding . . . . . . . . . . . . . . . . . . . . 6 79 4.2. Optimum Replication . . . . . . . . . . . . . . . . . . . . 6 80 4.3. All-Active and Single-Active Multi-Homing . . . . . . . . . 7 81 4.4. Inter-AS Tree Stitching . . . . . . . . . . . . . . . . . . 7 82 4.5. EVPN Service Interfaces . . . . . . . . . . . . . . . . . . 7 83 4.6. Distributed Anycast Gateway . . . . . . . . . . . . . . . . 7 84 4.7. Selective & Aggregate Selective Tunnels . . . . . . . . . . 8 85 4.8. Tenants' (S,G) or (*,G) states . . . . . . . . . . . . . . 8 86 4.9. Zero Disruption upon BD/Subnet Addition . . . . . . . . . . 8 87 4.10. No Changes to Existing EVPN Service Interface Models . . . 8 88 5. IRB Unicast versus IRB Multicast . . . . . . . . . . . . . . . 8 89 5.1. Emulated Virtual LAN Service . . . . . . . . . . . . . . . 9 90 6. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 9 91 6.1. Operational Model for EVPN IRB PEs . . . . . . . . . . . . 9 92 6.2. Unicast Route Advertisements for IP multicast Source . . . 12 93 6.3. Multi-homing of IP Multicast Source and Receivers . . . . 13 94 6.3.1. Single-Active Multi-Homing . . . . . . . . . . . . . . 13 95 6.3.2. All-Active Multi-Homing . . . . . . . . . . . . . . . 14 96 6.4. Mobility for Tenant's Sources and Receivers . . . . . . . 16 97 6.5. Intra-Subnet BUM Traffic Handling . . . . . . . . . . . . 17 98 7. Control Plane Operation . . . . . . . . . . . . . . . . . . . 17 99 7.1. Intra-ES IP Multicast Tunnel . . . . . . . . . . . . . . . 17 100 7.2. Intra-Subnet BUM Tunnel . . . . . . . . . . . . . . . . . . 18 101 7.3. Inter-Subnet IP Multicast Tunnel . . . . . . . . . . . . . 19 102 7.4. IGMP Hosts as TSes . . . . . . . . . . . . . . . . . . . . 19 103 7.5. TS PIM Routers . . . . . . . . . . . . . . . . . . . . . . 20 104 8 Data Plane Operation . . . . . . . . . . . . . . . . . . . . . 20 105 8.1 Intra-Subnet L2 Switching . . . . . . . . . . . . . . . . . 21 106 8.2 Inter-Subnet L3 Routing . . . . . . . . . . . . . . . . . . 21 107 9. DCs with only EVPN PEs . . . . . . . . . . . . . . . . . . . . 22 108 9.1. Setup of overlay multicast delivery . . . . . . . . . . . . 22 109 9.2. Handling of different encapsulations . . . . . . . . . . . 24 110 9.2.1. MPLS Encapsulation . . . . . . . . . . . . . . . . . . 24 111 9.2.2 VxLAN Encapsulation . . . . . . . . . . . . . . . . . . 24 112 9.2.3. Other Encapsulation . . . . . . . . . . . . . . . . . 24 113 10. DCI with MPLS in WAN and VxLAN in DCs . . . . . . . . . . . . 25 114 10.1. Control plane inter-connect . . . . . . . . . . . . . . . 25 115 10.2. Data plane inter-connect . . . . . . . . . . . . . . . . . 26 116 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 117 12. Security Considerations . . . . . . . . . . . . . . . . . . . 27 118 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 27 119 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 27 120 14.1. Normative References . . . . . . . . . . . . . . . . . . 27 121 14.2. Informative References . . . . . . . . . . . . . . . . . 28 122 15. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 28 123 Appendix A. Use Cases . . . . . . . . . . . . . . . . . . . . . . 29 124 A.1. DCs with only IGMP/MLD hosts w/o tenant router . . . . . . 29 125 A.2. DCs with mixed of IGMP/MLD hosts & multicast routers 126 running PIM-SSM . . . . . . . . . . . . . . . . . . . . . 30 127 A.3. DCs with mixed of IGMP/MLD hosts & multicast routers 128 running PIM-ASM . . . . . . . . . . . . . . . . . . . . . 30 129 A.4. DCs with mixed of IGMP/MLD hosts & multicast routers 130 running PIM-Bidir . . . . . . . . . . . . . . . . . . . . 30 132 1. Introduction 134 Ethernet Virtual Private Network (EVPN) solution is becoming 135 pervasive for Network Virtualization Overlay (NVO) services in data 136 center (DC) networks and as the next generation VPN services in 137 service provider (SP) networks. 139 As service providers transform their networks in their COs toward 140 next generation data center with Software Defined Networking (SDN) 141 based fabric and Network Function Virtualization (NFV), they want to 142 be able to maintain their offered services including Multicast VPN 143 (MVPN) service between their existing network and their new SPDC 144 network seamlessly without the use of gateway devices. There are 145 several reasons for having such seamless interoperability between 146 their new DCs and their existing networks: 148 - Lower Cost: gateway devices need to have very high scalability to 149 handle VPN services for their DCs and as such need to handle large 150 number of VPN instances (in tens or hundreds of thousands) and very 151 large number of routes (e.g., in tens of millions). For the same 152 speed and feed, these high scale gateway boxes are relatively much 153 more expensive than the edge devices (e.g., PEs and TORs) that 154 support much lower number of routes and VPN instances. 156 - Optimum Forwarding: in a given CO, both EVPN PEs and MVPN PEs can 157 be connected to the same fabric/network (e.g., same IGP domain). In 158 such scenarios, the service providers want to have optimum forwarding 159 among these PE devices without the use of gateway devices. Because if 160 gateway devices are used, then the IP multicast traffic between an 161 EVPN and MVPN PEs can no longer be optimum and in some case, it may 162 even get tromboned. Furthermore, when an SPDC network spans across 163 multiple LATA (multiple geographic areas) and gateways are used 164 between EVPN and MVPN PEs, then with respect to IP multicast traffic, 165 only one GW can be designated forwarder (DF) between EVPN and MVPN 166 PEs. Such scenarios not only results in non-optimum forwarding but 167 also it can result in tromboing of IP multicast traffic between the 168 two LATAs when both source and destination PEs are in the same LATA 169 and the DF gateway is elected to be in a different LATA. 171 - Less Provisioning: If gateways are used, then the operator need to 172 configure per-tenant info on the gateways. In other words, for each 173 tenant that is configured, one (or maybe two) additional touch points 174 are needed. 176 This document describes a unified solution based on [RFC6513] and 177 [RFC6514] for seamless interoperability of multicast VPN between EVPN 178 and MVPN PEs. Furthermore, it describes how the proposed solution can 179 be used as a routed multicast solution in data centers with only EVPN 180 PEs (e.g., routed multicast VPN only among EVPN PEs). 182 2. Requirements Language 184 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 185 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to 186 be interpreted as described in [RFC2119] only when they appear in all 187 upper case. They may also appear in lower or mixed case as English 188 words, without any normative meaning. 190 3. Terminology 192 Most of the terminology used in this documents comes from [RFC8365] 194 Broadcast Domain: In a bridged network, the broadcast domain 195 corresponds to a Virtual LAN (VLAN), where a VLAN is typically 196 represented by a single VLAN ID (VID) but can be represented by 197 several VIDs where Shared VLAN Learning (SVL) is used per [802.1Q]. 199 Bridge Table: An instantiation of a broadcast domain on a MAC-VRF. 201 VXLAN: Virtual Extensible LAN 203 POD: Point of Delivery 205 NV: Network Virtualization 207 NVO: Network Virtualization Overlay 209 NVE: Network Virtualization Endpoint 211 VNI: Virtual Network Identifier (for VXLAN) 213 EVPN: Ethernet VPN 215 EVI: An EVPN instance spanning the Provider Edge (PE) devices 216 participating in that EVPN 218 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 219 Control (MAC) addresses on a PE 221 IP-VRF: A Virtual Routing and Forwarding table for Internet Protocol 222 (IP) addresses on a PE 224 Ethernet Segment (ES): When a customer site (device or network) is 225 connected to one or more PEs via a set of Ethernet links, then that 226 set of links is referred to as an 'Ethernet segment'. 228 Ethernet Segment Identifier (ESI): A unique non-zero identifier that 229 identifies an Ethernet segment is called an 'Ethernet Segment 230 Identifier'. 232 Ethernet Tag: An Ethernet tag identifies a particular broadcast 233 domain, e.g., a VLAN. An EVPN instance consists of one or more 234 broadcast domains. 236 PE: Provider Edge device. 238 Single-Active Redundancy Mode: When only a single PE, among all the 239 PEs attached to an Ethernet segment, is allowed to forward traffic 240 to/from that Ethernet segment for a given VLAN, then the Ethernet 241 segment is defined to be operating in Single-Active redundancy mode. 243 All-Active Redundancy Mode: When all PEs attached to an Ethernet 244 segment are allowed to forward known unicast traffic to/from that 245 Ethernet segment for a given VLAN, then the Ethernet segment is 246 defined to be operating in All-Active redundancy mode. 248 PIM-SM: Protocol Independent Multicast - Sparse-Mode 250 PIM-SSM: Protocol Independent Multicast - Source Specific Multicast 252 Bidir PIM: Bidirectional PIM 254 CO: Central Office of a service provider 256 SPDC: Service Provider Data Center 258 4. Requirements 260 This section describes the requirements specific in providing 261 seamless multicast VPN service between MVPN and EVPN capable 262 networks. 264 4.1. Optimum Forwarding 266 The solution SHALL support optimum multicast forwarding between EVPN 267 and MVPN PEs within a network. The network can be confined to a CO or 268 it can span across multiple LATAs. The solution SHALL support optimum 269 multicast forwarding with both ingress replication tunnels and P2MP 270 tunnels. 272 4.2. Optimum Replication 273 For EVPN PEs with IRB capability, the solution SHALL use only a 274 single multicast tunnel among EVPN and MVPN PEs for IP multicast 275 traffic. Multicast tunnels can be either ingress replication tunnels 276 or P2MP tunnels. The solution MUST support optimum replication for 277 both Intra-subnet and Inter-subnet IP multicast traffic: 279 - Non-IP traffic SHALL be forwarded per EVPN baseline [RFC7432] or 280 [RFC8365] 282 - If a Multicast VPN spans across both Intra and Inter subnets, then 283 for Ingress replication regardless of whether the traffic is Intra or 284 Inter subnet, only a single copy of IP multicast traffic SHALL be 285 sent from the source PE to the destination PE. 287 - If a Multicast VPN spans across both Intra and Inter subnets, then 288 for P2MP tunnels regardless of whether the traffic is Intra or Inter 289 subnet, only a single copy of multicast data SHALL be transmitted by 290 the source PE. Source PE can be either EVPN or MVPN PE and receiving 291 PEs can be a mix of EVPN and MVPN PEs - i.e., a multicast VPN can be 292 spread across both EVPN and MVPN PEs. 294 4.3. All-Active and Single-Active Multi-Homing 296 The solution MUST support multi-homing of source devices and 297 receivers that are sitting in the same subnet (e.g., VLAN) and are 298 multi-homed to EVPN PEs. The solution SHALL allow for both Single- 299 Active and All-Active multi-homing. The solution MUST prevent loop 300 during steady and transient states just like EVPN baseline solution 301 [RFC7432] and [RFC8365] for all multi-homing types. 303 4.4. Inter-AS Tree Stitching 305 The solution SHALL support multicast tree stitching when the tree 306 spans across multiple Autonomous Systems. 308 4.5. EVPN Service Interfaces 310 The solution MUST support all EVPN service interfaces listed in 311 section 6 of [RFC7432]: 313 - VLAN-based service interface 314 - VLAN-bundle service interface 315 - VLAN-aware bundle service interface 317 4.6. Distributed Anycast Gateway 319 The solution SHALL support distributed anycast gateways for tenant 320 workloads on NVE devices operating in EVPN-IRB mode. 322 4.7. Selective & Aggregate Selective Tunnels 324 The solution SHALL support selective and aggregate selective P- 325 tunnels as well as inclusive and aggregate inclusive P-tunnels. When 326 selective tunnels are used, then multicast traffic SHOULD only be 327 forwarded to the remote PE which have receivers - i.e., if there are 328 no receivers at a remote PE, the multicast traffic SHOULD NOT be 329 forwarded to that PE and if there are no receivers on any remote PEs, 330 then the multicast traffic SHOULD NOT be forwarded to the core. 332 4.8. Tenants' (S,G) or (*,G) states 334 The solution SHOULD store (C-S,C-G) and (C-*,C-G) states only on PE 335 devices that have interest in such states hence reducing memory and 336 processing requirements - i.e., PE devices that have sources and/or 337 receivers interested in such multicast groups. 339 4.9. Zero Disruption upon BD/Subnet Addition 341 In DC environments, various Bridge Domains are provisioned and 342 removed on regular basis due to host mobility, policy and tenant 343 changes. Such change in BD configuration should not affect existing 344 flows within the same BD or any other BD in the network. 346 4.10. No Changes to Existing EVPN Service Interface Models 348 VLAN-aware bundle service as defined in [RFC7432] typically does not 349 require any VLAN ID translation from one tenant site to another - 350 i.e., the same set of VLAN IDs are configured consistently on all 351 tenant segments. In such scenarios, EVPN-IRB multicast service MUST 352 maintain the same mode of operation and SHALL NOT require any VLAN ID 353 translation. 355 5. IRB Unicast versus IRB Multicast 357 [EVPN-IRB] describes the operation for EVPN PEs in IRB mode for 358 unicast traffic. The same IRB model used for unicast traffic in 359 [EVPN-IRB], where an IP-VRF in an EVPN PE is attached to one or more 360 bridge tables (BTs) via virtual IRB interfaces, is also applicable 361 for multicast traffic. However, there are some noticeable differences 362 between the IRB operation for unicast traffic described in [EVPN-IRB] 363 versus for multicast traffic described in this document. For unicast 364 traffic, the intra-subnet traffic, is bridged within the MAC-VRF 365 associated with that subnet (i.e., a lookup based on MAC-DA is 366 performed); whereas, the inter-subnet traffic is routed in the 367 corresponding IP-VRF (ie, a lookup based on IP-DA is performed). A 368 given tenant can have one or more IP-VRFs; however, without loss of 369 generality, this document assumes one IP-VRF per tenant. In context 370 of a given tenant's multicast traffic, the intra-subnet traffic is 371 bridged for non-IP traffic and it is Layer-2 switched for IP traffic. 372 Whereas, the tenants's inter-subnet multicast traffic is always 373 routed in the corresponding IP-VRF. The difference between bridging 374 and L2-switching for multicast traffic is that the former uses MAC-DA 375 lookup for forwarding the multicast traffic; whereas, the latter uses 376 IP-DA lookup for such forwarding where the forwarding states are 377 built in the MAC-VRF using IGMP/MLD or PIM snooping. 379 5.1. Emulated Virtual LAN Service 381 EVPN does not provide a Virtual LAN (VLAN) service per [IEEE802.1Q] 382 but rather an emulated VLAN service. This VLAN service emulation is 383 not only done for unicast traffic but also is extended for intra- 384 subnet multicast traffic described in [EVPN-IGMP-PROXY] and [EVPN- 385 PIM-PROXY]. For intra-subnet multicast, an EVPN PE builds multicast 386 forwarding states in its bridge table (BT) based on snooping of 387 IGMP/MLD and/or PIM messages and the forwarding is performed based on 388 destination IP multicast address of the Ethernet frame rather than 389 destination MAC address as noted above. In order to enable seamless 390 integration of EVPN and MVPN PEs, this document extends the concept 391 of an emulated VLAN service for multicast IRB applications such that 392 the intra-subnet IP multicast traffic can get treated same as inter- 393 subnet IP multicast traffic which means intra-subnet IP multicast 394 traffic destined to remote PEs gets routed instead of being L2- 395 switched - i.e., TTL value gets decremented and the Ethernet header 396 of the L2 frame is de-capsulated an encapsulated at both ingress and 397 egress PEs. It should be noted that the non-IP multicast or L2 398 broadcast traffic still gets bridged and frames get forwarded based 399 on their destination MAC addresses. 401 6. Solution Overview 403 This section describes a multicast VPN solution based on [RFC6513] 404 and [RFC6514] for EVPN PEs operating in IRB mode that want to perform 405 seamless interoperability with their counterparts MVPN PEs. 407 6.1. Operational Model for EVPN IRB PEs 409 Without the loss of generality, this section assumes that all EVPN 410 PEs have IRB capability and operating in IRB mode for both unicast 411 and multicast traffic (e.g., all EVPN PEs are homogenous in terms of 412 their capabilities and operational modes). As it will be seen later, 413 an EVPN network can consist of a mix of PEs where some are capable of 414 multicast IRB and some are not and the multicast operation of such 415 heterogeneous EVPN network will be an extension of an EVPN homogenous 416 network. Therefore, we start with the multicast IRB solution 417 description for the EVPN homogenous network. 419 The EVPN PEs terminate IGMP/MLD messages from tenant host devices or 420 PIM messages from tenant routers on their IRB interfaces, thus avoid 421 sending these messages over MPLS/IP core. A tenant virtual/physical 422 router (e.g., CE) attached to an EVPN PE becomes a multicast routing 423 adjacency of that PE. Furthermore, the PE uses MVPN BGP protocol and 424 procedures per [RFC6513] and [RFC6514]. With respect to multicast 425 routing protocol between tenant's virtual/physical router and the PE 426 that it is attached to, any of the following PIM protocols is 427 supported per [RFC6513]: PIM-SM with Any Source Multicast (ASM) mode, 428 PIM-SM with Source Specific Multicast (SSM) mode, and PIM 429 Bidirectional (BIDIR) mode. Support of PIM-DM (Dense Mode) is 430 excluded in this document per [RFC6513]. 432 The EVPN PEs use MVPN BGP routes defined in [RFC6514] to convey 433 tenant (S,G) or (*,G) states to other MVPN or EVPN PEs and to set up 434 overlay trees (inclusive or selective) for a given MVPN instance. The 435 root or a leaf of such an overlay tree is terminated on an EVPN or 436 MVPN PE. Furthermore, this inclusive or selective overlay tree is 437 terminated on a single IP-VRF of the EVPN or MVPN PE. In case of EVPN 438 PE, these overlay trees never get terminated on MAC-VRFs of that PE. 439 Overlay trees are instantiated by underlay provider tunnels (P- 440 tunnels) - e.g., P2MP, MP2MP, or unicast tunnels per [RFC 6513]. When 441 there are several overlay trees mapped to a single underlay P-tunnel, 442 the tunnel is referred to as an aggregate tunnel. 444 Figure-1 below depicts a scenario where a tenant's MVPN spans across 445 both EVPN and MVPN PEs; where all EVPN PEs have multicast IRB 446 capability. An EVPN PE (with multicast IRB capability) can be modeled 447 as a MVPN PE where the virtual IRB interface of an EVPN PE (virtual 448 interface between a BT and IP-VRF) can be considered a routed 449 interface for the MVPN PE. 451 EVPN PE1 452 +------------+ 453 Src1 +----|(MAC-VRF1) | MVPN PE3 454 Rcvr1 +----| \ | +---------+ +--------+ 455 | (IP-VRF)|----| |---|(IP-VRF)|--- Rcvr5 456 | / | | | +--------+ 457 Rcvr2 +---|(MAC-VRF2) | | | 458 +------------+ | | 459 | MPLS/ | 460 EVPN PE2 | IP | 461 +------------+ | | 462 Rcvr3 +---|(MAC-VRF1) | | | MVPN PE4 463 | \ | | | +--------+ 464 | (IP-VRF)|----| |---|(IP-VRF)|--- Rcvr6 465 | / | +---------+ +--------+ 466 Rcvr4 +---|(MAC-VRF3) | 467 +------------+ 469 Figure-1: EVPN & MVPN PEs Seamless Interop 471 Figure 2 depicts the modeling of EVPN PEs based on MVPN PEs where an 472 EVPN PE can be modeled as a PE that consists of a MVPN PE whose 473 routed interfaces (e.g., attachment circuits) are replaced with IRB 474 interfaces connecting each IP-VRF of the MVPN PE to a set of BTs. 475 Similar to a MVPN PE where an attachment circuit serves as a routed 476 multicast interface for an IP-VRF associated with a MVPN instance, an 477 IRB interface serves as a routed multicast interface for the IP-VRF 478 associated with the MVPN instance. Since EVPN PEs run MVPN protocols 479 (e.g., [RFC6513] and [RFC6514]), for all practical purposes, they 480 look just like MVPN PEs to other PE devices. Such modeling of EVPN 481 PEs, transforms the multicast VPN operation of EVPN PEs to that of 482 MVPN and thus simplifies the interoperability between EVPN and MVPN 483 PEs to that of running a single unified solution based on MVPN. 485 EVPN PE1 486 +------------+ 487 Src1 +----|(MAC-VRF1) | 488 | \ | 489 Rcvr1 +----| +--------+| +---------+ +--------+ 490 | |MVPN PE1||----| |---|MVPN PE3|--- Rcvr5 491 | +--------+| | | +--------+ 492 | / | | | 493 Rcvr2 +---|(MAC-VRF2) | | | 494 +------------+ | | 495 | MPLS/ | 496 EVPN PE2 | IP | 497 +------------+ | | 498 Rcvr3 +---|(MAC-VRF1) | | | 499 | \ | | | 500 | +--------+| | | +--------+ 501 | |MVPN PE2||----| |---|MVPN PE4|--- Rcvr6 502 | +--------+| | | +--------+ 503 | / | +---------+ 504 Rcvr4 +---|(MAC-VRF3) | 505 +------------+ 507 Figure-2: Modeling EVPN PEs as MVPN PEs 509 Although modeling an EVPN PE as a MVPN PE, conceptually simplifies 510 the operation to that of a solution based on MVPN, the following 511 operational aspects of EVPN need to be factored in when considering 512 seamless integration between EVPN and MVPN PEs. 514 1) Unicast route advertisements for IP multicast source 515 2) Multi-homing of IP multicast sources and receivers 516 3) Mobility for Tenant's sources and receivers 517 4) non-IP multicast traffic handling 519 6.2. Unicast Route Advertisements for IP multicast Source 521 When an IP multicast source is attached to an EVPN PE, the unicast 522 route for that IP multicast source needs to be advertised. When the 523 source is attached to a Single-Active multi-homed ES, then the EVPN 524 DF PE is the PE that advertises a unicast route corresponding to the 525 source IP address with VRF Route Import extended community which in 526 turn is used as the Route Target for Join (S,G) messages sent toward 527 the source PE by the remote PEs. The EVPN PE advertises this unicast 528 route using EVPN route type 2 (or 5) and IPVPN unicast route along 529 with VRF Route Import extended community. EVPN route type 2 (or 5) is 530 advertised with the Route Targets corresponding to both IP-VRF and 531 MAC-VRF/BT; whereas, IPVPN unicast route is advertised with RT 532 corresponding to the IP-VRF. When unicast routes are advertised by 533 MVPN PEs, they are advertised using IPVPN unicast route along with 534 VRF Route Import extended community per [RFC6514]. 536 When the source is attached to an All-Active multi-homed ES, then the 537 PE that learns the source advertises the unicast route for that 538 source using EVPN route type 2 (or 5) and IPVPN unicast route along 539 with VRF Route Import extended community. EVPN route type 2 (or 5) is 540 advertised with the Route Targets corresponding to both IP-VRF and 541 MAC-VRF/BT; whereas, IPVPN unicast route is advertised with RT 542 corresponding to the IP-VRF. When the other multi-homing EVPN PEs for 543 that ES receive this unicast EVPN route, they import the route and 544 check to see if they have learned the route locally for that ES, if 545 they have, then they do nothing. But if they have not, then they add 546 the IP and MAC addresses to their IP-VRF and MAC-VRF/BT tables 547 respectively with the local interface corresponding to that ES as the 548 corresponding route adjacency. Furthermore, these PEs advertise an 549 IPVPN unicast route along with VRF Route Import extended community 550 and Route Target corresponding to IP-VRF to other remote PEs for that 551 MVPN. Therefore, the remote PEs learn the unicast route corresponding 552 to the source from all multi-homing PEs associated with that All- 553 Active Ethernet Segment even though one of the multi-homing PEs may 554 only have directly learned the IP address of the source. 556 6.3. Multi-homing of IP Multicast Source and Receivers 558 EVPN [RFC7432] has extensive multi-homing capabilities that allows 559 TSes to be multi-homed to two or more EVPN PEs in Single-Active or 560 All-Active mode. In Single-Active mode, only one of the multi-homing 561 EVPN PEs can receive/transmit traffic for a given subnet (a given BD) 562 for that multi-homed Ethernet Segment (ES). In All-Active mode, any 563 of the multi-homing EVPN PEs can receive/transmit unicast traffic but 564 only one of them (the DF PE) can send BUM traffic to the multi-homed 565 ES for a given subnet. 567 The multi-homing mode (Single-Active versus All-Active) of a TS 568 source can impact the MVPN procedures as described below. 570 6.3.1. Single-Active Multi-Homing 572 When a TS source reside on an ES that is multi-homed to two or more 573 EVPN PEs operating in Single-Active mode, only one of the EVPN PEs 574 can be active for the source subnet on that ES. Therefore, only one 575 of the multi-homing PE learns the unicast route of the TS source and 576 advertises that using EVPN and IPVPN to other PEs as described 577 previously. 579 A downstream PE that receives a Join/Prune message from a TS 580 host/router, selects a Upstream Multicast Hop (UMH) which is the 581 upstream PE that receives the IP multicast flow in case of Singe- 582 Active multi-homing. An IP multicast flow belongs to either a source- 583 specific tree (S,G) or to a shared tree (*,G). We use the notation 584 (X,G) to refer to either (S,G) or (*,G); where X refers to S in case 585 of (S,G) and X refers to the Rendezvous Point (RP) for G in case of 586 (*,G). Since the active PE (which is also the UMH PE) has advertised 587 unicast route for X along with the VRF Route Import EC, the 588 downstream PEs selects the UMH without any ambiguity based on MVPN 589 procedures described in section 5.1 of [RFC6513]. Any of the three 590 algorithms described in that section works fine. 592 The multi-homing PE that receives the IP multicast flow on its local 593 AC, performs the following tasks: 595 - L2 switches the multicast traffic in its BT associated with the 596 local AC over which it received the flow if there are any interested 597 receivers for that subnet. 599 - L3 routes the multicast traffic to other BTs for other subnets if 600 there are any interested receivers for those subnets. 602 - L3 routes the multicast traffic to other PEs per MVPN procedures. 604 The multicast traffic can be sent on Inclusive, Selective, or 605 Aggregate-Selective tree. Regardless what type of tree is used, only 606 a single copy of the multicast traffic is received by the downstream 607 PEs and the multicast traffic is forwarded optimally from the 608 upstream PE to the downstream PEs. 610 6.3.2. All-Active Multi-Homing 612 When a TS source reside on an ES that is multi-homed to two or more 613 EVPN PEs operating in All-Active mode, then any of the multi-homing 614 PEs can learn the TS source's unicast route; however, that PE may not 615 be the same PE that receives the IP multicast flow. Therefore, the 616 procedures for Single-Active Multi-homing need to be augmented for 617 All-Active scenario as below. 619 The multi-homing EVPN PE that receives the IP multicast flow on its 620 local AC, needs to do the following task in additions to the ones 621 listed in the previous section for Single-Active multi-homing: L2 622 switch the multicast traffic to other multi-homing EVPN PEs for that 623 ES via a multicast tunnel which it is called intra-ES tunnel. There 624 will be a dedicated tunnel for this purpose which is different from 625 inter-subnet overlay tunnel setup by MVPN procedures. 627 When the multi-homing EVPN PEs receive the IP multicast flow via this 628 tunnel, they treat it as if they receive the flow via their local 629 ACs and thus perform the tasks mentioned in the previous section for 630 Single-Active multi-homing. The tunnel type for this intra-ES tunnel 631 can be any of the supported tunnel types such as ingress-replication, 632 P2MP tunnel, BIER, and Assisted Replication; however, given that vast 633 majority of multi-homing ESes are just dual-homing, a simple ingress 634 replication tunnel can serve well. For a given ES, since multicast 635 traffic that is locally received by one multi-homing PE is sent to 636 other multi-homing PEs via this intra-ES tunnel, there is no need for 637 sending the multicast tunnel via MVPN tunnel to these multi-homing 638 PEs - i.e., MVPN multicast tunnels are used only for remote EVPN and 639 MVPN PEs. Multicast traffic sent over this intra-ES tunnel to other 640 multi-homing PEs (only one other in case of dual-homing) for a given 641 ES can be either fixed or on demand basis. If on-demand basis, then 642 one of the other multi-homing PEs that is selected as a UMH upon 643 receiving a join message from a downstream PE, sends a request to 644 receive this multicast flow from the source multi-homing PE over the 645 special intra-ES tunnel. 647 By feeding IP multicast flow received on one of the EVPN multi-homing 648 PEs to the interested EVPN PEs in the same multi-homing group, we 649 have essentially enabled all the EVPN PEs in the multi-homing group 650 to serve as UMH for that IP multicast flow. Each of these UMH PEs 651 advertises unicast route for X in (X,G) along with the VRF Route 652 Import EC to all PEs for that MVPN instance. The downstream PEs build 653 a candidate UMH set based on procedures described in section 5.1 of 654 [RFC6513] and pick a UMH from the set. It should be noted that both 655 the default UMH selection procedure based on highest UMH PE IP 656 address and the UMH selection algorithm based on hash function 657 specified in section 5.1.3 of [RFC6513] (which is also a MUST 658 implement algorithm) result in the same UMH PE be selected by all 659 downstream PEs running the same algorithm. However, in order to allow 660 a form of "equal cost load balancing", the hash algorithm is 661 recommended to be used among all EVPN and MVPN PEs. This hash 662 algorithm distributes UMH selection for different IP multicast flows 663 among the multi-homing PEs for a given ES. 665 Since all downstream PEs (EVPN and MVPN) use the same hash-based 666 algorithm for UMH determination, they all choose the same upstream PE 667 as their UMH for a given (X,G) flow and thus they all send their 668 (X,G) join message via BGP to the same upstream PE. This results in 669 one of the multi-homing PEs to receive the join message and thus send 670 the IP multicast flow for (X,G) over its associated overlay tree even 671 though all of the multi-homing PEs in the All-Active redundancy group 672 have received the IP multicast flow (one of them directly via its 673 local AC and the rest indirectly via the associated intra-ES tunnel). 674 Therefore, only a single copy of routed IP multicast flow is sent 675 over the network regardless of overlay tree type supported by the PEs 676 - i.e., the overlay tree can be of type selective or aggregate 677 selective or inclusive tree. This gives the network operator the 678 maximum flexibility for choosing any overlay tree type that is 679 suitable for its network operation and still be able to deliver only 680 a single copy of the IP multicast flows to the egress PEs. In other 681 words, an egress PE only receives a single copy of the IP multicast 682 flow over the network, because it either receives it via the EVPN 683 intra-ES tunnel or MVPN inter-subnet tunnel. Furthermore, if it 684 receives it via MVPN inter-subnet tunnel, then only one of the multi- 685 homing PEs associated with the source ES, sends the IP multicast 686 traffic. 688 Since the network of interest for seamless interoperability between 689 EVPN and MVPN PEs is MPLS, the EVPN handling of BUM traffic for MPLS 690 network needs to be considered. EVPN [RFC7432] uses ESI MPLS label 691 for split-horizon filtering of Broadcast/Unknown unicast/multicast 692 (BUM) traffic from an All-Active multi-homing Ethernet Segment to 693 ensure that BUM traffic doesn't get loop back to the same Ethernet 694 Segment that it came from. This split-horizon filtering mechanism 695 applies as-is for multicast IRB scenario because of using the intra- 696 ES tunnel among multi-homing PEs. Since the multicast traffic 697 received from a TS source on an All-Active ES by a multi-homing PE is 698 bridged to all other multi-homing PEs in that group, the standard 699 EVPN split-horizon filtering described in [RFC7432] applies as-is. 700 Split-horizon filtering for non-MPLS encapsulations such as VxLAN is 701 described in section 9.2.2 that deals with a DC network that consists 702 of only EVPN PEs. 704 6.4. Mobility for Tenant's Sources and Receivers 706 When a tenant system (TS), source or receiver, is multi-homed behind 707 a group of multi-homing EVPN PEs, then TS mobility SHALL be supported 708 among EVPN PEs. Furthermore, such TS mobility SHALL only cause an 709 temporary disruption to the related multicast service among EVPN and 710 MVPN PEs. If a source is moved from one EVPN PE to another one, then 711 the EVPN mobility procedure SHALL discover this move and a new 712 unicast route advertisement (using both EVPN and IP-VPN routes) is 713 made by the EVPN PE where the source has moved to per section 6.3 714 above and unicast route withdraw (for both EVPN and IP-VPN routes) is 715 performed by the EVPN PE where the source has moved from. 717 The move of a source results in disruption of the IP multicast flow 718 for the corresponding (S,G) flow till the new unicast route 719 associated with the source is advertised by the new PE along with the 720 VRF Route Import EC, the join messages sent by the egress PEs are 721 received by the new PE, the multicast state for that flow is 722 installed in the new PE and a new overlay tree is built for that 723 source from the new PE to the egress PEs that are interested in 724 receiving that IP multicast flow. 726 The move of a receiver results in disruption of the IP multicast flow 727 to that receiver only till the new PE for that receiver discovers the 728 source and joins the overlay tree for that flow. 730 6.5. Intra-Subnet BUM Traffic Handling 732 Link local IP multicast traffic consists IPv4 traffic with a 733 destination address prefix of 224/8 and IPv6 traffic with a 734 destination address prefix of FF02/16. Such IP multicast traffic as 735 well as non-IP multicast/broadcast traffic are sent per EVPN [RF7432] 736 BUM procedures and does not get routed via IP-VRF for multicast 737 addresses. So, such BUM traffic will be limited to a given EVI/VLAN 738 (e.g., a give subnet); whereas, IP multicast traffic, will be locally 739 L2 switched for local interfaces attached on the same subnet and will 740 be routed for local interfaces attached on a different subnet or for 741 forwarding traffic to other EVPN PEs (refer to section 5.1.1 for data 742 plane operation). 744 7. Control Plane Operation 746 In seamless interop between EVPN and MVPN PEs, the control plane may 747 need to setup the following three types of multicast tunnels. The 748 first two are among EVPN PEs only but the third one is among EVPN and 749 MVPN PEs. 751 1) Intra-ES IP multicast tunnel 753 2) Intra-subnet BUM tunnel 755 3) Inter-subnet IP multicast tunnel 757 7.1. Intra-ES IP Multicast Tunnel 759 As described in section 6.3.2, when a multicast source is sitting 760 behind an All-Active ES, then an intra-subnet multicast tunnel is 761 needed among the multi-homing EVPN PEs for that ES to carry multicast 762 flow received by one of the multi-homing PEs to the other PEs in that 763 ES. We refer to this multicast tunnel as Intra-ES tunnel. Vast 764 majority of All-Active multi-homing for TOR devices in DC networks 765 are just dual-homing which means the multicast flow received by one 766 of the dual-homing PE only needs to be sent to the other dual-homing 767 PE. Therefore, a simple ingress replication tunnel is all that is 768 needed. In case of multi-homing to three or more EVPN PEs, then other 769 tunnel types such as P2MP, MP2MP, BIER, and Assisted Replication can 770 be considered. It should be noted that this intra-ES tunnel is only 771 needed for All-Active multi-homing and it is not required for Single- 772 Active multi-homing. 774 The EVPN PEs belonging to a given All-Active ES discover each other 775 using EVPN Ethernet Segment route per procedures described in 776 [RFC7432]. These EVPN PEs perform DF election per [RFC7432], [EVPN- 777 DF-Framework], or other DF election algorithms to decide who is a DF 778 for a given BD. If the BD belongs to a tenant that has IRB IP 779 multicast enabled for it, then for fixed-mode, each PE sets up an 780 intra-ES tunnel to forward IP multicast traffic received locally on 781 that BD to other multi-homing PE(s) for that ES. Therefore, IP 782 multicast traffic received via a local attachment circuit is sent on 783 this tunnel and on the associated IRB interface for that BT and other 784 local attachment circuits if there are interested receivers for them. 785 The other multi-homing EVPN PEs treat this intra-ES tunnel just like 786 their local ACs - i.e., the multicast traffic received over this 787 tunnel is treated as if it is received via its local AC. Thus, the 788 multi-homing PEs cannot receive the same IP multicast flow from an 789 MVPN tunnel (e.g., over an IRB interface for that BD) because between 790 a source behind a local AC versus a source behind a remote PE, the PE 791 always chooses its local AC. 793 When ingress replication is used for intra-ES tunnel, every PE in the 794 All-Active multi-homing ES has all the information to setup these 795 tunnels - i.e., a) each PE knows what are the other multi-homing PEs 796 for that ES via EVPN Ethernet Segment route and can use this 797 information to setup intra-ES IP multicast tunnel among themselves. 799 7.2. Intra-Subnet BUM Tunnel 801 As the name implies, this tunnel is setup to carry BUM traffic for a 802 given subnet/BD among EVNP PEs. In [RFC7432], this overlay tunnel is 803 used for transmission of all BUM traffic including user IP multicast 804 traffic. However, for multicast traffic handling in EVPN-IRB PEs, 805 this tunnel is used for all broadcast, unknown-unicast, non-IP 806 multicast traffic, and link-local IP multicast traffic - i.e., it is 807 used for all BUM traffic except user IP multicast traffic. This 808 tunnel is setup using IMET route for a given EVI/BD. The composition 809 and advertisement of IMET routes are exactly per [RFC7432]. It should 810 be noted that when an EVPN All-Active multi-homing PE uses both this 811 tunnel as well as intra-ES tunnel, there SHALL be no duplication of 812 multicast traffic over the network because they carry different types 813 of multicast traffic - i.e., intra-ES tunnel among multi-homing PEs 814 carries only user IP multicast traffic; whereas, intra-subnet BUM 815 tunnel carries link-local IP multicast traffic and BUM traffic (w/ 816 non-IP multicast). 818 7.3. Inter-Subnet IP Multicast Tunnel 820 As its name implies, this tunnel is setup to carry IP-only multicast 821 traffic for a given tenant across all its subnets (BDs) among EVPN 822 and MVPN PEs. 824 The following NLRIs from [RFC6514] is used for setting up this inter- 825 subnet tunnel in the network. 827 Intra-AS I-PMSI A-D route is used to form default underlay tunnel 828 (also called inclusive tunnel) for a tenant IP-VRF. The tunnel 829 attributes are indicated using PMSI attribute with this route. 831 S-PMSI A-D route is used to form Customer flow specific underlay 832 tunnels. This enables selective delivery of data to PEs having 833 active receivers and optimizes fabric bandwidth utilization. The 834 tunnel attributes are indicated using PMSI attribute with this 835 route. 837 Each EVPN PE supporting a specific MVPN instance discovers the set of 838 other PEs in its AS that are attached to sites of that MVPN using 839 Intra-AS I-PMSI A-D route (route type 1) per [RFC6514]. It can also 840 discover the set of other ASes that have PEs attached to sites of 841 that MVPN using Inter-AS I-PMSI A-D route (route type 2) per 842 [RFC6514]. After the discovery of PEs that are attached to sites of 843 the MVPN, an inclusive overlay tree (I-PMSI) can be setup for 844 carrying tenant multicast flows for that MVPN; however, this is not a 845 requirement per [RFC6514] and it is possible to adopt a policy in 846 which all tenant flows are carried on S-PMSIs. 848 An EVPN-IRB PE sends a user IP multicast flow to other EVPN and MVPN 849 PEs over this inter-subnet tunnel that is instantiated using MVPN I- 850 PMSI or S-PMSI. This tunnel can be considered as being originated and 851 terminated from/to among IP-VRFs of EVPN/MVPN PEs; whereas, intra- 852 subnet tunnel is originated/terminated among MAC-VRFs of EVPN PEs. 854 7.4. IGMP Hosts as TSes 856 If a tenant system which is an IGMP host is multi-homed to two or 857 more EVPN PEs using All-Active multi-homing, then IGMP join and leave 858 messages are synchronized between these EVPN PEs using EVPN IGMP Join 859 Synch route (route type 7) and EVPN IGMP Leave Synch route (route 860 type 8) per [IGMP-PROXY]. IGMP states are built in the corresponding 861 BDs of the multi-homing EVPN PEs. In [IGMP-PROXY] the DF PE for that 862 BD originates an EVPN Selective Multicast Tag route (SMET route) 863 route to other EVPN PEs. However, in here there is no need to use 864 SMET because the IGMP messages are terminated by the EVPN-IRB PE and 865 tenant (*,G) or (S,G) join messages are sent via MVPN Shared Tree 866 Join route (route type 6) or Source Tree Join route (route type 7) 867 respectively of MCAST-VPN NLRI per [RFC6514]. In case of a network 868 with only IGMP hosts, the preferred mode of operation is that of SPT- 869 only per section 14 of [RFC6514]. This mode is only supported for 870 PIM-SM and avoids the RP configuration overhead. Such mode is chosen 871 by provisioning/ configuration. 873 7.5. TS PIM Routers 875 Just like a MVPN PE, an EVPN PE runs a separate tenant multicast 876 routing instance (VPN-specific) per MVPN instance and the following 877 tenant multicast routing instances are supported: 879 - PIM Sparse Mode (PIM-SM) with the ASM service model 880 - PIM Sparse Mode with the SSM service model 881 - PIM Bidirectional Mode (BIDIR-PIM), which uses bidirectional 882 tenant-trees to support the ASM service model 884 A given tenant's PIM join messages for (*,G) or (S, G) are processed 885 by the corresponding tenant multicast routing protocol and they are 886 advertised over MPLS/IP network using Shared Tree Join route (route 887 type 6) and Source Tree Join route (route type 7) respectively of 888 MCAST-VPN NLRI per [RFC6514]. 890 8 Data Plane Operation 892 When an EVPN-IRB PE receives an IGMP/MLD join message over one of its 893 Attachment Circuits (ACs), it adds that AC to its Layer-2 (L2) OIF 894 list. This L2 OIF list is associated with the MAC-VRF/BT 895 corresponding to the subnet of the tenant device that sent the 896 IGMP/MLD join. Therefore, tenant (S,G) or (*,G) forwarding entries 897 are created/updated for the corresponding MAC-VRF/BT based on these 898 source and group IP addresses. Furthermore, the IGMP/MLD join message 899 is propagated over the corresponding IRB interface and it is 900 processed by the tenant multicast routing instance which creates the 901 corresponding tenant (S,G) or (*,G) Layer-3 (L3) forwarding entries. 902 It adds this IRB interface to the L3 OIF list. An IRB is removed as a 903 L3 OIF when all L2 tenant (S,G) or (*,G) forwarding states is removed 904 for the MAC-VRF/BT associated with that IRB. Furthermore, tenant 905 (S,G) or (*,G) L3 forwarding state is removed when all of its L3 OIFs 906 are removed - i.e., all the IRB and L3 interfaces associated with 907 that tenant (S,G) or (*,G) are removed. 909 When an EVPN PE receives IP multicast traffic from one of its AC, if 910 it has any attached receivers for that subnet, it performs L2 911 switching of the intra-subnet traffic within the BT attached to that 912 AC. If the multicast flow is received over an AC that belongs to an 913 All-Active ES, then the multicast flow is also sent over the intra-ES 914 tunnel among multi-homing PEs. The EVPN PE then sends the multicast 915 traffic over the corresponding IRB interface. The multicast traffic 916 then gets routed in the corresponding IP-VRF and it gets forwarded to 917 interfaces in the L3 OIF list which can include other IRB interfaces, 918 other L3 interfaces directly connected to TSes, and the MVPN inter- 919 subnet tunnel which is instantiated by an I-PMSI or S-PMSI tunnel. 920 When the multicast packet is routed within the IP-VRF of the EVPN PE, 921 its Ethernet header is stripped and its TTL gets decremented as the 922 result of this IP routing. When the multicast traffic is received on 923 an IRB interface by the BT corresponding to that interface, it gets 924 L2 switched and sent over ACs that belong to the L2 OIF list. 926 8.1 Intra-Subnet L2 Switching 928 Rcvr1 in Figure 1 is connected to PE1 in MAC-VRF1 (same as Src1) and 929 sends IGMP join for (C-S, C-G), IGMP snooping will record this state 930 in local bridging entry. A routing entry will be formed as well 931 which will point to MAC-VRF1 as RPF for Src1. We assume that Src1 is 932 known via ARP or similar procedures. Rcvr1 will get a locally 933 bridged copy of multicast traffic from Src1. Rcvr3 is also connected 934 in MAC-VRF1 but to PE2 and hence would send IGMP join which will be 935 recorded at PE2. PE2 will also form routing entry and RPF will be 936 assumed as Tenant Tunnel "Tenant1" formed beforehand using MVPN 937 procedures. Also this would cause multicast control plane to 938 initiate a BGP MCAST-VPN type 7 route which would include VRI for PE1 939 and hence be accepted on PE1. PE1 will include Tenant1 tunnel as 940 Outgoing Interface (OIF) in the routing entry. Now, since it has 941 knowledge of remote receivers via MVPN control plane it will 942 encapsulate original multicast traffic in Tenant1 tunnel towards 943 core. 945 8.2 Inter-Subnet L3 Routing 947 Rcvr2 in Figure 1 is connected to PE1 in MAC-VRF2 and hence PE1 will 948 record its membership in MAC-VRF2. Since MAC-VRF2 is enabled with 949 IRB, it gets added as another OIF to routing entry formed for (C-S, 950 C-G). Rcvr2 and Rcvr4 are also in different MAC-VRFs than multicast 951 speaker Src1 and hence need Inter-subnet forwarding. PE2 will form 952 local bridging entry in MAC-VRF2 due to IGMP joins received from 953 Rcvr3 and Rcvr4 respectively. PE2 now adds another OIF 'MAC-VRF2' to 954 its existing routing entry. But there is no change in control plane 955 states since its already sent MVPN route and no further signaling is 956 required. Also since Src1 is not part of MAC-VRF2 subnet, it is 957 treated as routing OIF and hence MAC header gets modified as per 958 normal procedures for routing. PE3 forms routing entry very similar 959 to PE2. It is to be noted that PE3 does not have MAC-VRF1 configured 960 locally but still can receive the multicast data traffic over Tenant1 961 tunnel formed due to MVPN procedures 963 9. DCs with only EVPN PEs 965 As mentioned earlier, the proposed solution can be used as a routed 966 multicast solution in data center networks with only EVPN PEs (e.g., 967 routed multicast VPN only among EVPN PEs). It should be noted that 968 the scope of intra-subnet forwarding for the solution described in 969 this document, is limited to a single EVPN PE for Single-Active 970 multi-homing and to multi-homing PEs for All-Active multi-homing. In 971 other words, the IP multicast traffic that needs to be forwarded from 972 the source PE to remote PEs is routed to remote PEs regardless of 973 whether the traffic is intra-subnet or inter-subnet. As the result, 974 the TTL value for intra-subnet traffic that spans across two or more 975 PEs get decremented. Based on past experiences with MVPN over last 976 dozen years for supported IP multicast applications, layer-3 977 forwarding of intra-subnet multicast traffic should be fine. However, 978 if there are applications that require intra-subnet multicast traffic 979 to be L2 forwarded (e.g., without decrementing TTL value), then 980 [EVPN-IRB-MCAST] proposes a solution to accommodate such 981 applications. 983 9.1. Setup of overlay multicast delivery 985 It must be emphasized that this solution poses no restriction on the 986 setup of the tenant BDs and that neither the source PE, nor the 987 receiver PEs do not need to know/learn about the BD configuration on 988 other PEs in the MVPN. The Reverse Path Forwarder (RPF) is selected 989 per the tenant multicast source and the IP-VRF in compliance with the 990 procedures in [RFC6514], using the incoming EVPN route type 2 or 5 991 NLRI per [RFC7432]. 993 The VRF Route Import (VRI) extended community that is carried with 994 the IP-VPN routes in [RFC6514] MUST be carried via the EVPN unicast 995 routes instead. The construction and processing of the VRI are 996 consistent with [RFC6514]. The VRI MUST uniquely identify the PE 997 which is advertising a multicast source and the IP-VRF it resides in. 999 VRI is constructed as following: 1001 - The 4-octet Global Administrator field MUST be set to an IP 1002 address of the PE. This address SHOULD be common for all the 1003 IP-VRFs on the PE (e.g., this address may be the PE's loopback 1004 address). 1005 - The 2-octet Local Administrator field associated with a given 1006 IP-VRF contains a number that uniquely identifies that IP-VRF 1007 within the PE that contains the IP-VRF. 1009 Every PE which detects a local receiver via a local IGMP join or a 1010 local PIM join for a specific source (overlay SSM mode) MUST 1011 terminate the IGMP/PIM signaling at the IP-VRF and generate a (C-S,C- 1012 G) via the BGP MCAST-VPN route type 7 per [RFC6514] if and only if 1013 the RPF for the source points to the fabric. If the RPF points to a 1014 local multicast source on the same MAC-VRF or a different MAC-VRF on 1015 that PE, the MCAST-VPN MUST NOT be advertised and data traffic will 1016 be locally routed/bridged to the receiver as detailed in section 6.2. 1018 The VRI received with EVPN route type 2 or 5 NLRI from source PE will 1019 be appended as an export route-target extended community. More 1020 details about handling of various types of local receivers are in 1021 section 10. The PE which has advertised the unicast route with VRI, 1022 will import the incoming MCAST-VPN NLRI in the IP-VRF with the same 1023 import route-target extended-community and other PEs SHOULD ignore 1024 it. Following such procedure the source PE learns about the existence 1025 of at least one remote receiver in the tenant overlay and programs 1026 data plane accordingly so that a single copy of multicast data is 1027 forwarded into the core VRF using tenant VRF tunnel. 1029 If the multicast source is unknown (overlay ASM mode), the MCAST-VPN 1030 route type 6 (C-*,C-G) join SHOULD be targeted towards the designated 1031 overlay Rendezvous Point (RP) by appending the received RP VRI as an 1032 export route-target extended community. Every PE which detects a 1033 local source, registers with its RP PE. That is how the RP learns 1034 about the tenant source(s) and group(s) within the MVPN. Once the 1035 overlay RP PE receives either the first remote (C-RP,C-G) join or a 1036 local IGMP/PIM join, it will trigger an MCAST-VPN route type 7 (C- 1037 S,C-G) towards the actual source PE for which it has received PIM 1038 register message in full compliance with regular PIM procedures. This 1039 involves the source PE to advertise the MCAST-VPN Source Active A-D 1040 route (MCAST-VPN route-type 5) towards all PEs. The Source Active A- 1041 D route is used to inform all PEs in a given MVPN about the active 1042 multicast source for switching from RPT to SPT when MVPNs use tenant 1043 RP-shared trees (i.e., rooted at tenant's RP) per section 13 of 1044 [RFC6514]. This is done in order to choose a single forwarder PE and 1045 to suppress receiving duplicate traffic. In such scenarios, the 1046 active multicast source is used by the receiver PEs to join the SPT 1047 if they have not received tenant (S,G) joins and by the RPT PEs to 1048 prune off the tenant (S,G) state from the RPT. The Source Active A-D 1049 route is also used for MVPN scenarios without tenant RP-shared trees. 1050 In such scenarios, the receiver PEs with tenant (*,G) states use the 1051 Source Active A-D route to know which upstream PEs with sources 1052 behind them to join per section 14 of [RFC6514] - i.e., to suppress 1053 joining Overlay shared tree. 1055 9.2. Handling of different encapsulations 1057 Just as in [RFC6514] the MVPN I-PMSI and S-PMSI A-D routes are used 1058 to form the overlay multicast tunnels and signal the tunnel type 1059 using the P-Multicast Service Interface Tunnel (PMSI Tunnel) 1060 attribute. 1062 9.2.1. MPLS Encapsulation 1064 The [RFC6514] assumes MPLS/IP core and there is no modification to 1065 the signaling procedures and encoding for PMSI tunnel formation 1066 therein. Also, there is no need for a gateway to inter-operate with 1067 non-EVPN PEs supporting [RFC6514] based MVPN over IP/MPLS. 1069 9.2.2 VxLAN Encapsulation 1071 In order to signal VXLAN, the corresponding BGP encapsulation 1072 extended community [TUNNEL-ENCAP] SHOULD be appended to the MVPN I- 1073 PMSI and S-PMSI A-D routes. The MPLS label in the PMSI Tunnel 1074 Attribute MUST be the Virtual Network Identifier (VNI) associated 1075 with the customer MVPN. The supported PMSI tunnel types with VXLAN 1076 encapsulation are: PIM-SSM Tree, PIM-SM Tree, BIDIR-PIM Tree, Ingress 1077 Replication [RFC6514]. Further details are in [RFC8365]. 1079 In this case, a gateway is needed for inter-operation between the 1080 EVPN PEs and non-EVPN MVPN PEs. The gateway should re-originate the 1081 control plane signaling with the relevant tunnel encapsulation on 1082 either side. In the data plane, the gateway terminates the tunnels 1083 formed on either side and performs the relevant stitching/re- 1084 encapsulation on data packets. 1086 9.2.3. Other Encapsulation 1088 In order to signal a different tunneling encapsulation such as NVGRE, 1089 GPE, or GENEVE the corresponding BGP encapsulation extended community 1090 [TUNNEL-ENCAP] SHOULD be appended to the MVPN I-PMSI and S-PMSI A-D 1091 routes. If the Tunnel Type field in the encapsulation extended- 1092 community is set to a type which requires Virtual Network Identifier 1093 (VNI), e.g., VXLAN-GPE or NVGRE [TUNNEL-ENCAP], then the MPLS label 1094 in the PMSI Tunnel Attribute MUST be the VNI associated with the 1095 customer MVPN. Same as in VXLAN case, a gateway is needed for inter- 1096 operation between the EVPN-IRB PEs and non-EVPN MVPN PEs. 1098 10. DCI with MPLS in WAN and VxLAN in DCs 1100 This section describers the inter-operation between MVPN PEs in WAN 1101 using MPLS encapsulation with EVPN PEs in a DC network using VxLAN 1102 encapsulation. Since the tunnel encapsulation between these networks 1103 are different, we must have at least one gateway in between. Usually, 1104 two or more are required for redundancy and load balancing purpose. 1105 In such scenarios, a DC network can be represented as a customer 1106 network that is multi-homed to two or more MVPN PEs via L3 interfaces 1107 and thus standard MVPN multi-homing procedures are applicable here. 1108 It should be noted that a MVPN overlay tunnel over the DC network is 1109 terminated on the IP-VRF of the gateway and not the MAC-VRF/BTs. 1110 Therefore, the considerations for loop prevention and split-horizon 1111 filtering described in [INTERCON-EVPN] are not applicable here. Some 1112 aspects of the multi-homing between VxLAN DC networks and MPLS WAN is 1113 in common with [INTERCON-EVPN]. 1115 10.1. Control plane inter-connect 1117 The gateway(s) MUST be setup with the inclusive set of all the IP- 1118 VRFs that span across the two domains. On each gateway, there will be 1119 at least two BGP sessions: one towards the DC side and the other 1120 towards the WAN side. Usually for redundancy purpose, more sessions 1121 are setup on each side. The unicast route propagation follows the 1122 exact same procedures in [INTERCON-EVPN]. Hence, a multicast host 1123 located in either domain, is advertised with the gateway IP address 1124 as the next-hop to the other domain. As a result, PEs view the hosts 1125 in the other domain as directly attached to the gateway and all 1126 inter-domain multicast signaling is directed towards the gateway(s). 1127 Received MVPN routes type 1-7 from either side of the gateway(s), 1128 MUST NOT be reflected back to the same side but processed locally and 1129 re-advertised (if needed) to the other side: 1131 - Intra-AS I-PMSI A-D Route: these are distributed within 1132 each domain to form the overlay tunnels which terminate at 1133 gateway(s). They are not passed to the other side of the 1134 gateway(s). 1136 - C-Multicast Route: joins are imported into the corresponding 1137 IP-VRF on each gateway and advertised as a new route to the 1138 other side with the following modifications (the rest of 1139 NLRI fields and path attributes remain on-touched): 1140 * Route-Distinguisher is set to that of the IP-VRF 1141 * Route-target is set to the exported route-target 1142 list on IP-VRF 1143 * The PMSI tunnel attribute and BGP Encapsulation 1144 extended community will be modified according to 1145 section 8 1146 * Next-hop will be set to the IP address which 1147 represents the gateway on either domain 1149 - Source Active A-D Route: same as joins 1151 - S-PMSI A-D Route: these are passed to the other side to form 1152 selective PMSI tunnels per every (C-S,C-G) from the gateway 1153 to the PEs in the other domain provided it contains 1154 receivers for the given (C-S, C-G). Similar modifications 1155 made to joins are made to the newly originated S-PMSI. 1157 In addition, the Originating Router's IP address is set to GW's IP 1158 address. Multicast signaling from/to hosts on local ACs on the 1159 gateway(s) are generated and propagated in both domains (if needed) 1160 per the procedures in section 7 in this document and in [RFC6514] 1161 with no change. It must be noted that for a locally attached source, 1162 the gateway will program an OIF per every domain from which it 1163 receives a remote join in its forwarding plane and different 1164 encapsulation will be used on the data packets. 1166 10.2. Data plane inter-connect 1168 Traffic forwarding procedures on gateways are same as those described 1169 for PEs in section 5 and 6 except that, unlike a non-border leaf PE, 1170 the gateway will not only route the incoming traffic from one side to 1171 its local receivers, but will also send it to the remote receivers in 1172 the the other domain after de-capsulation and appending the right 1173 encapsulation. The OIF and IIF are programmed in FIB based on the 1174 received joins from either side and the RPF calculation to the source 1175 or RP. The de-capsulation and encapsulation actions are programmed 1176 based on the received I-PMSI or S-PMSI A-D routes from either sides. 1178 If there are more than one gateway between two domains, the multi- 1179 homing procedures described in the following section must be 1180 considered so that incoming traffic from one side is not looped back 1181 to the other gateway. 1183 The multicast traffic from local sources on each gateway flows to the 1184 other gateway with the preferred WAN encapsulation. 1186 11. IANA Considerations 1188 There is no additional IANA considerations for PBB-EVPN beyond what 1189 is already described in [RFC7432]. 1191 12. Security Considerations 1193 All the security considerations in [RFC7432] apply directly to this 1194 document because this document leverages [RFC7432] control plane and 1195 their associated procedures. 1197 13. Acknowledgements 1199 The authors would like to thank Niloofar Fazlollahi, Aamod 1200 Vyavaharkar, Kesavan Thiruvenkatasamy, and Swadesh Agrawal for their 1201 discussions and contributions. 1203 14. References 1205 14.1. Normative References 1207 [RFC7432] A. Sajassi, et al., "BGP MPLS Based Ethernet VPN", RFC 1208 7432 , February 2015. 1210 [RFC8365] A. Sajassi, et al., "A Network Virtualization Overlay 1211 Solution using EVPN", RFC 8365, February 2018. 1213 [RFC6513] E. Rosen, et al., "Multicast in MPLS/BGP IP VPNs", RFC6513, 1214 February 2012. 1216 [RFC6514] R. Aggarwal, et al., "BGP Encodings and Procedures for 1217 Multicast in MPLS/BGP IP VPNs", RFC6514, February 2012. 1219 [EVPN-IRB] A. Sajassi, et al., "Integrated Routing and Bridging in 1220 EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03, 1221 February 2017. 1223 [EVPN-IRB-MCAST] A. Rosen, et al., "EVPN Optimized Inter-Subnet 1224 Multicast (OISM) Forwarding", draft-lin-bess-evpn-irb- 1225 mcast-04, October 24, 2017. 1227 14.2. Informative References 1229 [RFC7080] A. Sajassi, et al., "Virtual Private LAN Service (VPLS) 1230 Interoperability with Provider Backbone Bridges", RFC 1231 7080, December 2013. 1233 [RFC7209] D. Thaler, et al., "Requirements for Ethernet VPN (EVPN)", 1234 RFC 7209, May 2014. 1236 [RFC4389] A. Sajassi, et al., "Neighbor Discovery Proxies (ND 1237 Proxy)", RFC 4389, April 2006. 1239 [RFC4761] K. Kompella, et al., "Virtual Private LAN Service (VPLS) 1240 Using BGP for Auto-Discovery and Signaling", RFC 4761, 1241 Jauary 2007. 1243 [INTERCON-EVPN] J. Rabadan, et al., "Interconnect Solution for EVPN 1244 Overlay networks", https://tools.ietf.org/html/draft-ietf- 1245 bess-dci-evpn-overlay-04, September 2016 1247 [TUNNEL-ENCAPS] E. Rosen, et al. "The BGP Tunnel Encapsulation 1248 Attribute", https://tools.ietf.org/html/draft-ietf-idr- 1249 tunnel-encaps-06, work in progress, June 2017. 1251 [EVPN-IGMP-PROXY] A. Sajassi, et. al., "IGMP and MLD Proxy for EVPN", 1252 draft-ietf-bess-evpn-igmp-mld-proxy-01, work in progress, 1253 March 2018. 1255 [EVPN-PIM-PROXY] J. Rabadan, et. al., "PIM Proxy in EVPN Networks", 1256 draft-skr-bess-evpn-pim-proxy-00, work in progress, July 1257 3, 2017. 1259 15. Authors' Addresses 1261 Ali Sajassi 1262 Cisco 1263 170 West Tasman Drive 1264 San Jose, CA 95134, US 1265 Email: sajassi@cisco.com 1266 Samir Thoria 1267 Cisco 1268 170 West Tasman Drive 1269 San Jose, CA 95134, US 1270 Email: sthoria@cisco.com 1272 Ashutosh Gupta 1273 Avi Networks 1274 Email: ashutosh@avinetworks.com 1276 Luay Jalil 1277 Verizon 1278 Email: luay.jalil@verizon.com 1280 Appendix A. Use Cases 1282 A.1. DCs with only IGMP/MLD hosts w/o tenant router 1284 In a EVPN network consisting of only IGMP/MLD hosts, PE's 1285 will receive IGMP (*, G) or (S, G) joins from their 1286 locally attached host and would originate MVPN C-Multicast 1287 Route Type 6 and 7 NLRI's respectively. As described in 1288 RFC 6514 these NLRI's are directed towards RP-PE for Type 1289 6 or Source-PE for Type 7. In case of (*, G) join a 1290 Shared-Path Tree will be built in the core from RP-PE 1291 towards all Receiver-PE's. Once a Source starts to send 1292 Multicast data to specified multicast-group, the PE 1293 directly connected to Source will do PIM-registration with 1294 RP. Since there are existing receivers for the Group, RP 1295 will originate a PIM (S, G) join towards Source. This will 1296 be converted to MVPN Type 7 NLRI by RP-PE. Please note 1297 that the router RP-PE would be the PE configured as RP 1298 (e.g., using static configuration or by using BSR or Auto- 1299 RP procedures). The detailed working of such protocols is 1300 beyond the scope of this document. Upon receiving Type 7 1301 NLRI, Source-PE will include MVPN Tunnel in its Outgoing 1302 Interface List. Furthermore, Source-PE will follow the 1303 procedures in RFC-6514 to originate MVPN SA-AD route (RT 1304 5) to avoid duplicate traffic and allow all Receiver-PE's 1305 to shift from Share-Tree to Shortest-Path-Tree rooted at 1306 Source-PE. Section 13 of [RFC6514] describes it. 1308 However a network operator can chose to have only 1309 Shortest-Path-Tree built in MVPN core as described in 1310 section 14 of [RFC6514]. One way to achieve this, is for 1311 all PE's act as RP for its locally connected hosts and 1312 thus avoid sending any Shared-Tree Join (MVPN Type 6) into 1313 the core. In this scenario, there will be no PIM 1314 registration needed since all PE's are first-hop router as 1315 well as acting RP. Once a source starts to send multicast 1316 data, the PE directly connected to it originates Source- 1317 Active AD (RT 5) to all other PE's in network. Upon 1318 Receiving Source-Active AD route a PE must cache it in its 1319 local database and also look for any matching interest for 1320 (*, G) where G is the multicast group described in 1321 received Source-Active AD route. If it finds any such 1322 matching entry, it must originate a C-Multicast route (RT 1323 7) in order to start receiving traffic from Source-PE. 1324 This procedure must be repeated on reception of any 1325 further Source-Active AD routes. 1327 A.2. DCs with mixed of IGMP/MLD hosts & multicast routers running PIM- 1328 SSM 1330 This scenario has multicast routers which can send PIM SSM 1331 (S, G) joins. Upon receiving these joins and if source 1332 described in join is learnt to be behind a MVPN peer PE, 1333 local PE will originate C-Multicast Join (RT 7) towards 1334 Source-PE. It is expected that PIM SSM group ranges are 1335 kept separate from ASM range for which IGMP hosts can send 1336 (*, G) joins. Hence both ASM and SSM groups shall operate 1337 without any overlap. There is no RP needed for SSM range 1338 groups and Shortest Path tree rooted at Source is built 1339 once a receiver interest is known. 1341 A.3. DCs with mixed of IGMP/MLD hosts & multicast routers running PIM- 1342 ASM 1344 This scenario includes reception of PIM (*, G) joins on 1345 PE's local AC. These joins are handled similar to IGMP (*, 1346 G) join as explained in sections above. Another 1347 interesting case can arise here is when one of the tenant 1348 routers can act as RP for some of the ASM Groups. In such 1349 scenario, a Upstream Multicast Hop (UMH) will be elected 1350 by other PE's in order to send C-Multicast Routes (RT 6). 1351 All procedures described in RFC 6513 with respect to UMH 1352 should be used to avoid traffic duplication due to 1353 incoherent selection of RP-PE by different Receiver-PE's. 1355 A.4. DCs with mixed of IGMP/MLD hosts & multicast routers running PIM- 1356 Bidir 1357 Creating Bidirectional (*, G) trees is useful when a 1358 customer wants least amount of control state in network. 1359 But on downside all receivers for a particular multicast 1360 group receive traffic from all sources sending to that 1361 group. However for the purpose of this document, all 1362 procedures as described in RFC 6513 and RFC 6514 apply 1363 when PIM-Bidir is used.