idnits 2.17.1 draft-ietf-bess-evpn-mvpn-seamless-interop-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 289: '... The solution SHALL support optimum ...' RFC 2119 keyword, line 291: '...As. The solution SHALL support optimum...' RFC 2119 keyword, line 297: '...ability, the solution SHALL use only a...' RFC 2119 keyword, line 301: '... solution MUST support optimum repli...' RFC 2119 keyword, line 304: '... - Non-IP traffic SHALL be forwarded per EVPN baseline [RFC7432] or...' (45 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1151 has weird spacing: '...rder to choos...' -- The document date (November 18, 2019) is 1618 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 200, but not defined == Missing Reference: 'IEEE802.1Q' is mentioned on line 418, but not defined == Missing Reference: 'RFC 6513' is mentioned on line 478, but not defined == Missing Reference: 'RF7432' is mentioned on line 795, but not defined == Missing Reference: 'IGMP-PROXY' is mentioned on line 953, but not defined == Missing Reference: 'TUNNEL-ENCAP' is mentioned on line 1200, but not defined == Missing Reference: 'RFC 6514' is mentioned on line 1443, but not defined == Unused Reference: 'EVPN-IRB-MCAST' is defined on line 1518, but no explicit reference was found in the text == Unused Reference: 'RFC7080' is defined on line 1524, but no explicit reference was found in the text == Unused Reference: 'RFC7209' is defined on line 1528, but no explicit reference was found in the text == Unused Reference: 'RFC4389' is defined on line 1531, but no explicit reference was found in the text == Unused Reference: 'RFC4761' is defined on line 1534, but no explicit reference was found in the text == Unused Reference: 'TUNNEL-ENCAPS' is defined on line 1542, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 -- No information found for draft-lin-bess-evpn-irb-mcast - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'EVPN-IRB-MCAST' == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-01 == Outdated reference: A later version (-02) exists of draft-skr-bess-evpn-pim-proxy-00 Summary: 1 error (**), 0 flaws (~~), 18 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Working Group A. Sajassi 3 Internet Draft K. Thiruvenkatasamy 4 Category: Standard Track S. Thoria 5 Cisco 6 A. Gupta 7 Avi Networks 8 L. Jalil 9 Verizon 11 Expires: May 21, 2020 November 18, 2019 13 Seamless Multicast Interoperability between EVPN and MVPN PEs 14 draft-ietf-bess-evpn-mvpn-seamless-interop-00 16 Abstract 18 Ethernet Virtual Private Network (EVPN) solution is becoming 19 pervasive for Network Virtualization Overlay (NVO) services in data 20 center (DC) networks and as the next generation VPN services in 21 service provider (SP) networks. 23 As service providers transform their networks in their COs toward 24 next generation data center with Software Defined Networking (SDN) 25 based fabric and Network Function Virtualization (NFV), they want to 26 be able to maintain their offered services including Multicast VPN 27 (MVPN) service between their existing network and their new Service 28 Provider Data Center (SPDC) network seamlessly without the use of 29 gateway devices. They want to have such seamless interoperability 30 between their new SPDCs and their existing networks for a) reducing 31 cost, b) having optimum forwarding, and c) reducing provisioning. 32 This document describes a unified solution based on RFCs 6513 & 6514 33 for seamless interoperability of Multicast VPN between EVPN and MVPN 34 PEs. Furthermore, it describes how the proposed solution can be used 35 as a routed multicast solution in data centers with only EVPN PEs. 37 Status of this Memo 39 This Internet-Draft is submitted to IETF in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF), its areas, and its working groups. Note that 44 other groups may also distribute working documents as 45 Internet-Drafts. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 The list of current Internet-Drafts can be accessed at 53 http://www.ietf.org/1id-abstracts.html 55 The list of Internet-Draft Shadow Directories can be accessed at 56 http://www.ietf.org/shadow.html 58 Copyright and License Notice 60 Copyright (c) 2015 IETF Trust and the persons identified as the 61 document authors. All rights reserved. 63 This document is subject to BCP 78 and the IETF Trust's Legal 64 Provisions Relating to IETF Documents 65 (http://trustee.ietf.org/license-info) in effect on the date of 66 publication of this document. Please review these documents 67 carefully, as they describe your rights and restrictions with respect 68 to this document. Code Components extracted from this document must 69 include Simplified BSD License text as described in Section 4.e of 70 the Trust Legal Provisions and are provided without warranty as 71 described in the Simplified BSD License. 73 Table of Contents 75 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 76 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 5 77 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 78 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 6 79 4.1. Optimum Forwarding . . . . . . . . . . . . . . . . . . . . 7 80 4.2. Optimum Replication . . . . . . . . . . . . . . . . . . . . 7 81 4.3. All-Active and Single-Active Multi-Homing . . . . . . . . . 7 82 4.4. Inter-AS Tree Stitching . . . . . . . . . . . . . . . . . . 7 83 4.5. EVPN Service Interfaces . . . . . . . . . . . . . . . . . . 8 84 4.6. Distributed Anycast Gateway . . . . . . . . . . . . . . . . 8 85 4.7. Selective & Aggregate Selective Tunnels . . . . . . . . . . 8 86 4.8. Tenants' (S,G) or (*,G) states . . . . . . . . . . . . . . 8 87 4.9. Zero Disruption upon BD/Subnet Addition . . . . . . . . . . 8 88 4.10. No Changes to Existing EVPN Service Interface Models . . . 8 89 4.11. External source and receivers . . . . . . . . . . . . . . 9 90 4.12. Tenant RP placement . . . . . . . . . . . . . . . . . . . 9 91 5. IRB Unicast versus IRB Multicast . . . . . . . . . . . . . . . 9 92 5.1. Emulated Virtual LAN Service . . . . . . . . . . . . . . . 9 93 6. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 10 94 6.1. Operational Model for EVPN IRB PEs . . . . . . . . . . . . 10 95 6.2. Unicast Route Advertisements for IP multicast Source . . . 12 96 6.3. Multi-homing of IP Multicast Source and Receivers . . . . 13 97 6.3.1. Single-Active Multi-Homing . . . . . . . . . . . . . . 14 98 6.3.2. All-Active Multi-Homing . . . . . . . . . . . . . . . 15 99 6.4. Mobility for Tenant's Sources and Receivers . . . . . . . 17 100 6.5. Intra-Subnet BUM Traffic Handling . . . . . . . . . . . . 17 101 6.6 EVPN and MVPN interworking with gateway model . . . . . . . 17 102 7. Control Plane Operation . . . . . . . . . . . . . . . . . . . 18 103 7.1. Intra-ES/Intra-Subnet IP Multicast Tunnel . . . . . . . . . 18 104 7.2. Intra-Subnet BUM Tunnel . . . . . . . . . . . . . . . . . . 19 105 7.3. Inter-Subnet IP Multicast Tunnel . . . . . . . . . . . . . 20 106 7.4. IGMP Hosts as TSes . . . . . . . . . . . . . . . . . . . . 20 107 7.5. TS PIM Routers . . . . . . . . . . . . . . . . . . . . . . 21 108 8 Data Plane Operation . . . . . . . . . . . . . . . . . . . . . 21 109 8.1 Intra-Subnet L2 Switching . . . . . . . . . . . . . . . . . 22 110 8.2 Inter-Subnet L3 Routing . . . . . . . . . . . . . . . . . . 22 111 9. DCs with only EVPN PEs . . . . . . . . . . . . . . . . . . . . 23 112 9.1. Setup of overlay multicast delivery . . . . . . . . . . . . 23 113 9.2. Handling of different encapsulations . . . . . . . . . . . 25 114 9.2.1. MPLS Encapsulation . . . . . . . . . . . . . . . . . . 25 115 9.2.2 VxLAN Encapsulation . . . . . . . . . . . . . . . . . . 25 116 9.2.3. Other Encapsulation . . . . . . . . . . . . . . . . . 26 117 10. DCI with MPLS in WAN and VxLAN in DCs . . . . . . . . . . . . 26 118 10.1. Control plane inter-connect . . . . . . . . . . . . . . . 26 119 10.2. Data plane inter-connect . . . . . . . . . . . . . . . . . 27 120 11. Supporting application with TTL value 1 . . . . . . . . . . . 28 121 11.1. Policy based model . . . . . . . . . . . . . . . . . . . . 28 122 11.2. Exercising BUM procedure for VLAN/BD . . . . . . . . . . . 28 123 11.3. Intra-subnet bridging . . . . . . . . . . . . . . . . . . 28 124 12. Interop with L2 EVPN PEs . . . . . . . . . . . . . . . . . . . 30 125 13. Connecting external Multicast networks or PIM routers. . . . . 30 126 14. RP handling . . . . . . . . . . . . . . . . . . . . . . . . . 30 127 14.1. Various RP deployment options . . . . . . . . . . . . . . 30 128 14.1.1. RP-less mode . . . . . . . . . . . . . . . . . . . . . 30 129 14.1.2. Fabric anycast RP . . . . . . . . . . . . . . . . . . 31 130 14.1.3. Static RP . . . . . . . . . . . . . . . . . . . . . . 31 131 14.1.4. Co-existence of Fabric anycast RP and external RP . . 31 132 14.2. RP configuration options . . . . . . . . . . . . . . . . . 31 133 15. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 134 16. Security Considerations . . . . . . . . . . . . . . . . . . . 32 135 17. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 32 136 18. References . . . . . . . . . . . . . . . . . . . . . . . . . 32 137 18.1. Normative References . . . . . . . . . . . . . . . . . . 32 138 18.2. Informative References . . . . . . . . . . . . . . . . . 33 139 19. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 34 140 Appendix A. Use Cases . . . . . . . . . . . . . . . . . . . . . . 34 141 A.1. DCs with only IGMP/MLD hosts w/o tenant router . . . . . . 34 143 1. Introduction 145 Ethernet Virtual Private Network (EVPN) solution is becoming 146 pervasive for Network Virtualization Overlay (NVO) services in data 147 center (DC) networks and as the next generation VPN services in 148 service provider (SP) networks. 150 As service providers transform their networks in their COs toward 151 next generation data center with Software Defined Networking (SDN) 152 based fabric and Network Function Virtualization (NFV), they want to 153 be able to maintain their offered services including Multicast VPN 154 (MVPN) service between their existing network and their new SPDC 155 network seamlessly without the use of gateway devices. There are 156 several reasons for having such seamless interoperability between 157 their new DCs and their existing networks: 159 - Lower Cost: gateway devices need to have very high scalability to 160 handle VPN services for their DCs and as such need to handle large 161 number of VPN instances (in tens or hundreds of thousands) and very 162 large number of routes (e.g., in tens of millions). For the same 163 speed and feed, these high scale gateway boxes are relatively much 164 more expensive than the edge devices (e.g., PEs and TORs) that 165 support much lower number of routes and VPN instances. 167 - Optimum Forwarding: in a given CO, both EVPN PEs and MVPN PEs can 168 be connected to the same fabric/network (e.g., same IGP domain). In 169 such scenarios, the service providers want to have optimum forwarding 170 among these PE devices without the use of gateway devices. Because if 171 gateway devices are used, then the IP multicast traffic between an 172 EVPN and MVPN PEs can no longer be optimum and in some case, it may 173 even get tromboned. Furthermore, when an SPDC network spans across 174 multiple LATA (multiple geographic areas) and gateways are used 175 between EVPN and MVPN PEs, then with respect to IP multicast traffic, 176 only one GW can be designated forwarder (DF) between EVPN and MVPN 177 PEs. Such scenarios not only results in non-optimum forwarding but 178 also it can result in tromboing of IP multicast traffic between the 179 two LATAs when both source and destination PEs are in the same LATA 180 and the DF gateway is elected to be in a different LATA. 182 - Less Provisioning: If gateways are used, then the operator need to 183 configure per-tenant info on the gateways. In other words, for each 184 tenant that is configured, one (or maybe two) additional touch points 185 are needed. 187 This document describes a unified solution based on [RFC6513] and 188 [RFC6514] for seamless interoperability of multicast VPN between EVPN 189 and MVPN PEs. Furthermore, it describes how the proposed solution can 190 be used as a routed multicast solution in data centers with only EVPN 191 PEs (e.g., routed multicast VPN only among EVPN PEs). 193 2. Requirements Language 195 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 196 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to 197 be interpreted as described in [RFC2119] only when they appear in all 198 upper case. They may also appear in lower or mixed case as English 199 words, without any normative meaning. 201 3. Terminology 203 Most of the terminology used in this documents comes from [RFC8365] 205 Broadcast Domain (BD): In a bridged network, the broadcast domain 206 corresponds to a Virtual LAN (VLAN), where a VLAN is typically 207 represented by a single VLAN ID (VID) but can be represented by 208 several VIDs where Shared VLAN Learning (SVL) is used per [802.1Q]. 210 Bridge Table (BT): An instantiation of a broadcast domain on a MAC- 211 VRF. 213 VXLAN: Virtual Extensible LAN 215 POD: Point of Delivery 217 NV: Network Virtualization 219 NVO: Network Virtualization Overlay 221 NVE: Network Virtualization Endpoint 223 VNI: Virtual Network Identifier (for VXLAN) 225 EVPN: Ethernet VPN 227 EVI: An EVPN instance spanning the Provider Edge (PE) devices 228 participating in that EVPN 230 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 231 Control (MAC) addresses on a PE 233 IP-VRF: A Virtual Routing and Forwarding table for Internet Protocol 234 (IP) addresses on a PE 236 Ethernet Segment (ES): When a customer site (device or network) is 237 connected to one or more PEs via a set of Ethernet links, then that 238 set of links is referred to as an 'Ethernet segment'. 240 Ethernet Segment Identifier (ESI): A unique non-zero identifier that 241 identifies an Ethernet segment is called an 'Ethernet Segment 242 Identifier'. 244 Ethernet Tag: An Ethernet tag identifies a particular broadcast 245 domain, e.g., a VLAN. An EVPN instance consists of one or more 246 broadcast domains. 248 PE: Provider Edge device. 250 Single-Active Redundancy Mode: When only a single PE, among all the 251 PEs attached to an Ethernet segment, is allowed to forward traffic 252 to/from that Ethernet segment for a given VLAN, then the Ethernet 253 segment is defined to be operating in Single-Active redundancy mode. 255 All-Active Redundancy Mode: When all PEs attached to an Ethernet 256 segment are allowed to forward known unicast traffic to/from that 257 Ethernet segment for a given VLAN, then the Ethernet segment is 258 defined to be operating in All-Active redundancy mode. 260 PIM-SM: Protocol Independent Multicast - Sparse-Mode 262 PIM-SSM: Protocol Independent Multicast - Source Specific Multicast 264 Bidir PIM: Bidirectional PIM 266 FHR: First Hop Router 268 LHR: Last Hop Router 270 CO: Central Office of a service provider 272 SPDC: Service Provider Data Center 274 LATA: Local Access and Transport Area 276 Border Leafs: A set of EVPN-PE acting as exit point for EVPN fabric. 278 L3VNI: A VNI in the tenant VRF, which is associated with the core 279 facing interface. 281 4. Requirements 283 This section describes the requirements specific in providing 284 seamless multicast VPN service between MVPN and EVPN capable 285 networks. 287 4.1. Optimum Forwarding 289 The solution SHALL support optimum multicast forwarding between EVPN 290 and MVPN PEs within a network. The network can be confined to a CO or 291 it can span across multiple LATAs. The solution SHALL support optimum 292 multicast forwarding with both ingress replication tunnels and P2MP 293 tunnels. 295 4.2. Optimum Replication 297 For EVPN PEs with IRB capability, the solution SHALL use only a 298 single multicast tunnel among EVPN and MVPN PEs for IP multicast 299 traffic, when both PEs use the same tunnel type. Multicast tunnels 300 can be either ingress replication tunnels or P2MP tunnels. The 301 solution MUST support optimum replication for both Intra-subnet and 302 Inter-subnet IP multicast traffic: 304 - Non-IP traffic SHALL be forwarded per EVPN baseline [RFC7432] or 305 [RFC8365] 307 - If a Multicast VPN spans across both Intra and Inter subnets, then 308 for Ingress replication regardless of whether the traffic is Intra or 309 Inter subnet, only a single copy of IP multicast traffic SHALL be 310 sent from the source PE to the destination PE. 312 - If a Multicast VPN spans across both Intra and Inter subnets, then 313 for P2MP tunnels regardless of whether the traffic is Intra or Inter 314 subnet, only a single copy of multicast data SHALL be transmitted by 315 the source PE. Source PE can be either EVPN or MVPN PE and receiving 316 PEs can be a mix of EVPN and MVPN PEs - i.e., a multicast VPN can be 317 spread across both EVPN and MVPN PEs. 319 4.3. All-Active and Single-Active Multi-Homing 321 The solution MUST support multi-homing of source devices and 322 receivers that are sitting in the same subnet (e.g., VLAN) and are 323 multi-homed to EVPN PEs. The solution SHALL allow for both Single- 324 Active and All-Active multi-homing. The solution MUST prevent loop 325 during steady and transient states just like EVPN baseline solution 326 [RFC7432] and [RFC8365] for all multi-homing types. 328 4.4. Inter-AS Tree Stitching 330 The solution SHALL support multicast tree stitching when the tree 331 spans across multiple Autonomous Systems. 333 4.5. EVPN Service Interfaces 335 The solution MUST support all EVPN service interfaces listed in 336 section 6 of [RFC7432]: 338 - VLAN-based service interface 339 - VLAN-bundle service interface 340 - VLAN-aware bundle service interface 342 4.6. Distributed Anycast Gateway 344 The solution SHALL support distributed anycast gateways for tenant 345 workloads on NVE devices operating in EVPN-IRB mode. 347 4.7. Selective & Aggregate Selective Tunnels 349 The solution SHALL support selective and aggregate selective P- 350 tunnels as well as inclusive and aggregate inclusive P-tunnels. When 351 selective tunnels are used, then multicast traffic SHOULD only be 352 forwarded to the remote PE which have receivers - i.e., if there are 353 no receivers at a remote PE, the multicast traffic SHOULD NOT be 354 forwarded to that PE and if there are no receivers on any remote PEs, 355 then the multicast traffic SHOULD NOT be forwarded to the core. 357 4.8. Tenants' (S,G) or (*,G) states 359 The solution SHOULD store (C-S,C-G) and (C-*,C-G) states only on PE 360 devices that have interest in such states hence reducing memory and 361 processing requirements - i.e., PE devices that have sources and/or 362 receivers interested in such multicast groups. 364 4.9. Zero Disruption upon BD/Subnet Addition 366 In DC environments, various Bridge Domains are provisioned and 367 removed on regular basis due to host mobility, policy and tenant 368 changes. Such change in BD configuration should not affect existing 369 flows within the same BD or any other BD in the network. 371 4.10. No Changes to Existing EVPN Service Interface Models 373 VLAN-aware bundle service as defined in [RFC7432] typically does not 374 require any VLAN ID translation from one tenant site to another - 375 i.e., the same set of VLAN IDs are configured consistently on all 376 tenant segments. In such scenarios, EVPN-IRB multicast service MUST 377 maintain the same mode of operation and SHALL NOT require any VLAN ID 378 translation. 380 4.11. External source and receivers 382 The solution SHALL support sources and receivers external to the 383 tenant domain. i.e., multicast source inside the tenant domain can 384 have receiver outside the tenant domain and vice versa. 386 4.12. Tenant RP placement 388 The solution SHALL support a tenant to have RP anywhere in the 389 network. RP can be placed inside the EVPN network or MVPN network or 390 external domain. 392 5. IRB Unicast versus IRB Multicast 394 [EVPN-IRB] describes the operation for EVPN PEs in IRB mode for 395 unicast traffic. The same IRB model used for unicast traffic in 396 [EVPN-IRB], where an IP-VRF in an EVPN PE is attached to one or more 397 bridge tables (BTs) via virtual IRB interfaces, is also applicable 398 for multicast traffic. However, there are some noticeable differences 399 between the IRB operation for unicast traffic described in [EVPN-IRB] 400 versus for multicast traffic described in this document. For unicast 401 traffic, the intra-subnet traffic, is bridged within the MAC-VRF 402 associated with that subnet (i.e., a lookup based on MAC-DA is 403 performed); whereas, the inter-subnet traffic is routed in the 404 corresponding IP-VRF (ie, a lookup based on IP-DA is performed). A 405 given tenant can have one or more IP-VRFs; however, without loss of 406 generality, this document assumes one IP-VRF per tenant. In context 407 of a given tenant's multicast traffic, the intra-subnet traffic is 408 bridged for non-IP traffic and it is Layer-2 switched for IP traffic. 409 Whereas, the tenants's inter-subnet multicast traffic is always 410 routed in the corresponding IP-VRF. The difference between bridging 411 and L2-switching for multicast traffic is that the former uses MAC-DA 412 lookup for forwarding the multicast traffic; whereas, the latter uses 413 IP-DA lookup for such forwarding where the forwarding states are 414 built in the MAC-VRF using IGMP/MLD or PIM snooping. 416 5.1. Emulated Virtual LAN Service 418 EVPN does not provide a Virtual LAN (VLAN) service per [IEEE802.1Q] 419 but rather an emulated VLAN service. This VLAN service emulation is 420 not only done for unicast traffic but also is extended for intra- 421 subnet multicast traffic described in [EVPN-IGMP-PROXY] and [EVPN- 422 PIM-PROXY]. For intra-subnet multicast, an EVPN PE builds multicast 423 forwarding states in its bridge table (BT) based on snooping of 424 IGMP/MLD and/or PIM messages and the forwarding is performed based on 425 destination IP multicast address of the Ethernet frame rather than 426 destination MAC address as noted above. In order to enable seamless 427 integration of EVPN and MVPN PEs, this document extends the concept 428 of an emulated VLAN service for multicast IRB applications such that 429 the intra-subnet IP multicast traffic can get treated same as inter- 430 subnet IP multicast traffic which means intra-subnet IP multicast 431 traffic destined to remote PEs gets routed instead of being L2- 432 switched - i.e., TTL value gets decremented and the Ethernet header 433 of the L2 frame is de-capsulated an encapsulated at both ingress and 434 egress PEs. It should be noted that the non-IP multicast or L2 435 broadcast traffic still gets bridged and frames get forwarded based 436 on their destination MAC addresses. 438 6. Solution Overview 440 This section describes a multicast VPN solution based on [RFC6513] 441 and [RFC6514] for EVPN PEs operating in IRB mode that want to perform 442 seamless interoperability with their counterparts MVPN PEs. 444 6.1. Operational Model for EVPN IRB PEs 446 Without the loss of generality, this section assumes that all EVPN 447 PEs have IRB capability and operating in IRB mode for both unicast 448 and multicast traffic (e.g., all EVPN PEs are homogenous in terms of 449 their capabilities and operational modes). As it will be seen later, 450 an EVPN network can consist of a mix of PEs where some are capable of 451 multicast IRB and some are not and the multicast operation of such 452 heterogeneous EVPN network will be an extension of an EVPN homogenous 453 network. Therefore, we start with the multicast IRB solution 454 description for the EVPN homogenous network. 456 The EVPN PEs terminate IGMP/MLD messages from tenant host devices or 457 PIM messages from tenant routers on their IRB interfaces, thus avoid 458 sending these messages over MPLS/IP core. A tenant virtual/physical 459 router (e.g., CE) attached to an EVPN PE becomes a multicast routing 460 adjacency of that PE. Furthermore, the PE uses MVPN BGP protocol and 461 procedures per [RFC6513] and [RFC6514]. With respect to multicast 462 routing protocol between tenant's virtual/physical router and the PE 463 that it is attached to, any of the following PIM protocols is 464 supported per [RFC6513]: PIM-SM with Any Source Multicast (ASM) mode, 465 PIM-SM with Source Specific Multicast (SSM) mode, and PIM 466 Bidirectional (BIDIR) mode. Support of PIM-DM (Dense Mode) is 467 excluded in this document per [RFC6513]. 469 The EVPN PEs use MVPN BGP routes defined in [RFC6514] to convey 470 tenant (S,G) or (*,G) states to other MVPN or EVPN PEs and to set up 471 overlay trees (inclusive or selective) for a given MVPN instance. The 472 root or a leaf of such an overlay tree is terminated on an EVPN or 473 MVPN PE. Furthermore, this inclusive or selective overlay tree is 474 terminated on a single IP-VRF of the EVPN or MVPN PE. In case of EVPN 475 PE, these overlay trees never get terminated on MAC-VRFs of that PE. 477 Overlay trees are instantiated by underlay provider tunnels (P- 478 tunnels) - e.g., P2MP, MP2MP, or unicast tunnels per [RFC 6513]. When 479 there are several overlay trees mapped to a single underlay P-tunnel, 480 the tunnel is referred to as an aggregate tunnel. 482 Figure-1 below depicts a scenario where a tenant's MVPN spans across 483 both EVPN and MVPN PEs; where all EVPN PEs have multicast IRB 484 capability. An EVPN PE (with multicast IRB capability) can be modeled 485 as a MVPN PE where the virtual IRB interface of an EVPN PE (virtual 486 interface between a BT and IP-VRF) can be considered a routed 487 interface for the MVPN PE. 489 EVPN PE1 490 +------------+ 491 Src1 +----|(MAC-VRF1) | MVPN PE3 492 Rcvr1 +----| \ | +---------+ +--------+ 493 | (IP-VRF)|----| |---|(IP-VRF)|--- Rcvr5 494 | / | | | +--------+ 495 Rcvr2 +---|(MAC-VRF2) | | | 496 +------------+ | | 497 | MPLS/ | 498 EVPN PE2 | IP | 499 +------------+ | | 500 Rcvr3 +---|(MAC-VRF1) | | | MVPN PE4 501 | \ | | | +--------+ 502 | (IP-VRF)|----| |---|(IP-VRF)|--- Rcvr6 503 | / | +---------+ +--------+ 504 Rcvr4 +---|(MAC-VRF3) | 505 +------------+ 507 Figure-1: EVPN & MVPN PEs Seamless Interop 509 Figure 2 depicts the modeling of EVPN PEs based on MVPN PEs where an 510 EVPN PE can be modeled as a PE that consists of a MVPN PE whose 511 routed interfaces (e.g., attachment circuits) are replaced with IRB 512 interfaces connecting each IP-VRF of the MVPN PE to a set of BTs. 513 Similar to a MVPN PE where an attachment circuit serves as a routed 514 multicast interface for an IP-VRF associated with a MVPN instance, an 515 IRB interface serves as a routed multicast interface for the IP-VRF 516 associated with the MVPN instance. Since EVPN PEs run MVPN protocols 517 (e.g., [RFC6513] and [RFC6514]), for all practical purposes, they 518 look just like MVPN PEs to other PE devices. Such modeling of EVPN 519 PEs, transforms the multicast VPN operation of EVPN PEs to that of 520 MVPN and thus simplifies the interoperability between EVPN and MVPN 521 PEs to that of running a single unified solution based on MVPN. 523 EVPN PE1 524 +------------+ 525 Src1 +----|(MAC-VRF1) | 526 | \ | 527 Rcvr1 +----| +--------+| +---------+ +--------+ 528 | |MVPN PE1||----| |---|MVPN PE3|--- Rcvr5 529 | +--------+| | | +--------+ 530 | / | | | 531 Rcvr2 +---|(MAC-VRF2) | | | 532 +------------+ | | 533 | MPLS/ | 534 EVPN PE2 | IP | 535 +------------+ | | 536 Rcvr3 +---|(MAC-VRF1) | | | 537 | \ | | | 538 | +--------+| | | +--------+ 539 | |MVPN PE2||----| |---|MVPN PE4|--- Rcvr6 540 | +--------+| | | +--------+ 541 | / | +---------+ 542 Rcvr4 +---|(MAC-VRF3) | 543 +------------+ 545 Figure-2: Modeling EVPN PEs as MVPN PEs 547 Although modeling an EVPN PE as a MVPN PE, conceptually simplifies 548 the operation to that of a solution based on MVPN, the following 549 operational aspects of EVPN need to be factored in when considering 550 seamless integration between EVPN and MVPN PEs. 552 1) Unicast route advertisements for IP multicast source 553 2) Multi-homing of IP multicast sources and receivers 554 3) Mobility for Tenant's sources and receivers 555 4) non-IP multicast traffic handling 557 6.2. Unicast Route Advertisements for IP multicast Source 559 When an IP multicast source is attached to an EVPN PE, the unicast 560 route for that IP multicast source needs to be advertised. When the 561 source is attached to a Single-Active multi-homed ES, then the EVPN 562 DF PE is the PE that advertises a unicast route corresponding to the 563 source IP address with VRF Route Import extended community which in 564 turn is used as the Route Target for Join (S,G) messages sent toward 565 the source PE by the remote PEs. The EVPN PE advertises this unicast 566 route using EVPN route type 2 and IPVPN unicast route along with VRF 567 Route Import extended community. EVPN route type 2 is advertised with 568 the Route Targets corresponding to both IP-VRF and MAC-VRF/BT; 569 whereas, IPVPN unicast route is advertised with RT corresponding to 570 the IP-VRF. When unicast routes are advertised by MVPN PEs, they are 571 advertised using IPVPN unicast route along with VRF Route Import 572 extended community per [RFC6514]. 574 When the source is attached to an All-Active multi-homed ES, then the 575 PE that learns the source advertises the unicast route for that 576 source using EVPN route type 2 and IPVPN unicast route along with VRF 577 Route Import extended community. EVPN route type 2 is advertised with 578 the Route Targets corresponding to both IP-VRF and MAC-VRF/BT; 579 whereas, IPVPN unicast route is advertised with RT corresponding to 580 the IP-VRF. When the other multi-homing EVPN PEs for that ES receive 581 this unicast EVPN route, they import the route and check to see if 582 they have learned the route locally for that ES, if they have, then 583 they do nothing. But if they have not, then they add the IP and MAC 584 addresses to their IP-VRF and MAC-VRF/BT tables respectively with the 585 local interface corresponding to that ES as the corresponding route 586 adjacency. Furthermore, these PEs advertise an IPVPN unicast route 587 along with VRF Route Import extended community and Route Target 588 corresponding to IP-VRF to other remote PEs for that MVPN. Therefore, 589 the remote PEs learn the unicast route corresponding to the source 590 from all multi-homing PEs associated with that All- Active Ethernet 591 Segment even though one of the multi-homing PEs may only have 592 directly learned the IP address of the source. 594 EVPN-PEs advertise unicast routes as host routes using EVPN route 595 type 2 for sources that are directly attached to a tenant BD that has 596 been extended in the EVPN fabric. EVPN-PE may summarize sources (IP 597 networks) behind a router that are attached to EVPN-PE or sources 598 that are connected to a BD, which is not extended across EVPN fabric 599 and advertises those routes with EVPN route type 5. EVPN host-routes 600 are advertised as IPVPN host-routes to MVPN-PEs only incase of 601 seamless interop mode. 603 Section 6.6 discusses connecting EVPN and MVPN networks with gateway 604 model. Section 9 extends seamless interop procedures to EVPN only 605 fabrics as an IRB solution for multicast. 607 EVPN-PEs only need to advertise unicast routes using EVPN route-type 608 2 or route-type 5 and don't need to advertise IPVPN routes within 609 EVPN only fabric. No L3VPN provisioning is needed between EVPN-PEs. 611 In gateway model, EVPN-PE advertises unicast routes as IPVPN routes 612 along with VRI extended community for all multicast sources attached 613 behind EVPN-PEs. All IPVPN routes SHOULD be summarized while 614 adverting to MVPN-PEs. 616 6.3. Multi-homing of IP Multicast Source and Receivers 618 EVPN [RFC7432] has extensive multi-homing capabilities that allows 619 TSes to be multi-homed to two or more EVPN PEs in Single-Active or 620 All-Active mode. In Single-Active mode, only one of the multi-homing 621 EVPN PEs can receive/transmit traffic for a given subnet (a given BD) 622 for that multi-homed Ethernet Segment (ES). In All-Active mode, any 623 of the multi-homing EVPN PEs can receive/transmit unicast traffic but 624 only one of them (the DF PE) can send BUM traffic to the multi-homed 625 ES for a given subnet. 627 The multi-homing mode (Single-Active versus All-Active) of a TS 628 source can impact the MVPN procedures as described below. 630 6.3.1. Single-Active Multi-Homing 632 When a TS source reside on an ES that is multi-homed to two or more 633 EVPN PEs operating in Single-Active mode, only one of the EVPN PEs 634 can be active for the source subnet on that ES. Therefore, only one 635 of the multi-homing PE learns the unicast route of the TS source and 636 advertises that using EVPN and IPVPN to other PEs as described 637 previously. 639 A downstream PE that receives a Join/Prune message from a TS 640 host/router, selects a Upstream Multicast Hop (UMH) which is the 641 upstream PE that receives the IP multicast flow in case of Singe- 642 Active multi-homing. An IP multicast flow belongs to either a source- 643 specific tree (S,G) or to a shared tree (*,G). We use the notation 644 (X,G) to refer to either (S,G) or (*,G); where X refers to S in case 645 of (S,G) and X refers to the Rendezvous Point (RP) for G in case of 646 (*,G). Since the active PE (which is also the UMH PE) has advertised 647 unicast route for X along with the VRF Route Import EC, the 648 downstream PEs selects the UMH without any ambiguity based on MVPN 649 procedures described in section 5.1 of [RFC6513]. Any of the three 650 algorithms described in that section works fine. 652 The multi-homing PE that receives the IP multicast flow on its local 653 AC, performs the following tasks: 655 - L2 switches the multicast traffic in its BT associated with the 656 local AC over which it received the flow if there are any interested 657 receivers for that subnet. 659 - L3 routes the multicast traffic to other BTs for other subnets if 660 there are any interested receivers for those subnets. 662 - L3 routes the multicast traffic to other PEs per MVPN procedures. 664 The multicast traffic can be sent on Inclusive, Selective, or 665 Aggregate-Selective tree. Regardless what type of tree is used, only 666 a single copy of the multicast traffic is received by the downstream 667 PEs and the multicast traffic is forwarded optimally from the 668 upstream PE to the downstream PEs. 670 6.3.2. All-Active Multi-Homing 672 When a TS source reside on an ES that is multi-homed to two or more 673 EVPN PEs operating in All-Active mode, then any of the multi-homing 674 PEs can learn the TS source's unicast route; however, that PE may not 675 be the same PE that receives the IP multicast flow. Therefore, the 676 procedures for Single-Active Multi-homing need to be augmented for 677 All-Active scenario as below. 679 The multi-homing EVPN PE that receives the IP multicast flow on its 680 local AC, needs to do the following task in additions to the ones 681 listed in the previous section for Single-Active multi-homing: L2 682 switch the multicast traffic to other multi-homing EVPN PEs for that 683 ES via a multicast tunnel which it is called intra-ES tunnel. There 684 will be a dedicated tunnel for this purpose which is different from 685 inter-subnet overlay tree/tunnel setup by MVPN procedures. 687 When the multi-homing EVPN PEs receive the IP multicast flow via this 688 tunnel, they treat it as if they receive the flow via their local 689 ACs and thus perform the tasks mentioned in the previous section for 690 Single-Active multi-homing. The tunnel type for this intra-ES tunnel 691 can be any of the supported tunnel types such as ingress-replication, 692 P2MP tunnel, BIER, and Assisted Replication; however, given that vast 693 majority of multi-homing ESes are just dual-homing, a simple ingress 694 replication tunnel can serve well. For a given ES, since multicast 695 traffic that is locally received by one multi-homing PE is sent to 696 other multi-homing PEs via this intra-ES tunnel, there is no need for 697 sending the multicast tunnel via MVPN tunnel to these multi-homing 698 PEs - i.e., MVPN multicast tunnels are used only for remote EVPN and 699 MVPN PEs. Multicast traffic sent over this intra-ES tunnel to other 700 multi-homing PEs (only one other in case of dual-homing) for a given 701 ES can be either fixed or on demand basis. If on-demand basis, then 702 one of the other multi-homing PEs that is selected as a UMH upon 703 receiving a join message from a downstream PE, sends a request to 704 receive this multicast flow from the source multi-homing PE over the 705 special intra-ES tunnel. 707 By feeding IP multicast flow received on one of the EVPN multi-homing 708 PEs to the interested EVPN PEs in the same multi-homing group, we 709 have essentially enabled all the EVPN PEs in the multi-homing group 710 to serve as UMH for that IP multicast flow. Each of these UMH PEs 711 advertises unicast route for X in (X,G) along with the VRF Route 712 Import EC to all PEs for that MVPN instance. The downstream PEs build 713 a candidate UMH set based on procedures described in section 5.1 of 714 [RFC6513] and pick a UMH from the set. It should be noted that both 715 the default UMH selection procedure based on highest UMH PE IP 716 address and the UMH selection algorithm based on hash function 717 specified in section 5.1.3 of [RFC6513] (which is also a MUST 718 implement algorithm) result in the same UMH PE be selected by all 719 downstream PEs running the same algorithm. However, in order to allow 720 a form of "equal cost load balancing", the hash algorithm is 721 recommended to be used among all EVPN and MVPN PEs. This hash 722 algorithm distributes UMH selection for different IP multicast flows 723 among the multi-homing PEs for a given ES. 725 Since all downstream PEs (EVPN and MVPN) use the same hash-based 726 algorithm for UMH determination, they all choose the same upstream PE 727 as their UMH for a given (X,G) flow and thus they all send their 728 (X,G) join message via BGP to the same upstream PE. This results in 729 one of the multi-homing PEs to receive the join message and thus send 730 the IP multicast flow for (X,G) over its associated overlay tree even 731 though all of the multi-homing PEs in the All-Active redundancy group 732 have received the IP multicast flow (one of them directly via its 733 local AC and the rest indirectly via the associated intra-ES tunnel). 734 Therefore, only a single copy of routed IP multicast flow is sent 735 over the network regardless of overlay tree type supported by the PEs 736 - i.e., the overlay tree can be of type selective or aggregate 737 selective or inclusive tree. This gives the network operator the 738 maximum flexibility for choosing any overlay tree type that is 739 suitable for its network operation and still be able to deliver only 740 a single copy of the IP multicast flows to the egress PEs. In other 741 words, an egress PE only receives a single copy of the IP multicast 742 flow over the network, because it either receives it via the EVPN 743 intra-ES tunnel or MVPN inter-subnet tunnel. Furthermore, if it 744 receives it via MVPN inter-subnet tunnel, then only one of the multi- 745 homing PEs associated with the source ES, sends the IP multicast 746 traffic. 748 Since the network of interest for seamless interoperability between 749 EVPN and MVPN PEs is MPLS, the EVPN handling of BUM traffic for MPLS 750 network needs to be considered. EVPN [RFC7432] uses ESI MPLS label 751 for split-horizon filtering of Broadcast/Unknown unicast/multicast 752 (BUM) traffic from an All-Active multi-homing Ethernet Segment to 753 ensure that BUM traffic doesn't get loop back to the same Ethernet 754 Segment that it came from. This split-horizon filtering mechanism 755 applies as-is for multicast IRB scenario because of using the intra- 756 ES tunnel among multi-homing PEs. Since the multicast traffic 757 received from a TS source on an All-Active ES by a multi-homing PE is 758 bridged to all other multi-homing PEs in that group, the standard 759 EVPN split-horizon filtering described in [RFC7432] applies as-is. 760 Split-horizon filtering for non-MPLS encapsulations such as VxLAN is 761 described in section 9.2.2 that deals with a DC network that consists 762 of only EVPN PEs. 764 6.4. Mobility for Tenant's Sources and Receivers 766 When a tenant system (TS), source or receiver, is multi-homed behind 767 a group of multi-homing EVPN PEs, then TS mobility SHALL be supported 768 among EVPN PEs. Furthermore, such TS mobility SHALL only cause an 769 temporary disruption to the related multicast service among EVPN and 770 MVPN PEs. If a source is moved from one EVPN PE to another one, then 771 the EVPN mobility procedure SHALL discover this move and a new 772 unicast route advertisement (using both EVPN and IP-VPN routes) is 773 made by the EVPN PE where the source has moved to per section 6.3 774 above and unicast route withdraw (for both EVPN and IP-VPN routes) is 775 performed by the EVPN PE where the source has moved from. 777 The move of a source results in disruption of the IP multicast flow 778 for the corresponding (S,G) flow till the new unicast route 779 associated with the source is advertised by the new PE along with the 780 VRF Route Import EC, the join messages sent by the egress PEs are 781 received by the new PE, the multicast state for that flow is 782 installed in the new PE and a new overlay tree is built for that 783 source from the new PE to the egress PEs that are interested in 784 receiving that IP multicast flow. 786 The move of a receiver results in disruption of the IP multicast flow 787 to that receiver only till the new PE for that receiver discovers the 788 source and joins the overlay tree for that flow. 790 6.5. Intra-Subnet BUM Traffic Handling 792 Link local IP multicast traffic consists IPv4 traffic with a 793 destination address prefix of 224/8 and IPv6 traffic with a 794 destination address prefix of FF02/16. Such IP multicast traffic as 795 well as non-IP multicast/broadcast traffic are sent per EVPN [RF7432] 796 BUM procedures and does not get routed via IP-VRF for multicast 797 addresses. So, such BUM traffic will be limited to a given EVI/VLAN 798 (e.g., a give subnet); whereas, IP multicast traffic, will be locally 799 L2 switched for local interfaces attached on the same subnet and will 800 be routed for local interfaces attached on a different subnet or for 801 forwarding traffic to other EVPN PEs (refer to section 8 for data 802 plane operation). 804 6.6 EVPN and MVPN interworking with gateway model 806 The procedures specified in this document offers optimal multicast 807 forwarding within a data center and also enables seamless 808 interoperability of multicast traffic between EVPN and MVPN networks, 809 when same tunnel types are used in the data plane. 811 There are few other use cases in connecting MVPN networks in the EVPN 812 fabric other than seamless interop model, where gateway model is used 813 to interconnect both networks. 815 Case1: All EVPN-PEs in the fabric can be made as MVPN exit points 816 Case2: MVPN network can be attached behind a EVPN PE or subset of 817 EVPN-PEs 818 Case3: MVPN network (MVPN-PEs) which uses different tunnel model 819 can be directly attached to EVPN fabric. 821 In gateway model, MVPN routes from one domain are terminated at the 822 gateway PE and re-originated for another domain. 824 With use case 1 & 2, All PEs connected to an EVPN fabric can use one 825 data plane to send & receive traffic within the fabric/data center. 826 Also, IPVPN routes need not be advertised inside the fabric. Instead, 827 PE where MVPN is terminated should advertise IPVPN as EVPN routes. 829 With use case 3, Fabric will get two copies per multicast flow, if 830 receivers exist both MVPN and EVPN networks. (Two different data 831 planes are used to send the traffic in the fabric; one for EVPN 832 network and one for MVPN network). 834 7. Control Plane Operation 836 In seamless interop between EVPN and MVPN PEs, the control plane may 837 need to setup the following three types of multicast tunnels. The 838 first two are among EVPN PEs only but the third one is among EVPN and 839 MVPN PEs. 841 1) Intra-ES IP multicast tunnel 843 2) Intra-subnet BUM tunnel 845 3) Inter-subnet IP multicast tunnel 847 7.1. Intra-ES/Intra-Subnet IP Multicast Tunnel 849 As described in section 6.3.2, when a multicast source is sitting 850 behind an All-Active ES, then an intra-subnet multicast tunnel is 851 needed among the multi-homing EVPN PEs for that ES to carry multicast 852 flow received by one of the multi-homing PEs to the other PEs in that 853 ES. We refer to this multicast tunnel as Intra-ES/Intra-Subnet 854 tunnel. Vast majority of All-Active multi-homing for TOR devices in 855 DC networks are just dual-homing which means the multicast flow 856 received by one of the dual-homing PE only needs to be sent to the 857 other dual-homing PE. Therefore, a simple ingress replication tunnel 858 is all that is needed. In case of multi-homing to three or more EVPN 859 PEs, then other tunnel types such as P2MP, MP2MP, BIER, and Assisted 860 Replication can be considered. It should be noted that this intra-ES 861 tunnel is only needed for All-Active multi-homing and it is not 862 required for Single- Active multi-homing. 864 The EVPN PEs belonging to a given All-Active ES discover each other 865 using EVPN Ethernet Segment route per procedures described in 866 [RFC7432]. These EVPN PEs perform DF election per [RFC7432], [EVPN- 867 DF-Framework], or other DF election algorithms to decide who is a DF 868 for a given BD. If the BD belongs to a tenant that has IRB IP 869 multicast enabled for it, then for fixed-mode, each PE sets up an 870 intra-ES tunnel to forward IP multicast traffic received locally on 871 that BD to other multi-homing PE(s) for that ES. Therefore, IP 872 multicast traffic received via a local attachment circuit is sent on 873 this tunnel and on the associated IRB interface for that BT and other 874 local attachment circuits if there are interested receivers for them. 875 The other multi-homing EVPN PEs treat this intra-ES tunnel just like 876 their local ACs - i.e., the multicast traffic received over this 877 tunnel is treated as if it is received via its local AC. Thus, the 878 multi-homing PEs cannot receive the same IP multicast flow from an 879 MVPN tunnel (e.g., over an IRB interface for that BD) because between 880 a source behind a local AC versus a source behind a remote PE, the PE 881 always chooses its local AC. 883 When ingress replication is used for intra-ES tunnel, every PE in the 884 All-Active multi-homing ES has all the information to setup these 885 tunnels - i.e., a) each PE knows what are the other multi-homing PEs 886 for that ES via EVPN Ethernet Segment route and can use this 887 information to setup intra-ES/Intra-Subnet IP multicast tunnel among 888 themselves. 890 7.2. Intra-Subnet BUM Tunnel 892 As the name implies, this tunnel is setup to carry BUM traffic for a 893 given subnet/BD among EVNP PEs. In [RFC7432], this overlay tunnel is 894 used for transmission of all BUM traffic including user IP multicast 895 traffic. However, for multicast traffic handling in EVPN-IRB PEs, 896 this tunnel is used for all broadcast, unknown-unicast, non-IP 897 multicast traffic, and link-local IP multicast traffic - i.e., it is 898 used for all BUM traffic except user IP multicast traffic. This 899 tunnel is setup using IMET route for a given EVI/BD. The composition 900 and advertisement of IMET routes are exactly per [RFC7432]. It should 901 be noted that when an EVPN All-Active multi-homing PE uses both this 902 tunnel as well as intra-ES tunnel, there SHALL be no duplication of 903 multicast traffic over the network because they carry different types 904 of multicast traffic - i.e., intra-ES tunnel among multi-homing PEs 905 carries only user IP multicast traffic; whereas, intra-subnet BUM 906 tunnel carries link-local IP multicast traffic and BUM traffic (w/ 907 non-IP multicast). 909 7.3. Inter-Subnet IP Multicast Tunnel 911 As its name implies, this tunnel is setup to carry IP-only multicast 912 traffic for a given tenant across all its subnets (BDs) among EVPN 913 and MVPN PEs. 915 The following NLRIs from [RFC6514] is used for setting up this inter- 916 subnet tunnel in the network. 918 Intra-AS I-PMSI A-D route is used for the setup of default 919 underlay tunnel (also called inclusive tunnel) for a tenant IP- 920 VRF. The tunnel attributes are indicated using PMSI attribute 921 with this route. 923 S-PMSI A-D route is used for the setup of Customer flow specific 924 underlay tunnels. This enables selective delivery of data to PEs 925 having active receivers and optimizes fabric bandwidth 926 utilization. The tunnel attributes are indicated using PMSI 927 attribute with this route. 929 Each EVPN PE supporting a specific MVPN instance discovers the set of 930 other PEs in its AS that are attached to sites of that MVPN using 931 Intra-AS I-PMSI A-D route (route type 1) per [RFC6514]. It can also 932 discover the set of other ASes that have PEs attached to sites of 933 that MVPN using Inter-AS I-PMSI A-D route (route type 2) per 934 [RFC6514]. After the discovery of PEs that are attached to sites of 935 the MVPN, an inclusive overlay tree (I-PMSI) can be setup for 936 carrying tenant multicast flows for that MVPN; however, this is not a 937 requirement per [RFC6514] and it is possible to adopt a policy in 938 which all tenant flows are carried on S-PMSIs. 940 An EVPN-IRB PE sends a user IP multicast flow to other EVPN and MVPN 941 PEs over this inter-subnet tunnel that is instantiated using MVPN I- 942 PMSI or S-PMSI. This tunnel can be considered as being originated and 943 terminated from/to among IP-VRFs of EVPN/MVPN PEs; whereas, intra- 944 subnet tunnel is originated/terminated among MAC-VRFs of EVPN PEs. 946 7.4. IGMP Hosts as TSes 948 If a tenant system which is an IGMP host is multi-homed to two or 949 more EVPN PEs using All-Active multi-homing, then IGMP join and leave 950 messages are synchronized between these EVPN PEs using EVPN IGMP Join 951 Synch route (route type 7) and EVPN IGMP Leave Synch route (route 952 type 8) per [IGMP-PROXY]. IGMP states are built in the corresponding 953 BDs of the multi-homing EVPN PEs. In [IGMP-PROXY] the DF PE for that 954 BD originates an EVPN Selective Multicast Tag route (SMET route) 955 route to other EVPN PEs. However, in here there is no need to use 956 SMET because the IGMP messages are terminated by the EVPN-IRB PE and 957 tenant (*,G) or (S,G) join messages are sent via MVPN Shared Tree 958 Join route (route type 6) or Source Tree Join route (route type 7) 959 respectively of MCAST-VPN NLRI per [RFC6514]. In case of a network 960 with only IGMP hosts, the preferred mode of operation is that of 961 Shortest Path Tree(SPT) per section 14 of [RFC6514]. This mode is 962 only supported for PIM-SM and avoids the RP configuration overhead. 963 Such mode is chosen by provisioning/ configuration. 965 7.5. TS PIM Routers 967 Just like a MVPN PE, an EVPN PE runs a separate tenant multicast 968 routing instance (VPN-specific) per MVPN instance and the following 969 tenant multicast routing instances are supported: 971 - PIM Sparse Mode (PIM-SM) with the ASM service model 972 - PIM Sparse Mode with the SSM service model 973 - PIM Bidirectional Mode (BIDIR-PIM), which uses bidirectional 974 tenant-trees to support the ASM service model 976 A given tenant's PIM join messages for (*,G) or (S, G) are processed 977 by the corresponding tenant multicast routing protocol and they are 978 advertised over MPLS/IP network using Shared Tree Join route (route 979 type 6) and Source Tree Join route (route type 7) respectively of 980 MCAST-VPN NLRI per [RFC6514]. 982 8 Data Plane Operation 984 When an EVPN-IRB PE receives an IGMP/MLD join message over one of its 985 Attachment Circuits (ACs), it adds that AC to its Layer-2 (L2) OIF 986 list. This L2 OIF list is associated with the MAC-VRF/BT 987 corresponding to the subnet of the tenant device that sent the 988 IGMP/MLD join. Therefore, tenant (S,G) or (*,G) forwarding entries 989 are created/updated for the corresponding MAC-VRF/BT based on these 990 source and group IP addresses. Furthermore, the IGMP/MLD join message 991 is propagated over the corresponding IRB interface and it is 992 processed by the tenant multicast routing instance which creates the 993 corresponding tenant (S,G) or (*,G) Layer-3 (L3) forwarding entries. 994 It adds this IRB interface to the L3 OIF list. An IRB is removed as a 995 L3 OIF when all L2 tenant (S,G) or (*,G) forwarding states is removed 996 for the MAC-VRF/BT associated with that IRB. Furthermore, tenant 997 (S,G) or (*,G) L3 forwarding state is removed when all of its L3 OIFs 998 are removed - i.e., all the IRB and L3 interfaces associated with 999 that tenant (S,G) or (*,G) are removed. 1001 When an EVPN PE receives IP multicast traffic from one of its AC, if 1002 it has any attached receivers for that subnet, it performs L2 1003 switching of the intra-subnet traffic within the BT attached to that 1004 AC. If the multicast flow is received over an AC that belongs to an 1005 All-Active ES, then the multicast flow is also sent over the intra- 1006 ES/Intra-Subnet tunnel among multi-homing PEs. The EVPN PE then sends 1007 the multicast traffic over the corresponding IRB interface. The 1008 multicast traffic then gets routed in the corresponding IP-VRF and it 1009 gets forwarded to interfaces in the L3 OIF list which can include 1010 other IRB interfaces, other L3 interfaces directly connected to TSes, 1011 and the MVPN Inter-Subnet tunnel which is instantiated by an I-PMSI 1012 or S-PMSI tunnel. When the multicast packet is routed within the IP- 1013 VRF of the EVPN PE, its Ethernet header is stripped and its TTL gets 1014 decremented as the result of this IP routing. When the multicast 1015 traffic is received on an IRB interface by the BT corresponding to 1016 that interface, it gets L2 switched and sent over ACs that belong to 1017 the L2 OIF list. 1019 8.1 Intra-Subnet L2 Switching 1021 Rcvr1 in Figure 1 is connected to PE1 in MAC-VRF1 (same as Src1) and 1022 sends IGMP join for (C-S, C-G), IGMP snooping will record this state 1023 in local bridging entry. A routing entry will be formed as well 1024 which will point to MAC-VRF1 as RPF for Src1. We assume that Src1 is 1025 known via ARP or similar procedures. Rcvr1 will get a locally 1026 bridged copy of multicast traffic from Src1. Rcvr3 is also connected 1027 in MAC-VRF1 but to PE2 and hence would send IGMP join which will be 1028 recorded at PE2. PE2 will also form routing entry and RPF will be 1029 assumed as Tenant Tunnel "Tenant1" formed beforehand using MVPN 1030 procedures. Also this would cause multicast control plane to 1031 initiate a BGP MCAST-VPN type 7 route which would include VRI for PE1 1032 and hence be accepted on PE1. PE1 will include Tenant1 tunnel as 1033 Outgoing Interface (OIF) in the routing entry. Now, since it has 1034 knowledge of remote receivers via MVPN control plane it will 1035 encapsulate original multicast traffic in Tenant1 tunnel towards 1036 core. 1038 8.2 Inter-Subnet L3 Routing 1040 Rcvr2 in Figure 1 is connected to PE1 in MAC-VRF2 and hence PE1 will 1041 record its membership in MAC-VRF2. Since MAC-VRF2 is enabled with 1042 IRB, it gets added as another OIF to routing entry formed for (C-S, 1043 C-G). Rcvr2 and Rcvr4 are also in different MAC-VRFs than multicast 1044 speaker Src1 and hence need Inter-subnet forwarding. PE2 will form 1045 local bridging entry in MAC-VRF2 due to IGMP joins received from 1046 Rcvr3 and Rcvr4 respectively. PE2 now adds another OIF 'MAC-VRF2' to 1047 its existing routing entry. But there is no change in control plane 1048 states since its already sent MVPN route and no further signaling is 1049 required. Also since Src1 is not part of MAC-VRF2 subnet, it is 1050 treated as routing OIF and hence MAC header gets modified as per 1051 normal procedures for routing. PE3 forms routing entry very similar 1052 to PE2. It is to be noted that PE3 does not have MAC-VRF1 configured 1053 locally but still can receive the multicast data traffic over Tenant1 1054 tunnel formed due to MVPN procedures 1056 9. DCs with only EVPN PEs 1058 As mentioned earlier, the proposed solution can be used as a routed 1059 multicast solution in data center networks with only EVPN PEs (e.g., 1060 routed multicast VPN only among EVPN PEs). It should be noted that 1061 the scope of intra-subnet forwarding for the solution described in 1062 this document, is limited to a single EVPN PE for Single-Active 1063 multi-homing and to multi-homing PEs for All-Active multi-homing. In 1064 other words, the IP multicast traffic that needs to be forwarded from 1065 the source PE to remote PEs is routed to remote PEs regardless of 1066 whether the traffic is intra-subnet or inter-subnet. As the result, 1067 the TTL value for intra-subnet traffic that spans across two or more 1068 PEs get decremented. 1070 However, if there are applications that require intra-subnet 1071 multicast traffic to be L2 forwarded, Section 11 discusses some 1072 options to support applications having TTL value 1. The procedure 1073 discussed in Section 11 may be used to support applications that 1074 require intra-subnet multicast traffic to be L2 forwarded. 1076 9.1. Setup of overlay multicast delivery 1078 It must be emphasized that this solution poses no restriction on the 1079 setup of the tenant BDs and that neither the source PE, nor the 1080 receiver PEs do not need to know/learn about the BD configuration on 1081 other PEs in the MVPN. The Reverse Path Forwarder (RPF) is selected 1082 per the tenant multicast source and the IP-VRF in compliance with the 1083 procedures in [RFC6514], using the incoming EVPN route type 2 or 5 1084 NLRI per [RFC7432]. 1086 The VRF Route Import (VRI) extended community that is carried with 1087 the IP-VPN routes in [RFC6514] MUST be carried with the EVPN unicast 1088 routes when these routes are used. The construction and processing of 1089 the VRI are consistent with [RFC6514]. The VRI MUST uniquely identify 1090 the PE which is advertising a multicast source and the IP-VRF it 1091 resides in. 1093 VRI is constructed as following: 1095 - The 4-octet Global Administrator field MUST be set to an IP 1096 address of the PE. This address SHOULD be common for all the 1097 IP-VRFs on the PE (e.g., this address may be the PE's loopback 1098 address or VTEP address). 1100 - The 2-octet Local Administrator field associated with a given 1101 IP-VRF contains a number that uniquely identifies that IP-VRF 1102 within the PE that contains the IP-VRF. 1104 EVPN PE MUST have Route Target Extended Community to import/export 1105 MVPN routes. In data center environment, it is desirable to have this 1106 RT configured using auto-generated method than static configuration. 1108 The following is one recommended model to auto-generate MVPN RT: 1110 - The Global Administrator field of the MVPN RT MAY be set 1111 to BGP AS Number. 1113 - The Local Administrator field of the MVPN RT MAY be set to 1114 the VNI associated with the tenant VRF. 1116 Every PE which detects a local receiver via a local IGMP join or a 1117 local PIM join for a specific source (overlay SSM mode) MUST 1118 terminate the IGMP/PIM signaling at the IP-VRF and generate a (C-S,C- 1119 G) via the BGP MCAST-VPN route type 7 per [RFC6514] if and only if 1120 the RPF for the source points to the fabric. If the RPF points to a 1121 local multicast source on the same MAC-VRF or a different MAC-VRF on 1122 that PE, the MCAST-VPN MUST NOT be advertised and data traffic will 1123 be locally routed/bridged to the receiver as detailed in section 6.2. 1125 The VRI received with EVPN route type 2 or 5 NLRI from source PE will 1126 be appended as an export route-target extended community. More 1127 details about handling of various types of local receivers are in 1128 section 10. The PE which has advertised the unicast route with VRI, 1129 will import the incoming MCAST-VPN NLRI in the IP-VRF with the same 1130 import route-target extended-community and other PEs SHOULD ignore 1131 it. Following such procedure the source PE learns about the existence 1132 of at least one remote receiver in the tenant overlay and programs 1133 data plane accordingly so that a single copy of multicast data is 1134 forwarded into the fabric using tenant VRF tunnel. 1136 If the multicast source is unknown (overlay ASM mode), the MCAST-VPN 1137 route type 6 (C-*,C-G) join SHOULD be targeted towards the designated 1138 overlay Rendezvous Point (RP) by appending the received RP VRI as an 1139 export route-target extended community. Every PE which detects a 1140 local source, registers with its RP PE. That is how the RP learns 1141 about the tenant source(s) and group(s) within the MVPN. Once the 1142 overlay RP PE receives either the first remote (C-RP,C-G) join or a 1143 local IGMP/PIM join, it will trigger an MCAST-VPN route type 7 (C- 1144 S,C-G) towards the actual source PE for which it has received PIM 1145 register message in full compliance with regular PIM procedures. This 1146 involves the source PE to advertise the MCAST-VPN Source Active A-D 1147 route (MCAST-VPN route-type 5) towards all PEs. The Source Active A- 1148 D route is used to inform all PEs in a given MVPN about the active 1149 multicast source for switching from RPT to SPT when MVPNs use tenant 1150 RP-shared trees (i.e., rooted at tenant's RP) per section 13 of 1151 [RFC6514]. This is done in order to choose a single forwarder PE and 1152 to suppress receiving duplicate traffic. In such scenarios, the 1153 active multicast source is used by the receiver PEs to join the SPT 1154 if they have not received tenant (S,G) joins and by the RPT PEs to 1155 prune off the tenant (S,G) state from the RPT. The Source Active A-D 1156 route is also used for MVPN scenarios without tenant RP-shared trees. 1157 In such scenarios, the receiver PEs with tenant (*,G) states use the 1158 Source Active A-D route to know which upstream PEs with sources 1159 behind them to join per section 14 of [RFC6514] - i.e., to suppress 1160 joining Overlay shared tree. 1162 9.2. Handling of different encapsulations 1164 Just as in [RFC6514] the MVPN I-PMSI and S-PMSI A-D routes are used 1165 to form the overlay multicast tunnels and signal the tunnel type 1166 using the P-Multicast Service Interface Tunnel (PMSI Tunnel) 1167 attribute. 1169 9.2.1. MPLS Encapsulation 1171 The [RFC6514] assumes MPLS/IP core and there is no modification to 1172 the signaling procedures and encoding for PMSI tunnel formation 1173 therein. Also, there is no need for a gateway to inter-operate with 1174 non-EVPN PEs supporting [RFC6514] based MVPN over IP/MPLS. 1176 9.2.2 VxLAN Encapsulation 1178 In order to signal VXLAN, the corresponding BGP encapsulation 1179 extended community [TUNNEL-ENCAP] SHOULD be appended to the MVPN I- 1180 PMSI and S-PMSI A-D routes. The MPLS label in the PMSI Tunnel 1181 Attribute MUST be the Virtual Network Identifier (VNI) associated 1182 with the customer MVPN. The supported PMSI tunnel types with VXLAN 1183 encapsulation are: PIM-SSM Tree, PIM-SM Tree, BIDIR-PIM Tree, Ingress 1184 Replication [RFC6514]. Further details are in [RFC8365]. 1186 In this case, a gateway is needed for inter-operation between the 1187 EVPN PEs and non-EVPN MVPN PEs. The gateway should re-originate the 1188 control plane signaling with the relevant tunnel encapsulation on 1189 either side. In the data plane, the gateway terminates the tunnels 1190 formed on either side and performs the relevant stitching/re- 1191 encapsulation on data packets. 1193 9.2.3. Other Encapsulation 1195 In order to signal a different tunneling encapsulation such as NVGRE, 1196 GPE, or GENEVE the corresponding BGP encapsulation extended community 1197 [TUNNEL-ENCAP] SHOULD be appended to the MVPN I-PMSI and S-PMSI A-D 1198 routes. If the Tunnel Type field in the encapsulation extended- 1199 community is set to a type which requires Virtual Network Identifier 1200 (VNI), e.g., VXLAN-GPE or NVGRE [TUNNEL-ENCAP], then the MPLS label 1201 in the PMSI Tunnel Attribute MUST be the VNI associated with the 1202 customer MVPN. Same as in VXLAN case, a gateway is needed for inter- 1203 operation between the EVPN-IRB PEs and non-EVPN MVPN PEs. 1205 10. DCI with MPLS in WAN and VxLAN in DCs 1207 This section describers the inter-operation between MVPN PEs in WAN 1208 using MPLS encapsulation with EVPN PEs in a DC network using VxLAN 1209 encapsulation. Since the tunnel encapsulation between these networks 1210 are different, we must have at least one gateway in between. Usually, 1211 two or more are required for redundancy and load balancing purpose. 1212 In such scenarios, a DC network can be represented as a customer 1213 network that is multi-homed to two or more MVPN PEs via L3 interfaces 1214 and thus standard MVPN multi-homing procedures are applicable here. 1215 It should be noted that a MVPN overlay tunnel over the DC network is 1216 terminated on the IP-VRF of the gateway and not the MAC-VRF/BTs. 1217 Therefore, the considerations for loop prevention and split-horizon 1218 filtering described in [INTERCON-EVPN] are not applicable here. Some 1219 aspects of the multi-homing between VxLAN DC networks and MPLS WAN is 1220 in common with [INTERCON-EVPN]. 1222 10.1. Control plane inter-connect 1224 The gateway(s) MUST be setup with the inclusive set of all the IP- 1225 VRFs that span across the two domains. On each gateway, there will be 1226 at least two BGP sessions: one towards the DC side and the other 1227 towards the WAN side. Usually for redundancy purpose, more sessions 1228 are setup on each side. The unicast route propagation follows the 1229 exact same procedures in [INTERCON-EVPN]. Hence, a multicast host 1230 located in either domain, is advertised with the gateway IP address 1231 as the next-hop to the other domain. As a result, PEs view the hosts 1232 in the other domain as directly attached to the gateway and all 1233 inter-domain multicast signaling is directed towards the gateway(s). 1234 Received MVPN routes type 1-7 from either side of the gateway(s), 1235 MUST NOT be reflected back to the same side but processed locally and 1236 re-advertised (if needed) to the other side: 1238 - Intra-AS I-PMSI A-D Route: these are distributed within 1239 each domain to form the overlay tunnels which terminate at 1240 gateway(s). They are not passed to the other side of the 1241 gateway(s). 1243 - C-Multicast Route: joins are imported into the corresponding 1244 IP-VRF on each gateway and advertised as a new route to the 1245 other side with the following modifications (the rest of 1246 NLRI fields and path attributes remain on-touched): 1247 * Route-Distinguisher is set to that of the IP-VRF 1248 * Route-target is set to the exported route-target 1249 list on IP-VRF 1250 * The PMSI tunnel attribute and BGP Encapsulation 1251 extended community will be modified according to 1252 section 8 1253 * Next-hop will be set to the IP address which 1254 represents the gateway on either domain 1256 - Source Active A-D Route: same as joins 1258 - S-PMSI A-D Route: these are passed to the other side to form 1259 selective PMSI tunnels per every (C-S,C-G) from the gateway 1260 to the PEs in the other domain provided it contains 1261 receivers for the given (C-S, C-G). Similar modifications 1262 made to joins are made to the newly originated S-PMSI. 1264 In addition, the Originating Router's IP address is set to GW's IP 1265 address. Multicast signaling from/to hosts on local ACs on the 1266 gateway(s) are generated and propagated in both domains (if needed) 1267 per the procedures in section 7 in this document and in [RFC6514] 1268 with no change. It must be noted that for a locally attached source, 1269 the gateway will program an OIF per every domain from which it 1270 receives a remote join in its forwarding plane and different 1271 encapsulation will be used on the data packets. 1273 10.2. Data plane inter-connect 1275 Traffic forwarding procedures on gateways are same as those described 1276 for PEs in section 5 and 6 except that, unlike a non-border leaf PE, 1277 the gateway will not only route the incoming traffic from one side to 1278 its local receivers, but will also send it to the remote receivers in 1279 the the other domain after de-capsulation and appending the right 1280 encapsulation. The OIF and IIF are programmed in FIB based on the 1281 received joins from either side and the RPF calculation to the source 1282 or RP. The de-capsulation and encapsulation actions are programmed 1283 based on the received I-PMSI or S-PMSI A-D routes from either sides. 1284 If there are more than one gateway between two domains, the multi- 1285 homing procedures described in the following section must be 1286 considered so that incoming traffic from one side is not looped back 1287 to the other gateway. 1289 The multicast traffic from local sources on each gateway flows to the 1290 other gateway with the preferred WAN encapsulation. 1292 11. Supporting application with TTL value 1 1294 It is possible that some deployments may have a host on the tenant 1295 domain that sends multicast traffic with TTL value 1. The interested 1296 receiver for that traffic flow may be attached to different PEs on 1297 the same subnet. The procedures specified in section 6 always routes 1298 the traffic between PEs for both intra and inter subnet traffic. 1299 Hence traffic with TTL value 1 is dropped due to the nature of 1300 routing. 1302 This section discusses few possible ways to support traffic having 1303 TTL value 1. Implementation MAY support any of the following model. 1305 11.1. Policy based model 1307 Policies may be used to enforce EVPN BUM procedure for traffic flows 1308 with TTL value 1. Traffic flow that matches the policy is excluded 1309 from seamless interop procedure specified in this document, hence TTL 1310 decrement issue will not apply. 1312 11.2. Exercising BUM procedure for VLAN/BD 1314 Servers/hosts sending the traffic with TTL value 1 may be attached to 1315 a separate VLAN/BD, where multicast routing is disabled. When 1316 multicast routing is disabled, EVPN BUM procedure may be applied to 1317 all traffic ingressing on that VLAN/BD. On the Egress PE, the RPF for 1318 such traffic may be set to BD interface, where the source is 1319 attached. 1321 11.3. Intra-subnet bridging 1323 The procedure specified in the section enables a PE to detect an 1324 attached subnet source (i.e., source that is directly attached in the 1325 tenant BD/VLAN). By applying the following procedure for the 1326 attached source, Traffic flows having TTL value 1 can be supported. 1328 - On the ingress PE, do the bridging on the interface towards the 1329 core interface 1330 - On the egress side, make a decision whether to bridge or route 1331 at the outgoing interface (OIF) based on whether the source is 1332 attached to the OIF's BD/VLAN or not. 1334 Recent ASIC supports single lookup forwarding for brigading and 1335 routing (L2+L3). The procedure mentioned here leverages this ASIC 1336 capability. 1338 PE1 1339 +------------+ 1340 S11 +---+(BD1) | +---------+ 1341 | \ | | | 1342 |(IP-VRF)-(CORE)| | 1343 | / | | | 1344 R12 +---+(BD2) | | | 1345 +------------+ | | 1346 | | 1347 PE2 | VXLAN. | 1348 +------------+ | | 1349 R21 +---+(BD1) | | | 1350 | \ | | | 1351 |(IP-VRF)-(CORE)| | 1352 | / | | | 1353 R22+----+(BD3) | +---------+ 1354 +------------+ 1356 Figure 3 Intra-subnet bridging 1358 Consider the above picture. In the picture 1360 - PE1 and PE2 are seamless interop capable PEs 1361 - S11 is a multicast host directly attached to PE1 in BD1 1362 - Source S11 sends traffic to Group G11 1363 - R21, R22 are IGMP receivers for group G11 1364 - R21 and R22 are attached to BD1 and BD3 respectively at PE2. 1366 When source S11 starts sending the traffic, PE1 learns the source and 1367 announces the source using MVPN procedures to the remote PEs. 1369 At PE2, IGMP joins from R21, R22 result the creation of (*,G11) entry 1370 with outgoing OIF as IRB interface of BD1 and BD3. When PE2 learns 1371 the source information from PE1, it installs the route (S11, G11) at 1372 the tenant VRF with RPF as CORE interface. 1374 PE2 inherits (*, G11) OIFs to (S11, G11) entry. While inheriting OIF, 1375 PE2 checks whether source is attached to OIF's subnet. OIF matching 1376 source subnet is added with flag indicating bridge only interface. In 1377 case of (S11, G11) entry, BD1 is added as the bridge only OIF, while 1378 BD3 is added as normal OIF(L3 OIF). 1380 PEs (PE2) sends MVPN join (S11, G11) towards PE1, since it has local 1381 receivers. 1383 At Ingress PE(PE1), CORE interface is added to (S11, G11) entry as an 1384 OIF (outgoing interface) with a flag indicating that bridge only 1385 interface. With this procedure, ingress PE(PE1) bridges the traffic 1386 on CORE interface. (PE1 retains the TTL and source-MAC). The traffic 1387 is encapsulated with VNI associated with CORE interface(L3VNI). PE1 1388 also routes the traffic for R12 which is attached to BD2 on the same 1389 device. 1391 PE2 decapsulates the traffic from PE1 and does inner lookup on the 1392 tenant VRF associated with incoming VNI. Traffic lookup on the tenant 1393 VRF yields (S11, G11) entry as the matching entry. Traffic gets 1394 bridged on BD1 (PE2 retains the TTL and source-MAC) since the OIF is 1395 marked as bridge only interface. Traffic gets routed on BD2. 1397 12. Interop with L2 EVPN PEs 1399 A gateway device is needed to do interop between EVPN PEs that 1400 support seamless interop procedure specified in this document and 1401 native EVPN-PEs(L2EVPN PE). The gateway device uses BUM tunnel when 1402 interworking with L2EVPN-PEs. 1404 Interop procedure will be covered in the next version of the draft. 1406 13. Connecting external Multicast networks or PIM routers. 1408 External multicast networks or PIM routers can be attached to any 1409 seamless interop capable EVPN-PEs or set of EVPN-PEs. Multicast 1410 network or PIM router can also be attached to any IRB enabled BDI 1411 interface or L3 enabled interface or set of interfaces. The fabric 1412 can be used as a Transit network. All PIM signaling is terminated at 1413 EVPN-PEs. 1415 No additional procedures are required while connecting external 1416 multicast networks. 1418 14. RP handling 1420 This section describes various RP models for a tenant VRF. The RP 1421 model SHOULD be consistent across all EVPN-PEs for given group/group 1422 range in the tenant VRF. 1424 14.1. Various RP deployment options 1426 14.1.1. RP-less mode 1427 EVPN fabric without having any external multicast network/attached 1428 MVPN network, doesn't need RP configuration. A configuration option 1429 SHALL be provided to the end user to operate the fabric in RP less 1430 mode. When an EVPN-PE is operating in RP-less mode, EVPN-PE MUST 1431 advertise all attached sources to remote EVPN PEs using procedure 1432 specified in [RFC 6514]. 1434 In RP less mode, (C-*,C-G) RPF may be set to NULL or may be set to 1435 wild card interface( Any interface on the tenant VRF). In RP-less 1436 mode, traffic is always forwarded based on (C-S,C-G) state. 1438 14.1.2. Fabric anycast RP 1440 In this model, anycast GW IP address is configured as RP in all EVPN- 1441 PE. When an EVPN-PE is operating in Fabric anycast-RP mode, an EVPN- 1442 PE MUST advertise all sources behind that PE to other EVPN PEs using 1443 procedure specified in [RFC 6514]. In this model, Sources may be 1444 directly attached to tenant BDs or sources may be attached behind a 1445 PIM router (In that case EVPN-PE learns source information due to PIM 1446 register terminating at RP interface at the tenant VRF side) 1448 In RP-less mode and Fabric anycast RP mode, EVPN-PE operates SPT-only 1449 mode as per section 14 of RFC 6514. 1451 14.1.3. Static RP 1453 The procedure specified in this document supports configuring EVPN 1454 fabric with static RP. RP can be configured in the EVPN-PE itself in 1455 the tenant VRF or in the external multicast networks connected behind 1456 an EVPN PE or in the MVPN network. When RPF is not local to EVPN-PE, 1457 EVPN-PE operates in rpt-spt mode as PER procedures specified in 1458 section 13 of RFC 6514. 1460 14.1.4. Co-existence of Fabric anycast RP and external RP 1462 External multicast network using its own RP may be connected to EVPN 1463 fabric operating with Fabric anycast RP mode. In this case, subset of 1464 EVPN-PEs may be designated as border leafs. Anycast RP may be 1465 configured between border leafs and external RP. Border leafs 1466 originates SA-AD routes for external sources towards fabric PEs. 1467 Border leaf acts as FHR for the sources inside the fabric. 1468 Configuration option may be provided to define the PE role as BL. 1470 14.2. RP configuration options 1472 PIM Bidir and PIM-SM ASM mode require Rendezvous point (RP) 1473 configuration, which acts as a shared root for a multicast shared 1474 tree. RP can be configured using static configuration or by using BSR 1475 or Auto-RP procedures on the tenant VRF. This document only discusses 1476 static RP configuration. The use of BSR or Auto-RP procedure in the 1477 EVPN fabric is beyond the scope of this document. 1479 15. IANA Considerations 1481 IANA is requested to assign new flags in the "Multicast Flags 1482 Extended Community Flags" registry for the following. 1484 o Seamless interop capable PE 1486 16. Security Considerations 1488 All the security considerations in [RFC7432] apply directly to this 1489 document because this document leverages [RFC7432] control plane and 1490 their associated procedures. 1492 17. Acknowledgements 1494 The authors would like to thank Niloofar Fazlollahi, Aamod 1495 Vyavaharkar, Raunak Banthia, and Swadesh Agrawal for their 1496 discussions and contributions. 1498 18. References 1500 18.1. Normative References 1502 [RFC7432] A. Sajassi, et al., "BGP MPLS Based Ethernet VPN", RFC 1503 7432 , February 2015. 1505 [RFC8365] A. Sajassi, et al., "A Network Virtualization Overlay 1506 Solution using EVPN", RFC 8365, February 2018. 1508 [RFC6513] E. Rosen, et al., "Multicast in MPLS/BGP IP VPNs", RFC6513, 1509 February 2012. 1511 [RFC6514] R. Aggarwal, et al., "BGP Encodings and Procedures for 1512 Multicast in MPLS/BGP IP VPNs", RFC6514, February 2012. 1514 [EVPN-IRB] A. Sajassi, et al., "Integrated Routing and Bridging in 1515 EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03, 1516 February 2017. 1518 [EVPN-IRB-MCAST] A. Rosen, et al., "EVPN Optimized Inter-Subnet 1519 Multicast (OISM) Forwarding", draft-lin-bess-evpn-irb- 1520 mcast-04, October 24, 2017. 1522 18.2. Informative References 1524 [RFC7080] A. Sajassi, et al., "Virtual Private LAN Service (VPLS) 1525 Interoperability with Provider Backbone Bridges", RFC 1526 7080, December 2013. 1528 [RFC7209] D. Thaler, et al., "Requirements for Ethernet VPN (EVPN)", 1529 RFC 7209, May 2014. 1531 [RFC4389] A. Sajassi, et al., "Neighbor Discovery Proxies (ND 1532 Proxy)", RFC 4389, April 2006. 1534 [RFC4761] K. Kompella, et al., "Virtual Private LAN Service (VPLS) 1535 Using BGP for Auto-Discovery and Signaling", RFC 4761, 1536 Jauary 2007. 1538 [INTERCON-EVPN] J. Rabadan, et al., "Interconnect Solution for EVPN 1539 Overlay networks", https://tools.ietf.org/html/draft-ietf- 1540 bess-dci-evpn-overlay-04, September 2016 1542 [TUNNEL-ENCAPS] E. Rosen, et al. "The BGP Tunnel Encapsulation 1543 Attribute", https://tools.ietf.org/html/draft-ietf-idr- 1544 tunnel-encaps-06, work in progress, June 2017. 1546 [EVPN-IGMP-PROXY] A. Sajassi, et. al., "IGMP and MLD Proxy for EVPN", 1547 draft-ietf-bess-evpn-igmp-mld-proxy-01, work in progress, 1548 March 2018. 1550 [EVPN-PIM-PROXY] J. Rabadan, et. al., "PIM Proxy in EVPN Networks", 1551 draft-skr-bess-evpn-pim-proxy-00, work in progress, July 1552 3, 2017. 1554 19. Authors' Addresses 1556 Ali Sajassi 1557 Csco 1558 170 West Tasman Drive 1559 San Jose, CA 95134, US 1560 Email: sajassi@cisco.com 1562 Kesavan Thiruvenkatasamy 1563 Cisco 1564 170 West Tasman Drive 1565 San Jose, CA 95134, US 1566 Email: kethiruv@cisco.com 1568 Samir Thoria 1569 Cisco 1570 170 West Tasman Drive 1571 San Jose, CA 95134, US 1572 Email: sthoria@cisco.com 1574 Ashutosh Gupta 1575 Avi Networks 1576 Email: ashutosh@avinetworks.com 1578 Luay Jalil 1579 Verizon 1580 Email: luay.jalil@verizon.com 1582 Appendix A. Use Cases 1584 A.1. DCs with only IGMP/MLD hosts w/o tenant router 1586 In a EVPN network consisting of only IGMP/MLD hosts, PE's 1587 will receive IGMP (*, G) or (S, G) joins from their 1588 locally attached host and would originate MVPN C-Multicast 1589 Route Type 6 and 7 NLRI's respectively. As described in 1590 RFC 6514 these NLRI's are directed towards RP-PE for Type 1591 6 or Source-PE for Type 7. In case of (*, G) join a 1592 Shared-Path Tree will be built in the core from RP-PE 1593 towards all Receiver-PE's. Once a Source starts to send 1594 Multicast data to specified multicast-group, the PE 1595 directly connected to Source will do PIM-registration with 1596 RP. Since there are existing receivers for the Group, RP 1597 will originate a PIM (S, G) join towards Source. This will 1598 be converted to MVPN Type 7 NLRI by RP-PE. Please note 1599 that the router RP-PE would be the PE configured as RP 1600 (e.g., using static configuration or by using BSR or Auto- 1601 RP procedures). The detailed working of such protocols is 1602 beyond the scope of this document. Upon receiving Type 7 1603 NLRI, Source-PE will include MVPN Tunnel in its Outgoing 1604 Interface List. Furthermore, Source-PE will follow the 1605 procedures in RFC-6514 to originate MVPN SA-AD route (RT 1606 5) to avoid duplicate traffic and allow all Receiver-PE's 1607 to shift from Share-Tree to Shortest-Path-Tree rooted at 1608 Source-PE. Section 13 of [RFC6514] describes it. 1610 However a network operator can chose to have only 1611 Shortest-Path-Tree built in MVPN core as described in 1612 section 14 of [RFC6514]. One way to achieve this, is for 1613 all PE's act as RP for its locally connected hosts and 1614 thus avoid sending any Shared-Tree Join (MVPN Type 6) into 1615 the core. In this scenario, there will be no PIM 1616 registration needed since all PE's are first-hop router as 1617 well as acting RP. Once a source starts to send multicast 1618 data, the PE directly connected to it originates Source- 1619 Active AD (RT 5) to all other PE's in network. Upon 1620 Receiving Source-Active AD route a PE must cache it in its 1621 local database and also look for any matching interest for 1622 (*, G) where G is the multicast group described in 1623 received Source-Active AD route. If it finds any such 1624 matching entry, it must originate a C-Multicast route (RT 1625 7) in order to start receiving traffic from Source-PE. 1626 This procedure must be repeated on reception of any 1627 further Source-Active AD routes. 1629 A.2. DCs with mixed of IGMP/MLD hosts & multicast routers running PIM- 1630 SSM 1632 This scenario has multicast routers which can send PIM SSM 1633 (S, G) joins. Upon receiving these joins and if source 1634 described in join is learnt to be behind a MVPN peer PE, 1635 local PE will originate C-Multicast Join (RT 7) towards 1636 Source-PE. It is expected that PIM SSM group ranges are 1637 kept separate from ASM range for which IGMP hosts can send 1638 (*, G) joins. Hence both ASM and SSM groups shall operate 1639 without any overlap. There is no RP needed for SSM range 1640 groups and Shortest Path tree rooted at Source is built 1641 once a receiver interest is known. 1643 A.3. DCs with mixed of IGMP/MLD hosts & multicast routers running PIM- 1644 ASM 1645 This scenario includes reception of PIM (*, G) joins on 1646 PE's local AC. These joins are handled similar to IGMP (*, 1647 G) join as explained in sections above. Another 1648 interesting case can arise here is when one of the tenant 1649 routers can act as RP for some of the ASM Groups. In such 1650 scenario, a Upstream Multicast Hop (UMH) will be elected 1651 by other PE's in order to send C-Multicast Routes (RT 6). 1652 All procedures described in RFC 6513 with respect to UMH 1653 should be used to avoid traffic duplication due to 1654 incoherent selection of RP-PE by different Receiver-PE's. 1656 A.4. DCs with mixed of IGMP/MLD hosts & multicast routers running PIM- 1657 Bidir 1659 Creating Bidirectional (*, G) trees is useful when a 1660 customer wants least amount of control state in network. 1661 But on downside all receivers for a particular multicast 1662 group receive traffic from all sources sending to that 1663 group. However for the purpose of this document, all 1664 procedures as described in RFC 6513 and RFC 6514 apply 1665 when PIM-Bidir is used.