idnits 2.17.1 draft-ietf-bess-evpn-mvpn-seamless-interop-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 294: '... The solution SHALL support optimum ...' RFC 2119 keyword, line 296: '...ple LATAs. The solution SHALL support...' RFC 2119 keyword, line 302: '...ability, the solution SHALL use only a...' RFC 2119 keyword, line 306: '... solution MUST support optimum repli...' RFC 2119 keyword, line 309: '... - Non-IP traffic SHALL be forwarded per EVPN baseline [RFC7432] or...' (45 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 16, 2021) is 1158 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 206, but not defined == Missing Reference: 'IEEE802.1Q' is mentioned on line 428, but not defined == Unused Reference: 'RFC4389' is defined on line 1748, but no explicit reference was found in the text == Unused Reference: 'RFC4761' is defined on line 1752, but no explicit reference was found in the text == Unused Reference: 'RFC7080' is defined on line 1757, but no explicit reference was found in the text == Unused Reference: 'RFC7209' is defined on line 1762, but no explicit reference was found in the text == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-06 == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-11 == Outdated reference: A later version (-22) exists of draft-ietf-idr-tunnel-encaps-21 == Outdated reference: A later version (-02) exists of draft-skr-bess-evpn-pim-proxy-01 Summary: 1 error (**), 0 flaws (~~), 11 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS WorkGroup A. Sajassi 3 Internet-Draft K. Thiruvenkatasamy 4 Intended status: Standards Track S. Thoria 5 Expires: August 20, 2021 Cisco 6 A. Gupta 7 VMware 8 L. Jalil 9 Verizon 10 February 16, 2021 12 Seamless Multicast Interoperability between EVPN and MVPN PEs 13 draft-ietf-bess-evpn-mvpn-seamless-interop-02 15 Abstract 17 Ethernet Virtual Private Network (EVPN) solution is becoming 18 pervasive for Network Virtualization Overlay (NVO) services in data 19 center (DC) networks and as the next generation VPN services in 20 service provider (SP) networks. 22 As service providers transform their networks in their COs toward 23 next generation data center with Software Defined Networking (SDN) 24 based fabric and Network Function Virtualization (NFV), they want to 25 be able to maintain their offered services including Multicast VPN 26 (MVPN) service between their existing network and their new Service 27 Provider Data Center (SPDC) network seamlessly without the use of 28 gateway devices. They want to have such seamless interoperability 29 between their new SPDCs and their existing networks for a) reducing 30 cost, b) having optimum forwarding, and c) reducing provisioning. 31 This document describes a unified solution based on RFCs 6513 & 6514 32 for seamless interoperability of Multicast VPN between EVPN and MVPN 33 PEs. Furthermore, it describes how the proposed solution can be used 34 as a routed multicast solution in data centers with only EVPN PEs. 36 Status of This Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF). Note that other groups may also distribute 43 working documents as Internet-Drafts. The list of current Internet- 44 Drafts is at https://datatracker.ietf.org/drafts/current/. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 This Internet-Draft will expire on August 20, 2021. 53 Copyright Notice 55 Copyright (c) 2021 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (https://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 71 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 5 72 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 73 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 7 74 4.1. Optimum Forwarding . . . . . . . . . . . . . . . . . . . 7 75 4.2. Optimum Replication . . . . . . . . . . . . . . . . . . . 7 76 4.3. All-Active and Single-Active Multi-Homing . . . . . . . . 7 77 4.4. Inter-AS Tree Stitching . . . . . . . . . . . . . . . . . 8 78 4.5. EVPN Service Interfaces . . . . . . . . . . . . . . . . . 8 79 4.6. Distributed Anycast Gateway . . . . . . . . . . . . . . . 8 80 4.7. Selective & Aggregate Selective Tunnels . . . . . . . . 8 81 4.8. Tenants' (S,G) or (*,G) states . . . . . . . . . . . . . 8 82 4.9. Zero Disruption upon BD/Subnet Addition . . . . . . . . . 8 83 4.10. No Changes to Existing EVPN Service Interface Models . . 9 84 4.11. External source and receivers . . . . . . . . . . . . . . 9 85 4.12. Tenant RP placement . . . . . . . . . . . . . . . . . . . 9 86 5. IRB Unicast versus IRB Multicast . . . . . . . . . . . . . . 9 87 5.1. Emulated Virtual LAN Service . . . . . . . . . . . . . . 10 88 6. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 10 89 6.1. Operational Model for EVPN IRB PEs . . . . . . . . . . . 10 90 6.2. Unicast Route Advertisements for IP multicast Source . . 13 91 6.3. Multi-homing of IP Multicast Source and Receivers . . . . 14 92 6.3.1. Single-Active Multi-Homing . . . . . . . . . . . . . 14 93 6.3.2. All-Active Multi-Homing . . . . . . . . . . . . . . . 15 94 6.4. Mobility for Tenant's Sources and Receivers . . . . . . . 17 95 6.5. Intra-Subnet BUM Traffic Handling . . . . . . . . . . . . 17 96 6.6. EVPN and MVPN interworking with gateway model . . . . . . 18 97 7. Control Plane Operation . . . . . . . . . . . . . . . . . . . 18 98 7.1. Intra-ES/Intra-Subnet IP Multicast Tunnel . . . . . . . . 19 99 7.2. Intra-Subnet BUM Tunnel . . . . . . . . . . . . . . . . . 20 100 7.3. Inter-Subnet IP Multicast Tunnel . . . . . . . . . . . . 21 101 7.4. IGMP Hosts as TSes . . . . . . . . . . . . . . . . . . . 21 102 7.5. TS PIM Routers . . . . . . . . . . . . . . . . . . . . . 22 103 8. Data Plane Operation . . . . . . . . . . . . . . . . . . . . 22 104 8.1. Intra-Subnet L2 Switching . . . . . . . . . . . . . . . . 23 105 8.2. Inter-Subnet L3 Routing . . . . . . . . . . . . . . . . . 23 106 9. DCs with only EVPN PEs . . . . . . . . . . . . . . . . . . . 24 107 9.1. Setup of overlay multicast delivery . . . . . . . . . . . 24 108 9.2. Handling of different encapsulations . . . . . . . . . . 26 109 9.2.1. MPLS Encapsulation . . . . . . . . . . . . . . . . . 26 110 9.2.2. VxLAN Encapsulation . . . . . . . . . . . . . . . . . 26 111 9.2.3. Other Encapsulation . . . . . . . . . . . . . . . . . 27 112 10. DCI with MPLS in WAN and VxLAN in DCs . . . . . . . . . . . . 27 113 10.1. Control plane inter-connect . . . . . . . . . . . . . . 27 114 10.2. Data plane inter-connect . . . . . . . . . . . . . . . . 28 115 11. Supporting application with TTL value 1 . . . . . . . . . . . 29 116 11.1. Policy based model . . . . . . . . . . . . . . . . . . . 29 117 11.2. Exercising BUM procedure for VLAN/BD . . . . . . . . . . 29 118 11.3. Intra-subnet bridging . . . . . . . . . . . . . . . . . 29 119 12. Interop with L2 EVPN PEs . . . . . . . . . . . . . . . . . . 31 120 12.1. Interaction with L2EVPN PE and Seamless interop capable 121 PE . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 122 12.2. Network having L2EVPN PE, Seamless interop capable PE 123 and MVPN PE . . . . . . . . . . . . . . . . . . . . . . 34 124 13. Connecting external Multicast networks or PIM routers. . . . 34 125 14. RP handling . . . . . . . . . . . . . . . . . . . . . . . . . 34 126 14.1. Various RP deployment options . . . . . . . . . . . . . 35 127 14.1.1. RP-less mode . . . . . . . . . . . . . . . . . . . . 35 128 14.1.2. Fabric anycast RP . . . . . . . . . . . . . . . . . 35 129 14.1.3. Static RP . . . . . . . . . . . . . . . . . . . . . 35 130 14.1.4. Co-existence of Fabric anycast RP and external RP . 35 131 14.2. RP configuration options . . . . . . . . . . . . . . . . 36 132 15. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 133 16. Security Considerations . . . . . . . . . . . . . . . . . . . 36 134 17. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36 135 18. References . . . . . . . . . . . . . . . . . . . . . . . . . 36 136 18.1. Normative References . . . . . . . . . . . . . . . . . . 36 137 18.2. Informative References . . . . . . . . . . . . . . . . . 37 138 Appendix A. Use Cases . . . . . . . . . . . . . . . . . . . . . 38 139 A.1. DCs with only IGMP/MLD hosts w/o tenant router . . . . . 38 140 A.2. DCs with mixed of IGMP/MLD hosts & multicast routers 141 running PIM-SSM . . . . . . . . . . . . . . . . . . . . . 39 142 A.3. DCs with mixed of IGMP/MLD hosts & multicast routers 143 running PIM-ASM . . . . . . . . . . . . . . . . . . . . . 39 145 A.4. DCs with mixed of IGMP/MLD hosts & multicast routers 146 running PIM-Bidir . . . . . . . . . . . . . . . . . . . . 39 147 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40 149 1. Introduction 151 Ethernet Virtual Private Network (EVPN) solution is becoming 152 pervasive for Network Virtualization Overlay (NVO) services in data 153 center (DC) networks and as the next generation VPN services in 154 service provider (SP) networks. 156 As service providers transform their networks in their COs toward 157 next generation data center with Software Defined Networking (SDN) 158 based fabric and Network Function Virtualization (NFV), they want to 159 be able to maintain their offered services including Multicast VPN 160 (MVPN) service between their existing network and their new SPDC 161 network seamlessly without the use of gateway devices. There are 162 several reasons for having such seamless interoperability between 163 their new DCs and their existing networks: 165 - Lower Cost: gateway devices need to have very high scalability to 166 handle VPN services for their DCs and as such need to handle large 167 number of VPN instances (in tens or hundreds of thousands) and very 168 large number of routes (e.g., in tens of millions). For the same 169 speed and feed, these high scale gateway boxes are relatively much 170 more expensive than the edge devices (e.g., PEs and TORs) that 171 support much lower number of routes and VPN instances. 173 - Optimum Forwarding: in a given CO, both EVPN PEs and MVPN PEs can 174 be connected to the same fabric/network (e.g., same IGP domain). In 175 such scenarios, the service providers want to have optimum forwarding 176 among these PE devices without the use of gateway devices. Because 177 if gateway devices are used, then the IP multicast traffic between an 178 EVPN and MVPN PEs can no longer be optimum and in some case, it may 179 even get tromboned. Furthermore, when an SPDC network spans across 180 multiple LATA (multiple geographic areas) and gateways are used 181 between EVPN and MVPN PEs, then with respect to IP multicast traffic, 182 only one GW can be designated forwarder (DF) between EVPN and MVPN 183 PEs. Such scenarios not only results in non-optimum forwarding but 184 also it can result in tromboing of IP multicast traffic between the 185 two LATAs when both source and destination PEs are in the same LATA 186 and the DF gateway is elected to be in a different LATA. 188 - Less Provisioning: If gateways are used, then the operator need to 189 configure per-tenant info on the gateways. In other words, for each 190 tenant that is configured, one (or maybe two) additional touch points 191 are needed. 193 This document describes a unified solution based on [RFC6513] and 194 [RFC6514] for seamless interoperability of multicast VPN between EVPN 195 and MVPN PEs. Furthermore, it describes how the proposed solution 196 can be used as a routed multicast solution in data centers with only 197 EVPN PEs (e.g., routed multicast VPN only among EVPN PEs). 199 2. Requirements Language 201 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 202 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to 203 be interpreted as described in [RFC2119] only when they appear in all 204 upper case. They may also appear in lower or mixed case as English 205 words, without any normative meaning. 207 3. Terminology 209 Most of the terminology used in this documents comes from [RFC8365] 211 Broadcast Domain (BD): In a bridged network, the broadcast domain 212 corresponds to a Virtual LAN (VLAN), where a VLAN is typically 213 represented by a single VLAN ID (VID) but can be represented by 214 several VIDs where Shared VLAN Learning (SVL) is used per [802.1Q]. 216 Bridge Table (BT): An instantiation of a broadcast domain on a MAC- 217 VRF. 219 VXLAN: Virtual Extensible LAN 221 POD: Point of Delivery 223 NV: Network Virtualization 225 NVO: Network Virtualization Overlay 227 NVE: Network Virtualization Endpoint 229 VNI: Virtual Network Identifier (for VXLAN) 231 EVPN: Ethernet VPN 233 EVI: An EVPN instance spanning the Provider Edge (PE) devices 234 participating in that EVPN 236 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 237 Control (MAC) addresses on a PE 239 IP-VRF: A Virtual Routing and Forwarding table for Internet Protocol 240 (IP) addresses on a PE 241 Ethernet Segment (ES): When a customer site (device or network) is 242 connected to one or more PEs via a set of Ethernet links, then that 243 set of links is referred to as an 'Ethernet segment'. 245 Ethernet Segment Identifier (ESI): A unique non-zero identifier that 246 identifies an Ethernet segment is called an 'Ethernet Segment 247 Identifier'. 249 Ethernet Tag: An Ethernet tag identifies a particular broadcast 250 domain, e.g., a VLAN. An EVPN instance consists of one or more 251 broadcast domains. 253 PE: Provider Edge device. 255 Single-Active Redundancy Mode: When only a single PE, among all the 256 PEs attached to an Ethernet segment, is allowed to forward traffic 257 to/from that Ethernet segment for a given VLAN, then the Ethernet 258 segment is defined to be operating in Single-Active redundancy mode. 260 All-Active Redundancy Mode: When all PEs attached to an Ethernet 261 segment are allowed to forward known unicast traffic to/from that 262 Ethernet segment for a given VLAN, then the Ethernet segment is 263 defined to be operating in All-Active redundancy mode. 265 PIM-SM: Protocol Independent Multicast - Sparse-Mode 267 PIM-SSM: Protocol Independent Multicast - Source Specific Multicast 269 Bidir PIM: Bidirectional PIM 271 FHR: First Hop Router 273 LHR: Last Hop Router 275 CO: Central Office of a service provider 277 SPDC: Service Provider Data Center 279 LATA: Local Access and Transport Area 281 Border Leafs: A set of EVPN-PE acting as exit point for EVPN fabric. 283 L3VNI: A VNI in the tenant VRF, which is associated with the core 284 facing interface. 286 4. Requirements 288 This section describes the requirements specific in providing 289 seamless multicast VPN service between MVPN and EVPN capable 290 networks. 292 4.1. Optimum Forwarding 294 The solution SHALL support optimum multicast forwarding between EVPN 295 and MVPN PEs within a network. The network can be confined to a CO 296 or it can span across multiple LATAs. The solution SHALL support 297 optimum multicast forwarding with both ingress replication tunnels 298 and P2MP tunnels. 300 4.2. Optimum Replication 302 For EVPN PEs with IRB capability, the solution SHALL use only a 303 single multicast tunnel among EVPN and MVPN PEs for IP multicast 304 traffic, when both PEs use the same tunnel type. Multicast tunnels 305 can be either ingress replication tunnels or P2MP tunnels. The 306 solution MUST support optimum replication for both Intra-subnet and 307 Inter-subnet IP multicast traffic: 309 - Non-IP traffic SHALL be forwarded per EVPN baseline [RFC7432] or 310 [RFC8365] 312 - If a Multicast VPN spans across both Intra and Inter subnets, then 313 for Ingress replication regardless of whether the traffic is Intra or 314 Inter subnet, only a single copy of IP multicast traffic SHALL be 315 sent from the source PE to the destination PE. 317 - If a Multicast VPN spans across both Intra and Inter subnets, then 318 for P2MP tunnels regardless of whether the traffic is Intra or Inter 319 subnet, only a single copy of multicast data SHALL be transmitted by 320 the source PE. Source PE can be either EVPN or MVPN PE and receiving 321 PEs can be a mix of EVPN and MVPN PEs - i.e., a multicast VPN can be 322 spread across both EVPN and MVPN PEs. 324 4.3. All-Active and Single-Active Multi-Homing 326 The solution MUST support multi-homing of source devices and 327 receivers that are sitting in the same subnet (e.g., VLAN) and are 328 multi-homed to EVPN PEs. The solution SHALL allow for both Single- 329 Active and All-Active multi-homing. The solution MUST prevent loop 330 during steady and transient states just like EVPN baseline solution 331 [RFC7432] and [RFC8365] for all multi-homing types. 333 4.4. Inter-AS Tree Stitching 335 The solution SHALL support multicast tree stitching when the tree 336 spans across multiple Autonomous Systems. 338 4.5. EVPN Service Interfaces 340 The solution MUST support all EVPN service interfaces listed in 341 section 6 of [RFC7432]: 343 o VLAN-based service interface 345 o VLAN-bundle service interface 347 o VLAN-aware bundle service interface. 349 4.6. Distributed Anycast Gateway 351 The solution SHALL support distributed anycast gateways for tenant 352 workloads on NVE devices operating in EVPN-IRB mode.. 354 4.7. Selective & Aggregate Selective Tunnels 356 The solution SHALL support selective and aggregate selective P- 357 tunnels as well as inclusive and aggregate inclusive P-tunnels. When 358 selective tunnels are used, then multicast traffic SHOULD only be 359 forwarded to the remote PE which have receivers - i.e., if there are 360 no receivers at a remote PE, the multicast traffic SHOULD NOT be 361 forwarded to that PE and if there are no receivers on any remote PEs, 362 then the multicast traffic SHOULD NOT be forwarded to the core. 364 4.8. Tenants' (S,G) or (*,G) states 366 The solution SHOULD store (C-S,C-G) and (C-*,C-G) states only on PE 367 devices that have interest in such states hence reducing memory and 368 processing requirements - i.e., PE devices that have sources and/or 369 receivers interested in such multicast groups. 371 4.9. Zero Disruption upon BD/Subnet Addition 373 In DC environments, various Bridge Domains are provisioned and 374 removed on regular basis due to host mobility, policy and tenant 375 changes. Such change in BD configuration should not affect existing 376 flows within the same BD or any other BD in the network. 378 4.10. No Changes to Existing EVPN Service Interface Models 380 VLAN-aware bundle service as defined in [RFC7432] typically does not 381 require any VLAN ID translation from one tenant site to another - 382 i.e., the same set of VLAN IDs are configured consistently on all 383 tenant segments. In such scenarios, EVPN-IRB multicast service MUST 384 maintain the same mode of operation and SHALL NOT require any VLAN ID 385 translation. 387 4.11. External source and receivers 389 The solution SHALL support sources and receivers external to the 390 tenant domain. i.e., multicast source inside the tenant domain can 391 have receiver outside the tenant domain and vice versa. 393 4.12. Tenant RP placement 395 The solution SHALL support a tenant to have RP anywhere in the 396 network. RP can be placed inside the EVPN network or MVPN network or 397 external domain. 399 5. IRB Unicast versus IRB Multicast 401 [I-D.ietf-bess-evpn-inter-subnet-forwarding] describes the operation 402 for EVPN PEs in IRB mode for unicast traffic. The same IRB model 403 used for unicast traffic in 404 [I-D.ietf-bess-evpn-inter-subnet-forwarding] , where an IP-VRF in an 405 EVPN PE is attached to one or more bridge tables (BTs) via virtual 406 IRB interfaces, is also applicable for multicast traffic. However, 407 there are some noticeable differences between the IRB operation for 408 unicast traffic described in 409 [I-D.ietf-bess-evpn-inter-subnet-forwarding] versus for multicast 410 traffic described in this document. For unicast traffic, the intra- 411 subnet traffic, is bridged within the MAC-VRF associated with that 412 subnet (i.e., a lookup based on MAC-DA is performed); whereas, the 413 inter-subnet traffic is routed in the corresponding IP-VRF (ie, a 414 lookup based on IP-DA is performed). A given tenant can have one or 415 more IP-VRFs; however, without loss of generality, this document 416 assumes one IP-VRF per tenant. In context of a given tenant's 417 multicast traffic, the intra-subnet traffic is bridged for non-IP 418 traffic and it is Layer-2 switched for IP traffic. Whereas, the 419 tenants's inter-subnet multicast traffic is always routed in the 420 corresponding IP-VRF. The difference between bridging and 421 L2-switching for multicast traffic is that the former uses MAC-DA 422 lookup for forwarding the multicast traffic; whereas, the latter uses 423 IP-DA lookup for such forwarding where the forwarding states are 424 built in the MAC-VRF using IGMP/MLD or PIM snooping. 426 5.1. Emulated Virtual LAN Service 428 EVPN does not provide a Virtual LAN (VLAN) service per [IEEE802.1Q] 429 but rather an emulated VLAN service. This VLAN service emulation is 430 not only done for unicast traffic but also is extended for intra- 431 subnet multicast traffic described in 432 [I-D.ietf-bess-evpn-igmp-mld-proxy] and 433 [I-D.skr-bess-evpn-pim-proxy]. For intra-subnet multicast, an EVPN 434 PE builds multicast forwarding states in its bridge table (BT) based 435 on snooping of IGMP/MLD and/or PIM messages and the forwarding is 436 performed based on destination IP multicast address of the Ethernet 437 frame rather than destination MAC address as noted above. In order 438 to enable seamless integration of EVPN and MVPN PEs, this document 439 extends the concept of an emulated VLAN service for multicast IRB 440 applications such that the intra-subnet IP multicast traffic can get 441 treated same as inter- subnet IP multicast traffic which means intra- 442 subnet IP multicast traffic destined to remote PEs gets routed 443 instead of being L2- switched - i.e., TTL value gets decremented and 444 the Ethernet header of the L2 frame is de-capsulated an encapsulated 445 at both ingress and egress PEs. It should be noted that the non-IP 446 multicast or L2 broadcast traffic still gets bridged and frames get 447 forwarded based on their destination MAC addresses. 449 6. Solution Overview 451 This section describes a multicast VPN solution based on [RFC6513] 452 and [RFC6514] for EVPN PEs operating in IRB mode that want to perform 453 seamless interoperability with their counterparts MVPN PEs. 455 6.1. Operational Model for EVPN IRB PEs 457 Without the loss of generality, this section assumes that all EVPN 458 PEs have IRB capability and operating in IRB mode for both unicast 459 and multicast traffic (e.g., all EVPN PEs are homogenous in terms of 460 their capabilities and operational modes). As it will be seen later, 461 an EVPN network can consist of a mix of PEs where some are capable of 462 multicast IRB and some are not and the multicast operation of such 463 heterogeneous EVPN network will be an extension of an EVPN homogenous 464 network. Therefore, we start with the multicast IRB solution 465 description for the EVPN homogenous network. 467 The EVPN PEs terminate IGMP/MLD messages from tenant host devices or 468 PIM messages from tenant routers on their IRB interfaces, thus avoid 469 sending these messages over MPLS/IP core. A tenant virtual/physical 470 router (e.g., CE) attached to an EVPN PE becomes a multicast routing 471 adjacency of that PE. Furthermore, the PE uses MVPN BGP protocol and 472 procedures per [RFC6513] and [RFC6514]. With respect to multicast 473 routing protocol between tenant's virtual/physical router and the PE 474 that it is attached to, any of the following PIM protocols is 475 supported per [RFC6513]: PIM-SM with Any Source Multicast (ASM) mode, 476 PIM-SM with Source Specific Multicast (SSM) mode, and PIM 477 Bidirectional (BIDIR) mode. Support of PIM-DM (Dense Mode) is 478 excluded in this document per [RFC6513]. 480 The EVPN PEs use MVPN BGP routes defined in [RFC6514] to convey 481 tenant (S,G) or (*,G) states to other MVPN or EVPN PEs and to set up 482 overlay trees (inclusive or selective) for a given MVPN instance. 483 The root or a leaf of such an overlay tree is terminated on an EVPN 484 or MVPN PE. Furthermore, this inclusive or selective overlay tree is 485 terminated on a single IP-VRF of the EVPN or MVPN PE. In case of 486 EVPN PE, these overlay trees never get terminated on MAC-VRFs of that 487 PE. 489 Overlay trees are instantiated by underlay provider tunnels (P- 490 tunnels) - e.g., P2MP, MP2MP, or unicast tunnels per [RFC6513]. When 491 there are several overlay trees mapped to a single underlay P-tunnel, 492 the tunnel is referred to as an aggregate tunnel. 494 Figure-1 below depicts a scenario where a tenant's MVPN spans across 495 both EVPN and MVPN PEs; where all EVPN PEs have multicast IRB 496 capability. An EVPN PE (with multicast IRB capability) can be 497 modeled as a MVPN PE where the virtual IRB interface of an EVPN PE 498 (virtual interface between a BT and IP-VRF) can be considered a 499 routed interface for the MVPN PE. 501 EVPN PE1 502 +------------+ 503 Src1 +----|(MAC-VRF1) | MVPN PE3 504 Rcvr1 +----| \ | +---------+ +--------+ 505 | (IP-VRF)|----| |---|(IP-VRF)|--- Rcvr5 506 | / | | | +--------+ 507 Rcvr2 +---|(MAC-VRF2) | | | 508 +------------+ | | 509 | MPLS/ | 510 EVPN PE2 | IP | 511 +------------+ | | 512 Rcvr3 +---|(MAC-VRF1) | | | MVPN PE4 513 | \ | | | +--------+ 514 | (IP-VRF)|----| |---|(IP-VRF)|--- Rcvr6 515 | / | +---------+ +--------+ 516 Rcvr4 +---|(MAC-VRF3) | 517 +------------+ 519 Figure-1: EVPN & MVPN PEs Seamless Interop 521 Figure 2 depicts the modeling of EVPN PEs based on MVPN PEs where an 522 EVPN PE can be modeled as a PE that consists of a MVPN PE whose 523 routed interfaces (e.g., attachment circuits) are replaced with IRB 524 interfaces connecting each IP-VRF of the MVPN PE to a set of BTs. 525 Similar to a MVPN PE where an attachment circuit serves as a routed 526 multicast interface for an IP-VRF associated with a MVPN instance, an 527 IRB interface serves as a routed multicast interface for the IP-VRF 528 associated with the MVPN instance. Since EVPN PEs run MVPN protocols 529 (e.g., [RFC6513] and [RFC6514] ), for all practical purposes, they 530 look just like MVPN PEs to other PE devices. Such modeling of EVPN 531 PEs, transforms the multicast VPN operation of EVPN PEs to that of 532 MVPN and thus simplifies the interoperability between EVPN and MVPN 533 PEs to that of running a single unified solution based on MVPN. 535 EVPN PE1 536 +------------+ 537 Src1 +----|(MAC-VRF1) | 538 | \ | 539 Rcvr1 +----| +--------+| +---------+ +--------+ 540 | |MVPN PE1||----| |---|MVPN PE3|--- Rcvr5 541 | +--------+| | | +--------+ 542 | / | | | 543 Rcvr2 +---|(MAC-VRF2) | | | 544 +------------+ | | 545 | MPLS/ | 546 EVPN PE2 | IP | 547 +------------+ | | 548 Rcvr3 +---|(MAC-VRF1) | | | 549 | \ | | | 550 | +--------+| | | +--------+ 551 | |MVPN PE2||----| |---|MVPN PE4|--- Rcvr6 552 | +--------+| | | +--------+ 553 | / | +---------+ 554 Rcvr4 +---|(MAC-VRF3) | 555 +------------+ 557 Figure-2: Modeling EVPN PEs as MVPN PEs 559 Although modeling an EVPN PE as a MVPN PE, conceptually simplifies 560 the operation to that of a solution based on MVPN, the following 561 operational aspects of EVPN need to be factored in when considering 562 seamless integration between EVPN and MVPN PEs. 564 o Unicast route advertisements for IP multicast source 566 o Multi-homing of IP multicast sources and receivers 567 o Mobility for Tenant's sources and receivers 569 o non-IP multicast traffic handling 571 6.2. Unicast Route Advertisements for IP multicast Source 573 When an IP multicast source is attached to an EVPN PE, the unicast 574 route for that IP multicast source needs to be advertised. When the 575 source is attached to a Single-Active multi-homed ES, then the EVPN 576 DF PE is the PE that advertises a unicast route corresponding to the 577 source IP address with VRF Route Import extended community which in 578 turn is used as the Route Target for Join (S,G) messages sent toward 579 the source PE by the remote PEs. The EVPN PE advertises this unicast 580 route using EVPN route type 2 and IPVPN unicast route along with VRF 581 Route Import extended community. EVPN route type 2 is advertised 582 with the Route Targets corresponding to both IP-VRF and MAC-VRF/BT; 583 whereas, IPVPN unicast route is advertised with RT corresponding to 584 the IP-VRF. When unicast routes are advertised by MVPN PEs, they are 585 advertised using IPVPN unicast route along with VRF Route Import 586 extended community per [RFC6514]. 588 When the source is attached to an All-Active multi-homed ES, then the 589 PE that learns the source advertises the unicast route for that 590 source using EVPN route type 2 and IPVPN unicast route along with VRF 591 Route Import extended community. EVPN route type 2 is advertised 592 with the Route Targets corresponding to both IP-VRF and MAC-VRF/BT; 593 whereas, IPVPN unicast route is advertised with RT corresponding to 594 the IP-VRF. When the other multi-homing EVPN PEs for that ES receive 595 this unicast EVPN route, they import the route and check to see if 596 they have learned the route locally for that ES, if they have, then 597 they do nothing. But if they have not, then they add the IP and MAC 598 addresses to their IP-VRF and MAC-VRF/BT tables respectively with the 599 local interface corresponding to that ES as the corresponding route 600 adjacency. Furthermore, these PEs advertise an IPVPN unicast route 601 along with VRF Route Import extended community and Route Target 602 corresponding to IP-VRF to other remote PEs for that MVPN. 603 Therefore, the remote PEs learn the unicast route corresponding to 604 the source from all multi-homing PEs associated with that All- Active 605 Ethernet Segment even though one of the multi-homing PEs may only 606 have directly learned the IP address of the source. 608 EVPN-PEs advertise unicast routes as host routes using EVPN route 609 type 2 for sources that are directly attached to a tenant BD that has 610 been extended in the EVPN fabric. EVPN-PE may summarize sources (IP 611 networks) behind a router that are attached to EVPN-PE or sources 612 that are connected to a BD, which is not extended across EVPN fabric 613 and advertises those routes with EVPN route type 5. EVPN host-routes 614 are advertised as IPVPN host-routes to MVPN-PEs only incase of 615 seamless interop mode. 617 Section 6.6 discusses connecting EVPN and MVPN networks with gateway 618 model. Section 9 extends seamless interop procedures to EVPN only 619 fabrics as an IRB solution for multicast. 621 EVPN-PEs only need to advertise unicast routes using EVPN route-type 622 2 or route-type 5 and don't need to advertise IPVPN routes within 623 EVPN only fabric. No L3VPN provisioning is needed between EVPN-PEs. 625 In gateway model, EVPN-PE advertises unicast routes as IPVPN routes 626 along with VRI extended community for all multicast sources attached 627 behind EVPN-PEs. All IPVPN routes SHOULD be summarized while 628 adverting to MVPN-PEs. 630 6.3. Multi-homing of IP Multicast Source and Receivers 632 EVPN [RFC7432] has extensive multi-homing capabilities that allows 633 TSes to be multi-homed to two or more EVPN PEs in Single-Active or 634 All-Active mode. In Single-Active mode, only one of the multi-homing 635 EVPN PEs can receive/transmit traffic for a given subnet (a given BD) 636 for that multi-homed Ethernet Segment (ES). In All-Active mode, any 637 of the multi-homing EVPN PEs can receive/transmit unicast traffic but 638 only one of them (the DF PE) can send BUM traffic to the multi-homed 639 ES for a given subnet. 641 The multi-homing mode (Single-Active versus All-Active) of a TS 642 source can impact the MVPN procedures as described below. 644 6.3.1. Single-Active Multi-Homing 646 When a TS source reside on an ES that is multi-homed to two or more 647 EVPN PEs operating in Single-Active mode, only one of the EVPN PEs 648 can be active for the source subnet on that ES. Therefore, only one 649 of the multi-homing PE learns the unicast route of the TS source and 650 advertises that using EVPN and IPVPN to other PEs as described 651 previously. 653 A downstream PE that receives a Join/Prune message from a TS host/ 654 router, selects a Upstream Multicast Hop (UMH) which is the upstream 655 PE that receives the IP multicast flow in case of Singe- Active 656 multi-homing. An IP multicast flow belongs to either a source- 657 specific tree (S,G) or to a shared tree (*,G). We use the notation 658 (X,G) to refer to either (S,G) or (*,G); where X refers to S in case 659 of (S,G) and X refers to the Rendezvous Point (RP) for G in case of 660 (*,G). Since the active PE (which is also the UMH PE) has advertised 661 unicast route for X along with the VRF Route Import EC, the 662 downstream PEs selects the UMH without any ambiguity based on MVPN 663 procedures described in section 5.1 of [RFC6513]. Any of the three 664 algorithms described in that section works fine. 666 The multi-homing PE that receives the IP multicast flow on its local 667 AC, performs the following tasks: 669 - L2 switches the multicast traffic in its BT associated with the 670 local AC over which it received the flow if there are any interested 671 receivers for that subnet. 673 - L3 routes the multicast traffic to other BTs for other subnets if 674 there are any interested receivers for those subnets. 676 - L3 routes the multicast traffic to other PEs per MVPN procedures. 678 The multicast traffic can be sent on Inclusive, Selective, or 679 Aggregate-Selective tree. Regardless what type of tree is used, only 680 a single copy of the multicast traffic is received by the downstream 681 PEs and the multicast traffic is forwarded optimally from the 682 upstream PE to the downstream PEs. 684 6.3.2. All-Active Multi-Homing 686 When a TS source reside on an ES that is multi-homed to two or more 687 EVPN PEs operating in All-Active mode, then any of the multi-homing 688 PEs can learn the TS source's unicast route; however, that PE may not 689 be the same PE that receives the IP multicast flow. Therefore, the 690 procedures for Single-Active Multi-homing need to be augmented for 691 All-Active scenario as below. 693 The multi-homing EVPN PE that receives the IP multicast flow on its 694 local AC, needs to do the following task in additions to the ones 695 listed in the previous section for Single-Active multi-homing: L2 696 switch the multicast traffic to other multi-homing EVPN PEs for that 697 ES via a multicast tunnel which it is called intra-ES tunnel. There 698 will be a dedicated tunnel for this purpose which is different from 699 inter-subnet overlay tree/tunnel setup by MVPN procedures. 701 When the multi-homing EVPN PEs receive the IP multicast flow via this 702 tunnel, they treat it as if they receive the flow via their local ACs 703 and thus perform the tasks mentioned in the previous section for 704 Single-Active multi-homing. The tunnel type for this intra-ES tunnel 705 can be any of the supported tunnel types such as ingress-replication, 706 P2MP tunnel, BIER, and Assisted Replication; however, given that vast 707 majority of multi-homing ESes are just dual-homing, a simple ingress 708 replication tunnel can serve well. For a given ES, since multicast 709 traffic that is locally received by one multi-homing PE is sent to 710 other multi-homing PEs via this intra-ES tunnel, there is no need for 711 sending the multicast tunnel via MVPN tunnel to these multi-homing 712 PEs - i.e., MVPN multicast tunnels are used only for remote EVPN and 713 MVPN PEs. Multicast traffic sent over this intra-ES tunnel to other 714 multi-homing PEs (only one other in case of dual-homing) for a given 715 ES can be either fixed or on demand basis. Section 7.1 covers on- 716 demand based forwarding. 718 By feeding IP multicast flow received on one of the EVPN multi-homing 719 PEs to the interested EVPN PEs in the same multi-homing group, we 720 have essentially enabled all the EVPN PEs in the multi-homing group 721 to serve as UMH for that IP multicast flow. Each of these UMH PEs 722 advertises unicast route for X in (X,G) along with the VRF Route 723 Import EC to all PEs for that MVPN instance. The downstream PEs 724 build a candidate UMH set based on procedures described in section 725 5.1 of [RFC6513] and pick a UMH from the set. It should be noted 726 that both the default UMH selection procedure based on highest UMH PE 727 IP address and the UMH selection algorithm based on hash function 728 specified in section 5.1.3 of [RFC6513] (which is also a MUST 729 implement algorithm) result in the same UMH PE be selected by all 730 downstream PEs running the same algorithm. However, in order to 731 allow a form of "equal cost load balancing", the hash algorithm is 732 recommended to be used among all EVPN and MVPN PEs. This hash 733 algorithm distributes UMH selection for different IP multicast flows 734 among the multi-homing PEs for a given ES. 736 Since all downstream PEs (EVPN and MVPN) use the same hash-based 737 algorithm for UMH determination, they all choose the same upstream PE 738 as their UMH for a given (X,G) flow and thus they all send their 739 (X,G) join message via BGP to the same upstream PE. This results in 740 one of the multi-homing PEs to receive the join message and thus send 741 the IP multicast flow for (X,G) over its associated overlay tree even 742 though all of the multi-homing PEs in the All-Active redundancy group 743 have received the IP multicast flow (one of them directly via its 744 local AC and the rest indirectly via the associated intra-ES tunnel). 745 Therefore, only a single copy of routed IP multicast flow is sent 746 over the network regardless of overlay tree type supported by the PEs 747 - i.e., the overlay tree can be of type selective or aggregate 748 selective or inclusive tree. This gives the network operator the 749 maximum flexibility for choosing any overlay tree type that is 750 suitable for its network operation and still be able to deliver only 751 a single copy of the IP multicast flows to the egress PEs. In other 752 words, an egress PE only receives a single copy of the IP multicast 753 flow over the network, because it either receives it via the EVPN 754 intra-ES tunnel or MVPN inter-subnet tunnel. Furthermore, if it 755 receives it via MVPN inter-subnet tunnel, then only one of the multi- 756 homing PEs associated with the source ES, sends the IP multicast 757 traffic. 759 Since the network of interest for seamless interoperability between 760 EVPN and MVPN PEs is MPLS, the EVPN handling of BUM traffic for MPLS 761 network needs to be considered. EVPN [RFC7432] uses ESI MPLS label 762 for split-horizon filtering of Broadcast/Unknown unicast/multicast 763 (BUM) traffic from an All-Active multi-homing Ethernet Segment to 764 ensure that BUM traffic doesn't get loop back to the same Ethernet 765 Segment that it came from. This split-horizon filtering mechanism 766 applies as-is for multicast IRB scenario because of using the intra- 767 ES tunnel among multi-homing PEs. Since the multicast traffic 768 received from a TS source on an All-Active ES by a multi-homing PE is 769 bridged to all other multi-homing PEs in that group, the standard 770 EVPN split-horizon filtering described in [RFC7432] applies as-is. 771 Split-horizon filtering for non-MPLS encapsulations such as VxLAN is 772 described in section 9.2.2 that deals with a DC network that consists 773 of only EVPN PEs. 775 6.4. Mobility for Tenant's Sources and Receivers 777 When a tenant system (TS), source or receiver, is multi-homed behind 778 a group of multi-homing EVPN PEs, then TS mobility SHALL be supported 779 among EVPN PEs. Furthermore, such TS mobility SHALL only cause an 780 temporary disruption to the related multicast service among EVPN and 781 MVPN PEs. If a source is moved from one EVPN PE to another one, then 782 the EVPN mobility procedure SHALL discover this move and a new 783 unicast route advertisement (using both EVPN and IP-VPN routes) is 784 made by the EVPN PE where the source has moved to per section 6.3 785 above and unicast route withdraw (for both EVPN and IP-VPN routes) is 786 performed by the EVPN PE where the source has moved from. 788 The move of a source results in disruption of the IP multicast flow 789 for the corresponding (S,G) flow till the new unicast route 790 associated with the source is advertised by the new PE along with the 791 VRF Route Import EC, the join messages sent by the egress PEs are 792 received by the new PE, the multicast state for that flow is 793 installed in the new PE and a new overlay tree is built for that 794 source from the new PE to the egress PEs that are interested in 795 receiving that IP multicast flow. 797 The move of a receiver results in disruption of the IP multicast flow 798 to that receiver only till the new PE for that receiver discovers the 799 source and joins the overlay tree for that flow. 801 6.5. Intra-Subnet BUM Traffic Handling 803 Link local IP multicast traffic consists IPv4 traffic with a 804 destination address prefix of 224/8 and IPv6 traffic with a 805 destination address prefix of FF02/16. Such IP multicast traffic as 806 well as non-IP multicast/broadcast traffic are sent per EVPN 808 [RFC7432] BUM procedures and does not get routed via IP-VRF for 809 multicast addresses. So, such BUM traffic will be limited to a given 810 EVI/VLAN (e.g., a give subnet); whereas, IP multicast traffic, will 811 be locally L2 switched for local interfaces attached on the same 812 subnet and will be routed for local interfaces attached on a 813 different subnet or for forwarding traffic to other EVPN PEs (refer 814 to section 8 for data plane operation). 816 6.6. EVPN and MVPN interworking with gateway model 818 The procedures specified in this document offers optimal multicast 819 forwarding within a data center and also enables seamless 820 interoperability of multicast traffic between EVPN and MVPN networks, 821 when same tunnel types are used in the data plane. 823 There are few other use cases in connecting MVPN networks in the EVPN 824 fabric other than seamless interop model, where gateway model is used 825 to interconnect both networks. 827 Case1: All EVPN-PEs in the fabric can be made as MVPN exit points 828 Case2: MVPN network can be attached behind a EVPN PE or subset of 829 EVPN-PEs 830 Case3: MVPN network (MVPN-PEs) which uses different tunnel model 831 can be directly attached to EVPN fabric. 833 In gateway model, MVPN routes from one domain are terminated at the 834 gateway PE and re-originated for another domain. 836 With use case 1 & 2, All PEs connected to an EVPN fabric can use one 837 data plane to send & receive traffic within the fabric/data center. 838 Also, IPVPN routes need not be advertised inside the fabric. 839 Instead, PE where MVPN is terminated should advertise IPVPN as EVPN 840 routes. 842 With use case 3, Fabric will get two copies per multicast flow, if 843 receivers exist both MVPN and EVPN networks. (Two different data 844 planes are used to send the traffic in the fabric; one for EVPN 845 network and one for MVPN network). 847 7. Control Plane Operation 849 In seamless interop between EVPN and MVPN PEs, the control plane may 850 need to setup the following three types of multicast tunnels. The 851 first two are among EVPN PEs only but the third one is among EVPN and 852 MVPN PEs. 854 1) Intra-ES IP multicast tunnel 855 2) Intra-subnet BUM tunnel 857 3) Inter-subnet IP multicast tunnel 859 7.1. Intra-ES/Intra-Subnet IP Multicast Tunnel 861 As described in section 6.3.2, when a multicast source is sitting 862 behind an All-Active ES, then an intra-subnet multicast tunnel is 863 needed among the multi-homing EVPN PEs for that ES to carry multicast 864 flow received by one of the multi-homing PEs to the other PEs in that 865 ES. We refer to this multicast tunnel as Intra-ES/Intra-Subnet 866 tunnel. Vast majority of All-Active multi-homing for TOR devices in 867 DC networks are just dual-homing which means the multicast flow 868 received by one of the dual-homing PE only needs to be sent to the 869 other dual-homing PE. Therefore, a simple ingress replication tunnel 870 is all that is needed. In case of multi-homing to three or more EVPN 871 PEs, then other tunnel types such as P2MP, MP2MP, BIER, and Assisted 872 Replication can be considered. It should be noted that this intra-ES 873 tunnel is only needed for All-Active multi-homing and it is not 874 required for Single- Active multi-homing. 876 The EVPN PEs belonging to a given All-Active ES discover each other 877 using EVPN Ethernet Segment route per procedures described in 878 [RFC7432]. These EVPN PEs perform DF election per [RFC7432], 879 [I-D.ietf-bess-evpn-df-election-framework], or other DF election 880 algorithms to decide who is a DF for a given BD. If the BD belongs 881 to a tenant that has IRB IP multicast enabled for it, then for fixed- 882 mode, each PE sets up an intra-ES tunnel to forward IP multicast 883 traffic received locally on that BD to other multi-homing PE(s) for 884 that ES. Therefore, IP multicast traffic received via a local 885 attachment circuit is sent on this tunnel and on the associated IRB 886 interface for that BT and other local attachment circuits if there 887 are interested receivers for them. The other multi-homing EVPN PEs 888 treat this intra-ES tunnel just like their local ACs - i.e., the 889 multicast traffic received over this tunnel is treated as if it is 890 received via its local AC. Thus, the multi-homing PEs cannot receive 891 the same IP multicast flow from an MVPN tunnel (e.g., over an IRB 892 interface for that BD) because between a source behind a local AC 893 versus a source behind a remote PE, the PE always chooses its local 894 AC. 896 When ingress replication is used for intra-ES tunnel, every PE in the 897 All-Active multi-homing ES has all the information to setup these 898 tunnels - i.e., a) each PE knows what are the other multi-homing PEs 899 for that ES via EVPN Ethernet Segment route and can use this 900 information to setup intra-ES/Intra-Subnet IP multicast tunnel among 901 themselves. 903 When all multihomed PE support [I-D.ietf-bess-evpn-igmp-mld-proxy], 904 traffic can be forwarded on demand basis. Based on IGMP 905 synchronization procedure specified in 906 [I-D.ietf-bess-evpn-igmp-mld-proxy], join state can be synchronized 907 between all multihomed PEs. Multihomed PE which receives the 908 multicast traffic from its attached circuit, sends the traffic 909 towards intra-ES tunnel, only if it has received IGMP sync message 910 from one of the multihomed PEs. 912 Special handling is required, when a MVPN join is received via inter- 913 subnet tunnel from remote PEs. One of the other multi-homing PEs 914 that is selected as a UMH upon receiving a join message from a 915 downstream PE, originates an IGMP sync request using BD corresponds 916 to the tenant VRF (L3VNI). At the receiving end, this route must be 917 treated as if IGMP join has been received for all attached BDs on the 918 tenant domain. Traffic should be forwarded to remote multihomed PE 919 using intra ES tunnel corresponds to the source BD. 921 If a source exists behind inter-subnet tunnel, it is possible that 922 more than one multihomed PEs send MVPN join towards remote PE based 923 on incoming join on their local interfaces. When the traffic is 924 received on the inter-subnet tunnel, it is sent towards locally 925 attached receivers. Only DF sends traffic towards multihomed 926 ethernet segment. Traffic received on the inter-subnet tunnel, 927 should not be sent towards intra-ES tunnel. 929 7.2. Intra-Subnet BUM Tunnel 931 As the name implies, this tunnel is setup to carry BUM traffic for a 932 given subnet/BD among EVNP PEs. In [RFC7432] , this overlay tunnel 933 is used for transmission of all BUM traffic including user IP 934 multicast traffic. However, for multicast traffic handling in EVPN- 935 IRB PEs, this tunnel is used for all broadcast, unknown-unicast, non- 936 IP multicast traffic, and link-local IP multicast traffic - i.e., it 937 is used for all BUM traffic except user IP multicast traffic. This 938 tunnel is setup using IMET route for a given EVI/BD. The composition 939 and advertisement of IMET routes are exactly per [RFC7432] . It 940 should be noted that when an EVPN All-Active multi-homing PE uses 941 both this tunnel as well as intra-ES tunnel, there SHALL be no 942 duplication of multicast traffic over the network because they carry 943 different types of multicast traffic - i.e., intra-ES tunnel among 944 multi-homing PEs carries only user IP multicast traffic; whereas, 945 intra-subnet BUM tunnel carries link-local IP multicast traffic and 946 BUM traffic (w/ non-IP multicast). 948 7.3. Inter-Subnet IP Multicast Tunnel 950 As its name implies, this tunnel is setup to carry IP-only multicast 951 traffic for a given tenant across all its subnets (BDs) among EVPN 952 and MVPN PEs. 954 The following NLRIs from [RFC6514] is used for setting up this inter- 955 subnet tunnel in the network. 957 Intra-AS I-PMSI A-D route is used for the setup of default 958 underlay tunnel (also called inclusive tunnel) for a tenant IP- 959 VRF. The tunnel attributes are indicated using PMSI attribute 960 with this route. 962 S-PMSI A-D route is used for the setup of Customer flow specific 963 underlay tunnels. This enables selective delivery of data to PEs 964 having active receivers and optimizes fabric bandwidth 965 utilization. The tunnel attributes are indicated using PMSI 966 attribute with this route. 968 Each EVPN PE supporting a specific MVPN instance discovers the set of 969 other PEs in its AS that are attached to sites of that MVPN using 970 Intra-AS I-PMSI A-D route (route type 1) per [RFC6514]. It can also 971 discover the set of other ASes that have PEs attached to sites of 972 that MVPN using Inter-AS I-PMSI A-D route (route type 2) per 973 [RFC6514]. After the discovery of PEs that are attached to sites of 974 the MVPN, an inclusive overlay tree (I-PMSI) can be setup for 975 carrying tenant multicast flows for that MVPN; however, this is not a 976 requirement per [RFC6514] and it is possible to adopt a policy in 977 which all tenant flows are carried on S-PMSIs. 979 An EVPN-IRB PE sends a user IP multicast flow to other EVPN and MVPN 980 PEs over this inter-subnet tunnel that is instantiated using MVPN I- 981 PMSI or S-PMSI. This tunnel can be considered as being originated 982 and terminated from/to among IP-VRFs of EVPN/MVPN PEs; whereas, 983 intra- subnet tunnel is originated/terminated among MAC-VRFs of EVPN 984 PEs. 986 7.4. IGMP Hosts as TSes 988 If a tenant system which is an IGMP host is multi-homed to two or 989 more EVPN PEs using All-Active multi-homing, then IGMP join and leave 990 messages are synchronized between these EVPN PEs using EVPN IGMP Join 991 Synch route (route type 7) and EVPN IGMP Leave Synch route (route 992 type 8) per [I-D.ietf-bess-evpn-igmp-mld-proxy]. IGMP states are 993 built in the corresponding BDs of the multi-homing EVPN PEs. In 994 [I-D.ietf-bess-evpn-igmp-mld-proxy] the DF PE for that BD originates 995 an EVPN Selective Multicast Tag route (SMET route) route to other 996 EVPN PEs. However, in here there is no need to use SMET because the 997 IGMP messages are terminated by the EVPN-IRB PE and tenant (*,G) or 998 (S,G) join messages are sent via MVPN Shared Tree Join route (route 999 type 6) or Source Tree Join route (route type 7) respectively of 1000 MCAST-VPN NLRI per [RFC6514]. In case of a network with only IGMP 1001 hosts, the preferred mode of operation is that of Shortest Path 1002 Tree(SPT) per section 14 of [RFC6514]. This mode is only supported 1003 for PIM-SM and avoids the RP configuration overhead. Such mode is 1004 chosen by provisioning/ configuration. 1006 7.5. TS PIM Routers 1008 Just like a MVPN PE, an EVPN PE runs a separate tenant multicast 1009 routing instance (VPN-specific) per MVPN instance and the following 1010 tenant multicast routing instances are supported: 1012 - PIM Sparse Mode (PIM-SM) with the ASM service model 1013 - PIM Sparse Mode with the SSM service model 1014 - PIM Bidirectional Mode (BIDIR-PIM), which uses bidirectional 1015 tenant-trees to support the ASM service model 1017 A given tenant's PIM join messages for (*,G) or (S, G) are processed 1018 by the corresponding tenant multicast routing protocol and they are 1019 advertised over MPLS/IP network using Shared Tree Join route (route 1020 type 6) and Source Tree Join route (route type 7) respectively of 1021 MCAST-VPN NLRI per [RFC6514]. 1023 8. Data Plane Operation 1025 When an EVPN-IRB PE receives an IGMP/MLD join message over one of its 1026 Attachment Circuits (ACs), it adds that AC to its Layer-2 (L2) OIF 1027 list. This L2 OIF list is associated with the MAC-VRF/BT 1028 corresponding to the subnet of the tenant device that sent the IGMP/ 1029 MLD join. Therefore, tenant (S,G) or (*,G) forwarding entries are 1030 created/updated for the corresponding MAC-VRF/BT based on these 1031 source and group IP addresses. Furthermore, the IGMP/MLD join 1032 message is propagated over the corresponding IRB interface and it is 1033 processed by the tenant multicast routing instance which creates the 1034 corresponding tenant (S,G) or (*,G) Layer-3 (L3) forwarding entries. 1035 It adds this IRB interface to the L3 OIF list. An IRB is removed as 1036 a L3 OIF when all L2 tenant (S,G) or (*,G) forwarding states is 1037 removed for the MAC-VRF/BT associated with that IRB. Furthermore, 1038 tenant (S,G) or (*,G) L3 forwarding state is removed when all of its 1039 L3 OIFs are removed - i.e., all the IRB and L3 interfaces associated 1040 with that tenant (S,G) or (*,G) are removed. 1042 When an EVPN PE receives IP multicast traffic from one of its AC, if 1043 it has any attached receivers for that subnet, it performs L2 1044 switching of the intra-subnet traffic within the BT attached to that 1045 AC. If the multicast flow is received over an AC that belongs to an 1046 All-Active ES, then the multicast flow is also sent over the intra- 1047 ES/Intra-Subnet tunnel among multi-homing PEs. The EVPN PE then 1048 sends the multicast traffic over the corresponding IRB interface. 1049 The multicast traffic then gets routed in the corresponding IP-VRF 1050 and it gets forwarded to interfaces in the L3 OIF list which can 1051 include other IRB interfaces, other L3 interfaces directly connected 1052 to TSes, and the MVPN Inter-Subnet tunnel which is instantiated by an 1053 I-PMSI or S-PMSI tunnel. When the multicast packet is routed within 1054 the IP- VRF of the EVPN PE, its Ethernet header is stripped and its 1055 TTL gets decremented as the result of this IP routing. When the 1056 multicast traffic is received on an IRB interface by the BT 1057 corresponding to that interface, it gets L2 switched and sent over 1058 ACs that belong to the L2 OIF list. 1060 8.1. Intra-Subnet L2 Switching 1062 Rcvr1 in Figure 1 is connected to PE1 in MAC-VRF1 (same as Src1) and 1063 sends IGMP join for (C-S, C-G), IGMP snooping will record this state 1064 in local bridging entry. A routing entry will be formed as well 1065 which will point to MAC-VRF1 as RPF for Src1. We assume that Src1 is 1066 known via ARP or similar procedures. Rcvr1 will get a locally 1067 bridged copy of multicast traffic from Src1. Rcvr3 is also connected 1068 in MAC-VRF1 but to PE2 and hence would send IGMP join which will be 1069 recorded at PE2. PE2 will also form routing entry and RPF will be 1070 assumed as Tenant Tunnel "Tenant1" formed beforehand using MVPN 1071 procedures. Also this would cause multicast control plane to 1072 initiate a BGP MCAST-VPN type 7 route which would include VRI for PE1 1073 and hence be accepted on PE1. PE1 will include Tenant1 tunnel as 1074 Outgoing Interface (OIF) in the routing entry. Now, since it has 1075 knowledge of remote receivers via MVPN control plane it will 1076 encapsulate original multicast traffic in Tenant1 tunnel towards 1077 core. 1079 8.2. Inter-Subnet L3 Routing 1081 Rcvr2 in Figure 1 is connected to PE1 in MAC-VRF2 and hence PE1 will 1082 record its membership in MAC-VRF2. Since MAC-VRF2 is enabled with 1083 IRB, it gets added as another OIF to routing entry formed for (C-S, 1084 C-G). Rcvr2 and Rcvr4 are also in different MAC-VRFs than multicast 1085 speaker Src1 and hence need Inter-subnet forwarding. PE2 will form 1086 local bridging entry in MAC-VRF2 due to IGMP joins received from 1087 Rcvr3 and Rcvr4 respectively. PE2 now adds another OIF 'MAC-VRF2' to 1088 its existing routing entry. But there is no change in control plane 1089 states since its already sent MVPN route and no further signaling is 1090 required. Also since Src1 is not part of MAC-VRF2 subnet, it is 1091 treated as routing OIF and hence MAC header gets modified as per 1092 normal procedures for routing. PE3 forms routing entry very similar 1093 to PE2. It is to be noted that PE3 does not have MAC-VRF1 configured 1094 locally but still can receive the multicast data traffic over Tenant1 1095 tunnel formed due to MVPN procedures 1097 9. DCs with only EVPN PEs 1099 As mentioned earlier, the proposed solution can be used as a routed 1100 multicast solution in data center networks with only EVPN PEs (e.g., 1101 routed multicast VPN only among EVPN PEs). It should be noted that 1102 the scope of intra-subnet forwarding for the solution described in 1103 this document, is limited to a single EVPN PE for Single-Active 1104 multi-homing and to multi-homing PEs for All-Active multi-homing. In 1105 other words, the IP multicast traffic that needs to be forwarded from 1106 the source PE to remote PEs is routed to remote PEs regardless of 1107 whether the traffic is intra-subnet or inter-subnet. As the result, 1108 the TTL value for intra-subnet traffic that spans across two or more 1109 PEs get decremented. 1111 However, if there are applications that require intra-subnet 1112 multicast traffic to be L2 forwarded, Section 11 discusses some 1113 options to support applications having TTL value 1. The procedure 1114 discussed in Section 11 may be used to support applications that 1115 require intra-subnet multicast traffic to be L2 forwarded. 1117 9.1. Setup of overlay multicast delivery 1119 It must be emphasized that this solution poses no restriction on the 1120 setup of the tenant BDs and that neither the source PE, nor the 1121 receiver PEs do not need to know/learn about the BD configuration on 1122 other PEs in the MVPN. The Reverse Path Forwarder (RPF) is selected 1123 per the tenant multicast source and the IP-VRF in compliance with the 1124 procedures in [RFC6514], using the incoming EVPN route type 2 or 5 1125 NLRI per [RFC7432]. 1127 The VRF Route Import (VRI) extended community that is carried with 1128 the IP-VPN routes in [RFC6514] MUST be carried with the EVPN unicast 1129 routes when these routes are used. The construction and processing 1130 of the VRI are consistent with [RFC6514]. The VRI MUST uniquely 1131 identify the PE which is advertising a multicast source and the IP- 1132 VRF it resides in. 1134 VRI is constructed as following: 1136 - The 4-octet Global Administrator field MUST be set to an IP 1137 address of the PE. This address SHOULD be common for all the 1138 IP-VRFs on the PE (e.g., this address may be the PE's loopback 1139 address or VTEP address). 1141 - The 2-octet Local Administrator field associated with a given 1142 IP-VRF contains a number that uniquely identifies that IP-VRF 1143 within the PE that contains the IP-VRF. 1145 EVPN PE MUST have Route Target Extended Community to import/export 1146 MVPN routes. In data center environment, it is desirable to have 1147 this RT configured using auto-generated method than static 1148 configuration. 1150 The following is one recommended model to auto-generate MVPN RT: 1152 - The Global Administrator field of the MVPN RT MAY be set 1153 to BGP AS Number. 1155 - The Local Administrator field of the MVPN RT MAY be set to 1156 the VNI associated with the tenant VRF. 1158 Every PE which detects a local receiver via a local IGMP join or a 1159 local PIM join for a specific source (overlay SSM mode) MUST 1160 terminate the IGMP/PIM signaling at the IP-VRF and generate a (C-S,C- 1161 G) via the BGP MCAST-VPN route type 7 per [RFC6514] if and only if 1162 the RPF for the source points to the fabric. If the RPF points to a 1163 local multicast source on the same MAC-VRF or a different MAC-VRF on 1164 that PE, the MCAST-VPN MUST NOT be advertised and data traffic will 1165 be locally routed/bridged to the receiver as detailed in section 6.2. 1167 The VRI received with EVPN route type 2 or 5 NLRI from source PE will 1168 be appended as an export route-target extended community. More 1169 details about handling of various types of local receivers are in 1170 section 10. The PE which has advertised the unicast route with VRI, 1171 will import the incoming MCAST-VPN NLRI in the IP-VRF with the same 1172 import route-target extended-community and other PEs SHOULD ignore 1173 it. Following such procedure the source PE learns about the 1174 existence of at least one remote receiver in the tenant overlay and 1175 programs data plane accordingly so that a single copy of multicast 1176 data is forwarded into the fabric using tenant VRF tunnel. 1178 If the multicast source is unknown (overlay ASM mode), the MCAST-VPN 1179 route type 6 (C-*,C-G) join SHOULD be targeted towards the designated 1180 overlay Rendezvous Point (RP) by appending the received RP VRI as an 1181 export route-target extended community. Every PE which detects a 1182 local source, registers with its RP PE. That is how the RP learns 1183 about the tenant source(s) and group(s) within the MVPN. Once the 1184 overlay RP PE receives either the first remote (C-RP,C-G) join or a 1185 local IGMP/PIM join, it will trigger an MCAST-VPN route type 7 (C- 1186 S,C-G) towards the actual source PE for which it has received PIM 1187 register message in full compliance with regular PIM procedures. 1188 This involves the source PE to advertise the MCAST-VPN Source Active 1189 A-D route (MCAST-VPN route-type 5) towards all PEs. The Source 1190 Active A- D route is used to inform all PEs in a given MVPN about the 1191 active multicast source for switching from RPT to SPT when MVPNs use 1192 tenant RP-shared trees (i.e., rooted at tenant's RP) per section 13 1193 of [RFC6514]. This is done in order to choose a single forwarder PE 1194 and to suppress receiving duplicate traffic. In such scenarios, the 1195 active multicast source is used by the receiver PEs to join the SPT 1196 if they have not received tenant (S,G) joins and by the RPT PEs to 1197 prune off the tenant (S,G) state from the RPT. The Source Active A-D 1198 route is also used for MVPN scenarios without tenant RP-shared trees. 1199 In such scenarios, the receiver PEs with tenant (*,G) states use the 1200 Source Active A-D route to know which upstream PEs with sources 1201 behind them to join per section 14 of [RFC6514] - i.e., to suppress 1202 joining Overlay shared tree. 1204 9.2. Handling of different encapsulations 1206 Just as in [RFC6514] the MVPN I-PMSI and S-PMSI A-D routes are used 1207 to form the overlay multicast tunnels and signal the tunnel type 1208 using the P-Multicast Service Interface Tunnel (PMSI Tunnel) 1209 attribute. 1211 9.2.1. MPLS Encapsulation 1213 The [RFC6514] assumes MPLS/IP core and there is no modification to 1214 the signaling procedures and encoding for PMSI tunnel formation 1215 therein. Also, there is no need for a gateway to inter-operate with 1216 non-EVPN PEs supporting [RFC6514] based MVPN over IP/MPLS. 1218 9.2.2. VxLAN Encapsulation 1220 In order to signal VXLAN, the corresponding BGP encapsulation 1221 extended community [I-D.ietf-idr-tunnel-encaps] SHOULD be appended to 1222 the MVPN I- PMSI and S-PMSI A-D routes. The MPLS label in the PMSI 1223 Tunnel Attribute MUST be the Virtual Network Identifier (VNI) 1224 associated with the customer MVPN. The supported PMSI tunnel types 1225 with VXLAN encapsulation are: PIM-SSM Tree, PIM-SM Tree, BIDIR-PIM 1226 Tree, Ingress Replication [RFC6514]. Further details are in 1227 [RFC8365]. 1229 In this case, a gateway is needed for inter-operation between the 1230 EVPN PEs and non-EVPN MVPN PEs. The gateway should re-originate the 1231 control plane signaling with the relevant tunnel encapsulation on 1232 either side. In the data plane, the gateway terminates the tunnels 1233 formed on either side and performs the relevant stitching/re- 1234 encapsulation on data packets. 1236 9.2.3. Other Encapsulation 1238 In order to signal a different tunneling encapsulation such as NVGRE, 1239 GPE, or GENEVE the corresponding BGP encapsulation extended community 1240 [I-D.ietf-idr-tunnel-encaps] SHOULD be appended to the MVPN I-PMSI 1241 and S-PMSI A-D routes. If the Tunnel Type field in the encapsulation 1242 extended- community is set to a type which requires Virtual Network 1243 Identifier (VNI), e.g., VXLAN-GPE or NVGRE 1244 [I-D.ietf-idr-tunnel-encaps], then the MPLS label in the PMSI Tunnel 1245 Attribute MUST be the VNI associated with the customer MVPN. Same as 1246 in VXLAN case, a gateway is needed for inter- operation between the 1247 EVPN-IRB PEs and non-EVPN MVPN PEs. 1249 10. DCI with MPLS in WAN and VxLAN in DCs 1251 This section describers the inter-operation between MVPN PEs in WAN 1252 using MPLS encapsulation with EVPN PEs in a DC network using VxLAN 1253 encapsulation. Since the tunnel encapsulation between these networks 1254 are different, we must have at least one gateway in between. 1255 Usually, two or more are required for redundancy and load balancing 1256 purpose. In such scenarios, a DC network can be represented as a 1257 customer network that is multi-homed to two or more MVPN PEs via L3 1258 interfaces and thus standard MVPN multi-homing procedures are 1259 applicable here. It should be noted that a MVPN overlay tunnel over 1260 the DC network is terminated on the IP-VRF of the gateway and not the 1261 MAC-VRF/BTs. Therefore, the considerations for loop prevention and 1262 split-horizon filtering described in [I-D.ietf-bess-dci-evpn-overlay] 1263 are not applicable here. Some aspects of the multi-homing between 1264 VxLAN DC networks and MPLS WAN is in common with 1265 [I-D.ietf-bess-dci-evpn-overlay] . 1267 10.1. Control plane inter-connect 1269 The gateway(s) MUST be setup with the inclusive set of all the IP- 1270 VRFs that span across the two domains. On each gateway, there will 1271 be at least two BGP sessions: one towards the DC side and the other 1272 towards the WAN side. Usually for redundancy purpose, more sessions 1273 are setup on each side. The unicast route propagation follows the 1274 exact same procedures in [I-D.ietf-bess-dci-evpn-overlay]. Hence, a 1275 multicast host located in either domain, is advertised with the 1276 gateway IP address as the next-hop to the other domain. As a result, 1277 PEs view the hosts in the other domain as directly attached to the 1278 gateway and all inter-domain multicast signaling is directed towards 1279 the gateway(s). Received MVPN routes type 1-7 from either side of 1280 the gateway(s), MUST NOT be reflected back to the same side but 1281 processed locally and re-advertised (if needed) to the other side: 1283 o Intra-AS I-PMSI A-D Route: these are distributed within each 1284 domain to form the overlay tunnels which terminate at gateway(s). 1285 They are not passed to the other side of the gateway(s). 1287 o C-Multicast Route: joins are imported into the corresponding IP- 1288 VRF on each gateway and advertised as a new route to the other 1289 side with the following modifications (the rest of NLRI fields and 1290 path attributes remain on-touched): 1292 * Route-Distinguisher is set to that of the IP-VRF 1294 * Route-target is set to the exported route-target list on IP-VRF 1296 * The PMSI tunnel attribute and BGP Encapsulation extended 1297 community will be modified according to section 8 1299 * Next-hop will be set to the IP address which represents the 1300 gateway on either domain 1302 o Source Active A-D Route: same as joins 1304 o S-PMSI A-D Route: these are passed to the other side to form 1305 selective PMSI tunnels per every (C-S,C-G) from the gateway to the 1306 PEs in the other domain provided it contains receivers for the 1307 given (C-S, C-G). Similar modifications made to joins are made to 1308 the newly originated S-PMSI. 1310 In addition, the Originating Router's IP address is set to GW's IP 1311 address. Multicast signaling from/to hosts on local ACs on the 1312 gateway(s) are generated and propagated in both domains (if needed) 1313 per the procedures in section 7 in this document and in [RFC6514] 1314 with no change. It must be noted that for a locally attached source, 1315 the gateway will program an OIF per every domain from which it 1316 receives a remote join in its forwarding plane and different 1317 encapsulation will be used on the data packets. 1319 10.2. Data plane inter-connect 1321 Traffic forwarding procedures on gateways are same as those described 1322 for PEs in section 5 and 6 except that, unlike a non-border leaf PE, 1323 the gateway will not only route the incoming traffic from one side to 1324 its local receivers, but will also send it to the remote receivers in 1325 the the other domain after de-capsulation and appending the right 1326 encapsulation. The OIF and IIF are programmed in FIB based on the 1327 received joins from either side and the RPF calculation to the source 1328 or RP. The de-capsulation and encapsulation actions are programmed 1329 based on the received I-PMSI or S-PMSI A-D routes from either sides. 1330 If there are more than one gateway between two domains, the multi- 1331 homing procedures described in the following section must be 1332 considered so that incoming traffic from one side is not looped back 1333 to the other gateway. 1335 The multicast traffic from local sources on each gateway flows to the 1336 other gateway with the preferred WAN encapsulation. 1338 11. Supporting application with TTL value 1 1340 It is possible that some deployments may have a host on the tenant 1341 domain that sends multicast traffic with TTL value 1. The interested 1342 receiver for that traffic flow may be attached to different PEs on 1343 the same subnet. The procedures specified in section 6 always routes 1344 the traffic between PEs for both intra and inter subnet traffic. 1345 Hence traffic with TTL value 1 is dropped due to the nature of 1346 routing. 1348 This section discusses few possible ways to support traffic having 1349 TTL value 1. Implementation MAY support any of the following model. 1351 11.1. Policy based model 1353 Policies may be used to enforce EVPN BUM procedure for traffic flows 1354 with TTL value 1. Traffic flow that matches the policy is excluded 1355 from seamless interop procedure specified in this document, hence TTL 1356 decrement issue will not apply. 1358 11.2. Exercising BUM procedure for VLAN/BD 1360 Servers/hosts sending the traffic with TTL value 1 may be attached to 1361 a separate VLAN/BD, where multicast routing is disabled. When 1362 multicast routing is disabled, EVPN BUM procedure may be applied to 1363 all traffic ingressing on that VLAN/BD. On the Egress PE, the RPF 1364 for such traffic may be set to BD interface, where the source is 1365 attached. 1367 11.3. Intra-subnet bridging 1369 The procedure specified in the section enables a PE to detect an 1370 attached subnet source (i.e., source that is directly attached in the 1371 tenant BD/VLAN). By applying the following procedure for the 1372 attached source, Traffic flows having TTL value 1 can be supported. 1374 - On the ingress PE, do the bridging on the interface towards the 1375 core interface 1376 - On the egress side, make a decision whether to bridge or route 1377 at the outgoing interface (OIF) based on whether the source is 1378 attached to the OIF's BD/VLAN or not. 1380 Recent ASIC supports single lookup forwarding for brigading and 1381 routing (L2+L3). The procedure mentioned here leverages this ASIC 1382 capability. 1384 PE1 1385 +------------+ 1386 S11 +---+(BD1) | +---------+ 1387 | \ | | | 1388 |(IP-VRF)-(CORE)| | 1389 | / | | | 1390 R12 +---+(BD2) | | | 1391 +------------+ | | 1392 | | 1393 PE2 | VXLAN. | 1394 +------------+ | | 1395 R21 +---+(BD1) | | | 1396 | \ | | | 1397 |(IP-VRF)-(CORE)| | 1398 | / | | | 1399 R22+----+(BD3) | +---------+ 1400 +------------+ 1402 Figure 3 Intra-subnet bridging 1404 Consider the above picture. In the picture 1406 - PE1 and PE2 are seamless interop capable PEs 1407 - S11 is a multicast host directly attached to PE1 in BD1 1408 - Source S11 sends traffic to Group G11 1409 - R21, R22 are IGMP receivers for group G11 1410 - R21 and R22 are attached to BD1 and BD3 respectively at PE2. 1412 When source S11 starts sending the traffic, PE1 learns the source and 1413 announces the source using MVPN procedures to the remote PEs. 1415 At PE2, IGMP joins from R21, R22 result the creation of (*,G11) entry 1416 with outgoing OIF as IRB interface of BD1 and BD3. When PE2 learns 1417 the source information from PE1, it installs the route (S11, G11) at 1418 the tenant VRF with RPF as CORE interface. 1420 PE2 inherits (*, G11) OIFs to (S11, G11) entry. While inheriting 1421 OIF, PE2 checks whether source is attached to OIF's subnet. OIF 1422 matching source subnet is added with flag indicating bridge only 1423 interface. In case of (S11, G11) entry, BD1 is added as the bridge 1424 only OIF, while BD3 is added as normal OIF(L3 OIF). PEs (PE2) sends 1425 MVPN join (S11, G11) towards PE1, since it has local receivers. 1427 At Ingress PE(PE1), CORE interface is added to (S11, G11) entry as an 1428 OIF (outgoing interface) with a flag indicating that bridge only 1429 interface. With this procedure, ingress PE(PE1) bridges the traffic 1430 on CORE interface. (PE1 retains the TTL and source-MAC). The 1431 traffic is encapsulated with VNI associated with CORE 1432 interface(L3VNI). PE1 also routes the traffic for R12 which is 1433 attached to BD2 on the same device. 1435 PE2 decapsulates the traffic from PE1 and does inner lookup on the 1436 tenant VRF associated with incoming VNI. Traffic lookup on the 1437 tenant VRF yields (S11, G11) entry as the matching entry. Traffic 1438 gets bridged on BD1 (PE2 retains the TTL and source-MAC) since the 1439 OIF is marked as bridge only interface. Traffic gets routed on BD2. 1441 12. Interop with L2 EVPN PEs 1443 A gateway device is needed to do interop between EVPN PEs that 1444 support seamless interop procedure specified in this document and 1445 L2EVPN-PEs. A tenant domain can be provisioned with one or more such 1446 gateway devices known as "Seamless interop EVPN Multicast Gateway 1447 (SEMG)". PE that is configured as SEMG must be provisioned with all 1448 BDs that are available in the tenant domain. 1450 When advertising IMET route for a BD, PE configured as SEMG 1451 advertises EVPN Multicast Flags Extended Community with SEMG flag 1452 set. Given set of eligible PEs, one PE is selected as the SEMG 1453 designated forwarder (SEMG-DF). PE should use procedure specified in 1454 [I-D.ietf-bess-evpn-df-election-framework] for the SEMG DF election. 1456 There are multiple possibilities that need to be considered here. 1458 o L2EVPN PE may or may not have support for 1459 [I-D.ietf-bess-evpn-igmp-mld-proxy] 1461 o Seamless interop PE may or may not support 1462 [I-D.ietf-bess-evpn-igmp-mld-proxy] 1464 o Network may only have L2EVPN PE and Seamless interop capable PE 1466 o Network may have L2EVPN PE, Seamless interop capable PE and MVPN 1467 PE. 1469 Multicast sources and receivers can exist anywhere in the network. 1470 These usecases are discussed below. 1472 12.1. Interaction with L2EVPN PE and Seamless interop capable PE 1474 The following cases are considered in this section. 1476 o Case1: [I-D.ietf-bess-evpn-igmp-mld-proxy] is supported both at 1477 seamless interop capable PE and L2EVPN PE. 1479 o Case2: [I-D.ietf-bess-evpn-igmp-mld-proxy] is supported only at 1480 seamless interop capable PE. 1482 o Case3: [I-D.ietf-bess-evpn-igmp-mld-proxy] is not supported at 1483 interop capable PE. 1485 [I-D.ietf-bess-evpn-igmp-mld-proxy] support is recommended for 1486 seamless interop capable PE. SEMG can group L2 EVPN PEs into two 1487 separate groups ( one that supports the 1488 [I-D.ietf-bess-evpn-igmp-mld-proxy] and another that doesn't) from 1489 IMET routes that it receives from the remote peers. The interop 1490 procedure for handling these two different set of remote L2 EVPN PEs 1491 are captured in case 1 and 2. 1493 Case 1: [I-D.ietf-bess-evpn-igmp-mld-proxy] is supported both at 1494 seamless interop capable PE and L2EVPN PE 1496 This may be the most common usecase. 1498 SEMG-DF has the following special responsibilities on a BD for which 1499 it is the DF. 1501 o Process EVPN SMET routes from the remote L2 EVPN PEs that support 1502 [I-D.ietf-bess-evpn-igmp-mld-proxy] and creates L2 multicast 1503 state. SMET route in-turn triggers the creation of L3 multicast 1504 state similar to IGMP join received on the local AC. SEMG-DF 1505 exercises the MVPN procedures for the join. 1507 o It should not process IGMP control packets from L2EVPN PE that 1508 supports [I-D.ietf-bess-evpn-igmp-mld-proxy]. 1510 o Originate SMET(*,*) route towards L2 EVPN PEs. This is to receive 1511 traffic from multicast sources that are connected behind L2 EVPN 1512 PEs. 1514 o When SEMG-DF receives traffic from L2 EVPN PE on the intra-subnet 1515 tunnel on BD-X, it does the following 1516 * Performs FHR functionality 1518 * Advertises the host route with L3 label and VRF Route-Import 1519 corresponds to the tenant domain. 1521 * Sends the traffic towards the locally attached receivers. 1523 * Sends the traffic towards L2EVPN receiver on BDs other than 1524 incoming BD(after multicast routing) 1526 * Sends the traffic towards remote seamless interop capable PEs, 1527 where receivers are attached/connected behind that PE. 1529 o When SEMG-DF receives traffic from the MVPN tunnel, it does the 1530 following 1532 * Sends the traffic towards the IRB interfaces, where receiver 1533 exists 1535 * BD corresponding to the IRB interfaces may have local receivers 1536 or remote receivers behind L2 EVPN PE. SEMG-DF sends the 1537 traffic on the intra-subnet tunnel for remote receivers. 1539 Case 2: [I-D.ietf-bess-evpn-igmp-mld-proxy] is not supported at L2 1540 EVPN PE 1542 This case only differs from case 1 in terms of the way it learns 1543 receivers behind L2 EVPN PEs and how SEMG-DF attracks traffic from 1544 sources behind L2 EVPN PE. Rest of procedures specified above is 1545 applicable for this case. 1547 SEMG-DF has the following special responsibilities on a BD for which 1548 it is the DF 1550 o Process IGMP control packets from remote L2 EVPN PEs that doesn't 1551 support [I-D.ietf-bess-evpn-igmp-mld-proxy] and create L2 and L3 1552 state. 1554 o When an IGMP query is received on the intra-subnet tunnel on BD-X, 1555 SEMG-DF needs to send proxy IGMP reports for all groups that it 1556 has learned from remote L2-EVPN PEs on that BD. 1558 o Connecting multicast router behind L2 EVPN PE is not recommended. 1559 If a multicast router is connected behind L2 EVPN PE, the BD 1560 corresponds to VRF tunnel needs to be configured in the L2 EVPN PE 1561 so that PIM router may get all joins that are received in the BD 1562 corresponds to MVPN tunnel interface at SEMG-DF. 1564 o SEMG-DF should get all multicast traffic from L2EVPN PEs. This 1565 may be achieved by sending IGMP query or PIM hello on the intra- 1566 subnet tunnel 1568 Case 3: [I-D.ietf-bess-evpn-igmp-mld-proxy] is not supported at 1569 seamless interop capable PE 1571 The precedure of handling this use case is exactly the same as case 1572 2. 1574 All seamless interop capable PEs other than SEMG should discard SMET 1575 routes that is coming from L2EVPN PEs and must discard all IGMP 1576 control packets, if any received on the intra-subnet tunnel. SEMG 1577 should discard incoming SMET routes and IGMP joins from L2EVPN PEs, 1578 if it is not the DF for the incoming BD. 1580 When [I-D.ietf-bess-evpn-igmp-mld-proxy] is supported both at 1581 seamless interop capable PE and L2EVPN PE, selective forwarding is 1582 done based on receiver interest at the egress-PE, when overlay tunnel 1583 type is Ingress-replication or selective tunnel. 1585 12.2. Network having L2EVPN PE, Seamless interop capable PE and MVPN PE 1587 Since MVPN-PE can only interact with Seamless interop capable PEs, 1588 SEMG-DF acts as FHR and LHR for sources and receivers behind L2 EVPN 1589 PE. Only SEMG-DF advertises IPVPN unicast route along with VRF Route 1590 Import extended community for hosts behind L2 EVPN PE. No additional 1591 procedures are required, when they all co-exist. 1593 13. Connecting external Multicast networks or PIM routers. 1595 External multicast networks or PIM routers can be attached to any 1596 seamless interop capable EVPN-PEs or set of EVPN-PEs. Multicast 1597 network or PIM router can also be attached to any IRB enabled BDI 1598 interface or L3 enabled interface or set of interfaces. The fabric 1599 can be used as a Transit network. All PIM signaling is terminated at 1600 EVPN-PEs. 1602 No additional procedures are required while connecting external 1603 multicast networks. 1605 14. RP handling 1607 This section describes various RP models for a tenant VRF. The RP 1608 model SHOULD be consistent across all EVPN-PEs for given group/group 1609 range in the tenant VRF. 1611 14.1. Various RP deployment options 1613 14.1.1. RP-less mode 1615 EVPN fabric without having any external multicast network/attached 1616 MVPN network, doesn't need RP configuration. A configuration option 1617 SHALL be provided to the end user to operate the fabric in RP less 1618 mode. When an EVPN-PE is operating in RP-less mode, EVPN-PE MUST 1619 advertise all attached sources to remote EVPN PEs using procedure 1620 specified in [RFC6514]. 1622 In RP less mode, (C-*,C-G) RPF may be set to NULL or may be set to 1623 wild card interface( Any interface on the tenant VRF). In RP-less 1624 mode, traffic is always forwarded based on (C-S,C-G) state. 1626 14.1.2. Fabric anycast RP 1628 In this model, anycast GW IP address is configured as RP in all EVPN- 1629 PE. When an EVPN-PE is operating in Fabric anycast-RP mode, an EVPN- 1630 PE MUST advertise all sources behind that PE to other EVPN PEs using 1631 procedure specified in [RFC6514]. In this model, Sources may be 1632 directly attached to tenant BDs or sources may be attached behind a 1633 PIM router (In that case EVPN-PE learns source information due to PIM 1634 register terminating at RP interface at the tenant VRF side) 1636 In RP-less mode and Fabric anycast RP mode, EVPN-PE operates SPT-only 1637 mode as per section 14 of [RFC6514]. 1639 14.1.3. Static RP 1641 The procedure specified in this document supports configuring EVPN 1642 fabric with static RP. RP can be configured in the EVPN-PE itself in 1643 the tenant VRF or in the external multicast networks connected behind 1644 an EVPN PE or in the MVPN network. When RPF is not local to EVPN-PE, 1645 EVPN-PE operates in rpt-spt mode as PER procedures specified in 1646 section 13 of [RFC6514]. 1648 14.1.4. Co-existence of Fabric anycast RP and external RP 1650 External multicast network using its own RP may be connected to EVPN 1651 fabric operating with Fabric anycast RP mode. In this case, subset 1652 of EVPN-PEs may be designated as border leafs. Anycast RP may be 1653 configured between border leafs and external RP. Border leafs 1654 originates SA-AD routes for external sources towards fabric PEs. 1655 Border leaf acts as FHR for the sources inside the fabric. 1656 Configuration option may be provided to define the PE role as BL. 1658 14.2. RP configuration options 1660 PIM Bidir and PIM-SM ASM mode require Rendezvous point (RP) 1661 configuration, which acts as a shared root for a multicast shared 1662 tree. RP can be configured using static configuration or by using 1663 BSR or Auto-RP procedures on the tenant VRF. This document only 1664 discusses static RP configuration. The use of BSR or Auto-RP 1665 procedure in the EVPN fabric is beyond the scope of this document. 1667 15. IANA Considerations 1669 IANA is requested to assign new flags in the "Multicast Flags 1670 Extended Community Flags" registry for the following. 1672 o Seamless interop capable PE 1674 o SEMG 1676 16. Security Considerations 1678 All the security considerations in [RFC7432] apply directly to this 1679 document because this document leverages [RFC7432] control plane and 1680 their associated procedures. 1682 17. Acknowledgements 1684 The authors would like to thank Niloofar Fazlollahi, Aamod 1685 Vyavaharkar, Raunak Banthia, and Swadesh Agrawal for their 1686 discussions and contributions. 1688 18. References 1690 18.1. Normative References 1692 [I-D.ietf-bess-dci-evpn-overlay] 1693 Rabadan, J., Sathappan, S., Henderickx, W., Sajassi, A., 1694 and J. Drake, "Interconnect Solution for EVPN Overlay 1695 networks", draft-ietf-bess-dci-evpn-overlay-10 (work in 1696 progress), March 2018. 1698 [I-D.ietf-bess-evpn-df-election-framework] 1699 Rabadan, J., satyamoh@cisco.com, s., Sajassi, A., Drake, 1700 J., Nagaraj, K., and S. Sathappan, "Framework for EVPN 1701 Designated Forwarder Election Extensibility", draft-ietf- 1702 bess-evpn-df-election-framework-09 (work in progress), 1703 January 2019. 1705 [I-D.ietf-bess-evpn-igmp-mld-proxy] 1706 Sajassi, A., Thoria, S., Drake, J., and W. Lin, "IGMP and 1707 MLD Proxy for EVPN", draft-ietf-bess-evpn-igmp-mld- 1708 proxy-06 (work in progress), January 2021. 1710 [I-D.ietf-bess-evpn-inter-subnet-forwarding] 1711 Sajassi, A., Salam, S., Thoria, S., Drake, J., and J. 1712 Rabadan, "Integrated Routing and Bridging in EVPN", draft- 1713 ietf-bess-evpn-inter-subnet-forwarding-11 (work in 1714 progress), October 2020. 1716 [I-D.ietf-idr-tunnel-encaps] 1717 Patel, K., Velde, G., Sangli, S., and J. Scudder, "The BGP 1718 Tunnel Encapsulation Attribute", draft-ietf-idr-tunnel- 1719 encaps-21 (work in progress), January 2021. 1721 [I-D.skr-bess-evpn-pim-proxy] 1722 Rabadan, J., Kotalwar, J., Sathappan, S., Zhang, Z., and 1723 A. Sajassi, "PIM Proxy in EVPN Networks", draft-skr-bess- 1724 evpn-pim-proxy-01 (work in progress), October 2017. 1726 [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ 1727 BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 1728 2012, . 1730 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 1731 Encodings and Procedures for Multicast in MPLS/BGP IP 1732 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 1733 . 1735 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1736 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 1737 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 1738 2015, . 1740 [RFC8365] Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R., 1741 Uttaro, J., and W. Henderickx, "A Network Virtualization 1742 Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365, 1743 DOI 10.17487/RFC8365, March 2018, 1744 . 1746 18.2. Informative References 1748 [RFC4389] Thaler, D., Talwar, M., and C. Patel, "Neighbor Discovery 1749 Proxies (ND Proxy)", RFC 4389, DOI 10.17487/RFC4389, April 1750 2006, . 1752 [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private 1753 LAN Service (VPLS) Using BGP for Auto-Discovery and 1754 Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, 1755 . 1757 [RFC7080] Sajassi, A., Salam, S., Bitar, N., and F. Balus, "Virtual 1758 Private LAN Service (VPLS) Interoperability with Provider 1759 Backbone Bridges", RFC 7080, DOI 10.17487/RFC7080, 1760 December 2013, . 1762 [RFC7209] Sajassi, A., Aggarwal, R., Uttaro, J., Bitar, N., 1763 Henderickx, W., and A. Isaac, "Requirements for Ethernet 1764 VPN (EVPN)", RFC 7209, DOI 10.17487/RFC7209, May 2014, 1765 . 1767 Appendix A. Use Cases 1769 A.1. DCs with only IGMP/MLD hosts w/o tenant router 1771 In a EVPN network consisting of only IGMP/MLD hosts, PE's will 1772 receive IGMP (*, G) or (S, G) joins from their locally attached host 1773 and would originate MVPN C-Multicast Route Type 6 and 7 NLRI's 1774 respectively. As described in [RFC6514] these NLRI's are directed 1775 towards RP-PE for Type 6 or Source-PE for Type 7. In case of (*, G) 1776 join a Shared-Path Tree will be built in the core from RP-PE towards 1777 all Receiver-PE's. Once a Source starts to send Multicast data to 1778 specified multicast-group, the PE directly connected to Source will 1779 do PIM-registration with RP. Since there are existing receivers for 1780 the Group, RP will originate a PIM (S, G) join towards Source. This 1781 will be converted to MVPN Type 7 NLRI by RP-PE. Please note that the 1782 router RP-PE would be the PE configured as RP (e.g., using static 1783 configuration or by using BSR or Auto- RP procedures). The detailed 1784 working of such protocols is beyond the scope of this document. Upon 1785 receiving Type 7 NLRI, Source-PE will include MVPN Tunnel in its 1786 Outgoing Interface List. Furthermore, Source-PE will follow the 1787 procedures in [RFC6514] to originate MVPN SA-AD route (RT 5) to avoid 1788 duplicate traffic and allow all Receiver-PE's to shift from Share- 1789 Tree to Shortest-Path-Tree rooted at Source-PE. Section 13 of 1790 [RFC6514] describes it. 1792 However a network operator can chose to have only Shortest-Path-Tree 1793 built in MVPN core as described in section 14 of [RFC6514]. One way 1794 to achieve this, is for all PE's act as RP for its locally connected 1795 hosts and thus avoid sending any Shared-Tree Join (MVPN Type 6) into 1796 the core. In this scenario, there will be no PIM registration needed 1797 since all PE's are first-hop router as well as acting RP. Once a 1798 source starts to send multicast data, the PE directly connected to it 1799 originates Source- Active AD (RT 5) to all other PE's in network. 1801 Upon Receiving Source-Active AD route a PE must cache it in its local 1802 database and also look for any matching interest for (*, G) where G 1803 is the multicast group described in received Source-Active AD route. 1804 If it finds any such matching entry, it must originate a C-Multicast 1805 route (RT 7) in order to start receiving traffic from Source-PE. 1806 This procedure must be repeated on reception of any further Source- 1807 Active AD routes. 1809 A.2. DCs with mixed of IGMP/MLD hosts & multicast routers running PIM- 1810 SSM 1812 This scenario has multicast routers which can send PIM SSM (S, G) 1813 joins. Upon receiving these joins and if source described in join is 1814 learnt to be behind a MVPN peer PE, local PE will originate 1815 C-Multicast Join (RT 7) towards Source-PE. It is expected that PIM 1816 SSM group ranges are kept separate from ASM range for which IGMP 1817 hosts can send (*, G) joins. Hence both ASM and SSM groups shall 1818 operate without any overlap. There is no RP needed for SSM range 1819 groups and Shortest Path tree rooted at Source is built once a 1820 receiver interest is known. 1822 A.3. DCs with mixed of IGMP/MLD hosts & multicast routers running PIM- 1823 ASM 1825 This scenario includes reception of PIM (*, G) joins on PE's local 1826 AC. These joins are handled similar to IGMP (*, G) join as explained 1827 in sections above. Another interesting case can arise here is when 1828 one of the tenant routers can act as RP for some of the ASM Groups. 1829 In such scenario, a Upstream Multicast Hop (UMH) will be elected by 1830 other PE's in order to send C-Multicast Routes (RT 6). All 1831 procedures described in [RFC6513] with respect to UMH should be used 1832 to avoid traffic duplication due to incoherent selection of RP-PE by 1833 different Receiver-PE's. 1835 A.4. DCs with mixed of IGMP/MLD hosts & multicast routers running PIM- 1836 Bidir 1838 Creating Bidirectional (*, G) trees is useful when a customer wants 1839 least amount of control state in network. But on downside all 1840 receivers for a particular multicast group receive traffic from all 1841 sources sending to that group. However for the purpose of this 1842 document, all procedures as described in [RFC6513] and [RFC6514] 1843 apply when PIM-Bidir is used. 1845 Authors' Addresses 1847 Ali Sajassi 1848 Cisco 1849 170 West Tasman Drive 1850 San Jose, CA 95134, US 1852 Email: sajassi@cisco.com 1854 Kesavan Thiruvenkatasamy 1855 Cisco 1856 170 West Tasman Drive 1857 San Jose, CA 95134, US 1859 Email: kethiruv@cisco.com 1861 Samir Thoria 1862 Cisco 1863 170 West Tasman Drive 1864 San Jose, CA 95134, US 1866 Email: sthoria@cisco.com 1868 Ashutosh Gupta 1869 VMware 1870 3401 Hillview Ave, Palo Alto, CA 94304 1872 Email: ashutoshgupta@vmware.com 1874 Luay Jalil 1875 Verizon 1877 Email: luay.jalil@verizon.com