idnits 2.17.1 draft-ietf-bess-evpn-igmp-mld-proxy-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 29, 2021) is 1056 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-14) exists of draft-ietf-bess-evpn-bum-procedure-updates-08 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS WorkGroup A. Sajassi 3 Internet-Draft S. Thoria 4 Intended status: Standards Track M. Mishra 5 Expires: November 30, 2021 Cisco Systems 6 K. Patel 7 Arrcus 8 J. Drake 9 W. Lin 10 Juniper Networks 11 May 29, 2021 13 IGMP and MLD Proxy for EVPN 14 draft-ietf-bess-evpn-igmp-mld-proxy-10 16 Abstract 18 Ethernet Virtual Private Network (EVPN) solution is becoming 19 pervasive in data center (DC) applications for Network Virtualization 20 Overlay (NVO) and DC interconnect (DCI) services, and in service 21 provider (SP) applications for next generation virtual private LAN 22 services. 24 This draft describes how to support efficiently endpoints running 25 IGMP for the above services over an EVPN network by incorporating 26 IGMP proxy procedures on EVPN PEs. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on November 30, 2021. 45 Copyright Notice 47 Copyright (c) 2021 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Specification of Requirements . . . . . . . . . . . . . . . . 4 64 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 65 4. IGMP/MLD Proxy . . . . . . . . . . . . . . . . . . . . . . . 6 66 4.1. Proxy Reporting . . . . . . . . . . . . . . . . . . . . . 6 67 4.1.1. IGMP/MLD Membership Report Advertisement in BGP . . . 6 68 4.1.2. IGMP/MLD Leave Group Advertisement in BGP . . . . . . 8 69 4.2. Proxy Querier . . . . . . . . . . . . . . . . . . . . . . 9 70 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 9 71 5.1. PE with only attached hosts/VMs for a given subnet . . . 10 72 5.2. PE with a mix of attached hosts/VMs and multicast source 11 73 5.3. PE with a mix of attached hosts/VMs, a multicast source 74 and a router . . . . . . . . . . . . . . . . . . . . . . 11 75 6. All-Active Multi-Homing . . . . . . . . . . . . . . . . . . . 11 76 6.1. Local IGMP/MLD Join Synchronization . . . . . . . . . . . 11 77 6.2. Local IGMP/MLD Leave Group Synchronization . . . . . . . 12 78 6.2.1. Remote Leave Group Synchronization . . . . . . . . . 13 79 6.2.2. Common Leave Group Synchronization . . . . . . . . . 13 80 6.3. Mass Withdraw of Multicast join Sync route in case of 81 failure . . . . . . . . . . . . . . . . . . . . . . . . . 14 82 7. Single-Active Multi-Homing . . . . . . . . . . . . . . . . . 14 83 8. Selective Multicast Procedures for IR tunnels . . . . . . . . 14 84 9. BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . 15 85 9.1. Selective Multicast Ethernet Tag Route . . . . . . . . . 15 86 9.1.1. Constructing the Selective Multicast Ethernet Tag 87 route . . . . . . . . . . . . . . . . . . . . . . . . 17 88 9.1.2. Default Selective Multicast Route . . . . . . . . . . 18 89 9.2. Multicast Join Synch Route . . . . . . . . . . . . . . . 19 90 9.2.1. Constructing the Multicast Join Synch Route . . . . . 21 91 9.3. Multicast Leave Synch Route . . . . . . . . . . . . . . . 22 92 9.3.1. Constructing the Multicast Leave Synch Route . . . . 24 94 9.4. Multicast Flags Extended Community . . . . . . . . . . . 26 95 9.5. EVI-RT Extended Community . . . . . . . . . . . . . . . . 27 96 9.6. Rewriting of RT ECs and EVI-RT ECs by ASBRs . . . . . . . 29 97 9.7. BGP Error Handling . . . . . . . . . . . . . . . . . . . 29 98 10. IGMP/MLD Immediate Leave . . . . . . . . . . . . . . . . . . 30 99 11. IGMP Version 1 Membership Report . . . . . . . . . . . . . . 30 100 12. Security Considerations . . . . . . . . . . . . . . . . . . . 30 101 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 102 14. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 31 103 15. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 31 104 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 105 16.1. Normative References . . . . . . . . . . . . . . . . . . 31 106 16.2. Informative References . . . . . . . . . . . . . . . . . 33 107 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 33 109 1. Introduction 111 Ethernet Virtual Private Network (EVPN) solution [RFC7432] is 112 becoming pervasive in data center (DC) applications for Network 113 Virtualization Overlay (NVO) and DC interconnect (DCI) services, and 114 in service provider (SP) applications for next generation virtual 115 private LAN services. 117 In DC applications, a point of delivery (POD) can consist of a 118 collection of servers supported by several top of rack (TOR) and 119 Spine switches. This collection of servers and switches are self 120 contained and may have their own control protocol for intra-POD 121 communication and orchestration. However, EVPN is used as standard 122 way of inter-POD communication for both intra-DC and inter-DC. A 123 subnet can span across multiple PODs and DCs. EVPN provides robust 124 multi-tenant solution with extensive multi-homing capabilities to 125 stretch a subnet (VLAN) across multiple PODs and DCs. There can be 126 many hosts/VMs ( several hundreds) attached to a subnet that is 127 stretched across several PODs and DCs. 129 These hosts/VMs express their interests in multicast groups on a 130 given subnet/VLAN by sending IGMP Membership Reports (Joins) for 131 their interested multicast group(s). Furthermore, an IGMP router 132 periodically sends membership queries to find out if there are hosts 133 on that subnet that are still interested in receiving multicast 134 traffic for that group. The IGMP/MLD Proxy solution described in 135 this draft accomplishes has three objectives: 137 1. Reduce flooding of IGMP messages: just like the ARP/ND 138 suppression mechanism in EVPN to reduce the flooding of ARP 139 messages over EVPN, it is also desired to have a mechanism to 140 reduce the flooding of IGMP messages (both Queries and Reports) 141 in EVPN. 143 2. Distributed anycast multicast proxy: it is desirable for the EVPN 144 network to act as a distributed anycast multicast router with 145 respect to IGMP/MLD proxy function for all the hosts attached to 146 that subnet. 148 3. Selective Multicast: to forward multicast traffic over EVPN 149 network such that it only gets forwarded to the PEs that have 150 interest in the multicast group(s), multicast traffic will not be 151 forwarded to the PEs that have no receivers attached to them for 152 that multicast group. This draft shows how this objective may be 153 achieved when Ingress Replication is used to distribute the 154 multicast traffic among the PEs. Procedures for supporting 155 selective multicast using P2MP tunnels can be found in 156 [I-D.ietf-bess-evpn-bum-procedure-updates] 158 The first two objectives are achieved by using IGMP/MLD proxy on the 159 PE and the third objective is achieved by setting up a multicast 160 tunnel (e.g., ingress replication) only among the PEs that have 161 interest in that multicast group(s) based on the trigger from IGMP/ 162 MLD proxy processes. The proposed solutions for each of these 163 objectives are discussed in the following sections. 165 2. Specification of Requirements 167 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 168 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 169 "OPTIONAL" in this document are to be interpreted as described in BCP 170 14 [RFC2119] [RFC8174] when, and only when, they appear in all 171 capitals, as shown here. 173 3. Terminology 175 o POD: Point of Delivery 177 o ToR: Top of Rack 179 o NV: Network Virtualization 181 o NVO: Network Virtualization Overlay 183 o EVPN: Ethernet Virtual Private Network 185 o IGMP: Internet Group Management Protocol 187 o MLD: Multicast Listener Discovery 189 o EVI: An EVPN instance spanning the Provider Edge (PE) devices 190 participating in that EVPN 192 o MAC-VRF: A Virtual Routing and Forwarding table for Media Access 193 Control (MAC) addresses on a PE 195 o IR: Ingress Replication 197 o Ethernet Segment (ES): When a customer site (device or network) is 198 connected to one or more PEs via a set of Ethernet links, then 199 that set of links is referred to as an 'Ethernet Segment'. 201 o Ethernet Segment Identifier (ESI): A unique non-zero identifier 202 that identifies an Ethernet Segment is called an 'Ethernet Segment 203 Identifier'. 205 o PE: Provider Edge. 207 o BD: Broadcast Domain. As per [RFC7432], an EVI consists of a 208 single or multiple BDs. In case of VLAN-bundle and VLAN-aware 209 bundle service model, an EVI contains multiple BDs. Also, in this 210 document, BD and subnet are equivalent terms. 212 o Ethernet Tag: An Ethernet tag identifies a particular broadcast 213 domain, e.g., a VLAN. An EVPN instance consists of one or more 214 broadcast domains. 216 o Single-Active Redundancy Mode: When only a single PE, among all 217 the PEs attached to an Ethernet segment, is allowed to forward 218 traffic to/from that Ethernet segment for a given VLAN, then the 219 Ethernet segment is defined to be operating in Single-Active 220 redundancy mode. 222 o All-Active Redundancy Mode: When all PEs attached to an Ethernet 223 segment are allowed to forward known unicast traffic to/from that 224 Ethernet segment for a given VLAN, then the Ethernet segment is 225 defined to be operating in All-Active redundancy mode. 227 o PMSI: P-Multicast Service Interface - a conceptual interface for a 228 PE to send customer multicast traffic to all or some PEs in the 229 same VPN. 231 o S-PMSI: Selective PMSI - to some of the PEs in the same VPN. 233 This document also assumes familiarity with the terminology of 234 [RFC7432]. Though most of the place this document uses term IGMP 235 Membership Report (Joins), the text applies equally for MLD 236 Membership Report too. Similarly, text for IGMPv2 applies to MLDv1 237 and text for IGMPv3 applies to MLDv2. IGMP / MLD version encoding in 238 BGP update is stated in Section 9 240 4. IGMP/MLD Proxy 242 The IGMP Proxy mechanism is used to reduce the flooding of IGMP 243 messages over an EVPN network similar to ARP proxy used in reducing 244 the flooding of ARP messages over EVPN. It also provides a 245 triggering mechanism for the PEs to setup their underlay multicast 246 tunnels. The IGMP Proxy mechanism consists of two components: 248 1. Proxy for IGMP Reports. 250 2. Proxy for IGMP Queries. 252 4.1. Proxy Reporting 254 When IGMP protocol is used between hosts/VMs and their first hop EVPN 255 router (EVPN PE), Proxy-reporting is used by the EVPN PE to summarize 256 (when possible) reports received from downstream hosts and propagate 257 them in BGP to other PEs that are interested in the information. 258 This is done by terminating the IGMP Reports in the first hop PE, and 259 translating and exchanging the relevant information among EVPN BGP 260 speakers. The information is again translated back to IGMP message 261 at the recipient EVPN speaker. Thus it helps create an IGMP overlay 262 subnet using BGP. In order to facilitate such an overlay, this 263 document also defines a new EVPN route type NLRI, the EVPN Selective 264 Multicast Ethernet Tag route, along with its procedures to help 265 exchange and register IGMP multicast groups Section 9. 267 4.1.1. IGMP/MLD Membership Report Advertisement in BGP 269 When a PE wants to advertise an IGMP Membership Report (Join) using 270 the BGP EVPN route, it follows the following rules (BGP encoding 271 stated in Section 9): 273 1. When the first hop PE receives several IGMP Membership Reports 274 (Joins), belonging to the same IGMP version, from different 275 attached hosts/VMs for the same (*,G) or (S,G), it only SHOULD 276 send a single BGP message corresponding to the very first IGMP 277 Membership Request (BGP update as soon as possible) for that 278 (*,G) or (S,G). This is because BGP is a stateful protocol and 279 no further transmission of the same report is needed. If the 280 IGMP Membership Request is for (*,G), then multicast group 281 address MUST be sent along with the corresponding version flag 282 (v2 or v3) set. In case of IGMPv3, the exclude flag MUST also 283 needs to be set to indicate that no source IP address to be 284 excluded (include all sources"*"). If the IGMP Join is for 285 (S,G), then besides setting multicast group address along with 286 the version flag v3, the source IP address and the include/ 287 exclude flag MUST be set. It should be noted that when 288 advertising the EVPN route for (S,G), the only valid version flag 289 is v3 (v2 flags MUST be set to zero). 291 2. When the first hop PE receives an IGMPv3 Join for (S,G) on a 292 given BD, it SHOULD advertise the corresponding EVPN Selective 293 Multicast Ethernet Tag (SMET) route regardless of whether the 294 source (S) is attached to itself or not in order to facilitate 295 the source move in the future. 297 3. When the first hop PE receives an IGMP version-X Join first for 298 (*,G) and then later it receives an IGMP version-Y Join for the 299 same (*,G), then it MUST re-advertise the same EVPN SMET route 300 with flag for version-Y set in addition to any previously-set 301 version flag(s). In other words, the first hop PE MUST NOT 302 withdraw the EVPN route before sending the new route because the 303 flag field is not part of BGP route key processing. 305 4. When the first hop PE receives an IGMP version-X Join first for 306 (*,G) and then later it receives an IGMPv3 Join for the same 307 multicast group address but for a specific source address S, then 308 the PE MUST advertise a new EVPN SMET route with v3 flag set (and 309 v2 reset). The include/exclude flag also need to be set 310 accordingly. Since source IP address is used as part of BGP 311 route key processing it is considered as a new BGP route 312 advertisement. 314 5. When a PE receives an EVPN SMET route with more than one version 315 flag set, it will generate the corresponding IGMP report for 316 (*,G) for each version specified in the flags field. With 317 multiple version flags set, there MUST NOT be source IP address 318 in the receive EVPN route. If there is, then an error SHOULD be 319 logged . If the v3 flag is set (in addition to v2), then the 320 include/exclude flag MUST indicate "exclude". If not, then an 321 error SHOULD be logged. The PE MUST generate an IGMP Membership 322 Report (Join) for that (*,G) and each IGMP version in the version 323 flag. 325 6. When a PE receives a list of EVPN SMET NLRIs in its BGP update 326 message, each with a different source IP address and the same 327 multicast group address, and the version flag is set to v3, then 328 the PE generates an IGMPv3 Membership Report with a record 329 corresponding to the list of source IP addresses and the group 330 address along with the proper indication of inclusion/exclusion. 332 7. Upon receiving EVPN SMET route(s) and before generating the 333 corresponding IGMP Membership Request(s), the PE checks to see 334 whether it has any CE multicast router for that BD on any of its 335 ES's . The PE provides such a check by listening for PIM Hello 336 messages on that AC (i.e, ES,BD). If the PE does have the 337 router's ACs, then the generated IGMP Membership Request(s) are 338 sent to those ACs. If it doesn't have any of the router's AC, 339 then no IGMP Membership Request(s) needs to be generated. This 340 is because sending IGMP Membership Requests to other hosts can 341 result in unintentionally preventing a host from joining a 342 specific multicast group using IGMPv2 - i.e., if the PE does not 343 receive a join from the host it will not forward multicast data 344 to it. Per [RFC4541] , when an IGMPv2 host receives a Membership 345 Report for a group address that it intends to join, the host will 346 suppress its own membership report for the same group, and if the 347 PE does not receive an IGMP Join from host it will not forward 348 multicast data to it. In other words, an IGMPv2 Join MUST NOT be 349 sent on an AC that does not lead to a CE multicast router. This 350 message suppression is a requirement for IGMPv2 hosts. This is 351 not a problem for hosts running IGMPv3 because there is no 352 suppression of IGMP Membership Reports. 354 4.1.2. IGMP/MLD Leave Group Advertisement in BGP 356 When a PE wants to withdraw an EVPN SMET route corresponding to an 357 IGMPv2 Leave Group (Leave) or IGMPv3 "Leave" equivalent message, it 358 follows the following rules: 360 1. When a PE receives an IGMPv2 Leave Group or its "Leave" 361 equivalent message for IGMPv3 from its attached host, it checks 362 to see if this host is the last host that is interested in this 363 multicast group by sending a query for the multicast group. If 364 the host was indeed the last one (i.e. no responses are received 365 for the query), then the PE MUST re-advertises EVPN SMET 366 Multicast route with the corresponding version flag reset. If 367 this is the last version flag to be reset, then instead of re- 368 advertising the EVPN route with all version flags reset, the PE 369 MUST withdraw the EVPN route for that (*,G). 371 2. When a PE receives an EVPN SMET route for a given (*,G), it 372 compares the received version flags from the route with its per- 373 PE stored version flags. If the PE finds that a version flag 374 associated with the (*,G) for the remote PE is reset, then the PE 375 MUST generate IGMP Leave for that (*,G) toward its local 376 interface (if any) attached to the multicast router for that 377 multicast group. It should be noted that the received EVPN route 378 SHOULD at least have one version flag set. If all version flags 379 are reset, it is an error because the PE should have received an 380 EVPN route withdraw for the last version flag. Error MUST be 381 considered as BGP error and the PE MUST apply the "treat-as- 382 withdraw" procedure of [RFC7606]. 384 3. When a PE receives an EVPN SMET route withdraw, it removes the 385 remote PE from its OIF list for that multicast group and if there 386 are no more OIF entries for that multicast group (either locally 387 or remotely), then the PE MUST stop responding to queries from 388 the locally attached router (if any). If there is a source for 389 that multicast group, the PE stops sending multicast traffic for 390 that source. 392 4.2. Proxy Querier 394 As mentioned in the previous sections, each PE MUST have proxy 395 querier functionality for the following reasons: 397 1. To enable the collection of EVPN PEs providing L2VPN service to 398 act as distributed multicast router with Anycast IP address for 399 all attached hosts/VMs in that subnet. 401 2. To enable suppression of IGMP Membership Reports and queries over 402 MPLS/IP core. 404 5. Operation 406 Consider the EVPN network of Figure-1, where there is an EVPN 407 instance configured across the PEs shown in this figure (namely PE1, 408 PE2, and PE3). Let's consider that this EVPN instance consists of a 409 single bridge domain (single subnet) with all the hosts, sources, and 410 the multicast router connected to this subnet. PE1 only has hosts 411 connected to it. PE2 has a mix of hosts and a multicast source. PE3 412 has a mix of hosts, a multicast source, and a multicast router. 413 Furthermore, let's consider that for (S1,G1), R1 is used as the 414 multicast router. The following subsections describe the IGMP proxy 415 operation in different PEs with regard to whether the locally 416 attached devices for that subnet are: 418 o only hosts/VMs 420 o mix of hosts/VMs and multicast source 422 o mix of hosts/VMs, multicast source, and multicast router 423 +--------------+ 424 | | 425 | | 426 +----+ | | +----+ 427 H1:(*,G1)v2 ---| | | | | |---- H6(*,G1)v2 428 H2:(*,G1)v2 ---| PE1| | IP/MPLS | | PE2|---- H7(S2,G2)v3 429 H3:(*,G1)v3 ---| | | Network | | |---- S2 430 H4:(S2,G2)v3 --| | | | | | 431 +----+ | | +----+ 432 | | 433 +----+ | | 434 H5:(S1,G1)v3 --| | | | 435 S1 ---| PE3| | | 436 R1 ---| | | | 437 +----+ | | 438 | | 439 +--------------+ 441 Figure 1: EVPN network 443 5.1. PE with only attached hosts/VMs for a given subnet 445 When PE1 receives an IGMPv2 Join Report from H1, it does not forward 446 this join to any of its other ports (for this subnet) because all 447 these local ports are associated with the hosts/VMs. PE1 sends an 448 EVPN Multicast Group route corresponding to this join for (*,G1) and 449 setting v2 flag. This EVPN route is received by PE2 and PE3 that are 450 the members of the same BD (i.e., same EVI in case of VLAN-based 451 service or EVI,VLAN in case of VLAN-aware bundle service). PE3 452 reconstructs the IGMPv2 Join Report from this EVPN BGP route and only 453 sends it to the port(s) with multicast routers attached to it (for 454 that subnet). In this example, PE3 sends the reconstructed IGMPv2 455 Join Report for (*,G1) only to R1. Furthermore, even though PE2 456 receives the EVPN BGP route, it does not send it to any of its ports 457 for that subnet; viz, ports associated with H6 and H7. 459 When PE1 receives the second IGMPv2 Join from H2 for the same 460 multicast group (*,G1), it only adds that port to its OIF list but it 461 doesn't send any EVPN BGP route because there is no change in 462 information. However, when it receives the IGMPv3 Join from H3 for 463 the same (*,G1). Besides adding the corresponding port to its OIF 464 list, it re-advertises the previously sent EVPN SMET route with the 465 v3 and exclude flag set. 467 Finally when PE1 receives the IMGMPv3 Join from H4 for (S2,G2), it 468 advertises a new EVPN SMET route corresponding to it. 470 5.2. PE with a mix of attached hosts/VMs and multicast source 472 The main difference in this case is that when PE2 receives the IGMPv3 473 Join from H7 for (S2,G2), it does advertise it in BGP to support 474 source move even though PE2 knows that S2 is attached to its local 475 AC. PE2 adds the port associated with H7 to its OIF list for 476 (S2,G2). The processing for IGMPv2 received from H6 is the same as 477 the IGMPv2 Join described in previous section. 479 5.3. PE with a mix of attached hosts/VMs, a multicast source and a 480 router 482 The main difference in this case relative to the previous two 483 sections is that IGMP v2/v3 Join messages received locally needs to 484 be sent to the port associated with router R1. Furthermore, the 485 Joins received via BGP (SMET) need to be passed to the R1 port but 486 filtered for all other ports. 488 6. All-Active Multi-Homing 490 Because the LAG flow hashing algorithm used by the CE is unknown at 491 the PE, in an All-Active redundancy mode it must be assumed that the 492 CE can send a given IGMP message to any one of the multi-homed PEs, 493 either DF or non-DF; i.e., different IGMP Membership Request messages 494 can arrive at different PEs in the redundancy group and furthermore 495 their corresponding Leave messages can arrive at PEs that are 496 different from the ones that received the Join messages. Therefore, 497 all PEs attached to a given ES must coordinate IGMP Membership 498 Request and Leave Group (x,G) state, where x may be either '*' or a 499 particular source S, for each BD on that ES. This allows the DF for 500 that (ES,BD) to correctly advertise or withdraw a Selective Multicast 501 Ethernet Tag (SMET) route for that (x,G) group in that BD when 502 needed. All-Active multihoming PEs for a given ES MUST support IGMP 503 synchronization procedures described in this section if they need to 504 perform IGMP proxy for hosts connected to that ES. 506 6.1. Local IGMP/MLD Join Synchronization 508 When a PE, either DF or non-DF, receives on a given multihomed ES 509 operating in All-Active redundancy mode, an IGMP Membership Report 510 for (x,G), it determines the BD to which the IGMP Membership Report 511 belongs. If the PE doesn't already have local IGMP Membership 512 Request (x,G) state for that BD on that ES, it MUST instantiate local 513 IGMP Membership Request (x,G) state and MUST advertise a BGP IGMP 514 Join Synch route for that (ES,BD). Local IGMP IGMP Membership 515 Request (x, G) state refers to IGMP Membership Request (x,G) state 516 that is created as a result of processing an IGMP Membership Report 517 for (x,G). 519 The IGMP Join Synch route MUST carry the ES-Import RT for the ES on 520 which the IGMP Membership Report was received. Thus it MUST only be 521 imported by the PEs attached to that ES and not any other PEs. 523 When a PE, either DF or non-DF, receives an IGMP Join Synch route it 524 installs that route and if it doesn't already have IGMP Membership 525 Request(x,G) state for that (ES,BD), it MUST instantiate that IGMP 526 Membership Request(x,G) state - i.e., IGMP Membership Request(x,G) 527 state is the union of the local IGMP Join (x,G) state and the 528 installed IGMP Join Synch route. If the DF did not already advertise 529 (originate) a SMET route for that (x,G) group in that BD, it MUST do 530 so now. 532 When a PE, either DF or non-DF, deletes its local IGMP Membership 533 Request(x, G) state for that (ES,BD), it MUST withdraw its BGP IGMP 534 Join Synch route for that (ES,BD). 536 When a PE, either DF or non-DF, receives the withdrawal of an IGMP 537 Join Synch route from another PE it MUST remove that route. When a 538 PE has no local IGMP Membership Request(x,G) state and it has no 539 installed IGMP Join Synch routes, it MUST remove IGMP Membership 540 Request(x,G) state for that (ES,BD). If the DF no longer has IGMP 541 Membership Request(x,G) state for that BD on any ES for which it is 542 DF, it MUST withdraw its SMET route for that (x,G) group in that BD. 544 In other words, a PE advertises an SMET route for that (x,G) group in 545 that BD when it has IGMP Membership Request (x,G) state in that BD on 546 at least one ES for which it is DF and it withdraws that SMET route 547 when it does not have IGMP Membership Request(x,G) state in that BD 548 on any ES for which it is DF. 550 6.2. Local IGMP/MLD Leave Group Synchronization 552 When a PE, either DF or non-DF, receives, on a given multihomed ES 553 operating in All-Active redundancy mode, an IGMP Leave Group message 554 for (x,G) from the attached CE, it determines the BD to which the 555 IGMPv2 Leave Group belongs. Regardless of whether it has IGMP 556 Membership Request (x,G) state for that (ES,BD), it initiates the 557 (x,G) leave group synchronization procedure, which consists of the 558 following steps: 560 1. It computes the Maximum Response Time, which is the duration of 561 (x,G) leave group synchronization procedure. This is the product 562 of two locally configured values, Last Member Query Count and 563 Last Member Query Interval (described in Section 3 of [RFC2236]), 564 plus a delta corresponding to the time it takes for a BGP 565 advertisement to propagate between the PEs attached to the 566 multihomed ES (delta is a consistently configured value on all 567 PEs attached to the multihomed ES). 569 2. It starts the Maximum Response Time timer. Note that the receipt 570 of subsequent IGMP Leave Group messages or BGP Leave Synch routes 571 for (x,G) do not change the value of a currently running Maximum 572 Response Time timer and are ignored by the PE. 574 3. It initiates the Last Member Query procedure described in 575 Section 3 of [RFC2236]; viz, it sends a number of Group-Specific 576 Query (x,G) messages (Last Member Query Count) at a fixed 577 interval (Last Member Query Interval) to the attached CE. 579 4. It advertises an IGMP Leave Synch route for that that (ES,BD). 580 This route notifies the other multihomed PEs attached to the 581 given multihomed ES that it has initiated an (x,G) leave group 582 synchronization procedure; i.e., it carries the ES-Import RT for 583 the ES on which the IGMP Leave Group was received. It also 584 contains the Maximum Response Time. 586 5. When the Maximum Response Timer expires, the PE that has 587 advertised the IGMP Leave Synch route withdraws it. 589 6.2.1. Remote Leave Group Synchronization 591 When a PE, either DF or non-DF, receives an IGMP Leave Synch route it 592 installs that route and it starts a timer for (x,G) on the specified 593 (ES,BD) whose value is set to the Maximum Response Time in the 594 received IGMP Leave Synch route. Note that the receipt of subsequent 595 IGMPv2 Leave Group messages or BGP Leave Synch routes for (x,G) do 596 not change the value of a currently running Maximum Response Time 597 timer and are ignored by the PE. 599 6.2.2. Common Leave Group Synchronization 601 If a PE attached to the multihomed ES receives an IGMP Membership 602 Report for (x,G) before the Maximum Response Time timer expires, it 603 advertises a BGP IGMP Join Synch route for that (ES,BD). If it 604 doesn't already have local IGMP Membership Request(x, G) state for 605 that (ES,BD), it instantiates local IGMP Membership Request (x,G) 606 state. If the DF is not currently advertising (originating) a SMET 607 route for that (x,G) group in that BD, it does so now. 609 If a PE attached to the multihomed ES receives an IGMP Join Synch 610 route for (x,G) before the Maximum Response Time timer expires, it 611 installs that route and if it doesn't already have IGMP Membership 612 Request (x,G) state for that BD on that ES, it instantiates that IGMP 613 Membership Request (x,G) state. If the DF has not already advertised 614 (originated) a SMET route for that (x,G) group in that BD, it does so 615 now. 617 When the Maximum Response Timer expires a PE that has advertised an 618 IGMP Leave Synch route, withdraws it. Any PE attached to the 619 multihomed ES, that started the Maximum Response Time and has no 620 local IGMP Membership Request (x,G) state and no installed IGMP Join 621 Synch routes, it removes IGMP Membership Request (x,G) state for that 622 (ES,BD). If the DF no longer has IGMP Membership Request (x,G) state 623 for that BD on any ES for which it is DF, it withdraws its SMET route 624 for that (x,G) group in that BD. 626 6.3. Mass Withdraw of Multicast join Sync route in case of failure 628 A PE which has received an IGMP Membership Request, would have synced 629 the IGMP Join by the procedure defined in section 6.1. If a PE with 630 local join state goes down or the PE to CE link goes down, it would 631 lead to a mass withdraw of multicast routes. Remote PEs (PEs where 632 these routes were remote IGMP Joins) SHOULD NOT remove the state 633 immediately; instead General Query SHOULD be generated to refresh the 634 states. There are several ways to detect failure at a peer, e.g. 635 using IGP next hop tracking or ES route withdraw. 637 7. Single-Active Multi-Homing 639 Note that to facilitate state synchronization after failover, the PEs 640 attached to a multihomed ES operating in Single-Active redundancy 641 mode SHOULD also coordinate IGMP Join (x,G) state. In this case all 642 IGMP Join messages are received by the DF and distributed to the non- 643 DF PEs using the procedures described above. 645 8. Selective Multicast Procedures for IR tunnels 647 If an ingress PE uses ingress replication, then for a given (x,G) 648 group in a given BD: 650 1. It sends (x,G) traffic to the set of PEs not supporting IGMP 651 Proxy. This set consists of any PE that has advertised an 652 Inclusive Multicast Tag route for the BD without the "IGMP Proxy 653 Support" flag. 655 2. It sends (x,G) traffic to the set of PEs supporting IGMP Proxy 656 and having listeners for that (x,G) group in that BD. This set 657 consists of any PE that has advertised an Inclusive Multicast Tag 658 route for the BD with the "IGMP Proxy Support" flag and that has 659 advertised a SMET route for that (x,G) group in that BD. 661 If an ingress PE's Selective P-Tunnel for a given BD uses P2MP and 662 all of the PEs in the BD support that tunnel type and IGMP proxy, 663 then for a given (x,G) group in a given BD it sends (x,G) traffic 664 using the Selective P-Tunnel for that (x,G) group in that BD. This 665 tunnel includes those PEs that have advertised a SMET route for that 666 (x,G) group on that BD (for Selective P-tunnel) but it may include 667 other PEs as well (for Aggregate Selective P-tunnel). 669 9. BGP Encoding 671 This document defines three new BGP EVPN routes to carry IGMP 672 Membership Reports. The route type is known as: 674 + 6 - Selective Multicast Ethernet Tag Route 676 + 7 - Multicast Join Synch Route 678 + 8 - Multicast Leave Synch Route 680 The detailed encoding and procedures for this route type are 681 described in subsequent sections. 683 9.1. Selective Multicast Ethernet Tag Route 685 A Selective Multicast Ethernet Tag route type specific EVPN NLRI 686 consists of the following: 688 +---------------------------------------+ 689 | RD (8 octets) | 690 +---------------------------------------+ 691 | Ethernet Tag ID (4 octets) | 692 +---------------------------------------+ 693 | Multicast Source Length (1 octet) | 694 +---------------------------------------+ 695 | Multicast Source Address (variable) | 696 +---------------------------------------+ 697 | Multicast Group Length (1 octet) | 698 +---------------------------------------+ 699 | Multicast Group Address (Variable) | 700 +---------------------------------------+ 701 | Originator Router Length (1 octet) | 702 +---------------------------------------+ 703 | Originator Router Address (variable) | 704 +---------------------------------------+ 705 | Flags (1 octet) | 706 +---------------------------------------+ 708 For the purpose of BGP route key processing, all the fields are 709 considered to be part of the prefix in the NLRI except for the one- 710 octet flag field. The Flags fields are defined as follows: 712 0 1 2 3 4 5 6 7 713 +--+--+--+--+--+--+--+--+ 714 | reserved |IE|v3|v2|v1| 715 +--+--+--+--+--+--+--+--+ 717 o The least significant bit, bit 7 indicates support for IGMP 718 version 1. Since IGMP V1 is being deprecated , sender MUST set it 719 as 0 for IGMP and receiver MUST ignore it. 721 o The second least significant bit, bit 6 indicates support for IGMP 722 version 2. 724 o The third least significant bit, bit 5 indicates support for IGMP 725 version 3. 727 o The fourth least significant bit, bit 4 indicates whether the 728 (S,G) information carried within the route-type is of an Include 729 Group type (bit value 0) or an Exclude Group type (bit value 1). 730 The Exclude Group type bit MUST be ignored if bit 5 is not set. 732 o This EVPN route type is used to carry tenant IGMP multicast group 733 information. The flag field assists in distributing IGMP 734 Membership Report of a given host/VM for a given multicast route. 735 The version bits help associate IGMP version of receivers 736 participating within the EVPN domain. 738 o The include/exclude bit helps in creating filters for a given 739 multicast route. 741 o If route is used for IPv6 (MLD) then bit 7 indicates support for 742 MLD version 1. The second least significant bit, bit 6 indicates 743 support for MLD version 2. Since there is no MLD version 3, in 744 case of IPv6 route third least significant bit MUST be 0. In case 745 of IPv6 routes, the fourth least significant bit MUST be ignored 746 if bit 6 is not set. 748 o Reserve bit SHOULD be set to 0 by sender. And receiver SHOULD 749 ignore the reserve bit. 751 9.1.1. Constructing the Selective Multicast Ethernet Tag route 753 This section describes the procedures used to construct the Selective 754 Multicast Ethernet Tag (SMET) route. 756 The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364] . The 757 value field comprises an IP address of the PE (typically, the 758 loopback address) followed by a number unique to the PE. 760 The Ethernet Tag ID MUST be set as follows: 762 o EVI is VLAN-Based or VLAN Bundle service - set to 0 764 o EVI is VLAN-Aware Bundle service without translation - set to the 765 customer VID for that BD 767 o EVI is VLAN-Aware Bundle service with translation - set to the 768 normalized Ethernet Tag ID - e.g., normalized VID 770 The Multicast Source Length MUST be set to length of the multicast 771 Source address in bits. If the Multicast Source Address field 772 contains an IPv4 address, then the value of the Multicast Source 773 Length field is 32. If the Multicast Source Address field contains 774 an IPv6 address, then the value of the Multicast Source Length field 775 is 128. In case of a (*, G) Join, the Multicast Source Length is set 776 to 0. 778 The Multicast Source Address is the source IP address from the IGMP 779 Membership Report. In case of a (*, G), this field is not used. 781 The Multicast Group Length MUST be set to length of multicast group 782 address in bits. If the Multicast Group Address field contains an 783 IPv4 address, then the value of the Multicast Group Length field is 784 32. If the Multicast Group Address field contains an IPv6 address, 785 then the value of the Multicast Group Length field is 128. 787 The Multicast Group Address is the Group address from the IGMP or MLD 788 Membership Report. 790 The Originator Router Length is the length of the Originator Router 791 Address in bits. 793 The Originator Router Address is the IP address of router originating 794 this route. The SMET Originator Router IP address MUST match that of 795 the IMET (or S-PMSI AD) route originated for the same EVI by the same 796 downstream PE. 798 The Flags field indicates the version of IGMP protocol from which the 799 Membership Report was received. It also indicates whether the 800 multicast group had the INCLUDE or EXCLUDE bit set. 802 Reserve bit MUST be set to 0. They can be defined in future by other 803 document. 805 IGMP is used to receive group membership information from hosts/VMs 806 by TORs. Upon receiving the hosts/VMs expression of interest of a 807 particular group membership, this information is then forwarded using 808 SMET route. The NLRI also keeps track of receiver's IGMP protocol 809 version and any source filtering for a given group membership. All 810 EVPN SMET routes are announced with per- EVI Route Target extended 811 communities. 813 9.1.2. Default Selective Multicast Route 815 If there is multicast router connected behind the EVPN domain, the PE 816 MAY originate a default SMET (*,*) to get all multicast traffic in 817 domain. 819 +--------------+ 820 | | 821 | | 822 | | +----+ 823 | | | |---- H1(*,G1)v2 824 | IP/MPLS | | PE1|---- H2(S2,G2)v3 825 | Network | | |---- S2 826 | | | | 827 | | +----+ 828 | | 829 +----+ | | 830 +----+ | | | | 831 | | S1 ---| PE2| | | 832 |PIM |----R1 ---| | | | 833 |ASM | +----+ | | 834 | | | | 835 +----+ +--------------+ 837 Figure 2: Multicast Router behind EVPN domain 839 Consider the EVPN network of Figure-2, where there is an EVPN 840 instance configured across the PEs. Lets consider PE2 is connected 841 to multicast router R1 and there is a network running PIM ASM behind 842 R1. If there are receivers behind the PIM ASM network, the PIM Join 843 would be forwarded to the PIM RP (Rendezvous Point). If receivers 844 behind PIM ASM network are interested in a multicast flow originated 845 by multicast source S2 (behind PE1), it is necessary for PE2 to 846 receive multicast traffic. In this case PE2 MUST originate a (*,*) 847 SMET route to receive all of the multicast traffic in the EVPN 848 domain. To generate Wildcards (*,*) routes , prcedure from [RFC6625] 849 SHOULD be used. 851 9.2. Multicast Join Synch Route 853 This EVPN route type is used to coordinate IGMP Join (x,G) state for 854 a given BD between the PEs attached to a given ES operating in All- 855 Active (or Single-Active) redundancy mode and it consists of 856 following: 858 +--------------------------------------------------+ 859 | RD (8 octets) | 860 +--------------------------------------------------+ 861 | Ethernet Segment Identifier (10 octets) | 862 +--------------------------------------------------+ 863 | Ethernet Tag ID (4 octets) | 864 +--------------------------------------------------+ 865 | Multicast Source Length (1 octet) | 866 +--------------------------------------------------+ 867 | Multicast Source Address (variable) | 868 +--------------------------------------------------+ 869 | Multicast Group Length (1 octet) | 870 +--------------------------------------------------+ 871 | Multicast Group Address (Variable) | 872 +--------------------------------------------------+ 873 | Originator Router Length (1 octet) | 874 +--------------------------------------------------+ 875 | Originator Router Address (variable) | 876 +--------------------------------------------------+ 877 | Flags (1 octet) | 878 +--------------------------------------------------+ 880 For the purpose of BGP route key processing, all the fields are 881 considered to be part of the prefix in the NLRI except for the one- 882 octet Flags field, whose fields are defined as follows: 884 0 1 2 3 4 5 6 7 885 +--+--+--+--+--+--+--+--+ 886 | reserved |IE|v3|v2|v1| 887 +--+--+--+--+--+--+--+--+ 889 o The least significant bit, bit 7 indicates support for IGMP 890 version 1. 892 o The second least significant bit, bit 6 indicates support for IGMP 893 version 2. 895 o The third least significant bit, bit 5 indicates support for IGMP 896 version 3. 898 o The fourth least significant bit, bit 4 indicates whether the (S, 899 G) information carried within the route-type is of Include Group 900 type (bit value 0) or an Exclude Group type (bit value 1). The 901 Exclude Group type bit MUST be ignored if bit 5 is not set. 903 o Reserve bit MUST be set to 0. They can be defined in future by 904 other document. 906 The Flags field assists in distributing IGMP Membership Report of a 907 given host/VM for a given multicast route. The version bits help 908 associate IGMP version of receivers participating within the EVPN 909 domain. The include/exclude bit helps in creating filters for a 910 given multicast route. 912 If route is being prepared for IPv6 (MLD) then bit 7 indicates 913 support for MLD version 1. The second least significant bit, bit 6 914 indicates support for MLD version 2. Since there is no MLD version 915 3, in case of IPv6 route third least significant bit MUST be 0. In 916 case of IPv6 route, the fourth least significant bit MUST be ignored 917 if bit 6 is not set. 919 9.2.1. Constructing the Multicast Join Synch Route 921 This section describes the procedures used to construct the IGMP Join 922 Synch route. Support for this route type is optional. If a PE does 923 not support this route, then it MUST NOT indicate that it supports 924 'IGMP proxy' in the Multicast Flag extended community for the EVIs 925 corresponding to its multi-homed Ethernet Segments (ESs). 927 An IGMP Join Synch route MUST carry exactly one ES-Import Route 928 Target extended community, the one that corresponds to the ES on 929 which the IGMP Join was received. It MUST also carry exactly one 930 EVI-RT EC, the one that corresponds to the EVI on which the IGMP Join 931 was received. See Section 9.5 for details on how to encode and 932 construct the EVI-RT EC. 934 The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364] . The 935 value field comprises an IP address of the PE (typically, the 936 loopback address) followed by a number unique to the PE. 938 The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet 939 value defined for the ES. 941 The Ethernet Tag ID MUST be set as follows: 943 o EVI is VLAN-Based or VLAN Bundle service - set to 0 945 o EVI is VLAN-Aware Bundle service without translation - set to the 946 customer VID for the BD 948 o EVI is VLAN-Aware Bundle service with translation - set to the 949 normalized Ethernet Tag ID - e.g., normalized VID 951 The Multicast Source length MUST be set to length of Multicast Source 952 address in bits. If the Multicast Source field contains an IPv4 953 address, then the value of the Multicast Source Length field is 32. 954 If the Multicast Source field contains an IPv6 address, then the 955 value of the Multicast Source Length field is 128. In case of a 956 (*,G) Join, the Multicast Source Length is set to 0. 958 The Multicast Source is the Source IP address of the IGMP Membership 959 Report. In case of a (*, G) Join, this field does not exist. 961 The Multicast Group length MUST be set to length of multicast group 962 address in bits. If the Multicast Group field contains an IPv4 963 address, then the value of the Multicast Group Length field is 32. 964 If the Multicast Group field contains an IPv6 address, then the value 965 of the Multicast Group Length field is 128. 967 The Multicast Group is the Group address of the IGMP Membership 968 Report. 970 The Originator Router Length is the length of the Originator Router 971 address in bits. 973 The Originator Router Address is the IP address of Router Originating 974 the prefix. 976 The Flags field indicates the version of IGMP protocol from which the 977 Membership Report was received. It also indicates whether the 978 multicast group had INCLUDE or EXCLUDE bit set. 980 Reserve bit MUST be set to 0. They can be defined in future by other 981 document. 983 9.3. Multicast Leave Synch Route 985 This EVPN route type is used to coordinate IGMP Leave Group (x,G) 986 state for a given BD between the PEs attached to a given ES operating 987 in All-Active (or Single-Active) redundancy mode and it consists of 988 following: 990 +--------------------------------------------------+ 991 | RD (8 octets) | 992 +--------------------------------------------------+ 993 | Ethernet Segment Identifier (10 octets) | 994 +--------------------------------------------------+ 995 | Ethernet Tag ID (4 octets) | 996 +--------------------------------------------------+ 997 | Multicast Source Length (1 octet) | 998 +--------------------------------------------------+ 999 | Multicast Source Address (variable) | 1000 +--------------------------------------------------+ 1001 | Multicast Group Length (1 octet) | 1002 +--------------------------------------------------+ 1003 | Multicast Group Address (Variable) | 1004 +--------------------------------------------------+ 1005 | Originator Router Length (1 octet) | 1006 +--------------------------------------------------+ 1007 | Originator Router Address (variable) | 1008 +--------------------------------------------------+ 1009 | Reserved (4 octet) | 1010 +--------------------------------------------------+ 1011 | Maximum Response Time (1 octet) | 1012 +--------------------------------------------------+ 1013 | Flags (1 octet) | 1014 +--------------------------------------------------+ 1016 For the purpose of BGP route key processing, all the fields are 1017 considered to be part of the prefix in the NLRI except for the 1018 Reserved, Maximum Response Time and the one-octet Flags field, whose 1019 fields are defined as follows: 1021 0 1 2 3 4 5 6 7 1022 +--+--+--+--+--+--+--+--+ 1023 | reserved |IE|v3|v2|v1| 1024 +--+--+--+--+--+--+--+--+ 1026 o The least significant bit, bit 7 indicates support for IGMP 1027 version 1. 1029 o The second least significant bit, bit 6 indicates support for IGMP 1030 version 2. 1032 o The third least significant bit, bit 5 indicates support for IGMP 1033 version 3. 1035 o The fourth least significant bit, bit 4 indicates whether the (S, 1036 G) information carried within the route-type is of Include Group 1037 type (bit value 0) or an Exclude Group type (bit value 1). The 1038 Exclude Group type bit MUST be ignored if bit 5 is not set. 1040 o Reserve bit MUST be set to 0. They can be defined in future by 1041 other document. 1043 The Flags field assists in distributing IGMP Membership Report of a 1044 given host/VM for a given multicast route. The version bits help 1045 associate IGMP version of receivers participating within the EVPN 1046 domain. The include/exclude bit helps in creating filters for a 1047 given multicast route. 1049 If route is being prepared for IPv6 (MLD) then bit 7 indicates 1050 support for MLD version 1. The second least significant bit, bit 6 1051 indicates support for MLD version 2. Since there is no MLD version 1052 3, in case of IPv6 route third least significant bit MUST be 0. In 1053 case of IPv6 route, the fourth least significant bit MUST be ignored 1054 if bit 6 is not set. 1056 Reserve bit in flag MUST be set to 0. They can be defined in future 1057 by other document. 1059 9.3.1. Constructing the Multicast Leave Synch Route 1061 This section describes the procedures used to construct the IGMP 1062 Leave Synch route. Support for this route type is optional. If a PE 1063 does not support this route, then it MUST NOT indicate that it 1064 supports 'IGMP proxy' in Multicast Flag extended community for the 1065 EVIs corresponding to its multi-homed Ethernet Segments. 1067 An IGMP Leave Synch route MUST carry exactly one ES-Import Route 1068 Target extended community, the one that corresponds to the ES on 1069 which the IGMP Leave was received. It MUST also carry exactly one 1070 EVI-RT EC, the one that corresponds to the EVI on which the IGMP 1071 Leave was received. See Section 9.5 for details on how to form the 1072 EVI-RT EC. 1074 The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364]. The 1075 value field comprises an IP address of the PE (typically, the 1076 loopback address) followed by a number unique to the PE. 1078 The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet 1079 value defined for the ES. 1081 The Ethernet Tag ID MUST be set as follows: 1083 o EVI is VLAN-Based or VLAN Bundle service - set to 0 1085 o EVI is VLAN-Aware Bundle service without translation - set to the 1086 customer VID for the BD 1088 o EVI is VLAN-Aware Bundle service with translation - set to the 1089 normalized Ethernet Tag ID - e.g., normalized VID 1091 The Multicast Source length MUST be set to length of multicast source 1092 address in bits. If the Multicast Source field contains an IPv4 1093 address, then the value of the Multicast Source Length field is 32. 1094 If the Multicast Source field contains an IPv6 address, then the 1095 value of the Multicast Source Length field is 128. In case of a (*, 1096 G) Join, the Multicast Source Length is set to 0. 1098 The Multicast Source is the Source IP address of the IGMP Membership 1099 Report. In case of a (*, G) Join, this field does not exist. 1101 The Multicast Group length MUST be set to length of multicast group 1102 address in bits. If the Multicast Group field contains an IPv4 1103 address, then the value of the Multicast Group Length field is 32. 1104 If the Multicast Group field contains an IPv6 address, then the value 1105 of the Multicast Group Length field is 128. 1107 The Multicast Group is the Group address of the IGMP Membership 1108 Report. 1110 The Originator Router Length is the length of the Originator Router 1111 address in bits. 1113 The Originator Router Address is the IP address of Router Originating 1114 the prefix. 1116 Reserved field is not part of the route key. The originator MUST set 1117 the reserved field to Zero , the receiver SHOULD ignore it and if it 1118 needs to be propagated, it MUST propagate it unchanged 1120 Maximum Response Time is value to be used while sending query as 1121 defined in [RFC2236] 1123 The Flags field indicates the version of IGMP protocol from which the 1124 Membership Report was received. It also indicates whether the 1125 multicast group had INCLUDE or EXCLUDE bit set. 1127 9.4. Multicast Flags Extended Community 1129 The 'Multicast Flags' extended community is a new EVPN extended 1130 community. EVPN extended communities are transitive extended 1131 communities with a Type field value of 6. IANA will assign a Sub- 1132 Type from the 'EVPN Extended Community Sub-Types' registry. 1134 A PE that supports IGMP proxy on a given BD MUST attach this extended 1135 community to the Inclusive Multicast Ethernet Tag (IMET) route it 1136 advertises for that BD and it MUST set the IGMP Proxy Support flag to 1137 1. Note that an [RFC7432] compliant PE will not advertise this 1138 extended community so its absence indicates that the advertising PE 1139 does not support IGMP Proxy. 1141 The advertisement of this extended community enables more efficient 1142 multicast tunnel setup from the source PE specially for ingress 1143 replication - i.e., if an egress PE supports IGMP proxy but doesn't 1144 have any interest in a given (x,G), it advertises its IGMP proxy 1145 capability using this extended community but it does not advertise 1146 any SMET route for that (x,G). When the source PE (ingress PE) 1147 receives such advertisements from the egress PE, it does not 1148 replicate the multicast traffic to that egress PE; however, it does 1149 replicate the multicast traffic to the egress PEs that don't 1150 advertise such capability even if they don't have any interests in 1151 that (x,G). 1153 A Multicast Flags extended community is encoded as an 8-octet value, 1154 as follows: 1156 0 1 2 3 1157 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1159 | Type=0x06 | Sub-Type=0x09| Flags (2 Octets) |M|I| 1160 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1161 | Reserved=0 | 1162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1164 The low-order (lease significant) two bits are defined as the "IGMP 1165 Proxy Support and MLD Proxy Support" bit. The absence of this 1166 extended community also means that the PE does not support IGMP 1167 proxy. where: 1169 o Type is 0x06 as registered with IANA for EVPN Extended 1170 Communities. 1172 o Sub-Type : 0x09 1173 o Flags are two Octets value. 1175 * Bit 15 (shown as I) defines IGMP Proxy Support. Value of 1 for 1176 bit 15 means that PE supports IGMP Proxy. Value of 0 for bit 1177 15 means that PE does not supports IGMP Proxy. 1179 * Bit 14 (shown as M) defines MLD Proxy Support. Value of 1 for 1180 bit 14 means that PE supports MLD Proxy. Value of 0 for bit 14 1181 means that PE does not support MLD proxy. 1183 * Bit 0 to 13 are reserved for future. Sender MUST set it 0 and 1184 receiver MUST ignore it. 1186 o Reserved bits are set to 0. Sender MUST set it to 0 and receiver 1187 MUST ignore it. 1189 If a router does not support this specification, it MUST NOT add 1190 Multicast Flags Extended Community in BGP route. A router receiving 1191 BGP update , if M and I both flag are zero (0), the router MUST treat 1192 this Update as malformed . Receiver of such update MUST ignore the 1193 extended community. 1195 9.5. EVI-RT Extended Community 1197 In EVPN, every EVI is associated with one or more Route Targets 1198 (RTs). These Route Targets serve two functions: 1200 1. Distribution control: RTs control the distribution of the routes. 1201 If a route carries the RT associated with a particular EVI, it 1202 will be distributed to all the PEs on which that EVI exists. 1204 2. EVI identification: Once a route has been received by a 1205 particular PE, the RT is used to identify the EVI to which it 1206 applies. 1208 An IGMP Join Synch or IGMP Leave Synch route is associated with a 1209 particular combination of ES and EVI. These routes need to be 1210 distributed only to PEs that are attached to the associated ES. 1211 Therefore these routes carry the ES-Import RT for that ES. 1213 Since an IGMP Join Synch or IGMP Leave Synch route does not need to 1214 be distributed to all the PEs on which the associated EVI exists, 1215 these routes cannot carry the RT associated with that EVI. 1216 Therefore, when such a route arrives at a particular PE, the route's 1217 RTs cannot be used to identify the EVI to which the route applies. 1218 Some other means of associating the route with an EVI must be used. 1220 This document specifies four new Extended Communities (EC) that can 1221 be used to identify the EVI with which a route is associated, but 1222 which do not have any effect on the distribution of the route. These 1223 new ECs are known as the "Type 0 EVI-RT EC", the "Type 1 EVI-RT EC", 1224 the "Type 2 EVI-RT EC", and the "Type 3 EVI-RT EC". 1226 1. A Type 0 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xA. 1228 2. A Type 1 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xB. 1230 3. A Type 2 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xC. 1232 4. A Type 3 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xD 1234 Each IGMP Join Synch or IGMP Leave Synch route MUST carry exactly one 1235 EVI-RT EC. The EVI-RT EC carried by a particular route is 1236 constructed as follows. Each such route is the result of having 1237 received an IGMP Join or an IGMP Leave message from a particular BD. 1238 The route is said to be associated associated with that BD. For each 1239 BD, there is a corresponding RT that is used to ensure that routes 1240 "about" that BD are distributed to all PEs attached to that BD. So 1241 suppose a given IGMP Join Synch or Leave Synch route is associated 1242 with a given BD, say BD1, and suppose that the corresponding RT for 1243 BD1 is RT1. Then: 1245 o 0. If RT1 is a Transitive Two-Octet AS-specific EC, then the EVI- 1246 RT EC carried by the route is a Type 0 EVI-RT EC. The value field 1247 of the Type 0 EVI-RT EC is identical to the value field of RT1. 1249 o 1. If RT1 is a Transitive IPv4-Address-specific EC, then the EVI- 1250 RT EC carried by the route is a Type 1 EVI-RT EC. The value field 1251 of the Type 1 EVI-RT EC is identical to the value field of RT1. 1253 o 2. If RT1 is a Transitive Four-Octet-specific EC, then the EVI-RT 1254 EC carried by the route is a Type 2 EVI-RT EC. The value field of 1255 the Type 2 EVI-RT EC is identical to the value field of RT1. 1257 o 3. If RT1 is a Transitive IPv6-Address-specific EC, then the EVI- 1258 RT EC carried by the route is a Type 3 EVI-RT EC. The value field 1259 of the Type 3 EVI-RT EC is identical to the value field of RT1. 1261 An IGMP Join Synch or Leave Synch route MUST carry exactly one EVI-RT 1262 EC. 1264 Suppose a PE receives a particular IGMP Join Synch or IGMP Leave 1265 Synch route, say R1, and suppose that R1 carries an ES-Import RT that 1266 is one of the PE's Import RTs. If R1 has no EVI-RT EC, or has more 1267 than one EVI-RT EC, the PE MUST apply the "treat-as-withdraw" 1268 procedure of [RFC7606]. 1270 Note that an EVI-RT EC is not a Route Target Extended Community, is 1271 not visible to the RT Constrain mechanism [RFC4684] , and is not 1272 intended to influence the propagation of routes by BGP. 1274 1 2 3 1275 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1276 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1277 | Type=0x06 | Sub-Type=n | RT associated with EVI | 1278 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1279 | RT associated with the EVI (cont.) | 1280 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1282 Where the value of 'n' is 0x0A, 0x0B, 0x0C, or 0x0D corresponding to 1283 EVI-RT type 0, 1, 2, or 3 respectively. 1285 9.6. Rewriting of RT ECs and EVI-RT ECs by ASBRs 1287 There are certain situations in which an ES is attached to a set of 1288 PEs that are not all in the same AS, or not all operated by the same 1289 provider. In some such situations, the RT that corresponds to a 1290 particular EVI may be different in each AS. If a route is propagated 1291 from AS1 to AS2, an ASBR at the AS1/AS2 border may be provisioned 1292 with a policy that removes the RTs that are meaningful in AS1 and 1293 replaces them with the corresponding (i.e., RTs corresponding to the 1294 same EVIs) RTs that are meaningful in AS2. This is known as RT- 1295 rewriting. 1297 Note that if a given route's RTs are rewritten, and the route carries 1298 an EVI-RT EC, the EVI-RT EC needs to be rewritten as well. 1300 9.7. BGP Error Handling 1302 If a received BGP update contains Flags not in accordance with IGMP/ 1303 MLD version-X expectation, the PE MUST apply the "treat-as-withdraw" 1304 procedure as per [RFC7606] 1306 If a received BGP update is malformed such that BGP route keys cannot 1307 be extracted, then BGP update MUST be considered as invalid. 1308 Receiving PE MUST apply the "Session reset" procedure of [RFC7606]. 1310 10. IGMP/MLD Immediate Leave 1312 IGMP MAY be configured with immediate leave option. This allows the 1313 device to remove the group entry from the multicast routing table 1314 immediately upon receiving a IGMP leave message for (x,G). In case 1315 of all active multi-homing while synchronizing the IGMP Leave state 1316 to redundancy peers, Maximum Response Time MAY be filled in as Zero. 1317 Implementations SHOULD have identical configuration across multi- 1318 homed peers. In case IGMP Leave Synch route is received with Maximum 1319 Response Time Zero, irrespective of local IGMP configuration it MAY 1320 be processed as an immediate leave. 1322 11. IGMP Version 1 Membership Report 1324 This document does not provide any detail about IGMPv1 processing. 1325 Multicast working group are in process of deprecating uses of IGMPv1. 1326 Implementations MUST only use IGMPv2 and above for IPv4 and MLDv1 and 1327 above for IPv6. IGMP V1 routes MUST be considered as invalid and the 1328 PE MUST apply the "treat-as-withdraw" procedure as per [RFC7606] 1330 12. Security Considerations 1332 Same security considerations as [RFC7432] ,[RFC2236] ,[RFC3376] , 1333 [RFC2710], [RFC3810]. 1335 13. IANA Considerations 1337 IANA has allocated the following codepoints from the EVPN Extended 1338 Community sub-types registry. 1340 0x09 Multicast Flags Extended Community [this document] 1341 0x0A EVI-RT Type 0 [this document] 1342 0x0B EVI-RT Type 1 [this document] 1343 0x0C EVI-RT Type 2 [this document] 1345 IANA is requested to allocate a new codepoint from the EVPN Extended 1346 Community sub-types registry for the following. 1348 0x0D EVI-RT Type 3 [this document] 1350 IANA has allocated the following EVPN route types from the EVPN Route 1351 Type registry. 1353 6 - Selective Multicast Ethernet Tag Route 1354 7 - Multicast Join Synch Route 1355 8 - Multicast Leave Synch Route 1357 The Multicast Flags Extended Community contains a 16-bit Flags field. 1358 The bits are numbered 0-15, from high-order to low-order. 1360 The registry should be initialized as follows: 1361 Bit Name Reference 1362 ---- -------------- ------------- 1363 0 - 13 Unassigned 1364 14 MLD Proxy Support This document 1365 15 IGMP Proxy Support This document 1367 The registration policy should be "First Come First Served". 1369 14. Acknowledgement 1371 The authors would like to thank Stephane Litkowski, Jorge Rabadan, 1372 Anoop Ghanwani, Jeffrey Haas, Krishna Muddenahally Ananthamurthy, 1373 Swadesh Agrawal for reviewing and providing valuable comment. 1375 15. Contributors 1377 Derek Yeung 1379 Arrcus 1381 Email: derek@arrcus.com 1383 16. References 1385 16.1. Normative References 1387 [I-D.ietf-bess-evpn-bum-procedure-updates] 1388 Zhang, Z., Lin, W., Rabadan, J., Patel, K., and A. 1389 Sajassi, "Updates on EVPN BUM Procedures", draft-ietf- 1390 bess-evpn-bum-procedure-updates-08 (work in progress), 1391 November 2019. 1393 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1394 Requirement Levels", BCP 14, RFC 2119, 1395 DOI 10.17487/RFC2119, March 1997, 1396 . 1398 [RFC2236] Fenner, W., "Internet Group Management Protocol, Version 1399 2", RFC 2236, DOI 10.17487/RFC2236, November 1997, 1400 . 1402 [RFC2710] Deering, S., Fenner, W., and B. Haberman, "Multicast 1403 Listener Discovery (MLD) for IPv6", RFC 2710, 1404 DOI 10.17487/RFC2710, October 1999, 1405 . 1407 [RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. 1408 Thyagarajan, "Internet Group Management Protocol, Version 1409 3", RFC 3376, DOI 10.17487/RFC3376, October 2002, 1410 . 1412 [RFC3810] Vida, R., Ed. and L. Costa, Ed., "Multicast Listener 1413 Discovery Version 2 (MLDv2) for IPv6", RFC 3810, 1414 DOI 10.17487/RFC3810, June 2004, 1415 . 1417 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1418 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1419 2006, . 1421 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 1422 R., Patel, K., and J. Guichard, "Constrained Route 1423 Distribution for Border Gateway Protocol/MultiProtocol 1424 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 1425 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 1426 November 2006, . 1428 [RFC6625] Rosen, E., Ed., Rekhter, Y., Ed., Hendrickx, W., and R. 1429 Qiu, "Wildcards in Multicast VPN Auto-Discovery Routes", 1430 RFC 6625, DOI 10.17487/RFC6625, May 2012, 1431 . 1433 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1434 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 1435 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 1436 2015, . 1438 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1439 Patel, "Revised Error Handling for BGP UPDATE Messages", 1440 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1441 . 1443 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1444 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1445 May 2017, . 1447 16.2. Informative References 1449 [RFC4541] Christensen, M., Kimball, K., and F. Solensky, 1450 "Considerations for Internet Group Management Protocol 1451 (IGMP) and Multicast Listener Discovery (MLD) Snooping 1452 Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006, 1453 . 1455 Authors' Addresses 1457 Ali Sajassi 1458 Cisco Systems 1459 821 Alder Drive, 1460 MILPITAS, CALIFORNIA 95035 1461 UNITED STATES 1463 Email: sajassi@cisco.com 1465 Samir Thoria 1466 Cisco Systems 1467 821 Alder Drive, 1468 MILPITAS, CALIFORNIA 95035 1469 UNITED STATES 1471 Email: sthoria@cisco.com 1473 Mankamana Mishra 1474 Cisco Systems 1475 821 Alder Drive, 1476 MILPITAS, CALIFORNIA 95035 1477 UNITED STATES 1479 Email: mankamis@cisco.com 1481 Keyur PAtel 1482 Arrcus 1483 UNITED STATES 1485 Email: keyur@arrcus.com 1487 John Drake 1488 Juniper Networks 1490 Email: jdrake@juniper.net 1491 Wen Lin 1492 Juniper Networks 1494 Email: wlin@juniper.net