idnits 2.17.1 draft-ietf-bess-evpn-igmp-mld-proxy-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 6 instances of too long lines in the document, the longest one being 2 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 25, 2021) is 906 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-14) exists of draft-ietf-bess-evpn-bum-procedure-updates-11 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS WorkGroup A. Sajassi 3 Internet-Draft S. Thoria 4 Intended status: Standards Track M. Mishra 5 Expires: April 28, 2022 Cisco Systems 6 K. Patel 7 Arrcus 8 J. Drake 9 W. Lin 10 Juniper Networks 11 October 25, 2021 13 IGMP and MLD Proxy for EVPN 14 draft-ietf-bess-evpn-igmp-mld-proxy-14 16 Abstract 18 This document describes how to support efficiently endpoints running 19 IGMP(Internet Group Management Protocol) or MLD (Multicast Listener 20 Discovery) for the multicast services over an EVPN network by 21 incorporating IGMP/MLD proxy procedures on EVPN (Ethernet VPN) PEs. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on April 28, 2022. 40 Copyright Notice 42 Copyright (c) 2021 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. Specification of Requirements . . . . . . . . . . . . . . . . 4 59 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 4. IGMP/MLD Proxy . . . . . . . . . . . . . . . . . . . . . . . 5 61 4.1. Proxy Reporting . . . . . . . . . . . . . . . . . . . . . 6 62 4.1.1. IGMP/MLD Membership Report Advertisement in BGP . . . 6 63 4.1.2. IGMP/MLD Leave Group Advertisement in BGP . . . . . . 8 64 4.2. Proxy Querier . . . . . . . . . . . . . . . . . . . . . . 9 65 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 9 66 5.1. PE with only attached hosts for a given subnet . . . . . 10 67 5.2. PE with a mix of attached hosts and multicast source . . 11 68 5.3. PE with a mix of attached hosts, a multicast source and a 69 router . . . . . . . . . . . . . . . . . . . . . . . . . 11 70 6. All-Active Multi-Homing . . . . . . . . . . . . . . . . . . . 11 71 6.1. Local IGMP/MLD Join Synchronization . . . . . . . . . . . 11 72 6.2. Local IGMP/MLD Leave Group Synchronization . . . . . . . 12 73 6.2.1. Remote Leave Group Synchronization . . . . . . . . . 13 74 6.2.2. Common Leave Group Synchronization . . . . . . . . . 13 75 6.3. Mass Withdraw of Multicast join Sync route in case of 76 failure . . . . . . . . . . . . . . . . . . . . . . . . . 14 77 7. Single-Active Multi-Homing . . . . . . . . . . . . . . . . . 14 78 8. Selective Multicast Procedures for IR tunnels . . . . . . . . 14 79 9. BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . 15 80 9.1. Selective Multicast Ethernet Tag Route . . . . . . . . . 15 81 9.1.1. Constructing the Selective Multicast Ethernet Tag 82 route . . . . . . . . . . . . . . . . . . . . . . . . 16 83 9.1.2. Default Selective Multicast Route . . . . . . . . . . 18 84 9.2. Multicast Join Synch Route . . . . . . . . . . . . . . . 18 85 9.2.1. Constructing the Multicast Join Synch Route . . . . . 20 86 9.3. Multicast Leave Synch Route . . . . . . . . . . . . . . . 21 87 9.3.1. Constructing the Multicast Leave Synch Route . . . . 23 88 9.4. Multicast Flags Extended Community . . . . . . . . . . . 24 89 9.5. EVI-RT Extended Community . . . . . . . . . . . . . . . . 26 90 9.6. Rewriting of RT ECs and EVI-RT ECs by ASBRs . . . . . . . 28 91 9.7. BGP Error Handling . . . . . . . . . . . . . . . . . . . 28 92 10. IGMP/MLD Immediate Leave . . . . . . . . . . . . . . . . . . 28 93 11. IGMP Version 1 Membership Report . . . . . . . . . . . . . . 29 94 12. Security Considerations . . . . . . . . . . . . . . . . . . . 29 95 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 96 14. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 30 97 15. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 30 98 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 99 16.1. Normative References . . . . . . . . . . . . . . . . . . 30 100 16.2. Informative References . . . . . . . . . . . . . . . . . 32 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32 103 1. Introduction 105 In DC applications, a point of delivery (POD) can consist of a 106 collection of servers supported by several top of rack (ToR) and 107 spine switches. This collection of servers and switches are self 108 contained and may have their own control protocol for intra-POD 109 communication and orchestration. However, EVPN is used as standard 110 way of inter-POD communication for both intra-DC and inter-DC. A 111 subnet can span across multiple PODs and DCs. EVPN provides a robust 112 multi-tenant solution with extensive multi-homing capabilities to 113 stretch a subnet (VLAN) across multiple PODs and DCs. There can be 114 many hosts (several hundreds) attached to a subnet that is stretched 115 across several PODs and DCs. 117 These hosts express their interests in multicast groups on a given 118 subnet/VLAN by sending IGMP Membership Reports (Joins) for their 119 interested multicast group(s). Furthermore, an IGMP router 120 periodically sends membership queries to find out if there are hosts 121 on that subnet that are still interested in receiving multicast 122 traffic for that group. The IGMP/MLD Proxy solution described in 123 this document accomplishes three objectives: 125 1. Reduce flooding of IGMP messages: just like the ARP/ND 126 suppression mechanism in EVPN to reduce the flooding of ARP 127 messages over EVPN, it is also desired to have a mechanism to 128 reduce the flooding of IGMP messages (both Queries and Reports) 129 in EVPN. 131 2. Distributed anycast multicast proxy: it is desirable for the EVPN 132 network to act as a distributed anycast multicast router with 133 respect to IGMP/MLD proxy function for all the hosts attached to 134 that subnet. 136 3. Selective Multicast: to forward multicast traffic over EVPN 137 network such that it only gets forwarded to the PEs that have 138 interest in the multicast group(s). This document shows how this 139 objective may be achieved when Ingress Replication is used to 140 distribute the multicast traffic among the PEs. Procedures for 141 supporting selective multicast using P2MP tunnels can be found in 142 [I-D.ietf-bess-evpn-bum-procedure-updates] 144 The first two objectives are achieved by using IGMP/MLD proxy on the 145 PE. The third objective is achieved by setting up a multicast tunnel 146 only among the PEs that have interest in that multicast group(s) 147 based on the trigger from IGMP/MLD proxy processes. The proposed 148 solutions for each of these objectives are discussed in the following 149 sections. 151 2. Specification of Requirements 153 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 154 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 155 "OPTIONAL" in this document are to be interpreted as described in BCP 156 14 [RFC2119] [RFC8174] when, and only when, they appear in all 157 capitals, as shown here. 159 3. Terminology 161 o AC: Attachment Circuit. 163 o All-Active Redundancy Mode: When all PEs attached to an Ethernet 164 segment are allowed to forward known unicast traffic to/from that 165 Ethernet segment for a given VLAN, then the Ethernet segment is 166 defined to be operating in All-Active redundancy mode. 168 o BD: Broadcast Domain. As per [RFC7432], an EVI consists of a 169 single or multiple BDs. In case of VLAN-bundle and VLAN-aware 170 bundle service model, an EVI contains multiple BDs. Also, in this 171 document, BD and subnet are equivalent terms. 173 o Ethernet Segment (ES): When a customer site (device or network) is 174 connected to one or more PEs via a set of Ethernet links. 176 o Ethernet Segment Identifier (ESI): A unique non-zero identifier 177 that identifies an Ethernet Segment. 179 o Ethernet Tag: It identifies a particular broadcast domain, e.g., a 180 VLAN. An EVPN instance consists of one or more broadcast domains. 182 o EVI: An EVPN instance spanning the Provider Edge (PE) devices 183 participating in that EVPN 185 o EVPN: Ethernet Virtual Private Network 187 o IGMP: Internet Group Management Protocol 189 o IR: Ingress Replication 190 o MAC-VRF: A Virtual Routing and Forwarding table for Media Access 191 Control (MAC) addresses on a PE 193 o MLD: Multicast Listener Discovery 195 o NV: Network Virtualization 197 o NVO: Network Virtualization Overlay 199 o OIF: Outgoing Interface for multicast. It can be physical 200 interface, virtual interface or tunnel. 202 o PE: Provider Edge. 204 o PMSI: P-Multicast Service Interface - a conceptual interface for a 205 PE to send customer multicast traffic to all or some PEs in the 206 same VPN. 208 o POD: Point of Delivery 210 o S-PMSI: Selective PMSI - to some of the PEs in the same VPN. 212 o Single-Active Redundancy Mode: When only a single PE, among all 213 the PEs attached to an Ethernet segment, is allowed to forward 214 traffic to/from that Ethernet segment for a given VLAN, then the 215 Ethernet segment is defined to be operating in Single-Active 216 redundancy mode. 218 o ToR: Top of Rack 220 This document also assumes familiarity with the terminology of 221 [RFC7432]. Though most of the place this document uses term IGMP 222 Membership Report (Joins), the text applies equally for MLD 223 Membership Report too. Similarly, text for IGMPv2 applies to MLDv1 224 and text for IGMPv3 applies to MLDv2. IGMP / MLD version encoding in 225 BGP update is stated in Section 9 227 4. IGMP/MLD Proxy 229 The IGMP Proxy mechanism is used to reduce the flooding of IGMP 230 messages over an EVPN network similar to ARP proxy used in reducing 231 the flooding of ARP messages over EVPN. It also provides a 232 triggering mechanism for the PEs to setup their underlay multicast 233 tunnels. The IGMP Proxy mechanism consists of two components: 235 1. Proxy for IGMP Reports. 237 2. Proxy for IGMP Queries. 239 4.1. Proxy Reporting 241 When IGMP protocol is used between hosts and their first hop EVPN 242 router (EVPN PE), Proxy-reporting is used by the EVPN PE to summarize 243 (when possible) reports received from downstream hosts and propagate 244 them in BGP to other PEs that are interested in the information. 245 This is done by terminating the IGMP Reports in the first hop PE, and 246 translating and exchanging the relevant information among EVPN BGP 247 speakers. The information is again translated back to IGMP message 248 at the recipient EVPN speaker. Thus it helps create an IGMP overlay 249 subnet using BGP. In order to facilitate such an overlay, this 250 document also defines a new EVPN route type NLRI, the EVPN Selective 251 Multicast Ethernet Tag route, along with its procedures to help 252 exchange and register IGMP multicast groups Section 9. 254 4.1.1. IGMP/MLD Membership Report Advertisement in BGP 256 When a PE wants to advertise an IGMP Membership Report (Join) using 257 the BGP EVPN route, it follows the following rules (BGP encoding 258 stated in Section 9): 260 1. When the first hop PE receives several IGMP Membership Reports 261 (Joins), belonging to the same IGMP version, from different 262 attached hosts for the same (*,G) or (S,G), it SHOULD send a 263 single BGP message corresponding to the very first IGMP 264 Membership Request (BGP update as soon as possible) for that 265 (*,G) or (S,G). This is because BGP is a stateful protocol and 266 no further transmission of the same report is needed. If the 267 IGMP Membership Request is for (*,G), then multicast group 268 address MUST be sent along with the corresponding version flag 269 (v2 or v3) set. In case of IGMPv3, the exclude flag MUST also be 270 set to indicate that no source IP address must be excluded 271 (include all sources "*"). If the IGMP Join is for (S,G), then 272 besides setting multicast group address along with the version 273 flag v3, the source IP address and the IE flag MUST be set. It 274 should be noted that when advertising the EVPN route for (S,G), 275 the only valid version flag is v3 (v2 flags MUST be set to zero). 277 2. When the first hop PE receives an IGMPv3 Join for (S,G) on a 278 given BD, it SHOULD advertise the corresponding EVPN Selective 279 Multicast Ethernet Tag (SMET) route regardless of whether the 280 source (S) is attached to itself or not in order to facilitate 281 the source move in the future. 283 3. When the first hop PE receives an IGMP version-X Join first for 284 (*,G) and then later it receives an IGMP version-Y Join for the 285 same (*,G), then it MUST re-advertise the same EVPN SMET route 286 with flag for version-Y set in addition to any previously-set 287 version flag(s). In other words, the first hop PE MUST NOT 288 withdraw the EVPN route before sending the new route because the 289 flag field is not part of BGP route key processing. 291 4. When the first hop PE receives an IGMP version-X Join first for 292 (*,G) and then later it receives an IGMPv3 Join for the same 293 multicast group address but for a specific source address S, then 294 the PE MUST advertise a new EVPN SMET route with v3 flag set (and 295 v2 reset). The IE flag also need to be set accordingly. Since 296 source IP address is used as part of BGP route key processing it 297 is considered as a new BGP route advertisement. When different 298 version of IGMP join are received, final state MUST be as per 299 section 5.1 of [RFC3376]. At the end of route processing local 300 and remote group record state MUST be as per section 5.1 of 301 [RFC3376]. 303 5. When a PE receives an EVPN SMET route with more than one version 304 flag set, it will generate the corresponding IGMP report for 305 (*,G) for each version specified in the flags field. With 306 multiple version flags set, there must not be source IP address 307 in the received EVPN route. If there is, then an error SHOULD be 308 logged. If the v3 flag is set (in addition to v2), then the IE 309 flag MUST indicate "exclude". If not, then an error SHOULD be 310 logged. The PE MUST generate an IGMP Membership Report (Join) 311 for that (*,G) and each IGMP version in the version flag. 313 6. When a PE receives a list of EVPN SMET NLRIs in its BGP update 314 message, each with a different source IP address and the same 315 multicast group address, and the version flag is set to v3, then 316 the PE generates an IGMPv3 Membership Report with a record 317 corresponding to the list of source IP addresses and the group 318 address along with the proper indication of inclusion/exclusion. 320 7. Upon receiving EVPN SMET route(s) and before generating the 321 corresponding IGMP Membership Request(s), the PE checks to see 322 whether it has any CE multicast router for that BD on any of its 323 ES's . The PE provides such a check by listening for PIM Hello 324 messages on that AC (i.e, ES,BD). If the PE does have the 325 router's ACs, then the generated IGMP Membership Request(s) are 326 sent to those ACs. If it doesn't have any of the router's AC, 327 then no IGMP Membership Request(s) needs to be generated. This 328 is because sending IGMP Membership Requests to other hosts can 329 result in unintentionally preventing a host from joining a 330 specific multicast group using IGMPv2 - i.e., if the PE does not 331 receive a join from the host it will not forward multicast data 332 to it. Per [RFC4541] , when an IGMPv2 host receives a Membership 333 Report for a group address that it intends to join, the host will 334 suppress its own membership report for the same group, and if the 335 PE does not receive an IGMP Join from the host it will not 336 forward multicast data to it. In other words, an IGMPv2 Join 337 MUST NOT be sent on an AC that does not lead to a CE multicast 338 router. This message suppression is a requirement for IGMPv2 339 hosts. This is not a problem for hosts running IGMPv3 because 340 there is no suppression of IGMP Membership Reports. 342 4.1.2. IGMP/MLD Leave Group Advertisement in BGP 344 When a PE wants to withdraw an EVPN SMET route corresponding to an 345 IGMPv2 Leave Group or IGMPv3 "Leave" equivalent message, it follows 346 the following rules: 348 1. When a PE receives an IGMPv2 Leave Group or its "Leave" 349 equivalent message for IGMPv3 from its attached host, it checks 350 to see if this host is the last host that is interested in this 351 multicast group by sending a query for the multicast group. If 352 the host was indeed the last one (i.e. no responses are received 353 for the query), then the PE MUST re-advertises EVPN SMET 354 Multicast route with the corresponding version flag reset. If 355 this is the last version flag to be reset, then instead of re- 356 advertising the EVPN route with all version flags reset, the PE 357 MUST withdraw the EVPN route for that (*,G). 359 2. When a PE receives an EVPN SMET route for a given (*,G), it 360 compares the received version flags from the route with its per- 361 PE stored version flags. If the PE finds that a version flag 362 associated with the (*,G) for the remote PE is reset, then the PE 363 MUST generate IGMP Leave for that (*,G) toward its local 364 interface (if any) attached to the multicast router for that 365 multicast group. It should be noted that the received EVPN route 366 MUST at least have one version flag set. If all version flags 367 are reset, it is an error because the PE should have received an 368 EVPN route withdraw for the last version flag. Error MUST be 369 considered as a BGP error and the PE MUST apply the "treat-as- 370 withdraw" procedure of [RFC7606]. 372 3. When a PE receives an EVPN SMET route withdraw, it removes the 373 remote PE from its OIF list for that multicast group and if there 374 are no more OIF entries for that multicast group (either locally 375 or remotely), then the PE MUST stop responding to queries from 376 the locally attached router (if any). If there is a source for 377 that multicast group, the PE stops sending multicast traffic for 378 that source. 380 4.2. Proxy Querier 382 As mentioned in the previous sections, each PE MUST have proxy 383 querier functionality for the following reasons: 385 1. To enable the collection of EVPN PEs providing L2VPN service to 386 act as distributed multicast router with Anycast IP address for 387 all attached hosts in that subnet. 389 2. To enable suppression of IGMP Membership Reports and queries over 390 MPLS/IP core. 392 5. Operation 394 Consider the EVPN network of Figure-1, where there is an EVPN 395 instance configured across the PEs shown in this figure (namely PE1, 396 PE2, and PE3). Let's consider that this EVPN instance consists of a 397 single bridge domain (single subnet) with all the hosts, sources, and 398 the multicast router connected to this subnet. PE1 only has hosts 399 connected to it. PE2 has a mix of hosts and a multicast source. PE3 400 has a mix of hosts, a multicast source, and a multicast router. 401 Furthermore, let's consider that for (S1,G1), R1 is used as the 402 multicast router. The following subsections describe the IGMP proxy 403 operation in different PEs with regard to whether the locally 404 attached devices for that subnet are: 406 o only hosts 408 o mix of hosts and multicast source 410 o mix of hosts, multicast source, and multicast router 411 +--------------+ 412 | | 413 | | 414 +----+ | | +----+ 415 H1:(*,G1)v2 ---| | | | | |---- H6(*,G1)v2 416 H2:(*,G1)v2 ---| PE1| | IP/MPLS | | PE2|---- H7(S2,G2)v3 417 H3:(*,G1)v3 ---| | | Network | | |---- S2 418 H4:(S2,G2)v3 --| | | | | | 419 +----+ | | +----+ 420 | | 421 +----+ | | 422 H5:(S1,G1)v3 --| | | | 423 S1 ---| PE3| | | 424 R1 ---| | | | 425 +----+ | | 426 | | 427 +--------------+ 429 Figure 1: EVPN network 431 5.1. PE with only attached hosts for a given subnet 433 When PE1 receives an IGMPv2 Membership Report from H1, it does not 434 forward this join to any of its other ports (for this subnet) because 435 all these local ports are associated with the hosts. PE1 sends an 436 EVPN Multicast Group route corresponding to this join for (*,G1) and 437 setting v2 flag. This EVPN route is received by PE2 and PE3 that are 438 the members of the same BD (i.e., same EVI in case of VLAN-based 439 service or EVI,VLAN in case of VLAN-aware bundle service). PE3 440 reconstructs the IGMPv2 Membership Report from this EVPN BGP route 441 and only sends it to the port(s) with multicast routers attached to 442 it (for that subnet). In this example, PE3 sends the reconstructed 443 IGMPv2 Membership Report for (*,G1) only to R1. Furthermore, even 444 though PE2 receives the EVPN BGP route, it does not send it to any of 445 its ports for that subnet; viz, ports associated with H6 and H7. 447 When PE1 receives the second IGMPv2 Join from H2 for the same 448 multicast group (*,G1), it only adds that port to its OIF list but it 449 doesn't send any EVPN BGP route because there is no change in 450 information. However, when it receives the IGMPv3 Join from H3 for 451 the same (*,G1). Besides adding the corresponding port to its OIF 452 list, it re-advertises the previously sent EVPN SMET route with the 453 v3 and exclude flag set. 455 Finally when PE1 receives the IGMPv3 Join from H4 for (S2,G2), it 456 advertises a new EVPN SMET route corresponding to it. 458 5.2. PE with a mix of attached hosts and multicast source 460 The main difference in this case is that when PE2 receives the IGMPv3 461 Join from H7 for (S2,G2), it does advertise it in BGP to support 462 source move even though PE2 knows that S2 is attached to its local 463 AC. PE2 adds the port associated with H7 to its OIF list for 464 (S2,G2). The processing for IGMPv2 received from H6 is the same as 465 the IGMPv2 Join described in previous section. 467 5.3. PE with a mix of attached hosts, a multicast source and a router 469 The main difference in this case relative to the previous two 470 sections is that IGMP v2/v3 Join messages received locally need to be 471 sent to the port associated with router R1. Furthermore, the Joins 472 received via BGP (SMET) need to be passed to the R1 port but filtered 473 for all other ports. 475 6. All-Active Multi-Homing 477 Because the LAG flow hashing algorithm used by the CE is unknown at 478 the PE, in an All-Active redundancy mode it must be assumed that the 479 CE can send a given IGMP message to any one of the multi-homed PEs, 480 either DF or non-DF; i.e., different IGMP Membership Request messages 481 can arrive at different PEs in the redundancy group and furthermore 482 their corresponding Leave messages can arrive at PEs that are 483 different from the ones that received the Join messages. Therefore, 484 all PEs attached to a given ES must coordinate IGMP Membership 485 Request and Leave Group (x,G) state, where x may be either '*' or a 486 particular source S, for each BD on that ES. This allows the DF for 487 that (ES,BD) to correctly advertise or withdraw a Selective Multicast 488 Ethernet Tag (SMET) route for that (x,G) group in that BD when 489 needed. All-Active multihoming PEs for a given ES MUST support IGMP 490 synchronization procedures described in this section if they need to 491 perform IGMP proxy for hosts connected to that ES. 493 6.1. Local IGMP/MLD Join Synchronization 495 When a PE, either DF or non-DF, receives on a given multihomed ES 496 operating in All-Active redundancy mode, an IGMP Membership Report 497 for (x,G), it determines the BD to which the IGMP Membership Report 498 belongs. If the PE doesn't already have local IGMP Membership 499 Request (x,G) state for that BD on that ES, it MUST instantiate local 500 IGMP Membership Request (x,G) state and MUST advertise a BGP IGMP 501 Join Synch route for that (ES,BD). Local IGMP Membership Request 502 (x,G) state refers to IGMP Membership Request (x,G) state that is 503 created as a result of processing an IGMP Membership Report for 504 (x,G). 506 The IGMP Join Synch route MUST carry the ES-Import RT for the ES on 507 which the IGMP Membership Report was received. Thus it MUST only be 508 imported by the PEs attached to that ES and not any other PEs. 510 When a PE, either DF or non-DF, receives an IGMP Join Synch route it 511 installs that route and if it doesn't already have IGMP Membership 512 Request (x,G) state for that (ES,BD), it MUST instantiate that IGMP 513 Membership Request (x,G) state - i.e., IGMP Membership Request (x,G) 514 state is the union of the local IGMP Join (x,G) state and the 515 installed IGMP Join Synch route. If the DF did not already advertise 516 (originate) a SMET route for that (x,G) group in that BD, it MUST do 517 so now. 519 When a PE, either DF or non-DF, deletes its local IGMP Membership 520 Request (x,G) state for that (ES,BD), it MUST withdraw its BGP IGMP 521 Join Synch route for that (ES,BD). 523 When a PE, either DF or non-DF, receives the withdrawal of an IGMP 524 Join Synch route from another PE it MUST remove that route. When a 525 PE has no local IGMP Membership Request (x,G) state and it has no 526 installed IGMP Join Synch routes, it MUST remove IGMP Membership 527 Request (x,G) state for that (ES,BD). If the DF no longer has IGMP 528 Membership Request (x,G) state for that BD on any ES for which it is 529 DF, it MUST withdraw its SMET route for that (x,G) group in that BD. 531 In other words, a PE advertises an SMET route for that (x,G) group in 532 that BD when it has IGMP Membership Request (x,G) state in that BD on 533 at least one ES for which it is DF and it withdraws that SMET route 534 when it does not have IGMP Membership Request (x,G) state in that BD 535 on any ES for which it is DF. 537 6.2. Local IGMP/MLD Leave Group Synchronization 539 When a PE, either DF or non-DF, receives, on a given multihomed ES 540 operating in All-Active redundancy mode, an IGMP Leave Group message 541 for (x,G) from the attached CE, it determines the BD to which the 542 IGMPv2 Leave Group belongs. Regardless of whether it has IGMP 543 Membership Request (x,G) state for that (ES,BD), it initiates the 544 (x,G) leave group synchronization procedure, which consists of the 545 following steps: 547 1. It computes the Maximum Response Time, which is the duration of 548 (x,G) leave group synchronization procedure. This is the product 549 of two locally configured values, Last Member Query Count and 550 Last Member Query Interval (described in Section 3 of [RFC2236]), 551 plus a delta corresponding to the time it takes for a BGP 552 advertisement to propagate between the PEs attached to the 553 multihomed ES (delta is a consistently configured value on all 554 PEs attached to the multihomed ES). 556 2. It starts the Maximum Response Time timer. Note that the receipt 557 of subsequent IGMP Leave Group messages or BGP Leave Synch routes 558 for (x,G) do not change the value of a currently running Maximum 559 Response Time timer and are ignored by the PE. 561 3. It initiates the Last Member Query procedure described in 562 Section 3 of [RFC2236]; viz, it sends a number of Group-Specific 563 Query (x,G) messages (Last Member Query Count) at a fixed 564 interval (Last Member Query Interval) to the attached CE. 566 4. It advertises an IGMP Leave Synch route for that that (ES,BD). 567 This route notifies the other multihomed PEs attached to the 568 given multihomed ES that it has initiated an (x,G) leave group 569 synchronization procedure; i.e., it carries the ES-Import RT for 570 the ES on which the IGMP Leave Group was received. It also 571 contains the Maximum Response Time. 573 5. When the Maximum Response Timer expires, the PE that has 574 advertised the IGMP Leave Synch route withdraws it. 576 6.2.1. Remote Leave Group Synchronization 578 When a PE, either DF or non-DF, receives an IGMP Leave Synch route it 579 installs that route and it starts a timer for (x,G) on the specified 580 (ES,BD) whose value is set to the Maximum Response Time in the 581 received IGMP Leave Synch route. Note that the receipt of subsequent 582 IGMPv2 Leave Group messages or BGP Leave Synch routes for (x,G) do 583 not change the value of a currently running Maximum Response Time 584 timer and are ignored by the PE. 586 6.2.2. Common Leave Group Synchronization 588 If a PE attached to the multihomed ES receives an IGMP Membership 589 Report for (x,G) before the Maximum Response Time timer expires, it 590 advertises a BGP IGMP Join Synch route for that (ES,BD). If it 591 doesn't already have local IGMP Membership Request (x,G) state for 592 that (ES,BD), it instantiates local IGMP Membership Request (x,G) 593 state. If the DF is not currently advertising (originating) a SMET 594 route for that (x,G) group in that BD, it does so now. 596 If a PE attached to the multihomed ES receives an IGMP Join Synch 597 route for (x,G) before the Maximum Response Time timer expires, it 598 installs that route and if it doesn't already have IGMP Membership 599 Request (x,G) state for that BD on that ES, it instantiates that IGMP 600 Membership Request (x,G) state. If the DF has not already advertised 601 (originated) a SMET route for that (x,G) group in that BD, it does so 602 now. 604 When the Maximum Response Timer expires a PE that has advertised an 605 IGMP Leave Synch route, withdraws it. Any PE attached to the 606 multihomed ES, that started the Maximum Response Time and has no 607 local IGMP Membership Request (x,G) state and no installed IGMP Join 608 Synch routes, it removes IGMP Membership Request (x,G) state for that 609 (ES,BD). If the DF no longer has IGMP Membership Request (x,G) state 610 for that BD on any ES for which it is DF, it withdraws its SMET route 611 for that (x,G) group in that BD. 613 6.3. Mass Withdraw of Multicast join Sync route in case of failure 615 A PE which has received an IGMP Membership Request would have synced 616 the IGMP Join by the procedure defined in section 6.1. If a PE with 617 local join state goes down or the PE to CE link goes down, it would 618 lead to a mass withdraw of multicast routes. Remote PEs (PEs where 619 these routes were remote IGMP Joins) SHOULD NOT remove the state 620 immediately; instead General Query SHOULD be generated to refresh the 621 states. There are several ways to detect failure at a peer, e.g. 622 using IGP next hop tracking or ES route withdraw. 624 7. Single-Active Multi-Homing 626 Note that to facilitate state synchronization after failover, the PEs 627 attached to a multihomed ES operating in Single-Active redundancy 628 mode SHOULD also coordinate IGMP Join (x,G) state. In this case all 629 IGMP Join messages are received by the DF and distributed to the non- 630 DF PEs using the procedures described above. 632 8. Selective Multicast Procedures for IR tunnels 634 If an ingress PE uses ingress replication, then for a given (x,G) 635 group in a given BD: 637 1. It sends (x,G) traffic to the set of PEs not supporting IGMP 638 Proxy. This set consists of any PE that has advertised an 639 Inclusive Multicast Tag route for the BD without the "IGMP Proxy 640 Support" flag. 642 2. It sends (x,G) traffic to the set of PEs supporting IGMP Proxy 643 and having listeners for that (x,G) group in that BD. This set 644 consists of any PE that has advertised an Inclusive Multicast 645 Ethernet Tag route for the BD with the "IGMP Proxy Support" flag 646 and that has advertised a SMET route for that (x,G) group in that 647 BD. 649 9. BGP Encoding 651 This document defines three new BGP EVPN routes to carry IGMP 652 Membership Reports. The route types are known as: 654 + 6 - Selective Multicast Ethernet Tag Route 656 + 7 - Multicast Join Synch Route 658 + 8 - Multicast Leave Synch Route 660 The detailed encoding and procedures for these route types are 661 described in subsequent sections. 663 9.1. Selective Multicast Ethernet Tag Route 665 A Selective Multicast Ethernet Tag route type specific EVPN NLRI 666 consists of the following: 668 +---------------------------------------+ 669 | RD (8 octets) | 670 +---------------------------------------+ 671 | Ethernet Tag ID (4 octets) | 672 +---------------------------------------+ 673 | Multicast Source Length (1 octet) | 674 +---------------------------------------+ 675 | Multicast Source Address (variable) | 676 +---------------------------------------+ 677 | Multicast Group Length (1 octet) | 678 +---------------------------------------+ 679 | Multicast Group Address (Variable) | 680 +---------------------------------------+ 681 | Originator Router Length (1 octet) | 682 +---------------------------------------+ 683 | Originator Router Address (variable) | 684 +---------------------------------------+ 685 | Flags (1 octet) | 686 +---------------------------------------+ 688 For the purpose of BGP route key processing, all the fields are 689 considered to be part of the prefix in the NLRI except for the one- 690 octet flag field. The Flags fields are defined as follows: 692 0 1 2 3 4 5 6 7 693 +--+--+--+--+--+--+--+--+ 694 | reserved |IE|v3|v2|v1| 695 +--+--+--+--+--+--+--+--+ 697 o The least significant bit, bit 7 indicates support for IGMP 698 version 1. Since IGMP V1 is being deprecated sender MUST set it 699 as 0 for IGMP and receiver MUST ignore it. 701 o The second least significant bit, bit 6 indicates support for IGMP 702 version 2. 704 o The third least significant bit, bit 5 indicates support for IGMP 705 version 3. 707 o The fourth least significant bit, bit 4 indicates whether the 708 (S,G) information carried within the route-type is of an Include 709 Group type (bit value 0) or an Exclude Group type (bit value 1). 710 The Exclude Group type bit MUST be ignored if bit 5 is not set. 712 o This EVPN route type is used to carry tenant IGMP multicast group 713 information. The flag field assists in distributing IGMP 714 Membership Report of a given host for a given multicast route. 715 The version bits help associate IGMP version of receivers 716 participating within the EVPN domain. 718 o The include/exclude (IE) bit helps in creating filters for a given 719 multicast route. 721 o If route is used for IPv6 (MLD) then bit 7 indicates support for 722 MLD version 1. The second least significant bit, bit 6 indicates 723 support for MLD version 2. Since there is no MLD version 3, in 724 case of IPv6 route third least significant bit MUST be 0. In case 725 of IPv6 routes, the fourth least significant bit MUST be ignored 726 if bit 6 is not set. 728 o Reserved bits MUST be set to 0 by sender. And receiver SHOULD 729 ignore the Reserved bits. 731 9.1.1. Constructing the Selective Multicast Ethernet Tag route 733 This section describes the procedures used to construct the Selective 734 Multicast Ethernet Tag (SMET) route. 736 The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364]. The 737 value field comprises an IP address of the PE (typically, the 738 loopback address) followed by a number unique to the PE. 740 The Ethernet Tag ID MUST be set as procedure defined in [RFC7432]. 742 The Multicast Source Length MUST be set to length of the multicast 743 Source address in bits. If the Multicast Source Address field 744 contains an IPv4 address, then the value of the Multicast Source 745 Length field is 32. If the Multicast Source Address field contains 746 an IPv6 address, then the value of the Multicast Source Length field 747 is 128. In case of a (*,G) Join, the Multicast Source Length is set 748 to 0. 750 The Multicast Source Address is the source IP address from the IGMP 751 Membership Report. In case of a (*,G), this field is not used. 753 The Multicast Group Length MUST be set to length of multicast group 754 address in bits. If the Multicast Group Address field contains an 755 IPv4 address, then the value of the Multicast Group Length field is 756 32. If the Multicast Group Address field contains an IPv6 address, 757 then the value of the Multicast Group Length field is 128. 759 The Multicast Group Address is the Group address from the IGMP or MLD 760 Membership Report. 762 The Originator Router Length is the length of the Originator Router 763 Address in bits. 765 The Originator Router Address is the IP address of router originating 766 this route. The SMET Originator Router IP address MUST match that of 767 the IMET (or S-PMSI AD) route originated for the same EVI by the same 768 downstream PE. 770 The Flags field indicates the version of IGMP protocol from which the 771 Membership Report was received. It also indicates whether the 772 multicast group had the INCLUDE or EXCLUDE bit set. 774 Reserved bits MUST be set to 0. They can be defined in future by 775 other document. 777 IGMP is used to receive group membership information from hosts by 778 TORs. Upon receiving the hosts expression of interest of a 779 particular group membership, this information is then forwarded using 780 SMET route. The NLRI also keeps track of receiver's IGMP protocol 781 version and any source filtering for a given group membership. All 782 EVPN SMET routes are announced with per- EVI Route Target extended 783 communities. 785 9.1.2. Default Selective Multicast Route 787 If there is multicast router connected behind the EVPN domain, the PE 788 MAY originate a default SMET (*,*) to get all multicast traffic in 789 domain. 791 +--------------+ 792 | | 793 | | 794 | | +----+ 795 | | | |---- H1(*,G1)v2 796 | IP/MPLS | | PE1|---- H2(S2,G2)v3 797 | Network | | |---- S2 798 | | | | 799 | | +----+ 800 | | 801 +----+ | | 802 +----+ | | | | 803 | | S1 ---| PE2| | | 804 |PIM |----R1 ---| | | | 805 |ASM | +----+ | | 806 | | | | 807 +----+ +--------------+ 809 Figure 2: Multicast Router behind EVPN domain 811 Consider the EVPN network of Figure-2, where there is an EVPN 812 instance configured across the PEs. Let's consider that PE2 is 813 connected to multicast router R1 and there is a network running PIM 814 ASM behind R1. If there are receivers behind the PIM ASM network the 815 PIM Join would be forwarded to the PIM RP (Rendezvous Point). If 816 receivers behind PIM ASM network are interested in a multicast flow 817 originated by multicast source S2 (behind PE1), it is necessary for 818 PE2 to receive multicast traffic. In this case PE2 MUST originate a 819 (*,*) SMET route to receive all of the multicast traffic in the EVPN 820 domain. To generate Wildcards (*,*) routes, the procedure from 821 [RFC6625] SHOULD be used. 823 9.2. Multicast Join Synch Route 825 This EVPN route type is used to coordinate IGMP Join (x,G) state for 826 a given BD between the PEs attached to a given ES operating in All- 827 Active (or Single-Active) redundancy mode and it consists of 828 following: 830 +--------------------------------------------------+ 831 | RD (8 octets) | 832 +--------------------------------------------------+ 833 | Ethernet Segment Identifier (10 octets) | 834 +--------------------------------------------------+ 835 | Ethernet Tag ID (4 octets) | 836 +--------------------------------------------------+ 837 | Multicast Source Length (1 octet) | 838 +--------------------------------------------------+ 839 | Multicast Source Address (variable) | 840 +--------------------------------------------------+ 841 | Multicast Group Length (1 octet) | 842 +--------------------------------------------------+ 843 | Multicast Group Address (Variable) | 844 +--------------------------------------------------+ 845 | Originator Router Length (1 octet) | 846 +--------------------------------------------------+ 847 | Originator Router Address (variable) | 848 +--------------------------------------------------+ 849 | Flags (1 octet) | 850 +--------------------------------------------------+ 852 For the purpose of BGP route key processing, all the fields are 853 considered to be part of the prefix in the NLRI except for the one- 854 octet Flags field, whose fields are defined as follows: 856 0 1 2 3 4 5 6 7 857 +--+--+--+--+--+--+--+--+ 858 | reserved |IE|v3|v2|v1| 859 +--+--+--+--+--+--+--+--+ 861 o The least significant bit, bit 7 indicates support for IGMP 862 version 1. 864 o The second least significant bit, bit 6 indicates support for IGMP 865 version 2. 867 o The third least significant bit, bit 5 indicates support for IGMP 868 version 3. 870 o The fourth least significant bit, bit 4 indicates whether the (S, 871 G) information carried within the route-type is of Include Group 872 type (bit value 0) or an Exclude Group type (bit value 1). The 873 Exclude Group type bit MUST be ignored if bit 5 is not set. 875 o Reserved bits MUST be set to 0. 877 The Flags field assists in distributing IGMP Membership Report of a 878 given host for a given multicast route. The version bits help 879 associate IGMP version of receivers participating within the EVPN 880 domain. The include/exclude bit helps in creating filters for a 881 given multicast route. 883 If route is being prepared for IPv6 (MLD) then bit 7 indicates 884 support for MLD version 1. The second least significant bit, bit 6 885 indicates support for MLD version 2. Since there is no MLD version 886 3, in case of IPv6 route third least significant bit MUST be 0. In 887 case of IPv6 route, the fourth least significant bit MUST be ignored 888 if bit 6 is not set. 890 9.2.1. Constructing the Multicast Join Synch Route 892 This section describes the procedures used to construct the IGMP Join 893 Synch route. Support for these route types is optional. If a PE 894 does not support this route, then it MUST NOT indicate that it 895 supports 'IGMP proxy' in the Multicast Flag extended community for 896 the EVIs corresponding to its multi-homed Ethernet Segments (ESs). 898 An IGMP Join Synch route MUST carry exactly one ES-Import Route 899 Target extended community, the one that corresponds to the ES on 900 which the IGMP Join was received. It MUST also carry exactly one 901 EVI-RT EC, the one that corresponds to the EVI on which the IGMP Join 902 was received. See Section 9.5 for details on how to encode and 903 construct the EVI-RT EC. 905 The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364]. The 906 value field comprises an IP address of the PE (typically, the 907 loopback address) followed by a number unique to the PE. 909 The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet 910 value defined for the ES. 912 The Ethernet Tag ID MUST be set as per procedure defined in 913 [RFC7432]. 915 The Multicast Source length MUST be set to length of Multicast Source 916 address in bits. If the Multicast Source field contains an IPv4 917 address, then the value of the Multicast Source Length field is 32. 918 If the Multicast Source field contains an IPv6 address, then the 919 value of the Multicast Source Length field is 128. In case of a 920 (*,G) Join, the Multicast Source Length is set to 0. 922 The Multicast Source is the Source IP address of the IGMP Membership 923 Report. In case of a (*,G) Join, this field does not exist. 925 The Multicast Group length MUST be set to length of multicast group 926 address in bits. If the Multicast Group field contains an IPv4 927 address, then the value of the Multicast Group Length field is 32. 928 If the Multicast Group field contains an IPv6 address, then the value 929 of the Multicast Group Length field is 128. 931 The Multicast Group is the Group address of the IGMP Membership 932 Report. 934 The Originator Router Length is the length of the Originator Router 935 address in bits. 937 The Originator Router Address is the IP address of Router Originating 938 the prefix. 940 The Flags field indicates the version of IGMP protocol from which the 941 Membership Report was received. It also indicates whether the 942 multicast group had INCLUDE or EXCLUDE bit set. 944 Reserved bits MUST be set to 0. 946 9.3. Multicast Leave Synch Route 948 This EVPN route type is used to coordinate IGMP Leave Group (x,G) 949 state for a given BD between the PEs attached to a given ES operating 950 in All-Active (or Single-Active) redundancy mode and it consists of 951 following: 953 +--------------------------------------------------+ 954 | RD (8 octets) | 955 +--------------------------------------------------+ 956 | Ethernet Segment Identifier (10 octets) | 957 +--------------------------------------------------+ 958 | Ethernet Tag ID (4 octets) | 959 +--------------------------------------------------+ 960 | Multicast Source Length (1 octet) | 961 +--------------------------------------------------+ 962 | Multicast Source Address (variable) | 963 +--------------------------------------------------+ 964 | Multicast Group Length (1 octet) | 965 +--------------------------------------------------+ 966 | Multicast Group Address (Variable) | 967 +--------------------------------------------------+ 968 | Originator Router Length (1 octet) | 969 +--------------------------------------------------+ 970 | Originator Router Address (variable) | 971 +--------------------------------------------------+ 972 | Reserved (4 octet) | 973 +--------------------------------------------------+ 974 | Maximum Response Time (1 octet) | 975 +--------------------------------------------------+ 976 | Flags (1 octet) | 977 +--------------------------------------------------+ 979 For the purpose of BGP route key processing, all the fields are 980 considered to be part of the prefix in the NLRI except for the 981 Reserved, Maximum Response Time and the one-octet Flags field, whose 982 fields are defined as follows: 984 0 1 2 3 4 5 6 7 985 +--+--+--+--+--+--+--+--+ 986 | reserved |IE|v3|v2|v1| 987 +--+--+--+--+--+--+--+--+ 989 o The least significant bit, bit 7 indicates support for IGMP 990 version 1. 992 o The second least significant bit, bit 6 indicates support for IGMP 993 version 2. 995 o The third least significant bit, bit 5 indicates support for IGMP 996 version 3. 998 o The fourth least significant bit, bit 4 indicates whether the (S, 999 G) information carried within the route-type is of Include Group 1000 type (bit value 0) or an Exclude Group type (bit value 1). The 1001 Exclude Group type bit MUST be ignored if bit 5 is not set. 1003 o Reserved bits MUST be set to 0. They can be defined in future by 1004 other document. 1006 The Flags field assists in distributing IGMP Membership Report of a 1007 given host for a given multicast route. The version bits help 1008 associate IGMP version of receivers participating within the EVPN 1009 domain. The include/exclude bit helps in creating filters for a 1010 given multicast route. 1012 If route is being prepared for IPv6 (MLD) then bit 7 indicates 1013 support for MLD version 1. The second least significant bit, bit 6 1014 indicates support for MLD version 2. Since there is no MLD version 1015 3, in case of IPv6 route third least significant bit MUST be 0. In 1016 case of IPv6 route, the fourth least significant bit MUST be ignored 1017 if bit 6 is not set. 1019 Reserved bits in flag MUST be set to 0. They can be defined in 1020 future by other document. 1022 9.3.1. Constructing the Multicast Leave Synch Route 1024 This section describes the procedures used to construct the IGMP 1025 Leave Synch route. Support for these route types is optional. If a 1026 PE does not support this route, then it MUST NOT indicate that it 1027 supports 'IGMP proxy' in Multicast Flag extended community for the 1028 EVIs corresponding to its multi-homed Ethernet Segments. 1030 An IGMP Leave Synch route MUST carry exactly one ES-Import Route 1031 Target extended community, the one that corresponds to the ES on 1032 which the IGMP Leave was received. It MUST also carry exactly one 1033 EVI-RT EC, the one that corresponds to the EVI on which the IGMP 1034 Leave was received. See Section 9.5 for details on how to form the 1035 EVI-RT EC. 1037 The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364]. The 1038 value field comprises an IP address of the PE (typically, the 1039 loopback address) followed by a number unique to the PE. 1041 The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet 1042 value defined for the ES. 1044 The Ethernet Tag ID MUST be set as per procedure defined in 1045 [RFC7432]. 1047 The Multicast Source length MUST be set to length of multicast source 1048 address in bits. If the Multicast Source field contains an IPv4 1049 address, then the value of the Multicast Source Length field is 32. 1050 If the Multicast Source field contains an IPv6 address, then the 1051 value of the Multicast Source Length field is 128. In case of a 1052 (*,G) Join, the Multicast Source Length is set to 0. 1054 The Multicast Source is the Source IP address of the IGMP Membership 1055 Report. In case of a (*,G) Join, this field does not exist. 1057 The Multicast Group length MUST be set to length of multicast group 1058 address in bits. If the Multicast Group field contains an IPv4 1059 address, then the value of the Multicast Group Length field is 32. 1060 If the Multicast Group field contains an IPv6 address, then the value 1061 of the Multicast Group Length field is 128. 1063 The Multicast Group is the Group address of the IGMP Membership 1064 Report. 1066 The Originator Router Length is the length of the Originator Router 1067 address in bits. 1069 The Originator Router Address is the IP address of Router Originating 1070 the prefix. 1072 Reserved field is not part of the route key. The originator MUST set 1073 the reserved field to Zero , the receiver SHOULD ignore it and if it 1074 needs to be propagated, it MUST propagate it unchanged 1076 Maximum Response Time is value to be used while sending query as 1077 defined in [RFC2236] 1079 The Flags field indicates the version of IGMP protocol from which the 1080 Membership Report was received. It also indicates whether the 1081 multicast group had INCLUDE or EXCLUDE bit set. 1083 9.4. Multicast Flags Extended Community 1085 The 'Multicast Flags' extended community is a new EVPN extended 1086 community. EVPN extended communities are transitive extended 1087 communities with a Type field value of 6. IANA will assign a Sub- 1088 Type from the 'EVPN Extended Community Sub-Types' registry. 1090 A PE that supports IGMP proxy on a given BD MUST attach this extended 1091 community to the Inclusive Multicast Ethernet Tag (IMET) route it 1092 advertises for that BD and it MUST set the IGMP Proxy Support flag to 1093 1. Note that an [RFC7432] compliant PE will not advertise this 1094 extended community so its absence indicates that the advertising PE 1095 does not support IGMP Proxy. 1097 The advertisement of this extended community enables more efficient 1098 multicast tunnel setup from the source PE specially for ingress 1099 replication - i.e., if an egress PE supports IGMP proxy but doesn't 1100 have any interest in a given (x,G), it advertises its IGMP proxy 1101 capability using this extended community but it does not advertise 1102 any SMET route for that (x,G). When the source PE (ingress PE) 1103 receives such advertisements from the egress PE, it does not 1104 replicate the multicast traffic to that egress PE; however, it does 1105 replicate the multicast traffic to the egress PEs that don't 1106 advertise such capability even if they don't have any interests in 1107 that (x,G). 1109 A Multicast Flags extended community is encoded as an 8-octet value, 1110 as follows: 1112 0 1 2 3 1113 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1114 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1115 | Type=0x06 |Sub-Type=0x09 | Flags (2 Octets) |M|I| 1116 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1117 | Reserved=0 | 1118 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1120 The low-order (lease significant) two bits are defined as the "IGMP 1121 Proxy Support and MLD Proxy Support" bit. The absence of this 1122 extended community also means that the PE does not support IGMP 1123 proxy. where: 1125 o Type is 0x06 as registered with IANA for EVPN Extended 1126 Communities. 1128 o Sub-Type : 0x09 1130 o Flags are two Octets value. 1132 * Bit 15 (shown as I) defines IGMP Proxy Support. Value of 1 for 1133 bit 15 means that PE supports IGMP Proxy. Value of 0 for bit 1134 15 means that PE does not supports IGMP Proxy. 1136 * Bit 14 (shown as M) defines MLD Proxy Support. Value of 1 for 1137 bit 14 means that PE supports MLD Proxy. Value of 0 for bit 14 1138 means that PE does not support MLD proxy. 1140 * Bit 0 to 13 are reserved for future. Sender MUST set it 0 and 1141 receiver MUST ignore it. 1143 o Reserved bits are set to 0. Sender MUST set it to 0 and receiver 1144 MUST ignore it. 1146 If a router does not support this specification, it MUST NOT add 1147 Multicast Flags Extended Community in BGP route. A router receiving 1148 BGP update, if M and I both flag are zero (0), the router MUST treat 1149 this Update as malformed. Receiver of such update MUST ignore the 1150 extended community. 1152 9.5. EVI-RT Extended Community 1154 In EVPN, every EVI is associated with one or more Route Targets 1155 (RTs). These Route Targets serve two functions: 1157 1. Distribution control: RTs control the distribution of the routes. 1158 If a route carries the RT associated with a particular EVI, it 1159 will be distributed to all the PEs on which that EVI exists. 1161 2. EVI identification: Once a route has been received by a 1162 particular PE, the RT is used to identify the EVI to which it 1163 applies. 1165 An IGMP Join Synch or IGMP Leave Synch route is associated with a 1166 particular combination of ES and EVI. These routes need to be 1167 distributed only to PEs that are attached to the associated ES. 1168 Therefore these routes carry the ES-Import RT for that ES. 1170 Since an IGMP Join Synch or IGMP Leave Synch route does not need to 1171 be distributed to all the PEs on which the associated EVI exists, 1172 these routes cannot carry the RT associated with that EVI. 1173 Therefore, when such a route arrives at a particular PE, the route's 1174 RTs cannot be used to identify the EVI to which the route applies. 1175 Some other means of associating the route with an EVI must be used. 1177 This document specifies four new Extended Communities (EC) that can 1178 be used to identify the EVI with which a route is associated, but 1179 which do not have any effect on the distribution of the route. These 1180 new ECs are known as the "Type 0 EVI-RT EC", the "Type 1 EVI-RT EC", 1181 the "Type 2 EVI-RT EC", and the "Type 3 EVI-RT EC". 1183 1. A Type 0 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xA. 1185 2. A Type 1 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xB. 1187 3. A Type 2 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xC. 1189 4. A Type 3 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xD 1191 Each IGMP Join Synch or IGMP Leave Synch route MUST carry exactly one 1192 EVI-RT EC. The EVI-RT EC carried by a particular route is 1193 constructed as follows. Each such route is the result of having 1194 received an IGMP Join or an IGMP Leave message from a particular BD. 1195 The route is said to be associated with that BD. For each BD, there 1196 is a corresponding RT that is used to ensure that routes "about" that 1197 BD are distributed to all PEs attached to that BD. So suppose a 1198 given IGMP Join Synch or Leave Synch route is associated with a given 1199 BD, say BD1, and suppose that the corresponding RT for BD1 is RT1. 1200 Then: 1202 o 0. If RT1 is a Transitive Two-Octet AS-specific EC, then the EVI- 1203 RT EC carried by the route is a Type 0 EVI-RT EC. The value field 1204 of the Type 0 EVI-RT EC is identical to the value field of RT1. 1206 o 1. If RT1 is a Transitive IPv4-Address-specific EC, then the EVI- 1207 RT EC carried by the route is a Type 1 EVI-RT EC. The value field 1208 of the Type 1 EVI-RT EC is identical to the value field of RT1. 1210 o 2. If RT1 is a Transitive Four-Octet-specific EC, then the EVI-RT 1211 EC carried by the route is a Type 2 EVI-RT EC. The value field of 1212 the Type 2 EVI-RT EC is identical to the value field of RT1. 1214 o 3. If RT1 is a Transitive IPv6-Address-specific EC, then the EVI- 1215 RT EC carried by the route is a Type 3 EVI-RT EC. The value field 1216 of the Type 3 EVI-RT EC is identical to the value field of RT1. 1218 An IGMP Join Synch or Leave Synch route MUST carry exactly one EVI-RT 1219 EC. 1221 Suppose a PE receives a particular IGMP Join Synch or IGMP Leave 1222 Synch route, say R1, and suppose that R1 carries an ES-Import RT that 1223 is one of the PE's Import RTs. If R1 has no EVI-RT EC, or has more 1224 than one EVI-RT EC, the PE MUST apply the "treat-as-withdraw" 1225 procedure of [RFC7606]. 1227 Note that an EVI-RT EC is not a Route Target Extended Community, is 1228 not visible to the RT Constrain mechanism [RFC4684], and is not 1229 intended to influence the propagation of routes by BGP. 1231 1 2 3 1232 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1234 | Type=0x06 | Sub-Type=n | RT associated with EVI | 1235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1236 | RT associated with the EVI (cont.) | 1237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1239 Where the value of 'n' is 0x0A, 0x0B, 0x0C, or 0x0D corresponding to 1240 EVI-RT type 0, 1, 2, or 3 respectively. 1242 9.6. Rewriting of RT ECs and EVI-RT ECs by ASBRs 1244 There are certain situations in which an ES is attached to a set of 1245 PEs that are not all in the same AS, or not all operated by the same 1246 provider. In some such situations, the RT that corresponds to a 1247 particular EVI may be different in each AS. If a route is propagated 1248 from AS1 to AS2, an ASBR at the AS1/AS2 border may be provisioned 1249 with a policy that removes the RTs that are meaningful in AS1 and 1250 replaces them with the corresponding (i.e., RTs corresponding to the 1251 same EVIs) RTs that are meaningful in AS2. This is known as RT- 1252 rewriting. 1254 Note that if a given route's RTs are rewritten, and the route carries 1255 an EVI-RT EC, the EVI-RT EC needs to be rewritten as well. 1257 9.7. BGP Error Handling 1259 If a received BGP update contains Flags not in accordance with IGMP/ 1260 MLD version-X expectation, the PE MUST apply the "treat-as-withdraw" 1261 procedure as per [RFC7606] 1263 If a received BGP update is malformed such that BGP route keys cannot 1264 be extracted, then BGP update MUST be considered as invalid. 1265 Receiving PE MUST apply the "Session reset" procedure of [RFC7606]. 1267 10. IGMP/MLD Immediate Leave 1269 IGMP MAY be configured with immediate leave option. This allows the 1270 device to remove the group entry from the multicast routing table 1271 immediately upon receiving a IGMP leave message for (x,G). In case 1272 of all active multi-homing while synchronizing the IGMP Leave state 1273 to redundancy peers, Maximum Response Time MAY be filled in as Zero. 1274 Implementations SHOULD have identical configuration across multi- 1275 homed peers. In case IGMP Leave Synch route is received with Maximum 1276 Response Time Zero, irrespective of local IGMP configuration it MAY 1277 be processed as an immediate leave. 1279 11. IGMP Version 1 Membership Report 1281 This document does not provide any detail about IGMPv1 processing. 1282 Multicast working group are in process of deprecating uses of IGMPv1. 1283 Implementations MUST only use IGMPv2 and above for IPv4 and MLDv1 and 1284 above for IPv6. IGMP V1 routes MUST be considered as invalid and the 1285 PE MUST apply the "treat-as-withdraw" procedure as per [RFC7606]. 1286 Initial version of document did mention use of IGMPv1 and flag had 1287 provision to support IGMPv1. There may be an implementation which is 1288 deployed as initial version of document, to interop flag has not been 1289 changed. 1291 12. Security Considerations 1293 This document describes a means to efficiently operate IGMP and MLD 1294 on a subnet constructed across multiple PODs or DCs via an EVPN 1295 solution. The security considerations for the operation of the 1296 underlying EVPN and BGP substrate are described in [RFC7432], and 1297 specific multicast considerations are outlined in [RFC6513] and 1298 [RFC6514]. The EVPN and associated IGMP proxy provides a single 1299 broadcast domain so the same security considerations of IGMPv2 1300 [RFC2236], [RFC3376], MLD [RFC2710], or MLDv2 [RFC3810] apply. 1302 13. IANA Considerations 1304 IANA has allocated the following codepoints from the EVPN Extended 1305 Community Sub-Types sub-registry of the BGP Extended Communities 1306 registry. 1308 0x09 Multicast Flags Extended Community [this document] 1309 0x0A EVI-RT Type 0 [this document] 1310 0x0B EVI-RT Type 1 [this document] 1311 0x0C EVI-RT Type 2 [this document] 1313 IANA is requested to allocate a new codepoint from the EVPN Extended 1314 Community sub-types registry for the following. 1316 0x0D EVI-RT Type 3 [this document] 1318 IANA has allocated the following EVPN route types from the EVPN Route 1319 Type registry. 1321 6 - Selective Multicast Ethernet Tag Route 1322 7 - Multicast Join Synch Route 1323 8 - Multicast Leave Synch Route 1325 The Multicast Flags Extended Community contains a 16-bit Flags field. 1326 The bits are numbered 0-15, from high-order to low-order. 1328 The registry should be initialized as follows: 1329 Bit Name Reference 1330 ---- -------------- ------------- 1331 0 - 13 Unassigned 1332 14 MLD Proxy Support This document 1333 15 IGMP Proxy Support This document 1335 The registration policy should be "First Come First Served". 1337 14. Acknowledgement 1339 The authors would like to thank Stephane Litkowski, Jorge Rabadan, 1340 Anoop Ghanwani, Jeffrey Haas, Krishna Muddenahally Ananthamurthy, 1341 Swadesh Agrawal for reviewing and providing valuable comment. 1343 15. Contributors 1345 Derek Yeung 1347 Arrcus 1349 Email: derek@arrcus.com 1351 16. References 1353 16.1. Normative References 1355 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1356 Requirement Levels", BCP 14, RFC 2119, 1357 DOI 10.17487/RFC2119, March 1997, 1358 . 1360 [RFC2236] Fenner, W., "Internet Group Management Protocol, Version 1361 2", RFC 2236, DOI 10.17487/RFC2236, November 1997, 1362 . 1364 [RFC2710] Deering, S., Fenner, W., and B. Haberman, "Multicast 1365 Listener Discovery (MLD) for IPv6", RFC 2710, 1366 DOI 10.17487/RFC2710, October 1999, 1367 . 1369 [RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. 1370 Thyagarajan, "Internet Group Management Protocol, Version 1371 3", RFC 3376, DOI 10.17487/RFC3376, October 2002, 1372 . 1374 [RFC3810] Vida, R., Ed. and L. Costa, Ed., "Multicast Listener 1375 Discovery Version 2 (MLDv2) for IPv6", RFC 3810, 1376 DOI 10.17487/RFC3810, June 2004, 1377 . 1379 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1380 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1381 2006, . 1383 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 1384 R., Patel, K., and J. Guichard, "Constrained Route 1385 Distribution for Border Gateway Protocol/MultiProtocol 1386 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 1387 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 1388 November 2006, . 1390 [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ 1391 BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 1392 2012, . 1394 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 1395 Encodings and Procedures for Multicast in MPLS/BGP IP 1396 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 1397 . 1399 [RFC6625] Rosen, E., Ed., Rekhter, Y., Ed., Hendrickx, W., and R. 1400 Qiu, "Wildcards in Multicast VPN Auto-Discovery Routes", 1401 RFC 6625, DOI 10.17487/RFC6625, May 2012, 1402 . 1404 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1405 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 1406 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 1407 2015, . 1409 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1410 Patel, "Revised Error Handling for BGP UPDATE Messages", 1411 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1412 . 1414 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1415 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1416 May 2017, . 1418 16.2. Informative References 1420 [I-D.ietf-bess-evpn-bum-procedure-updates] 1421 Zhang, Z., Lin, W., Rabadan, J., Patel, K., and A. 1422 Sajassi, "Updates on EVPN BUM Procedures", draft-ietf- 1423 bess-evpn-bum-procedure-updates-11 (work in progress), 1424 October 2021. 1426 [RFC4541] Christensen, M., Kimball, K., and F. Solensky, 1427 "Considerations for Internet Group Management Protocol 1428 (IGMP) and Multicast Listener Discovery (MLD) Snooping 1429 Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006, 1430 . 1432 Authors' Addresses 1434 Ali Sajassi 1435 Cisco Systems 1436 821 Alder Drive, 1437 MILPITAS, CALIFORNIA 95035 1438 UNITED STATES 1440 Email: sajassi@cisco.com 1442 Samir Thoria 1443 Cisco Systems 1444 821 Alder Drive, 1445 MILPITAS, CALIFORNIA 95035 1446 UNITED STATES 1448 Email: sthoria@cisco.com 1449 Mankamana Mishra 1450 Cisco Systems 1451 821 Alder Drive, 1452 MILPITAS, CALIFORNIA 95035 1453 UNITED STATES 1455 Email: mankamis@cisco.com 1457 Keyur PAtel 1458 Arrcus 1459 UNITED STATES 1461 Email: keyur@arrcus.com 1463 John Drake 1464 Juniper Networks 1466 Email: jdrake@juniper.net 1468 Wen Lin 1469 Juniper Networks 1471 Email: wlin@juniper.net