idnits 2.17.1 draft-ietf-bess-evpn-igmp-mld-proxy-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 7 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: 3. When the first hop PE receives an IGMP version-X Join first for (*,G) and then later it receives an IGMP version-Y Join for the same (*,G), then it MUST re-advertise the same EVPN SMET route with flag for version-Y set in addition to any previously-set version flag(s). In other words, the first hop PE MUST not withdraw the EVPN route before sending the new route because the flag field is not part of BGP route key processing. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: 5. When a PE receives an EVPN SMET route with more than one version flag set, it will generate the corresponding IGMP report for (*,G) for each version specified in the flags field. With multiple version flags set, there MUST not be source IP address in the receive EVPN route. If there is, then an error SHOULD be logged . If the v3 flag is set (in addition to v2), then the include/exclude flag MUST indicate "exclude". If not, then an error SHOULD be logged. The PE MUST generate an IGMP membership report (Join) for that (*,G) and each IGMP version in the version flag. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: A PE which has received an IGMP Join, would have synced the IGMP Join by the procedure defined in section 6.1. If a PE with local join state goes down or the PE to CE link goes down, it would lead to a mass withdraw of multicast routes. Remote PEs (PEs where these routes were remote IGMP Joins) SHOULD not remove the state immediately; instead General Query SHOULD be generated to refresh the states. There are several ways to Some of the way to detect failure at a peer, e.g. using IGP next hop tracking or ES route withdraw. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: This section describes the procedures used to construct the IGMP Leave Synch route. Support for this route type is optional. If a PE does not support this route, then it MUST not indicate that it supports 'IGMP proxy' in Multicast Flag extended community for the EVIs corresponding to its multi-homed Ethernet Segments. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: If a router does not support this specification, it MUST not add Multicast Flags Extended Community in BGP route. A router receiving BGP update , if M and I both flag are zero (0), the router MUST treat this Update as malformed . Receiver of such update MUST ignore the extended community. -- The document date (April 28, 2020) is 1459 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'ES' is mentioned on line 608, but not defined == Missing Reference: 'BD' is mentioned on line 608, but not defined Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS WorkGroup A. Sajassi 3 Internet-Draft S. Thoria 4 Intended status: Standards Track Cisco Systems 5 Expires: October 30, 2020 K. Patel 6 Arrcus 7 J. Drake 8 W. Lin 9 Juniper Networks 10 April 28, 2020 12 IGMP and MLD Proxy for EVPN 13 draft-ietf-bess-evpn-igmp-mld-proxy-05 15 Abstract 17 Ethernet Virtual Private Network (EVPN) solution is becoming 18 pervasive in data center (DC) applications for Network Virtualization 19 Overlay (NVO) and DC interconnect (DCI) services, and in service 20 provider (SP) applications for next generation virtual private LAN 21 services. 23 This draft describes how to support efficiently endpoints running 24 IGMP for the above services over an EVPN network by incorporating 25 IGMP proxy procedures on EVPN PEs. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on October 30, 2020. 44 Copyright Notice 46 Copyright (c) 2020 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (https://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 62 2. Specification of Requirements . . . . . . . . . . . . . . . . 4 63 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 4. IGMP/MLD Proxy . . . . . . . . . . . . . . . . . . . . . . . 5 65 4.1. Proxy Reporting . . . . . . . . . . . . . . . . . . . . . 6 66 4.1.1. IGMP/MLD Membership Report Advertisement in BGP . . . 6 67 4.1.2. IGMP/MLD Leave Group Advertisement in BGP . . . . . . 8 68 4.2. Proxy Querier . . . . . . . . . . . . . . . . . . . . . . 8 69 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 9 70 5.1. PE with only attached hosts/VMs for a given subnet . . . 10 71 5.2. PE with a mix of attached hosts/VMs and multicast source 11 72 5.3. PE with a mix of attached hosts/VMs, a multicast source 73 and a router . . . . . . . . . . . . . . . . . . . . . . 11 74 6. All-Active Multi-Homing . . . . . . . . . . . . . . . . . . . 11 75 6.1. Local IGMP/MLD Join Synchronization . . . . . . . . . . . 11 76 6.2. Local IGMP/MLD Leave Group Synchronization . . . . . . . 12 77 6.2.1. Remote Leave Group Synchronization . . . . . . . . . 13 78 6.2.2. Common Leave Group Synchronization . . . . . . . . . 13 79 6.3. Mass Withdraw of Multicast join Sync route in case of 80 failure . . . . . . . . . . . . . . . . . . . . . . . . . 14 81 7. Single-Active Multi-Homing . . . . . . . . . . . . . . . . . 14 82 8. Selective Multicast Procedures for IR tunnels . . . . . . . . 14 83 9. BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . 15 84 9.1. Selective Multicast Ethernet Tag Route . . . . . . . . . 15 85 9.1.1. Constructing the Selective Multicast Ethernet Tag 86 route . . . . . . . . . . . . . . . . . . . . . . . . 17 87 9.1.2. Default Selective Multicast Route . . . . . . . . . . 18 88 9.2. Multicast Join Synch Route . . . . . . . . . . . . . . . 19 89 9.2.1. Constructing the Multicast Join Synch Route . . . . . 21 90 9.3. Multicast Leave Synch Route . . . . . . . . . . . . . . . 22 91 9.3.1. Constructing the Multicast Leave Synch Route . . . . 24 92 9.4. Multicast Flags Extended Community . . . . . . . . . . . 26 93 9.5. EVI-RT Extended Community . . . . . . . . . . . . . . . . 27 94 9.6. Rewriting of RT ECs and EVI-RT ECs by ASBRs . . . . . . . 29 95 10. IGMP/MLD Immediate Leave . . . . . . . . . . . . . . . . . . 29 96 11. IGMP Version 1 Membership Request . . . . . . . . . . . . . . 30 97 12. Security Considerations . . . . . . . . . . . . . . . . . . . 30 98 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 99 14. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 31 100 15. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 31 101 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 102 16.1. Normative References . . . . . . . . . . . . . . . . . . 31 103 16.2. Informative References . . . . . . . . . . . . . . . . . 32 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 33 106 1. Introduction 108 Ethernet Virtual Private Network (EVPN) solution [RFC7432] is 109 becoming pervasive in data center (DC) applications for Network 110 Virtualization Overlay (NVO) and DC interconnect (DCI) services, and 111 in service provider (SP) applications for next generation virtual 112 private LAN services. 114 In DC applications, a point of delivery (POD) can consist of a 115 collection of servers supported by several top of rack (TOR) and 116 Spine switches. This collection of servers and switches are self 117 contained and may have their own control protocol for intra-POD 118 communication and orchestration. However, EVPN is used as standard 119 way of inter-POD communication for both intra-DC and inter-DC. A 120 subnet can span across multiple PODs and DCs. EVPN provides robust 121 multi-tenant solution with extensive multi-homing capabilities to 122 stretch a subnet (VLAN) across multiple PODs and DCs. There can be 123 many hosts/VMs ( several hundreds) attached to a subnet that is 124 stretched across several PODs and DCs. 126 These hosts/VMs express their interests in multicast groups on a 127 given subnet/VLAN by sending IGMP membership reports (Joins) for 128 their interested multicast group(s). Furthermore, an IGMP router 129 periodically sends membership queries to find out if there are hosts 130 on that subnet that are still interested in receiving multicast 131 traffic for that group. The IGMP/MLD Proxy solution described in 132 this draft accomplishes has three objectives: 134 1. Reduce flooding of IGMP messages: just like the ARP/ND 135 suppression mechanism in EVPN to reduce the flooding of ARP 136 messages over EVPN, it is also desired to have a mechanism to 137 reduce the flooding of IGMP messages (both Queries and Reports) 138 in EVPN. 140 2. Distributed anycast multicast proxy: it is desirable for the EVPN 141 network to act as a distributed anycast multicast router with 142 respect to IGMP/MLD proxy function for all the hosts attached to 143 that subnet. 145 3. Selective Multicast: to forward multicast traffic over EVPN 146 network such that it only gets forwarded to the PEs that have 147 interest in the multicast group(s), multicast traffic will not be 148 forwarded to the PEs that have no receivers attached to them for 149 that multicast group. This draft shows how this objective may be 150 achieved when Ingress Replication is used to distribute the 151 multicast traffic among the PEs. Procedures for supporting 152 selective multicast using P2MP tunnels can be found in [bum- 153 procedure-updates] 155 The first two objectives are achieved by using IGMP/MLD proxy on the 156 PE and the third objective is achieved by setting up a multicast 157 tunnel (e.g., ingress replication) only among the PEs that have 158 interest in that multicast group(s) based on the trigger from IGMP/ 159 MLD proxy processes. The proposed solutions for each of these 160 objectives are discussed in the following sections. 162 2. Specification of Requirements 164 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 165 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 166 "OPTIONAL" in this document are to be interpreted as described in BCP 167 14 [RFC2119] [RFC8174] when, and only when, they appear in all 168 capitals, as shown here. 170 3. Terminology 172 o POD: Point of Delivery 174 o ToR: Top of Rack 176 o NV: Network Virtualization 178 o NVO: Network Virtualization Overlay 180 o EVPN: Ethernet Virtual Private Network 182 o IGMP: Internet Group Management Protocol 184 o MLD: Multicast Listener Discovery 186 o EVI: An EVPN instance spanning the Provider Edge (PE) devices 187 participating in that EVPN 189 o MAC-VRF: A Virtual Routing and Forwarding table for Media Access 190 Control (MAC) addresses on a PE 192 o IR: Ingress Replication 193 o Ethernet Segment (ES): When a customer site (device or network) is 194 connected to one or more PEs via a set of Ethernet links, then 195 that set of links is referred to as an 'Ethernet Segment'. 197 o Ethernet Segment Identifier (ESI): A unique non-zero identifier 198 that identifies an Ethernet Segment is called an 'Ethernet Segment 199 Identifier'. 201 o PE: Provider Edge. 203 o BD: Broadcast Domain. As per [RFC7432], an EVI consists of a 204 single or multiple BDs. In case of VLAN-bundle and VLAN-aware 205 bundle service model, an EVI contains multiple BDs. Also, in this 206 document, BD and subnet are equivalent terms. 208 o Ethernet Tag: An Ethernet tag identifies a particular broadcast 209 domain, e.g., a VLAN. An EVPN instance consists of one or more 210 broadcast domains. 212 o Single-Active Redundancy Mode: When only a single PE, among all 213 the PEs attached to an Ethernet segment, is allowed to forward 214 traffic to/from that Ethernet segment for a given VLAN, then the 215 Ethernet segment is defined to be operating in Single-Active 216 redundancy mode. 218 o All-Active Redundancy Mode: When all PEs attached to an Ethernet 219 segment are allowed to forward known unicast traffic to/from that 220 Ethernet segment for a given VLAN, then the Ethernet segment is 221 defined to be operating in All-Active redundancy mode. 223 This document also assumes familiarity with the terminology of 224 [RFC7432]. Though most of the place this document uses term IGMP 225 membership request (Joins), the text applies equally for MLD 226 membership request too. Similarly, text for IGMPv2 applies to MLDv1 227 and text for IGMPv3 applies to MLDv2. IGMP / MLD version encoding in 228 BGP update is stated in Section 9 230 4. IGMP/MLD Proxy 232 The IGMP Proxy mechanism is used to reduce the flooding of IGMP 233 messages over an EVPN network similar to ARP proxy used in reducing 234 the flooding of ARP messages over EVPN. It also provides a 235 triggering mechanism for the PEs to setup their underlay multicast 236 tunnels. The IGMP Proxy mechanism consists of two components: 238 1. Proxy for IGMP Reports. 240 2. Proxy for IGMP Queries. 242 4.1. Proxy Reporting 244 When IGMP protocol is used between hosts/VMs and their first hop EVPN 245 router (EVPN PE), Proxy-reporting is used by the EVPN PE to summarize 246 (when possible) reports received from downstream hosts and propagate 247 them in BGP to other PEs that are interested in the information. 248 This is done by terminating the IGMP Reports in the first hop PE, and 249 translating and exchanging the relevant information among EVPN BGP 250 speakers. The information is again translated back to IGMP message 251 at the recipient EVPN speaker. Thus it helps create an IGMP overlay 252 subnet using BGP. In order to facilitate such an overlay, this 253 document also defines a new EVPN route type NLRI, the EVPN Selective 254 Multicast Ethernet Tag route, along with its procedures to help 255 exchange and register IGMP multicast groups Section 9. 257 4.1.1. IGMP/MLD Membership Report Advertisement in BGP 259 When a PE wants to advertise an IGMP membership report (Join) using 260 the BGP EVPN route, it follows the following rules (BGP encoding 261 stated in Section 9): 263 1. When the first hop PE receives several IGMP membership reports 264 (Joins), belonging to the same IGMP version, from different 265 attached hosts/VMs for the same (*,G) or (S,G), it only SHOULD 266 send a single BGP message corresponding to the very first IGMP 267 Join (BGP update as soon as possible) for that (*,G) or (S,G). 268 This is because BGP is a stateful protocol and no further 269 transmission of the same report is needed. If the IGMP Join is 270 for (*,G), then multicast group address MUST be sent along with 271 the corresponding version flag (v2 or v3) set. In case of 272 IGMPv3, the exclude flag MUST also needs to be set to indicate 273 that no source IP address to be excluded (include all 274 sources"*"). If the IGMP Join is for (S,G), then besides setting 275 multicast group address along with the version flag v3, the 276 source IP address and the include/exclude flag MUST be set. It 277 should be noted that when advertising the EVPN route for (S,G), 278 the only valid version flag is v3 (v2 flags MUST be set to zero). 280 2. When the first hop PE receives an IGMPv3 Join for (S,G) on a 281 given BD, it SHOULD advertise the corresponding EVPN Selective 282 Multicast Ethernet Tag (SMET) route regardless of whether the 283 source (S) is attached to itself or not in order to facilitate 284 the source move in the future. 286 3. When the first hop PE receives an IGMP version-X Join first for 287 (*,G) and then later it receives an IGMP version-Y Join for the 288 same (*,G), then it MUST re-advertise the same EVPN SMET route 289 with flag for version-Y set in addition to any previously-set 290 version flag(s). In other words, the first hop PE MUST not 291 withdraw the EVPN route before sending the new route because the 292 flag field is not part of BGP route key processing. 294 4. When the first hop PE receives an IGMP version-X Join first for 295 (*,G) and then later it receives an IGMPv3 Join for the same 296 multicast group address but for a specific source address S, then 297 the PE MUST advertise a new EVPN SMET route with v3 flag set (and 298 v2 reset). The include/exclude flag also need to be set 299 accordingly. Since source IP address is used as part of BGP 300 route key processing it is considered as a new BGP route 301 advertisement. 303 5. When a PE receives an EVPN SMET route with more than one version 304 flag set, it will generate the corresponding IGMP report for 305 (*,G) for each version specified in the flags field. With 306 multiple version flags set, there MUST not be source IP address 307 in the receive EVPN route. If there is, then an error SHOULD be 308 logged . If the v3 flag is set (in addition to v2), then the 309 include/exclude flag MUST indicate "exclude". If not, then an 310 error SHOULD be logged. The PE MUST generate an IGMP membership 311 report (Join) for that (*,G) and each IGMP version in the version 312 flag. 314 6. When a PE receives a list of EVPN SMET NLRIs in its BGP update 315 message, each with a different source IP address and the same 316 multicast group address, and the version flag is set to v3, then 317 the PE generates an IGMPv3 membership report with a record 318 corresponding to the list of source IP addresses and the group 319 address along with the proper indication of inclusion/exclusion. 321 7. Upon receiving EVPN SMET route(s) and before generating the 322 corresponding IGMP Join(s), the PE checks to see whether it has 323 any CE multicast router for that BD on any of its ES's . The PE 324 provides such a check by listening for PIM Hello messages on that 325 AC (i.e, ES,BD). If the PE does have the router's ACs, then the 326 generated IGMP Join(s) are sent to those ACs. If it doesn't have 327 any of the router's AC, then no IGMP Join(s) needs to be 328 generated. This is because sending IGMP Joins to other hosts can 329 result in unintentionally preventing a host from joining a 330 specific multicast group using IGMPv2 - i.e., if the PE does not 331 receive a join from the host it will not forward multicast data 332 to it. Per [RFC4541] , when an IGMPv2 host receives a membership 333 report for a group address that it intends to join, the host will 334 suppress its own membership report for the same group, and if the 335 PE does not receive an IGMP Join from host it will not forward 336 multicast data to it. In other words, an IGMPv2 Join MUST NOT be 337 sent on an AC that does not lead to a CE multicast router. This 338 message suppression is a requirement for IGMPv2 hosts. This is 339 not a problem for hosts running IGMPv3 because there is no 340 suppression of IGMP Membership reports. 342 4.1.2. IGMP/MLD Leave Group Advertisement in BGP 344 When a PE wants to withdraw an EVPN SMET route corresponding to an 345 IGMPv2 Leave Group (Leave) or IGMPv3 "Leave" equivalent message, it 346 follows the following rules: 348 1. When a PE receives an IGMPv2 Leave Group or its "Leave" 349 equivalent message for IGMPv3 from its attached host, it checks 350 to see if this host is the last host that is interested in this 351 multicast group by sending a query for the multicast group. If 352 the host was indeed the last one (i.e. no responses are received 353 for the query), then the PE MUST re-advertises EVPN SMET 354 Multicast route with the corresponding version flag reset. If 355 this is the last version flag to be reset, then instead of re- 356 advertising the EVPN route with all version flags reset, the PE 357 MUST withdraws the EVPN route for that (*,G). 359 2. When a PE receives an EVPN SMET route for a given (*,G), it 360 compares the received version flags from the route with its per- 361 PE stored version flags. If the PE finds that a version flag 362 associated with the (*,G) for the remote PE is reset, then the PE 363 MUST generate IGMP Leave for that (*,G) toward its local 364 interface (if any) attached to the multicast router for that 365 multicast group. It should be noted that the received EVPN route 366 SHOULD at least have one version flag set. If all version flags 367 are reset, it is an error because the PE should have received an 368 EVPN route withdraw for the last version flag. Error MUST be 369 considered as BGP error and SHOULD be handled as per [RFC7606]. 371 3. When a PE receives an EVPN SMET route withdraw, it removes the 372 remote PE from its OIF list for that multicast group and if there 373 are no more OIF entries for that multicast group (either locally 374 or remotely), then the PE MUST stop responding to queries from 375 the locally attached router (if any). If there is a source for 376 that multicast group, the PE stops sending multicast traffic for 377 that source. 379 4.2. Proxy Querier 381 As mentioned in the previous sections, each PE MUST have proxy 382 querier functionality for the following reasons: 384 1. To enable the collection of EVPN PEs providing L2VPN service to 385 act as distributed multicast router with Anycast IP address for 386 all attached hosts/VMs in that subnet. 388 2. To enable suppression of IGMP membership reports and queries over 389 MPLS/IP core. 391 5. Operation 393 Consider the EVPN network of Figure-1, where there is an EVPN 394 instance configured across the PEs shown in this figure (namely PE1, 395 PE2, and PE3). Let's consider that this EVPN instance consists of a 396 single bridge domain (single subnet) with all the hosts, sources, and 397 the multicast router connected to this subnet. PE1 only has hosts 398 connected to it. PE2 has a mix of hosts and a multicast source. PE3 399 has a mix of hosts, a multicast source, and a multicast router. 400 Furthermore, let's consider that for (S1,G1), R1 is used as the 401 multicast router. The following subsections describe the IGMP proxy 402 operation in different PEs with regard to whether the locally 403 attached devices for that subnet are: 405 o only hosts/VMs 407 o mix of hosts/VMs and multicast source 409 o mix of hosts/VMs, multicast source, and multicast router 410 +--------------+ 411 | | 412 | | 413 +----+ | | +----+ 414 H1:(*,G1)v2 ---| | | | | |---- H6(*,G1)v2 415 H2:(*,G1)v2 ---| PE1| | IP/MPLS | | PE2|---- H7(S2,G2)v3 416 H3:(*,G1)v3 ---| | | Network | | |---- S2 417 H4:(S2,G2)v3 --| | | | | | 418 +----+ | | +----+ 419 | | 420 +----+ | | 421 H5:(S1,G1)v3 --| | | | 422 S1 ---| PE3| | | 423 R1 ---| | | | 424 +----+ | | 425 | | 426 +--------------+ 428 Figure 1: EVPN network 430 5.1. PE with only attached hosts/VMs for a given subnet 432 When PE1 receives an IGMPv2 Join Report from H1, it does not forward 433 this join to any of its other ports (for this subnet) because all 434 these local ports are associated with the hosts/VMs. PE1 sends an 435 EVPN Multicast Group route corresponding to this join for (*,G1) and 436 setting v2 flag. This EVPN route is received by PE2 and PE3 that are 437 the members of the same BD (i.e., same EVI in case of VLAN-based 438 service or EVI,VLAN in case of VLAN-aware bundle service). PE3 439 reconstructs the IGMPv2 Join Report from this EVPN BGP route and only 440 sends it to the port(s) with multicast routers attached to it (for 441 that subnet). In this example, PE3 sends the reconstructed IGMPv2 442 Join Report for (*,G1) only to R1. Furthermore, even though PE2 443 receives the EVPN BGP route, it does not send it to any of its ports 444 for that subnet; viz, ports associated with H6 and H7. 446 When PE1 receives the second IGMPv2 Join from H2 for the same 447 multicast group (*,G1), it only adds that port to its OIF list but it 448 doesn't send any EVPN BGP route because there is no change in 449 information. However, when it receives the IGMPv3 Join from H3 for 450 the same (*,G1). Besides adding the corresponding port to its OIF 451 list, it re-advertises the previously sent EVPN SMET route with the 452 v3 and exclude flag set. 454 Finally when PE1 receives the IMGMPv3 Join from H4 for (S2,G2), it 455 advertises a new EVPN SMET route corresponding to it. 457 5.2. PE with a mix of attached hosts/VMs and multicast source 459 The main difference in this case is that when PE2 receives the IGMPv3 460 Join from H7 for (S2,G2), it does advertise it in BGP to support 461 source move even though PE2 knows that S2 is attached to its local 462 AC. PE2 adds the port associated with H7 to its OIF list for 463 (S2,G2). The processing for IGMPv2 received from H6 is the same as 464 the IGMPv2 Join described in previous section. 466 5.3. PE with a mix of attached hosts/VMs, a multicast source and a 467 router 469 The main difference in this case relative to the previous two 470 sections is that IGMP v2/v3 Join messages received locally needs to 471 be sent to the port associated with router R1. Furthermore, the 472 Joins received via BGP (SMET) need to be passed to the R1 port but 473 filtered for all other ports. 475 6. All-Active Multi-Homing 477 Because the LAG flow hashing algorithm used by the CE is unknown at 478 the PE, in an All-Active redundancy mode it must be assumed that the 479 CE can send a given IGMP message to any one of the multi-homed PEs, 480 either DF or non-DF; i.e., different IGMP Join messages can arrive at 481 different PEs in the redundancy group and furthermore their 482 corresponding Leave messages can arrive at PEs that are different 483 from the ones that received the Join messages. Therefore, all PEs 484 attached to a given ES must coordinate IGMP Join and Leave Group 485 (x,G) state, where x may be either '*' or a particular source S, for 486 each BD on that ES. This allows the DF for that [ES,BD] to correctly 487 advertise or withdraw a Selective Multicast Ethernet Tag (SMET) route 488 for that (x,G) group in that BD when needed. All-Active multihoming 489 PEs for a given ES MUST support IGMP synchronization procedures 490 described in this section if they need to perform IGMP proxy for 491 hosts connected to that ES. 493 6.1. Local IGMP/MLD Join Synchronization 495 When a PE, either DF or non-DF, receives on a given multihomed ES 496 operating in All-Active redundancy mode, an IGMP Membership Report 497 for (x,G), it determines the BD to which the IGMP Membership Report 498 belongs. If the PE doesn't already have local IGMP Join (x,G) state 499 for that BD on that ES, it MUST instantiate local IGMP Join (x,G) 500 state and MUST advertise a BGP IGMP Join Synch route for that 501 [ES,BD]. Local IGMP Join (x, G) state refers to IGMP Join (x,G) 502 state that is created as a result of processing an IGMP Membership 503 Report for (x,G). 505 The IGMP Join Synch route MUST carry the ES-Import RT for the ES on 506 which the IGMP Membership Report was received. Thus it MUST only be 507 sent to the PEs attached to that ES and not any other PEs. 509 When a PE, either DF or non-DF, receives an IGMP Join Synch route it 510 installs that route and if it doesn't already have IGMP Join (x,G) 511 state for that [ES,BD], it MUST instantiate that IGMP Join (x,G) 512 state - i.e., IGMP Join (x,G) state is the union of the local IGMP 513 Join (x,G) state and the installed IGMP Join Synch route. If the DF 514 did not already advertise (originate) a SMET route for that (x,G) 515 group in that BD, it MUST do so now. 517 When a PE, either DF or non-DF, deletes its local IGMP Join (x, G) 518 state for that [ES,BD], it MUST withdraw its BGP IGMP Join Synch 519 route for that [ES,BD]. 521 When a PE, either DF or non-DF, receives the withdrawal of an IGMP 522 Join Synch route from another PE it MUST remove that route. When a 523 PE has no local IGMP Join (x,G) state and it has no installed IGMP 524 Join Synch routes, it MUST remove IGMP Join (x,G) state for that 525 [ES,BD]. If the DF no longer has IGMP Join (x,G) state for that BD 526 on any ES for which it is DF, it MUST withdraw its SMET route for 527 that (x,G) group in that BD. 529 In other words, a PE advertises an SMET route for that (x,G) group in 530 that BD when it has IGMP Join (x,G) state in that BD on at least one 531 ES for which it is DF and it withdraws that SMET route when it does 532 not have IGMP Join (x,G) state in that BD on any ES for which it is 533 DF. 535 6.2. Local IGMP/MLD Leave Group Synchronization 537 When a PE, either DF or non-DF, receives, on a given multihomed ES 538 operating in All-Active redundancy mode, an IGMP Leave Group message 539 for (x,G) from the attached CE, it determines the BD to which the 540 IGMPv2 Leave Group belongs. Regardless of whether it has IGMP Join 541 (x,G) state for that [ES,BD], it initiates the (x,G) leave group 542 synchronization procedure, which consists of the following steps: 544 1. It computes the Maximum Response Time, which is the duration of 545 (x,G) leave group synchronization procedure. This is the product 546 of two locally configured values, Last Member Query Count and 547 Last Member Query Interval (described in Section 3 of [RFC2236]), 548 plus a delta corresponding to the time it takes for a BGP 549 advertisement to propagate between the PEs attached to the 550 multihomed ES (delta is a consistently configured value on all 551 PEs attached to the multihomed ES). 553 2. It starts the Maximum Response Time timer. Note that the receipt 554 of subsequent IGMP Leave Group messages or BGP Leave Synch routes 555 for (x,G) do not change the value of a currently running Maximum 556 Response Time timer and are ignored by the PE. 558 3. It initiates the Last Member Query procedure described in 559 Section 3 of [RFC2236]; viz, it sends a number of Group-Specific 560 Query (x,G) messages (Last Member Query Count) at a fixed 561 interval (Last Member Query Interval) to the attached CE. 563 4. It advertises an IGMP Leave Synch route for that that [ES,BD]. 564 This route notifies the other multihomed PEs attached to the 565 given multihomed ES that it has initiated an (x,G) leave group 566 synchronization procedure; i.e., it carries the ES-Import RT for 567 the ES on which the IGMP Leave Group was received. It also 568 contains the Maximum Response Time and the Leave Group 569 Synchronization Procedure Sequence number. The latter identifies 570 the specific (x,G) leave group synchronization procedure 571 initiated by the advertising PE, which increments the value 572 whenever it initiates a procedure. 574 5. When the Maximum Response Timer expires, the PE that has 575 advertised the IGMP Leave Synch route withdraws it. 577 6.2.1. Remote Leave Group Synchronization 579 When a PE, either DF or non-DF, receives an IGMP Leave Synch route it 580 installs that route and it starts a timer for (x,G) on the specified 581 [ES,BD] whose value is set to the Maximum Response Time in the 582 received IGMP Leave Synch route. Note that the receipt of subsequent 583 IGMPv2 Leave Group messages or BGP Leave Synch routes for (x,G) do 584 not change the value of a currently running Maximum Response Time 585 timer and are ignored by the PE. 587 6.2.2. Common Leave Group Synchronization 589 If a PE attached to the multihomed ES receives an IGMP Membership 590 Report for (x,G) before the Maximum Response Time timer expires, it 591 advertises a BGP IGMP Join Synch route for that [ES,BD]. If it 592 doesn't already have local IGMP Join (x, G) state for that [ES, BD], 593 it instantiates local IGMP Join (x,G) state. If the DF is not 594 currently advertising (originating) a SMET route for that (x,G) group 595 in that BD, it does so now. 597 If a PE attached to the multihomed ES receives an IGMP Join Synch 598 route for (x,G) before the Maximum Response Time timer expires, it 599 installs that route and if it doesn't already have IGMP Join (x,G) 600 state for that BD on that ES, it instantiates that IGMP Join (x,G) 601 state. If the DF has not already advertised (originated) a SMET 602 route for that (x,G) group in that BD, it does so now. 604 When the Maximum Response Timer expires a PE that has advertised an 605 IGMP Leave Synch route, withdraws it. Any PE attached to the 606 multihomed ES, that started the Maximum Response Time and has no 607 local IGMP Join (x,G) state and no installed IGMP Join Synch routes, 608 it removes IGMP Join (x,G) state for that [ES,BD]. If the DF no 609 longer has IGMP Join (x,G) state for that BD on any ES for which it 610 is DF, it withdraws its SMET route for that (x,G) group in that BD. 612 6.3. Mass Withdraw of Multicast join Sync route in case of failure 614 A PE which has received an IGMP Join, would have synced the IGMP Join 615 by the procedure defined in section 6.1. If a PE with local join 616 state goes down or the PE to CE link goes down, it would lead to a 617 mass withdraw of multicast routes. Remote PEs (PEs where these 618 routes were remote IGMP Joins) SHOULD not remove the state 619 immediately; instead General Query SHOULD be generated to refresh the 620 states. There are several ways to Some of the way to detect failure 621 at a peer, e.g. using IGP next hop tracking or ES route withdraw. 623 7. Single-Active Multi-Homing 625 Note that to facilitate state synchronization after failover, the PEs 626 attached to a mutihomed ES operating in Single-Active redundancy mode 627 SHOULD also coordinate IGMP Join (x,G) state. In this case all IGMP 628 Join messages are received by the DF and distributed to the non-DF 629 PEs using the procedures described above. 631 8. Selective Multicast Procedures for IR tunnels 633 If an ingress PE uses ingress replication, then for a given (x,G) 634 group in a given BD: 636 1. It sends (x,G) traffic to the set of PEs not supporting IGMP 637 Proxy. This set consists of any PE that has advertised an 638 Inclusive Multicast Tag route for the BD without the "IGMP Proxy 639 Support" flag. 641 2. It sends (x,G) traffic to the set of PEs supporting IGMP Proxy 642 and having listeners for that (x,G) group in that BD. This set 643 consists of any PE that has advertised an Inclusive Multicast Tag 644 route for the BD with the "IGMP Proxy Support" flag and that has 645 advertised a SMET route for that (x,G) group in that BD. 647 If an ingress PE's Selective P-Tunnel for a given BD uses P2MP and 648 all of the PEs in the BD support that tunnel type and IGMP proxy, 649 then for a given (x,G) group in a given BD it sends (x,G) traffic 650 using the Selective P-Tunnel for that (x,G) group in that BD. This 651 tunnel includes those PEs that have advertised a SMET route for that 652 (x,G) group on that BD (for Selective P-tunnel) but it may include 653 other PEs as well (for Aggregate Selective P-tunnel). 655 9. BGP Encoding 657 This document defines three new BGP EVPN routes to carry IGMP 658 membership reports. The route type is known as: 660 + 6 - Selective Multicast Ethernet Tag Route 662 + 7 - Multicast Join Synch Route 664 + 8 - Multicast Leave Synch Route 666 The detailed encoding and procedures for this route type are 667 described in subsequent sections. 669 9.1. Selective Multicast Ethernet Tag Route 671 A Selective Multicast Ethernet Tag route type specific EVPN NLRI 672 consists of the following: 674 +---------------------------------------+ 675 | RD (8 octets) | 676 +---------------------------------------+ 677 | Ethernet Tag ID (4 octets) | 678 +---------------------------------------+ 679 | Multicast Source Length (1 octet) | 680 +---------------------------------------+ 681 | Multicast Source Address (variable) | 682 +---------------------------------------+ 683 | Multicast Group Length (1 octet) | 684 +---------------------------------------+ 685 | Multicast Group Address (Variable) | 686 +---------------------------------------+ 687 | Originator Router Length (1 octet) | 688 +---------------------------------------+ 689 | Originator Router Address (variable) | 690 +---------------------------------------+ 691 | Flags (1 octet) | 692 +---------------------------------------+ 694 For the purpose of BGP route key processing, all the fields are 695 considered to be part of the prefix in the NLRI except for the one- 696 octet flag field. The Flags fields are defined as follows: 698 0 1 2 3 4 5 6 7 699 +--+--+--+--+--+--+--+--+ 700 | reserved |IE|v3|v2|v1| 701 +--+--+--+--+--+--+--+--+ 703 o The least significant bit, bit 7 indicates support for IGMP 704 version 1. Since IGMP V1 is being deprecated , sender MUST set it 705 as 0 for IGMP and receiver MUST ignore it. 707 o The second least significant bit, bit 6 indicates support for IGMP 708 version 2. 710 o The third least significant bit, bit 5 indicates support for IGMP 711 version 3. 713 o The fourth least significant bit, bit 4 indicates whether the 714 (S,G) information carried within the route-type is of an Include 715 Group type (bit value 0) or an Exclude Group type (bit value 1). 716 The Exclude Group type bit MUST be ignored if bit 5 is not set. 718 o This EVPN route type is used to carry tenant IGMP multicast group 719 information. The flag field assists in distributing IGMP 720 membership interest of a given host/VM for a given multicast 721 route. The version bits help associate IGMP version of receivers 722 participating within the EVPN domain. 724 o The include/exclude bit helps in creating filters for a given 725 multicast route. 727 o If route is used for IPv6 (MLD) then bit 7 indicates support for 728 MLD version 1. The second least significant bit, bit 6 indicates 729 support for MLD version 2. Since there is no MLD version 3, in 730 case of IPv6 route third least significant bit MUST be 0. In case 731 of IPv6 routes, the fourth least significant bit MUST be ignored 732 if bit 6 is not set. 734 o Reserve bit SHOULD be set to 0 by sender. And receiver SHOULD 735 ignore the reserve bit. 737 9.1.1. Constructing the Selective Multicast Ethernet Tag route 739 This section describes the procedures used to construct the Selective 740 Multicast Ethernet Tag (SMET) route. 742 The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364] . The 743 value field comprises an IP address of the PE (typically, the 744 loopback address) followed by a number unique to the PE. 746 The Ethernet Tag ID MUST be set as follows: 748 o EVI is VLAN-Based or VLAN Bundle service - set to 0 750 o EVI is VLAN-Aware Bundle service without translation - set to the 751 customer VID for that BD 753 o EVI is VLAN-Aware Bundle service with translation - set to the 754 normalized Ethernet Tag ID - e.g., normalized VID 756 The Multicast Source Length MUST be set to length of the multicast 757 Source address in bits. If the Multicast Source Address field 758 contains an IPv4 address, then the value of the Multicast Source 759 Length field is 32. If the Multicast Source Address field contains 760 an IPv6 address, then the value of the Multicast Source Length field 761 is 128. In case of a (*, G) Join, the Multicast Source Length is set 762 to 0. 764 The Multicast Source Address is the source IP address from the IGMP 765 membership report. In case of a (*, G), this field is not used. 767 The Multicast Group Length MUST be set to length of multicast group 768 address in bits. If the Multicast Group Address field contains an 769 IPv4 address, then the value of the Multicast Group Length field is 770 32. If the Multicast Group Address field contains an IPv6 address, 771 then the value of the Multicast Group Length field is 128. 773 The Multicast Group Address is the Group address from the IGMP or MLD 774 membership report. 776 The Originator Router Length is the length of the Originator Router 777 Address in bits. 779 The Originator Router Address is the IP address of router originating 780 this route. The SMET Originator Router IP address MUST match that of 781 the IMET (or SPMSI AD) route originated for the same EVI by the same 782 downstream PE. 784 The Flags field indicates the version of IGMP protocol from which the 785 membership report was received. It also indicates whether the 786 multicast group had the INCLUDE or EXCLUDE bit set. 788 Reserve bit MUST be set to 0. They can be defined in future by other 789 document. 791 IGMP is used to receive group membership information from hosts/VMs 792 by TORs. Upon receiving the hosts/VMs expression of interest of a 793 particular group membership, this information is then forwarded using 794 SMET route. The NLRI also keeps track of receiver's IGMP protocol 795 version and any source filtering for a given group membership. All 796 EVPN SMET routes are announced with per- EVI Route Target extended 797 communities. 799 When a router that receives a BGP Update that contains the Selective 800 Multicast Route flag with its Partial bit set (Not following this 801 specification) determines that the route is malformed, the router 802 SHOULD treat this Update as malformed . Error MUST be considered as 803 BGP error and SHOULD be discarded as per [RFC7606]. An 804 implementation SHOULD provide debugging facilities to permit issues 805 caused by a malformed join sync route to be diagnosed. At a minimum, 806 such facilities MUST include logging an error when such an route is 807 detected. 809 9.1.2. Default Selective Multicast Route 811 If there is multicast router connected behind the EVPN domain, the PE 812 MAY originate a default SMET (*,*) to get all multicast traffic in 813 domain. 815 +--------------+ 816 | | 817 | | 818 | | +----+ 819 | | | |---- H1(*,G1)v2 820 | IP/MPLS | | PE1|---- H2(S2,G2)v3 821 | Network | | |---- S2 822 | | | | 823 | | +----+ 824 | | 825 +----+ | | 826 +----+ | | | | 827 | | S1 ---| PE2| | | 828 |PIM |----R1 ---| | | | 829 |ASM | +----+ | | 830 | | | | 831 +----+ +--------------+ 833 Figure 2: Multicast Router behind EVPN domain 835 Consider the EVPN network of Figure-2, where there is an EVPN 836 instance configured across the PEs. Lets consider PE2 is connected 837 to multicast router R1 and there is a network running PIM ASM behind 838 R1. If there are receivers behind the PIM ASM network, the PIM Join 839 would be forwarded to the PIM RP (Rendezvous Point). If receivers 840 behind PIM ASM network are interested in a multicast flow originated 841 by multicast source S2 (behind PE1), it is necessary for PE2 to 842 receive multicast traffic. In this case PE2 MUST originate a (*,*) 843 SMET route to receive all of the multicast traffic in the EVPN 844 domain. 846 9.2. Multicast Join Synch Route 848 This EVPN route type is used to coordinate IGMP Join (x,G) state for 849 a given BD between the PEs attached to a given ES operating in All- 850 Active (or Single-Active) redundancy mode and it consists of 851 following: 853 +--------------------------------------------------+ 854 | RD (8 octets) | 855 +--------------------------------------------------+ 856 | Ethernet Segment Identifier (10 octets) | 857 +--------------------------------------------------+ 858 | Ethernet Tag ID (4 octets) | 859 +--------------------------------------------------+ 860 | Multicast Source Length (1 octet) | 861 +--------------------------------------------------+ 862 | Multicast Source Address (variable) | 863 +--------------------------------------------------+ 864 | Multicast Group Length (1 octet) | 865 +--------------------------------------------------+ 866 | Multicast Group Address (Variable) | 867 +--------------------------------------------------+ 868 | Originator Router Length (1 octet) | 869 +--------------------------------------------------+ 870 | Originator Router Address (variable) | 871 +--------------------------------------------------+ 872 | Flags (1 octet) | 873 +--------------------------------------------------+ 875 For the purpose of BGP route key processing, all the fields are 876 considered to be part of the prefix in the NLRI except for the one- 877 octet Flags field, whose fields are defined as follows: 879 0 1 2 3 4 5 6 7 880 +--+--+--+--+--+--+--+--+ 881 | reserved |IE|v3|v2|v1| 882 +--+--+--+--+--+--+--+--+ 884 o The least significant bit, bit 7 indicates support for IGMP 885 version 1. 887 o The second least significant bit, bit 6 indicates support for IGMP 888 version 2. 890 o The third least significant bit, bit 5 indicates support for IGMP 891 version 3. 893 o The fourth least significant bit, bit 4 indicates whether the (S, 894 G) information carried within the route-type is of Include Group 895 type (bit value 0) or an Exclude Group type (bit value 1). The 896 Exclude Group type bit MUST be ignored if bit 5 is not set. 898 o Reserve bit MUST be set to 0. They can be defined in future by 899 other document. 901 The Flags field assists in distributing IGMP membership interest of a 902 given host/VM for a given multicast route. The version bits help 903 associate IGMP version of receivers participating within the EVPN 904 domain. The include/exclude bit helps in creating filters for a 905 given multicast route. 907 If route is being prepared for IPv6 (MLD) then bit 7 indicates 908 support for MLD version 1. The second least significant bit, bit 6 909 indicates support for MLD version 2. Since there is no MLD version 910 3, in case of IPv6 route third least significant bit MUST be 0. In 911 case of IPv6 route, the fourth least significant bit MUST be ignored 912 if bit 6 is not set. 914 9.2.1. Constructing the Multicast Join Synch Route 916 This section describes the procedures used to construct the IGMP Join 917 Synch route. Support for this route type is optional. If a PE does 918 not support this route, then it MUST NOT indicate that it supports 919 'IGMP proxy' in the Multicast Flag extended community for the EVIs 920 corresponding to its multi-homed Ethernet Segments (ESs). 922 An IGMP Join Synch route MUST carry exactly one ES-Import Route 923 Target extended community, the one that corresponds to the ES on 924 which the IGMP Join was received. It MUST also carry exactly one 925 EVI-RT EC, the one that corresponds to the EVI on which the IGMP Join 926 was received. See Section 9.5 for details on how to encode and 927 construct the EVI-RT EC. 929 The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364] . The 930 value field comprises an IP address of the PE (typically, the 931 loopback address) followed by a number unique to the PE. 933 The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet 934 value defined for the ES. 936 The Ethernet Tag ID MUST be set as follows: 938 o EVI is VLAN-Based or VLAN Bundle service - set to 0 940 o EVI is VLAN-Aware Bundle service without translation - set to the 941 customer VID for the BD 943 o EVI is VLAN-Aware Bundle service with translation - set to the 944 normalized Ethernet Tag ID - e.g., normalized VID 946 The Multicast Source length MUST be set to length of Multicast Source 947 address in bits. If the Multicast Source field contains an IPv4 948 address, then the value of the Multicast Source Length field is 32. 949 If the Multicast Source field contains an IPv6 address, then the 950 value of the Multicast Source Length field is 128. In case of a 951 (*,G) Join, the Multicast Source Length is set to 0. 953 The Multicast Source is the Source IP address of the IGMP membership 954 report. In case of a (*, G) Join, this field does not exist. 956 The Multicast Group length MUST be set to length of multicast group 957 address in bits. If the Multicast Group field contains an IPv4 958 address, then the value of the Multicast Group Length field is 32. 959 If the Multicast Group field contains an IPv6 address, then the value 960 of the Multicast Group Length field is 128. 962 The Multicast Group is the Group address of the IGMP membership 963 report. 965 The Originator Router Length is the length of the Originator Router 966 address in bits. 968 The Originator Router Address is the IP address of Router Originating 969 the prefix. 971 The Flags field indicates the version of IGMP protocol from which the 972 membership report was received. It also indicates whether the 973 multicast group had INCLUDE or EXCLUDE bit set. 975 Reserve bit MUST be set to 0. They can be defined in future by other 976 document. 978 When a multihomed router that receives a BGP Update that contains the 979 Multicast Join Sync Route flag with its Partial bit set (Not 980 following this specification) determines that the route is malformed, 981 the router SHOULD treat this Update as malformed . Error MUST be 982 considered as BGP error and SHOULD be discarded as per [RFC7606]. An 983 implementation SHOULD provide debugging facilities to permit issues 984 caused by a malformed join sync route to be diagnosed. At a minimum, 985 such facilities MUST include logging an error when such an route is 986 detected. 988 9.3. Multicast Leave Synch Route 990 This EVPN route type is used to coordinate IGMP Leave Group (x,G) 991 state for a given BD between the PEs attached to a given ES operating 992 in All-Active (or Single-Active) redundancy mode and it consists of 993 following: 995 +--------------------------------------------------+ 996 | RD (8 octets) | 997 +--------------------------------------------------+ 998 | Ethernet Segment Identifier (10 octets) | 999 +--------------------------------------------------+ 1000 | Ethernet Tag ID (4 octets) | 1001 +--------------------------------------------------+ 1002 | Multicast Source Length (1 octet) | 1003 +--------------------------------------------------+ 1004 | Multicast Source Address (variable) | 1005 +--------------------------------------------------+ 1006 | Multicast Group Length (1 octet) | 1007 +--------------------------------------------------+ 1008 | Multicast Group Address (Variable) | 1009 +--------------------------------------------------+ 1010 | Originator Router Length (1 octet) | 1011 +--------------------------------------------------+ 1012 | Originator Router Address (variable) | 1013 +--------------------------------------------------+ 1014 | Reserved (4 octet) | 1015 +--------------------------------------------------+ 1016 | Maximum Response Time (1 octet) | 1017 +--------------------------------------------------+ 1018 | Flags (1 octet) | 1019 +--------------------------------------------------+ 1021 For the purpose of BGP route key processing, all the fields are 1022 considered to be part of the prefix in the NLRI except for the 1023 Reserved, Maximum Response Time and the one-octet Flags field, whose 1024 fields are defined as follows: 1026 0 1 2 3 4 5 6 7 1027 +--+--+--+--+--+--+--+--+ 1028 | reserved |IE|v3|v2|v1| 1029 +--+--+--+--+--+--+--+--+ 1031 o The least significant bit, bit 7 indicates support for IGMP 1032 version 1. 1034 o The second least significant bit, bit 6 indicates support for IGMP 1035 version 2. 1037 o The third least significant bit, bit 5 indicates support for IGMP 1038 version 3. 1040 o The fourth least significant bit, bit 4 indicates whether the (S, 1041 G) information carried within the route-type is of Include Group 1042 type (bit value 0) or an Exclude Group type (bit value 1). The 1043 Exclude Group type bit MUST be ignored if bit 5 is not set. 1045 o Reserve bit MUST be set to 0. They can be defined in future by 1046 other document. 1048 The Flags field assists in distributing IGMP membership interest of a 1049 given host/VM for a given multicast route. The version bits help 1050 associate IGMP version of receivers participating within the EVPN 1051 domain. The include/exclude bit helps in creating filters for a 1052 given multicast route. 1054 If route is being prepared for IPv6 (MLD) then bit 7 indicates 1055 support for MLD version 1. The second least significant bit, bit 6 1056 indicates support for MLD version 2. Since there is no MLD version 1057 3, in case of IPv6 route third least significant bit MUST be 0. In 1058 case of IPv6 route, the fourth least significant bit MUST be ignored 1059 if bit 6 is not set. 1061 Reserve bit in flag MUST be set to 0. They can be defined in future 1062 by other document. 1064 9.3.1. Constructing the Multicast Leave Synch Route 1066 This section describes the procedures used to construct the IGMP 1067 Leave Synch route. Support for this route type is optional. If a PE 1068 does not support this route, then it MUST not indicate that it 1069 supports 'IGMP proxy' in Multicast Flag extended community for the 1070 EVIs corresponding to its multi-homed Ethernet Segments. 1072 An IGMP Leave Synch route MUST carry exactly one ES-Import Route 1073 Target extended community, the one that corresponds to the ES on 1074 which the IGMP Leave was received. It MUST also carry exactly one 1075 EVI-RT EC, the one that corresponds to the EVI on which the IGMP 1076 Leave was received. See Section 9.5 for details on how to form the 1077 EVI-RT EC. 1079 The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364]. The 1080 value field comprises an IP address of the PE (typically, the 1081 loopback address) followed by a number unique to the PE. 1083 The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet 1084 value defined for the ES. 1086 The Ethernet Tag ID MUST be set as follows: 1088 o EVI is VLAN-Based or VLAN Bundle service - set to 0 1090 o EVI is VLAN-Aware Bundle service without translation - set to the 1091 customer VID for the BD 1093 o EVI is VLAN-Aware Bundle service with translation - set to the 1094 normalized Ethernet Tag ID - e.g., normalized VID 1096 The Multicast Source length MUST be set to length of multicast source 1097 address in bits. If the Multicast Source field contains an IPv4 1098 address, then the value of the Multicast Source Length field is 32. 1099 If the Multicast Source field contains an IPv6 address, then the 1100 value of the Multicast Source Length field is 128. In case of a (*, 1101 G) Join, the Multicast Source Length is set to 0. 1103 The Multicast Source is the Source IP address of the IGMP membership 1104 report. In case of a (*, G) Join, this field does not exist. 1106 The Multicast Group length MUST be set to length of multicast group 1107 address in bits. If the Multicast Group field contains an IPv4 1108 address, then the value of the Multicast Group Length field is 32. 1109 If the Multicast Group field contains an IPv6 address, then the value 1110 of the Multicast Group Length field is 128. 1112 The Multicast Group is the Group address of the IGMP membership 1113 report. 1115 The Originator Router Length is the length of the Originator Router 1116 address in bits. 1118 The Originator Router Address is the IP address of Router Originating 1119 the prefix. 1121 Reserved field is not part of the route key. The originator MUST set 1122 the reserved field to Zero , the receiver SHOULD ignore it and if it 1123 needs to be propagated, it MUST propagate it unchanged 1125 Maximum Response Time is value to be used while sending query as 1126 defined in [RFC2236] 1128 The Flags field indicates the version of IGMP protocol from which the 1129 membership report was received. It also indicates whether the 1130 multicast group had INCLUDE or EXCLUDE bit set. 1132 When a multihomed router that receives a BGP Update that contains the 1133 Multicast Leave Sync Route flag with its Partial bit set (Not 1134 following this specification) determines that the route is malformed, 1135 the router SHOULD treat this Update as malformed . Error MUST be 1136 considered as BGP error and SHOULD be discarded as per [RFC7606]. An 1137 implementation SHOULD provide debugging facilities to permit issues 1138 caused by a malformed join sync route to be diagnosed. At a minimum, 1139 such facilities MUST include logging an error when such an route is 1140 detected. 1142 9.4. Multicast Flags Extended Community 1144 The 'Multicast Flags' extended community is a new EVPN extended 1145 community. EVPN extended communities are transitive extended 1146 communities with a Type field value of 6. IANA will assign a Sub- 1147 Type from the 'EVPN Extended Community Sub-Types' registry. 1149 A PE that supports IGMP proxy on a given BD MUST attach this extended 1150 community to the Inclusive Multicast Ethernet Tag (IMET) route it 1151 advertises for that BD and it MUST set the IGMP Proxy Support flag to 1152 1. Note that an [RFC7432] compliant PE will not advertise this 1153 extended community so its absence indicates that the advertising PE 1154 does not support IGMP Proxy. 1156 The advertisement of this extended community enables more efficient 1157 multicast tunnel setup from the source PE specially for ingress 1158 replication - i.e., if an egress PE supports IGMP proxy but doesn't 1159 have any interest in a given (x,G), it advertises its IGMP proxy 1160 capability using this extended community but it does not advertise 1161 any SMET route for that (x,G). When the source PE (ingress PE) 1162 receives such advertisements from the egress PE, it does not 1163 replicate the multicast traffic to that egress PE; however, it does 1164 replicate the multicast traffic to the egress PEs that don't 1165 advertise such capability even if they don't have any interests in 1166 that (x,G). 1168 A Multicast Flags extended community is encoded as an 8-octet value, 1169 as follows: 1171 1 2 3 1172 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1173 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1174 | Type=0x06 | Sub-Type=0x09| Flags (2 Octets) |M|I| 1175 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1176 | Reserved=0 | 1177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1179 The low-order (lease significant) two bits are defined as the "IGMP 1180 Proxy Support and MLD Proxy Support" bit. The absence of this 1181 extended community also means that the PE does not support IGMP 1182 proxy. where: 1184 o Type is 0x06 as registered with IANA for EVPN Extended 1185 Communities. 1187 o Sub-Type : 0x09 1189 o Flags are two Octets value. 1191 * Bit 15 (shown as I) defines IGMP Proxy Support. Value of 1 for 1192 bit 15 means that PE supports IGMP Proxy. Value of 0 for bit 1193 15 means that PE does not supports IGMP Proxy. 1195 * Bit 14 (shown as M) defines MLD Proxy Support. Value of 1 for 1196 bit 14 means that PE supports MLD Proxy. Value of 0 for bit 14 1197 means that PE does not support MLD proxy. 1199 * Bit 0 to 13 are reserved for future. Sender MUST set it 0 and 1200 receiver MUST ignore it. 1202 o Reserved bits are set to 0. Sender MUST set it to 0 and receiver 1203 MUST ignore it. 1205 If a router does not support this specification, it MUST not add 1206 Multicast Flags Extended Community in BGP route. A router receiving 1207 BGP update , if M and I both flag are zero (0), the router MUST treat 1208 this Update as malformed . Receiver of such update MUST ignore the 1209 extended community. 1211 9.5. EVI-RT Extended Community 1213 In EVPN, every EVI is associated with one or more Route Targets 1214 (RTs). These Route Targets serve two functions: 1216 1. Distribution control: RTs control the distribution of the routes. 1217 If a route carries the RT associated with a particular EVI, it 1218 will be distributed to all the PEs on which that EVI exists. 1220 2. EVI identification: Once a route has been received by a 1221 particular PE, the RT is used to identify the EVI to which it 1222 applies. 1224 An IGMP Join Synch or IGMP Leave Synch route is associated with a 1225 particular combination of ES and EVI. These routes need to be 1226 distributed only to PEs that are attached to the associated ES. 1227 Therefore these routes carry the ES-Import RT for that ES. 1229 Since an IGMP Join Synch or IGMP Leave Synch route does not need to 1230 be distributed to all the PEs on which the associated EVI exists, 1231 these routes cannot carry the RT associated with that EVI. 1232 Therefore, when such a route arrives at a particular PE, the route's 1233 RTs cannot be used to identify the EVI to which the route applies. 1234 Some other means of associating the route with an EVI must be used. 1236 This document specifies four new Extended Communities (EC) that can 1237 be used to identify the EVI with which a route is associated, but 1238 which do not have any effect on the distribution of the route. These 1239 new ECs are known as the "Type 0 EVI-RT EC", the "Type 1 EVI-RT EC", 1240 the "Type 2 EVI-RT EC", and the "Type 3 EVI-RT EC". 1242 1. A Type 0 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xA. 1244 2. A Type 1 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xB. 1246 3. A Type 2 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xC. 1248 4. A Type 3 EVI-RT EC is an EVPN EC (type 6) of sub-type 0xD 1250 Each IGMP Join Synch or IGMP Leave Synch route MUST carry exactly one 1251 EVI-RT EC. The EVI-RT EC carried by a particular route is 1252 constructed as follows. Each such route is the result of having 1253 received an IGMP Join or an IGMP Leave message from a particular BD. 1254 The route is said to be associated associated with that BD. For each 1255 BD, there is a corresponding RT that is used to ensure that routes 1256 "about" that BD are distributed to all PEs attached to that BD. So 1257 suppose a given IGMP Join Synch or Leave Synch route is associated 1258 with a given BD, say BD1, and suppose that the corresponding RT for 1259 BD1 is RT1. Then: 1261 o 0. If RT1 is a Transitive Two-Octet AS-specific EC, then the EVI- 1262 RT EC carried by the route is a Type 0 EVI-RT EC. The value field 1263 of the Type 0 EVI-RT EC is identical to the value field of RT1. 1265 o 1. If RT1 is a Transitive IPv4-Address-specific EC, then the EVI- 1266 RT EC carried by the route is a Type 1 EVI-RT EC. The value field 1267 of the Type 1 EVI-RT EC is identical to the value field of RT1. 1269 o 2. If RT1 is a Transitive Four-Octet-specific EC, then the EVI-RT 1270 EC carried by the route is a Type 2 EVI-RT EC. The value field of 1271 the Type 2 EVI-RT EC is identical to the value field of RT1. 1273 o 3. If RT1 is a Transitive IPv6-Address-specific EC, then the EVI- 1274 RT EC carried by the route is a Type 3 EVI-RT EC. The value field 1275 of the Type 3 EVI-RT EC is identical to the value field of RT1. 1277 An IGMP Join Synch or Leave Synch route MUST carry exactly one EVI-RT 1278 EC. 1280 Suppose a PE receives a particular IGMP Join Synch or IGMP Leave 1281 Synch route, say R1, and suppose that R1 carries an ES-Import RT that 1282 is one of the PE's Import RTs. If R1 has no EVI-RT EC, or has more 1283 than one EVI-RT EC, the PE MUST apply the "treat-as-withdraw" 1284 procedure of [RFC7606]. 1286 Note that an EVI-RT EC is not a Route Target Extended Community, is 1287 not visible to the RT Constrain mechanism [RFC4684] , and is not 1288 intended to influence the propagation of routes by BGP. 1290 1 2 3 1291 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1292 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1293 | Type=0x06 | Sub-Type=n | RT associated with EVI | 1294 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1295 | RT associated with the EVI (cont.) | 1296 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1298 Where the value of 'n' is 0x0A, 0x0B, 0x0C, or 0x0D corresponding to 1299 EVI-RT type 0, 1, 2, or 3 respectively. 1301 9.6. Rewriting of RT ECs and EVI-RT ECs by ASBRs 1303 There are certain situations in which an ES is attached to a set of 1304 PEs that are not all in the same AS, or not all operated by the same 1305 provider. In some such situations, the RT that corresponds to a 1306 particular EVI may be different in each AS. If a route is propagated 1307 from AS1 to AS2, an ASBR at the AS1/AS2 border may be provisioned 1308 with a policy that removes the RTs that are meaningful in AS1 and 1309 replaces them with the corresponding (i.e., RTs corresponding to the 1310 same EVIs) RTs that are meaningful in AS2. This is known as RT- 1311 rewriting. 1313 Note that if a given route's RTs are rewritten, and the route carries 1314 an EVI-RT EC, the EVI-RT EC needs to be rewritten as well. 1316 10. IGMP/MLD Immediate Leave 1318 IGMP MAY be configured with immediate leave option. This allows the 1319 device to remove the group entry from the multicast routing table 1320 immediately upon receiving a IGMP leave message for (x,G). In case 1321 of all active multi-homing while synchronizing the IGMP Leave state 1322 to redundancy peers, Maximum Response Time MAY be filled in as Zero. 1324 Implementations SHOULD have identical configuration across multi- 1325 homed peers. In case IGMP Leave Synch route is received with Maximum 1326 Response Time Zero, irrespective of local IGMP configuration it MAY 1327 be processed as an immediate leave. 1329 11. IGMP Version 1 Membership Request 1331 This document does not provide any detail about IGMPv1 processing. 1332 Multicast working group are in process of deprecating uses of IGMPv1. 1333 Implementations MUST only use IGMPv2 and above for IPv4 and MLDv1 and 1334 above for IPv6. IGMP V1 routes MUST be considered as invalid and 1335 handled as per [RFC7606] 1337 12. Security Considerations 1339 Same security considerations as [RFC7432] ,[RFC2236] ,[RFC3376] , 1340 [RFC2710], [RFC3810]. 1342 13. IANA Considerations 1344 IANA has allocated the following codepoints from the EVPN Extended 1345 Community sub-types registry. 1347 0x09 Multicast Flags Extended Community [this document] 1348 0x0A EVI-RT Type 0 [this document] 1349 0x0B EVI-RT Type 1 [this document] 1350 0x0C EVI-RT Type 2 [this document] 1352 IANA is requested to allocate a new codepoint from the EVPN Extended 1353 Community sub-types registry for the following. 1355 0x0D EVI-RT Type 3 [this document] 1357 IANA has allocated the following EVPN route types from the EVPN Route 1358 Type registry. 1360 6 - Selective Multicast Ethernet Tag Route 1361 7 - Multicast Join Synch Route 1362 8 - Multicast Leave Synch Route 1364 The Multicast Flags Extended Community contains a 16-bit Flags field. 1365 The bits are numbered 0-15, from high-order to low-order. 1367 The registry should be initialized as follows: 1368 Bit Name Reference 1369 ---- -------------- ------------- 1370 0 - 13 Unassigned 1371 14 MLD Proxy Support This document 1372 15 IGMP Proxy Support This document 1374 The registration policy should be "First Come First Served". 1376 14. Acknowledgement 1378 The authors would like to thank Stephane Litkowski, Jorge Rabadan, 1379 Anoop Ghanwani, Jeffrey Haas, Krishna Muddenahally Ananthamurthy for 1380 reviewing and providing valuable comment. 1382 15. Contributors 1384 Mankamana Mishra 1386 Cisco systems 1388 Email: mankamis@cisco.com 1390 Derek Yeung 1392 Arrcus 1394 Email: derek@arrcus.com 1396 16. References 1398 16.1. Normative References 1400 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1401 Requirement Levels", BCP 14, RFC 2119, 1402 DOI 10.17487/RFC2119, March 1997, 1403 . 1405 [RFC2236] Fenner, W., "Internet Group Management Protocol, Version 1406 2", RFC 2236, DOI 10.17487/RFC2236, November 1997, 1407 . 1409 [RFC2710] Deering, S., Fenner, W., and B. Haberman, "Multicast 1410 Listener Discovery (MLD) for IPv6", RFC 2710, 1411 DOI 10.17487/RFC2710, October 1999, 1412 . 1414 [RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. 1415 Thyagarajan, "Internet Group Management Protocol, Version 1416 3", RFC 3376, DOI 10.17487/RFC3376, October 2002, 1417 . 1419 [RFC3810] Vida, R., Ed. and L. Costa, Ed., "Multicast Listener 1420 Discovery Version 2 (MLDv2) for IPv6", RFC 3810, 1421 DOI 10.17487/RFC3810, June 2004, 1422 . 1424 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1425 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1426 2006, . 1428 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 1429 R., Patel, K., and J. Guichard, "Constrained Route 1430 Distribution for Border Gateway Protocol/MultiProtocol 1431 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 1432 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 1433 November 2006, . 1435 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1436 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 1437 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 1438 2015, . 1440 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1441 Patel, "Revised Error Handling for BGP UPDATE Messages", 1442 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1443 . 1445 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1446 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1447 May 2017, . 1449 16.2. Informative References 1451 [RFC4541] Christensen, M., Kimball, K., and F. Solensky, 1452 "Considerations for Internet Group Management Protocol 1453 (IGMP) and Multicast Listener Discovery (MLD) Snooping 1454 Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006, 1455 . 1457 Authors' Addresses 1459 Ali Sajassi 1460 Cisco Systems 1461 821 Alder Drive, 1462 MILPITAS, CALIFORNIA 95035 1463 UNITED STATES 1465 Email: sajassi@cisco.com 1467 Samir Thoria 1468 Cisco Systems 1469 821 Alder Drive, 1470 MILPITAS, CALIFORNIA 95035 1471 UNITED STATES 1473 Email: sthoria@cisco.com 1475 Keyur PAtel 1476 Arrcus 1477 UNITED STATES 1479 Email: keyur@arrcus.com 1481 John Drake 1482 Juniper Networks 1484 Email: jdrake@juniper.net 1486 Wen Lin 1487 Juniper Networks 1489 Email: wlin@juniper.net