idnits 2.17.1 draft-skr-bess-evpn-pim-proxy-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([EVPN-IGMP-MLD-PROXY], [EVPN-PROXY-ARP-ND], [RFC7432]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 3, 2017) is 2489 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC7606' is mentioned on line 673, but not defined == Missing Reference: 'RFC2119' is mentioned on line 928, but not defined ** Obsolete normative reference: RFC 4601 (Obsoleted by RFC 7761) ** Downref: Normative reference to an Informational draft: draft-ietf-pals-vpls-pim-snooping (ref. 'VPLS-PIM-PROXY') == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-00 == Outdated reference: A later version (-16) exists of draft-ietf-bess-evpn-proxy-arp-nd-02 Summary: 3 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet Draft J. Kotalwar 4 Intended status: Standards Track S. Sathappan 5 Nokia 6 Z. Zhang 7 Juniper 8 A. Sajassi 9 Cisco 11 Expires: January 4, 2018 July 3, 2017 13 PIM Proxy in EVPN Networks 14 draft-skr-bess-evpn-pim-proxy-00 16 Abstract 18 Ethernet Virtual Private Networks [RFC7432] are becoming prevalent in 19 Data Centers, Data Center Interconnect (DCI) and Service Provider VPN 20 applications. One of the goals that EVPN pursues is the reduction of 21 flooding and the efficiency of CE-based control plane procedures in 22 Broadcast Domains. Examples of this are [EVPN-PROXY-ARP-ND] for 23 improving the efficiency of CE's ARP/ND protocols, and [EVPN-IGMP- 24 MLD-PROXY] for IGMP/MLD protocols. This document complements the 25 latter, describing the procedures required to minimize the flooding 26 of PIM messages in EVPN Broadcast Domains, and optimize the IP 27 Multicast delivery between PIM routers. 29 Status of this Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF), its areas, and its working groups. Note that 36 other groups may also distribute working documents as Internet- 37 Drafts. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 The list of current Internet-Drafts can be accessed at 45 http://www.ietf.org/ietf/1id-abstracts.txt 46 The list of Internet-Draft Shadow Directories can be accessed at 47 http://www.ietf.org/shadow.html 49 This Internet-Draft will expire on January 4, 2018. 51 Copyright Notice 53 Copyright (c) 2017 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents 58 (http://trustee.ietf.org/license-info) in effect on the date of 59 publication of this document. Please review these documents 60 carefully, as they describe your rights and restrictions with respect 61 to this document. Code Components extracted from this document must 62 include Simplified BSD License text as described in Section 4.e of 63 the Trust Legal Provisions and are provided without warranty as 64 described in the Simplified BSD License. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 2. PIM Proxy Operation in EVPN Broadcast Domains . . . . . . . . . 4 70 2.1. Multicast Router Discovery Procedures in EVPN . . . . . . . 5 71 2.1.1. Discovering PIM Routers . . . . . . . . . . . . . . . . 5 72 2.1.2. Discovering IGMP Queriers . . . . . . . . . . . . . . . 7 73 2.2. PIM Join/Prune Proxy Procedures . . . . . . . . . . . . . . 7 74 2.3. PIM Assert Optimization . . . . . . . . . . . . . . . . . . 10 75 2.3.1 Assert Optimization Procedures in Downstream PEs . . . . 11 76 2.3.2 Assert Optimization Procedures in Upstream PEs . . . . . 12 77 2.4. EVPN Multi-Homing and State Synchronization . . . . . . . . 12 78 2.5. PIM Bootstrap and RP Discovery . . . . . . . . . . . . . . 13 79 2.6. PIM-DM (Dense Mode) Proxy Procedures . . . . . . . . . . . 13 80 3. Interaction with IGMP-snooping and Sources . . . . . . . . . . 13 81 4. BGP Information Model . . . . . . . . . . . . . . . . . . . . . 14 82 4.1 Multicast Router Discovery (MRD) Route . . . . . . . . . . . 15 83 4.2 Selective Multicast Ethernet Tag Route for PIM Proxy . . . . 16 84 4.3 PIM RPT-Prune Route . . . . . . . . . . . . . . . . . . . . 18 85 4.4 IGMP/PIM Join Synch Route for PIM Proxy . . . . . . . . . . 19 86 4.5 IGMP/PIM RPT-Prune Synch Route for PIM Proxy . . . . . . . . 20 87 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 21 88 6. Conventions used in this document . . . . . . . . . . . . . . . 21 89 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 21 90 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 21 91 9. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 22 92 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 93 10.1 Normative References . . . . . . . . . . . . . . . . . . . 22 94 10.2 Informative References . . . . . . . . . . . . . . . . . . 23 95 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 23 96 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 23 97 13. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 23 99 1. Introduction 101 Ethernet Virtual Private Networks [RFC7432] are becoming prevalent in 102 Data Centers, Data Center Interconnect (DCI) and Service Provider VPN 103 applications. One of the goals that EVPN pursues is the reduction of 104 flooding and the efficiency of CE-based control plane procedures in 105 Broadcast Domains. Examples of this are [EVPN-PROXY-ARP-ND] for 106 improving the efficiency of CE's ARP/ND protocols, and [EVPN-IGMP- 107 MLD-PROXY] for IGMP/MLD protocols. 109 This document focuses on optimizing the behavior of PIM in EVPN 110 Broadcast Domains and re-uses some procedures of [EVPN-IGMP-MLD- 111 PROXY]. The reader is also advised to check out [VPLS-PIM-PROXY] to 112 understand certain aspects of the procedures of PIM Join/Prune 113 messages received on Attachment Circuits (ACs). 115 Section 2 describes the PIM Proxy procedures that the implementation 116 should follow, including: 118 o The use of EVPN to suppress the flooding of PIM Hello messages in 119 shared Broadcast Domains. The benefit of this is twofold: 120 - PIM Hello messages will be ONLY flooded to Attachment Circuits 121 that are connected to PIM routers, as opposed to all the CEs and 122 hosts in the Broadcast Domain. 123 - Soft-state PIM Hello messages will be replaced by hard-state BGP 124 messages that don't need to be refreshed periodically. 126 o The use of EVPN to discover IGMP Queriers, while avoiding the 127 flooding of IGMP Queries in the core. 129 o The procedures to proxy PIM Join/Prune messages and replace them by 130 hard-state EVPN routes that don't need to be refreshed 131 periodically. By using BGP EVPN to propagate both, Hello and 132 Join/Prune messages, we also avoid out-of-order delivery between 133 both types of PIM messages. 135 o This document also describes an EVPN based procedure so that the 136 PIM routers connected to the shared Broadcast Domain don't need to 137 run any PIM Assert procedure. PIM Assert procedures may be 138 expensive for PIM routers in terms of resource consumption. With 139 this procedure, there is no PIM Assert needed on PIM routers. 141 o The use of procedures similar to the ones defined in [EVPN-IGMP- 142 MLD-PROXY] to synchronize multicast states among the PEs in the 143 same Ethernet Segment. 145 Section 3 describes the interaction of PIM Proxy with IGMP Proxy PEs 146 and Multicast Sources connected to the same EVPN Broadcast Domain. 148 Section 4 defines the BGP Information Model that this document 149 requires to address the PIM Proxy procedures. 151 This document assumes the reader is familiar with PIM and IGMP 152 protocols. 154 2. PIM Proxy Operation in EVPN Broadcast Domains 156 This section describes the operation of PIM Proxy in EVPN Broadcast 157 Domains (BDs). Figure 1 depicts an EVPN Broadcast Domain defined in 158 four PEs that are connected to PIM routers. This example will be used 159 throughout this section and assumes both R4 and R5 are PIM Upstream 160 Neighbors for PIM routers R1, R2 and R3 and multicast group G1. In 161 this situation, the PIM multicast traffic flows from R4 or R5 to R1, 162 R2 and R3. The PIM Join/Prune signaling will flow in the opposite 163 direction. From a terminology perspective, we consider PE1 and PE2 as 164 egress or downstream PEs, whereas PE3 and PE4 are ingress or upstream 165 PEs. 167 J(*,G1,IP5) 168 +--+ 169 |R1+------> XXXXXXXX 170 +--+ +-----+ XXXX XX XXXXX +-----+ +--+ 171 | PE1 |XXXXX XXXX XX| PE3 +----> |R4| 172 +--+ | | | | +--+ 173 |R2+-----> +-----+ +-----+ <---- 174 +--+ X XX multicast 175 J(*,G1,IP5) X XXX (S1,G1) 176 XXX EVPN Broadcast XX 177 X Domain X 178 +--+ +-----+ X RP 179 |R3+---> | PE2 | XX+-----+ +--+ 180 +--+ | | XXXX | PE4 +--> |R5| 181 +-----+XXXX XXXXX | | +--+ 182 J(S1,G1,IP4) X X X +-----+ 183 XX XXX XX XXX 184 XXXXXX XXXXX XXX 186 Figure 1 - PIM Routers connected by an EVPN Broadcast Domain 188 It is important to note that any Router's PIM message not explicitly 189 specified in this document will be forwarded by the PEs normally, in 190 the data path, as a unicast or multicast packet. 192 2.1. Multicast Router Discovery Procedures in EVPN 194 The procedures defined in this section make use of the Multicast 195 Router Discovery (MRD) route described in section 4 and are OPTIONAL. 196 An EVPN router not implementing this specification will transparently 197 flood PIM Hello messages and IGMP Queries to remote PEs. 199 2.1.1. Discovering PIM Routers 201 As described in [RFC4601] for shared LANs, an EVPN Broadcast Domain 202 may have multiple PIM routers connected to it and a single one of 203 these routers, the DR, will act on behalf of directly connected hosts 204 with respect to the PIM-SM protocol. The DR election, as well as 205 discovery and negotiation of options in PIM, is performed using Hello 206 messages. PIM Hello messages are periodically exchanged and flooded 207 in EVPN Broadcast Domains that don't follow this specification. 209 When PIM Proxy is enabled, an EVPN PE will snoop PIM Hello messages 210 and forward them only to local ACs where PIM routers have been 211 detected. This document assumes that all the procedures defined in 213 [VPLS-PIM-PROXY] to snoop PIM Hellos on local ACs and build the PIM 214 Neighbor DB on the PEs are followed. PIM Hello messages MUST NOT be 215 forwarded to remote EVPN PEs though. 217 Using Figure 1 as an example, the PIM Proxy operation for Hello 218 messages is as follows: 220 1) The arrival of a new PIM Hello message at e.g. PE1 will trigger an 221 MRD route advertisement including: 222 o The IP address and length of the multicast router that issued 223 the Hello message. E.g. R1's IP address and length. 224 o The DR Priority copied from the Hello DR Priority TLV. 225 o Q flag set (if the multicast router is a Querier). 226 o P flag set that indicates the router is PIM capable. 228 2) All other PEs import the MRD route and do the following: 229 o Add the multicast router address to the PIM Neighbor Database 230 (PIM Nbr DB) associated to the Originator Router Address. 231 o Generate a PIM hello where the IP Source Address is the 232 Multicast Router IP and the DR Priority is copied from the 233 route. This PIM hello is sent to all the local ACs connected to 234 a PIM router. For example, PE3 will send the generated hello 235 message to R4. 237 3) Each PE will build its PIM Nbr DB out of the local PIM hello 238 messages and/or remote MRD routes. The PIM hello timers and other 239 hello parameters are not propagated in the MRD routes. 241 o The timers are handled locally by the PE and as per [RFC4601]. 242 This is valid for the hold_time (when a PIM router or PE 243 receives a hello message, resets the neighbor-expiry timer), and 244 other timers. 246 o The Generation ID option is also processed locally on the PE, as 247 well as the Generation ID changes for a given multicast router. 248 It is not propagated in the MRD route. 250 o Procedures described in [RFC4601] are used to remove a local AC 251 PIM router from the PIM Nbr DB. When a local router is removed 252 from the DB, the MRD route is withdrawn. If the local router is 253 still sending Queries, the route is updated with flags P=0 and 254 Q=1. Upon receiving the update, the other PEs will remove the 255 router from the PIM Nbr DB but not from the list of queriers. 257 4) Based on regular PIM DR election procedures (highest DR Priority 258 or highest IP), each PE is aware of who the DR is for the BD. For 259 more information, refer to section "3. Interaction with IGMP- 260 snooping and Sources". 262 2.1.2. Discovering IGMP Queriers 264 In (EVPN) Broadcast Domains that are shared among not only PIM 265 routers but also IGMP hosts, one or more PIM routers will also be 266 configured as IGMP Queriers. The proxy Querier mechanism described in 267 [EVPN-IGMP-MLD-PROXY] suppresses the flooding of queries on the 268 Broadcast Domain, by using PE generated Queries from an anycast IP 269 address. 271 While the proxy Querier mechanism works in most of the use-cases, 272 sometimes it is desired to have a more transparent behavior and 273 propagate existing multicast router IGMP Queries as opposed to 274 "blindly" querying all the hosts from the PEs. The MRD route defined 275 in section 4 can be used for that purpose. 277 When the discovered local PIM router is also sending IGMP Queries, 278 the PE will issue an MRD route for the multicast router with both Q 279 (IGMP Querier) and P (PIM router) flags set. Note that the PE may set 280 both flags or only one of them, depending on the capabilities of the 281 local router. 283 A PE receiving an MRD route with Q=1 will generate IGMP Query 284 messages, using the multicast router IP address encoded in the 285 received MRD route. If more than one IGMP Queriers exist in the EVI, 286 the PE receiving the MRD routes with Q=1 will select the lower IP 287 address, as per [RFC2236]. Note that, upon receiving the MRD routes 288 with Q=1, the PE must generate IGMP Queries and forward them to all 289 the local ACs. Other Queriers listening to these received Query 290 messages will stop sending Queries if they are no longer the selected 291 Querier, as per [RFC2236]. 293 This procedure allows the EVPN PEs to act as proxy Queriers, but 294 using the IP address of the best existing IGMP Querier in the EVPN 295 Broadcast Domain. This can help IGMP hosts troubleshoot any issues on 296 the IGMP routers and check their connectivity to them. 298 2.2. PIM Join/Prune Proxy Procedures 300 This section describes the procedures associated to the PIM Proxy 301 function for Join and Prune messages. This document assumes that all 302 the procedures defined in [VPLS-PIM-PROXY] to build multicast states 303 on the PEs' local ACs are followed. Figure 2 illustrates an scenario 304 where PIM Proxy is enabled on the EVPN PEs. 306 J(*,G1,IP5) 307 +--+ J(*,G1,IP5) 308 |R1+------> XXXXXXXX P(S1,G1,IP5,rpt) 309 +--+ +-----+ XXXX XX XXXXX +-----+ +--+ 310 | PE1 |XXXXX XXXX XX| PE3 +----> |R4| 311 +--+ | | SMET | | +--+ 312 |R2+-----> +-----+ (*,G1,IP5) +-----+ 313 +--+ X +---------> XX 314 J(*,G1,IP5) X XXX 315 XX XX 316 X X J(*,G1,IP5) 317 +--+ +-----+ SMET X P(S1,G1,IP5,rpt) 318 |R3+---> | PE2 | (S1,G1,IP5,rpt) XX+-----+ +--+ 319 +--+ | | +--------> XXXX | PE4 +--> |R5| 320 +-----+XXXX XXXXX | | +--+ 321 P(S1,G1,IP5,rpt) X X X +-----+ RP 322 XX XXX XX XXX 323 XXXXXX XXXXX XXX 325 Figure 2 - Proxy PIM Join/Prune in EVPN 327 PIM J/P messages are sent by the routers towards upstream sources and 328 RPs: 329 o (*,G) is used in Join/Prune messages that are sent towards the RP 330 for the specified group. 331 o (S,G) used in Join/Prune messages sent towards the specified 332 source. 333 o (S,G,rpt) is used in Join/Prune messages sent towards the RP. We 334 refer to this as RPT message and the Prune message always precedes 335 the Join message. The typical sequence of PIM messages (for a 336 group) seen in a BD connecting PIM routers is the following: 338 a) (*,G) Join issued by a downstream router to the RP (to join the 339 RP Tree). 340 b) (S,G) Join issued by a downstream router switching to the SPT. 341 c) (S,G,rpt) Prune issued by a downstream router to the RP to prune 342 a specific source from the RPT. 343 d) (S,G) Prune issued by a downstream router no longer interested 344 in the SPT. 345 e) (S,G,rpt) Join issued by a downstream router interested (again) 346 in the RPT for (S,G). 348 The Proxy PIM procedures for Join/Prune messages are summarized as 349 follows: 351 1) Downstream PE procedures: 353 o A downstream PE will snoop PIM Join/Prune messages and won't 354 forward them to remote PEs. 356 o Triggered by the reception of the PIM Join message, a downstream 357 PE will advertise an SMET route, including the source, group and 358 Upstream Neighbor as received from the PIM Join message. A 359 single SMET route is advertised per source, group, with the P 360 flag set. As an example, in Figure 2, PE1 receives two PIM Join 361 messages for the same source, group and Upstream Neighbor, 362 however PE1 advertises a single SMET route. 364 o When the last connected router sends a PIM Prune message for a 365 given source, group and Upstream Neighbor and the state is 366 removed, the PE will withdraw the SMET route (note that the 367 state is removed once the prune-pend timer expires). 369 o SMET routes must always be generated upon receiving a PIM Join 370 message, irrespective of the location of the Upstream Neighbor 371 and even if the Upstream Neighbor is local to the PE. 373 o A downstream PE receiving a PIM Prune (S,G,rpt) message will 374 trigger an RPT-Prune route for the source and group. 375 Subsequently, if the downstream PE receives a PIM Join (S,G,rpt) 376 to cancel the previous Prune (S,G,rpt) and keep pulling the 377 multicast traffic from the RPT, the downstream PE will withdraw 378 the RPT-Prune route. 380 o PIM Timers are handled locally. If the holdtime expires for a 381 local Join the PE withdraws the SMET route. 383 3) Upstream PE procedures: 385 o A received SMET route with P=1 will add state for the source and 386 group and will generate a PIM Join message for the source, group 387 that will be forwarded to all the local AC PIM routers. 389 o A received SMET route withdrawal will remove the state and 390 generate a PIM Prune message for the source, group and upstream 391 neighbor that will be forwarded to all the local AC PIM routers. 393 o A received RPT-Prune route for (S,G) will generate a PIM Prune 394 (S,G,rpt) message that will be forwarded to all the local AC PIM 395 routers. 397 o A received RPT-Prune withdrawal for (S,G) will generate a PIM 398 Join (S,G,rpt) message that will be forwarded to all the local 399 AC PIM routers. 401 It is important to note that, compared to a solution that does not 402 snoop PIM messages and does not use BGP to propagate states in the 403 core, this EVPN PIM Proxy solution will add some latency derived from 404 the procedures described in this document. 406 2.3. PIM Assert Optimization 408 The PIM Assert process described in [RFC4601] is intense in terms of 409 resource consumption in the PIM routers, however it is needed in case 410 PIM routers share a multi-access transit LAN. The use of PIM Proxy 411 for EVPN BDs can minimize and even suppress the need for PIM Assert 412 as described in this section. 414 As a refresher, the PIM Assert procedures are needed to prevent two 415 or more Upstream PIM routers from forwarding the same multicast 416 content to the group of Downstream PIM routers sharing the same 417 (EVPN) Broadcast Domain. This multicast packet duplication may happen 418 in any of the following cases: 420 o Two or more Downstream PIM routers on the BD may issue (*,G) Joins 421 to different upstream routers on the BD because they have 422 inconsistent MRIB entries regarding how to reach the RP. Both paths 423 on the RP tree will be set up, causing two copies of all the shared 424 tree traffic to appear on the EVPN Broadcast Domain. 426 o Two or more routers on the BD may issue (S,G) Joins to different 427 upstream routers on the BD because they have inconsistent MRIB 428 entries regarding how to reach source S. Both paths on the source- 429 specific tree will be set up, causing two copies of all the traffic 430 from S to appear on the BD. 432 o A router on the BD may issue a (*,G) Join to one upstream router on 433 the BD, and another router on the BD may issue an (S,G) Join to a 434 different upstream router on the same BD. Traffic from S may reach 435 the BD over both the RPT and the SPT. If the receiver behind the 436 downstream (*,G) router doesn't issue an (S,G,rpt) prune, then this 437 condition would persist. 439 PIM does not prevent such duplicate joins from occurring; instead, 440 when duplicate data packets appear on the same BD from different 441 routers, these routers notice this and then elect a single forwarder. 442 This election is performed using the PIM Assert procedure. 444 The issue is minimized or suppressed in this document by making sure 445 all the Upstream PEs select the same Upstream Neighbor for a given 446 (*,G) or (S,G) in any of the three above situations. If there is only 447 one upstream PIM router selected and the same multicast content is 448 not allowed to be flooded from more than one Upstream Neighbor, there 449 will not be multicast duplication or need for Assert procedures in 450 the EVPN Broadcast Domain. 452 Figure 3 illustrates an example of the PIM Assert Optimization in 453 EVPN. 455 J(*,G1,IP5) 456 +--+ J(*,G1,IP5) 457 |R1+------> XXXXXXXX J(S1,G1,IP4) 458 +--+ +-----+ XXXX XX XXXXX +-----+ +--+ 459 | PE1 |XXXXX XXXX XX| PE3 +----> |R4| 460 +--+ | | SMET | | +--+ 461 |R2+-----> +-----+ (*,G1,IP5) +-----+ 462 +--+ X +---------> XX 463 J(*,G1,IP4) X XXX 464 XX XX 465 X X J(*,G1,IP5) 466 +--+ +-----+ SMET X J(S1,G1,IP4) 467 |R3+---> | PE2 | (S1,G1,IP4) XX+-----+ +--+ 468 +--+ | | +--------> XXXX | PE4 +--> |R5| 469 +-----+XXXX XXXXX | | +--+ 470 J(S1,G1,IP4) X X X +-----+ RP 471 XX XXX XX XXX P(S1,G1,IP5,rpt)--> 472 XXXXXX XXXXX XXX 474 Figure 3 - Proxy PIM Assert Optimization in EVPN 476 2.3.1 Assert Optimization Procedures in Downstream PEs 478 The Downstream PEs will trigger SMET routes based on the received PIM 479 Join messages. This is their behavior when any of the three 480 situations described in section 2.3 occurs: 482 o If the Downstream PE receives two local (*,G) Joins to different 483 Upstream Neighbors, the PE will generate a single SMET route, 484 selecting the highest IP address. In Figure 3, if we assume R1 485 issues J(*,G1,IP5) and R2 J(*,G1,IP4), PE1 will advertise an SMET 486 route for (*,G,IP5). If PE1 had already advertised (*,G1,IP4), it 487 would have sent an update with (*,G1,IP5). Note that the Upstream 488 Router IP address is not part of the SMET route key, hence there is 489 no need to withdraw the previous (*,G1,IP4). 491 o In the same way, if the Downstream PE receives two local (S,G) 492 Joins to different Upstream Neighbors, the PE will generate a 493 single SMET route, selecting the highest IP address. 495 o If the Downstream PE receives a local (S,G) and a local (*,G) Joins 496 for the same group but to different Upstream Neighbors, the PE will 497 generate two different SMET routes (since *,G and S,G make two 498 different route keys), keeping the original Upstream Neighbors in 499 the SMET routes. 501 2.3.2 Assert Optimization Procedures in Upstream PEs 503 Upon receiving two or more SMET routes for the same group but 504 different Upstream Neighbors, the Upstream PEs will follow this 505 procedure: 507 1) The Upstream PE will select a unique Upstream Neighbor based on 508 the following rules: 510 a) The Upstream Neighbor encoded in a (S,G) SMET route has 511 precedence over the Upstream Neighbor on the (*,G) SMET route 512 for the same group. This is consistent with the Assert winner 513 election in [RFC4601]. In the example of Figure 3, PE3 and PE4 514 will select IP4 as the Upstream Neighbor for (S1,G1) and (*,G1). 516 b) In case the SMET routes have the same source (* or S), the 517 higher Upstream Neighbor IP Address wins. 519 2) After selecting the Unique Upstream Neighbor, the PE will instruct 520 the data path to discard any ingress multicast stream that is 521 coming from an interface different than the selected Upstream 522 Neighbor for the multicast group. In the example in Figure 3, PE4 523 will not accept G1 multicast traffic from R5. 525 3) Then the PE will generate the corresponding local PIM messages as 526 usual. In the example, PE3 and PE4 generate PIM Join messages for 527 (S1,G1,IP4) and (*,G1,IP5). 529 4) The PE connected to the non-selected Upstream Neighbor will issue 530 a PIM (S,G)/(*,G) Prune or a PIM (S,G,rpt) Prune to make sure the 531 non-selected Upstream Router does not forward traffic for the 532 group anymore. In the example, PE4 will issue a local (S1,G1,rpt) 533 Prune message to R5, so that R5 does not forward G1 traffic. 535 In case of any change that impacts on the Upstream Neighbor selection 536 for a given group G1, the upstream PEs will simply update the 537 Upstream Neighbor selection and follow the above procedure. This 538 mechanism prevents the multicast duplication in the EVPN Broadcast 539 Domain and avoids PIM Assert procedures among PIM routers in the BD. 541 2.4. EVPN Multi-Homing and State Synchronization 542 PIM Join/Prune States will be synchronized across all the PEs in an 543 Ethernet Segment by using the procedures described in [EVPN-IGMP-MLD- 544 PROXY] and the IGMP/PIM Join Synch Route with the corresponding Flag 545 P set. This document does not require the use of IGMP Leave Synch 546 Routes. 548 In the same way, RPT-Prune States can be synchronized by using the 549 PIM RPT-Prune Synch route. The generation and process for this route 550 follows similar procedures as for the IGMP/PIM Join Synch Route. 552 In order to synchronize the PIM Neighbors discovered on an Ethernet 553 Segment, the MRD route and its ESI value will be used. Upon receiving 554 a Hello message on a link that is part of a multi-homed Ethernet 555 Segment, the PE will issue an MRD route that encodes the ESI value of 556 the AC over which the Hello was received. Upon receiving the non-zero 557 ESI MRD route, the PEs in the same ES will add the router to their 558 PIM Neighbor DB, using their AC on the same ES as the PIM Neighbor 559 port. This will allow the DF on the ES to generate Hello messages for 560 the local PIM router. 562 A PE that is not part of the ESI would normally receive a single non- 563 zero ESI MRD route per multicast router. In certain transient 564 situations the PE may receive more than one non-zero ESI MRD route 565 for the same multicast router. The PE should recognize this and not 566 generate additional PIM Hello messages for the local ACs. 568 2.5. PIM Bootstrap and RP Discovery 570 This section will be covered in future revisions of this document. 572 2.6. PIM-DM (Dense Mode) Proxy Procedures 574 This section will be covered in future revisions of this document. 576 3. Interaction with IGMP-snooping and Sources 578 Figure 4 illustrates an example with a multicast source, an IGMP host 579 and a PIM router in the same EVPN BD. 581 XXXXX J(*,G1) 582 XXXXXXX +-----+ +--+ 583 XXXX | PE3 | <---+H3| 584 X | | +--+ 585 +------+ X +--------> +-----+ +---> 586 |Source| +-----+ | S1,G1 X S1,G1 mcast 587 | S1 +---> | PE1 | + mcast XX 588 +------+ | | XX Hello 589 G1 +-----+ + S1,G1 X <---+ 590 XX | mcast +-----+ +--+ 591 X +---------> | PE4 +--> |R4| 592 X | | +--+ 593 XX XXX +-----+ DR 594 XXX XXX XXX 595 XXXXXXX S1,G1, mcast 597 Figure 4 - Proxy PIM interaction with local sources and hosts 599 When PIM routers, multicast sources and IGMP hosts coexist in the 600 same EVPN Broadcast domain, the PEs supporting both IGMP and PIM 601 proxy will provide the following optimizations in the EVPN BD: 603 o If an IGMP host and a PIM router are connected to the same BD on a 604 PE, the PE will advertise a single SMET route per (S,G) or (*,G) 605 irrespective of the received IGMP or PIM message. The IGMP flags 606 can be simultaneously set along with the P flag. 608 o In the same way, if IGMP hosts and PIM routers are connected to the 609 same MAC-VRF and Ethernet Segment, the IGMP/PIM Join Synch route 610 can be shared by a host and a router requesting the same multicast 611 source and group. 613 o A PE connected to a Source and using Ingress Replication will 614 forward a multicast stream (S1,G1) to all the egress PEs that 615 advertised an SMET route for (S1,G1) and all the egress PEs that 616 advertised an MRD route for the EVPN BD. 618 4. BGP Information Model 620 This document defines the following additional routes and requests 621 IANA to allocate a type value in the EVPN route type registry: 623 + Type TBD - Multicast Router Discovery (MRD) Route 624 + Type TBD - PIM RPT-Prune Route 625 + Type TBD - PIM RPT-Prune Join Synch Route 627 In addition, the following routes defined in [EVPN-IGMP-MLD-PROXY] 628 are re-used and extended in this document's procedures: 630 + Type 6 - Selective Multicast Ethernet Tag Route 631 + Type 7 - IGMP Join Synch Route 633 Where Type 7 is requested to be re-named as IGMP/PIM Join Synch 634 Route. 636 4.1 Multicast Router Discovery (MRD) Route 638 Figure 5 shows the content of the MRD route: 640 +-------------------------------------------------+ 641 | RD (8 octets) | 642 +-------------------------------------------------+ 643 | Ethernet Segment ID (10 octets) | 644 +-------------------------------------------------+ 645 | Ethernet Tag ID (4 octets) | 646 +-------------------------------------------------+ 647 | Originator Router Length (1 octet) | 648 +-------------------------------------------------+ 649 | Originator Router Address (Variable) | 650 +-------------------------------------------------+ 651 | Mcast Router Length (1 octet) | 652 +-------------------------------------------------+ 653 | Mcast Router Address 1 (variable) | 654 +-------------------------------------------------+ 655 | Secondary Address List Length (1 octet) | 656 +-------------------------------------------------+ 657 | Secondary Mcast Router Address 1 (variable) | 658 +-------------------------------------------------+ 659 | . | 660 | . | 661 | Secondary Mcast Router Address n (variable) | 662 +-------------------------------------------------+ 663 | DR Priority (4 octets) | 664 +-------------------------------------------------+ 665 | Flags (1 octet) | 666 +-------------------------------------------------+ 668 Figure 5 Multicast Router Discovery Route 670 The support for this new route type is OPTIONAL. Since this new route 671 type is OPTIONAL, an implementation not supporting it MUST ignore the 672 route, based on the unknown route type value, as specified by Section 673 5.4 in [RFC7606]. 675 The encoding of this route is defined as follows: 677 o RD, ESI and Ethernet Tag ID are defined as per [RFC7432] for MAC/IP 678 routes. 680 o The Originator Router Length and Address encode and IPv4 or IPv6 681 address that belongs to the advertising PE. 683 o The Multicast Router Length and Address field encode the Primary IP 684 address of the PIM neighbor added to the PE's DB. 686 o The Secondary Address List Length encodes the number of Secondary 687 IP addresses advertised by the PIM router in the PIM Hello message. 688 If this field is zero, the NLRI will not include any Secondary 689 Multicast Router Address. All the IP addresses will have the same 690 Length, that is, they will all be either IPv4 or IPv6, but not a 691 mix of both. 693 o DR Priority is copied from the same field in Hello packets, as per 694 [RFC4601]. 696 o Flags: 697 - Q: Querier flag. Least significant bit. It indicates the encoded 698 multicast router is an IGMP Querier. 699 - P: PIM router flag. Second low order bit in the Flags octet. It 700 indicates that the multicast router is a PIM router. 701 - Q and P may be set simultaneously. 703 For BGP processing purposes, only the RD, Ethernet Tag ID, Originator 704 Router Length and Address, and Multicast Router Length and Address 705 are considered part of the route key. The Secondary Multicast Router 706 Addresses and the rest of the fields are not part of the route key. 708 4.2 Selective Multicast Ethernet Tag Route for PIM Proxy 710 The extended SMET route used in this document is shown in Figure 6. 712 NOTE: this route may use the SMET route type, or may be a different 713 route type PIM SMET route. This is TBD. 715 +---------------------------------------+ 716 | RD (8 octets) | 717 +---------------------------------------+ 718 | Ethernet Tag ID (4 octets) | 719 +---------------------------------------+ 720 | Multicast Source Length (1 octet) | 721 +---------------------------------------+ 722 | Multicast Source Address (variable) | 723 +---------------------------------------+ 724 | Multicast Group Length (1 octet) | 725 +---------------------------------------+ 726 | Multicast Group Address (Variable) | 727 +---------------------------------------+ 728 | Originator Router Length (1 octet) | 729 +---------------------------------------+ 730 | Originator Router Address (variable) | 731 +---------------------------------------+ 732 | Flags (1 octets) (optional) | 733 +---------------------------------------+ 734 | Upstream Router Length (1B)(optional)| 735 +---------------------------------------+ 736 | Upstream Router Addr (variable)(opt) | 737 +---------------------------------------+ 739 Flags: 741 0 1 2 3 4 5 6 7 742 +--+--+--+--+--+--+--+--+ 743 | | | P|IE|v3|v2|v1| 744 +--+--+--+--+--+--+--+--+ 746 Figure 6 Selective Multicast Ethernet Tag Route and Flags 748 As in the case of the MRD route, this route type is OPTIONAL. 750 This route will be used as per [EVPN-IGMP-MLD-PROXY], with the 751 following extra and optional fields: 753 o Upstream Router Length and Address will contain the same 754 information as received in a PIM Join/Prune message on a local AC. 755 There is only one Upstream Router Address per route. 757 o Flags: This field encodes Flags that are now relevant to IGMP and 758 PIM. The following new Flag is defined: 760 - Flag P: Indicates the SMET route is generated by a received PIM 761 Join on a local AC. When P=1, the Upstream Router Length and 762 Address fields are present in the route. Otherwise the two fields 763 will not be present. 765 Compared to [EVPN-IGMP-MLD-PROXY] there is no change in terms of 766 fields considered part of the route key for BGP processing. The 767 Upstream Router Length and Address are not considered part of the 768 route key. 770 4.3 PIM RPT-Prune Route 772 The RPT-Prune route is analogous to the SMET route but for PIM RPT- 773 Prune messages. The SMET routes cannot be used to convey RPT-Prune 774 messages because they are always triggered by IGMP or PIM Join 775 messages. A PIM RPT-Prune message is used to Prune a specific (S,G) 776 from the RP Tree by downstream routers. An RPT-Prune message is 777 typically seen prior to an RPT-Join message for the (S,G), hence it 778 requires its own BGP route. 780 +---------------------------------------+ 781 | RD (8 octets) | 782 +---------------------------------------+ 783 | Ethernet Tag ID (4 octets) | 784 +---------------------------------------+ 785 | Multicast Source Length (1 octet) | 786 +---------------------------------------+ 787 | Multicast Source Address (variable) | 788 +---------------------------------------+ 789 | Multicast Group Length (1 octet) | 790 +---------------------------------------+ 791 | Multicast Group Address (Variable) | 792 +---------------------------------------+ 793 | Originator Router Length (1 octet) | 794 +---------------------------------------+ 795 | Originator Router Address (variable) | 796 +---------------------------------------+ 797 | Upstream Router Length (1B) | 798 +---------------------------------------+ 799 | Upstream Router Addr (variable) | 800 +---------------------------------------+ 802 Figure 7 PIM RPT-Prune Route 804 Fields are defined in the same way as for the SMET route. 806 4.4 IGMP/PIM Join Synch Route for PIM Proxy 808 This document renames the IGMP Join Synch Route as IGMP/PIM Join 809 Synch Route and extends it with new fields and Flags as shown in 810 Figure 8: 812 NOTE: this route may use and extend the IGMP Join Synch Route, or may 813 turn out to be a different route type in future revisions. This is 814 TBD. 816 +----------------------------------------------+ 817 | RD (8 octets) | 818 +----------------------------------------------+ 819 | Ethernet Segment Identifier (10 octets) | 820 +----------------------------------------------+ 821 | Ethernet Tag ID (4 octets) | 822 +----------------------------------------------+ 823 | Multicast Source Length (1 octet) | 824 +----------------------------------------------+ 825 | Multicast Source Address (variable) | 826 +----------------------------------------------+ 827 | Multicast Group Length (1 octet) | 828 +----------------------------------------------+ 829 | Multicast Group Address (Variable) | 830 +----------------------------------------------+ 831 | Originator Router Length (1 octet) | 832 +----------------------------------------------+ 833 | Originator Router Address (variable) | 834 +----------------------------------------------+ 835 | Flags (1 octet) | 836 +----------------------------------------------+ 837 | Upstream Router Length (1B)(optional) | 838 +----------------------------------------------+ 839 | Upstream Router Addr (variable)(opt) | 840 +----------------------------------------------+ 842 Flags: 844 0 1 2 3 4 5 6 7 845 +--+--+--+--+--+--+--+--+ 846 | | | | P|IE|v3|v2|v1| 847 +--+--+--+--+--+--+--+--+ 849 Figure 8 IGMP/PIM Join Synch Route and Flags 851 This route will be used as per [EVPN-IGMP-MLD-PROXY], with the 852 following extra and optional fields: 854 o Upstream Router Length and Address will contain the same 855 information as received in a PIM Join/Prune message on a local AC. 856 There is only one Upstream Router Address per route. 858 o Flags: This field encodes Flags that are now relevant to IGMP and 859 PIM. The following new Flag is defined: 861 - Flag P: Indicates the Join Synch route is generated by a received 862 PIM Join on a local AC. When P=1, the Upstream Router Length and 863 Address fields are present in the route. Otherwise the two fields 864 will not be present. 866 Compared to [EVPN-IGMP-MLD-PROXY] there is no change in terms of 867 fields considered part of the route key for BGP processing. The 868 Upstream Router Length and Address are not considered part of the 869 route key. 871 4.5 IGMP/PIM RPT-Prune Synch Route for PIM Proxy 873 This new route is used to Synch RPT-Prune states among the PEs in the 874 Ethernet Segment. 876 +----------------------------------------------+ 877 | RD (8 octets) | 878 +----------------------------------------------+ 879 | Ethernet Segment Identifier (10 octets) | 880 +----------------------------------------------+ 881 | Ethernet Tag ID (4 octets) | 882 +----------------------------------------------+ 883 | Multicast Source Length (1 octet) | 884 +----------------------------------------------+ 885 | Multicast Source Address (variable) | 886 +----------------------------------------------+ 887 | Multicast Group Length (1 octet) | 888 +----------------------------------------------+ 889 | Multicast Group Address (Variable) | 890 +----------------------------------------------+ 891 | Originator Router Length (1 octet) | 892 +----------------------------------------------+ 893 | Originator Router Address (variable) | 894 +----------------------------------------------+ 895 | Upstream Router Length (1B)(optional) | 896 +----------------------------------------------+ 897 | Upstream Router Addr (variable)(opt) | 898 +----------------------------------------------+ 900 Figure 9 IGMP/PIM RPT-Prune Synch Route 902 The RD, Ethernet Segment Identifier and other fields are defined as 903 for the IGMP/PIM Join Synch Route. In addition, the Upstream Router 904 Length and Address will contain the same information as received in a 905 PIM RPT-Prune message on a local AC. The Upstream Router points at 906 the RP for the source and group and there is only one Upstream Router 907 Address per route. 909 The route key for BGP processing is defined as per the IGMP/PIM Join 910 Synch route. 912 5. Conclusions 914 This document extends the IGMP Proxy concept of [EVPN-IGMP-MLD-PROXY] 915 to PIM, so that EVPN can also be used to minimize the flooding of PIM 916 control messages and optimize the delivery of IP multicast traffic in 917 EVPN Broadcast Domains that connect PIM routers. 919 This specification describes procedures to Discover new PIM routers 920 in the BD, as well as propagate PIM Join/Prune messages using EVPN 921 SMET routes and other optimizations. 923 6. Conventions used in this document 925 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 926 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 927 document are to be interpreted as described in RFC-2119 [RFC2119]. 929 In this document, these words will appear with that interpretation 930 only when in ALL CAPS. Lower case uses of these words are not to be 931 interpreted as carrying RFC-2119 significance. 933 In this document, the characters ">>" preceding an indented line(s) 934 indicates a compliance requirement statement using the key words 935 listed above. This convention aids reviewers in quickly identifying 936 or finding the explicit compliance requirements of this RFC. 938 7. Security Considerations 940 This section will be added in future versions. 942 8. IANA Considerations 944 This document requests IANA to allocate a new EVPN route type in the 945 corresponding registry: 947 + Type TBD - Multicast Router Discovery (MRD) Route 948 + Type TBD - PIM RPT-Prune Route 949 + Type TBD - PIM RPT-Prune Join Synch Route 951 In addition, the following route defined in [EVPN-IGMP-MLD-PROXY] 952 should be renamed as follows: 954 + Type 7 - IGMP/PIM Join Synch Route 956 9. Terminology 958 o EVI: EVPN Instance. 960 o EVPN Broadcast Domain: it refers to an EVI in case of VLAN-based 961 and VLAN-bundle interfaces. It refers to a Bridge Domain identified 962 by an Ethernet-Tag (in the control plane) in case of VLAN-Aware 963 Bundle interfaces. 965 o AC: Attachment Circuit. 967 o PIM-DM: Protocol Independent Multicast - Dense Mode. 969 o PIM-SM: Protocol Independent Multicast - Sparse Mode. 971 o PIM-SSM: Protocol Independent Multicast - Source Specific Mode. 973 o S: IP address of the multicast source. 975 o G: IP address of the multicast group. 977 o N: Upstream neighbor field in a Join/Prune/Graft message. 979 o PIM J/P: PIM Join/Prune messages. 981 o RP: PIM Rendezvous Point. 983 o MRD route: Multicast Router Discovery. 985 o PIM Nbr: PIM Neighbor. 987 10. References 989 10.1 Normative References 991 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 992 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 993 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, . 996 [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, 997 "Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol 998 Specification (Revised)", RFC 4601, DOI 10.17487/RFC4601, August 999 2006, . 1001 [RFC2236] Fenner, W., "Internet Group Management Protocol, Version 1002 2", RFC 2236, DOI 10.17487/RFC2236, November 1997, . 1005 [VPLS-PIM-PROXY] Dornon, O. et al, "Protocol Independent Multicast 1006 (PIM) over Virtual Private LAN Service (VPLS)", June 2017, work-in- 1007 progress, draft-ietf-pals-vpls-pim-snooping-06. 1009 [EVPN-IGMP-MLD-PROXY] Sajassi, A. et al, "IGMP and MLD Proxy for 1010 EVPN", March 2017, work-in-progress, draft-ietf-bess-evpn-igmp-mld- 1011 proxy-00. 1013 10.2 Informative References 1015 [EVPN-PROXY-ARP-ND] Rabadan, J. et al, "Operational Aspects of Proxy- 1016 ARP/ND in EVPN Networks", April 2017, work-in-progress, draft-ietf- 1017 bess-evpn-proxy-arp-nd-02. 1019 11. Acknowledgments 1021 12. Contributors 1023 13. Authors' Addresses 1025 Jorge Rabadan 1026 Nokia 1027 777 E. Middlefield Road 1028 Mountain View, CA 94043 USA 1029 Email: jorge.rabadan@nokia.com 1031 Senthil Sathappan 1032 Nokia 1033 701 E. Middlefield Road 1034 Mountain View, CA 94043 USA 1035 Email: senthil.sathappan@nokia.com 1037 Jayant Kotalwar 1038 Nokia 1039 701 E. Middlefield Road 1040 Mountain View, CA 94043 USA 1041 Email: jayant.kotalwar@nokia.com 1043 Zhaohui Zhang 1044 Juniper Networks 1045 EMail: zzhang@juniper.net 1047 Ali Sajassi 1048 Cisco 1049 Email: sajassi@cisco.com