idnits 2.17.1 draft-serbest-l2vpn-vpls-mcast-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 42. -- Found old boilerplate from RFC 3978, Section 5.5 on line 2087. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2060. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2067. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2073. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents -- however, there's a paragraph with a matching beginning. Boilerplate error? == The page length should not exceed 58 lines per page, but there was 43 longer pages, the longest (page 8) being 66 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 44 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 2 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 233: '...looded. A PE is REQUIRED to have safe...' RFC 2119 keyword, line 239: '...network, a PE is REQUIRED to have tool...' RFC 2119 keyword, line 255: '... A VPLS solution MUST NOT affect the o...' RFC 2119 keyword, line 256: '...Additionally, a VPLS solution MUST NOT...' RFC 2119 keyword, line 326: '...ndates that a PE MUST be able to snoop...' (115 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == In addition to RFC 3978, Section 5.1 boilerplate, a section with a similar start was also found: By submitting this Internet-Draft, we represent that any applicable patent or other IPR claims of which we are aware have been disclosed, or will be disclosed, and any of which we become aware will be disclosed in accordance with RFC 3668. == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'VPLS-BGP' is mentioned on line 155, but not defined == Missing Reference: 'MLD' is mentioned on line 322, but not defined == Missing Reference: 'MLDv2' is mentioned on line 323, but not defined == Missing Reference: 'IGMPv2' is mentioned on line 732, but not defined == Missing Reference: 'IGMPv3' is mentioned on line 733, but not defined == Missing Reference: 'MAGMA-SNOOOP' is mentioned on line 736, but not defined == Unused Reference: 'VPLSD-BGP' is defined on line 1973, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 2362 (Obsoleted by RFC 4601, RFC 5059) Summary: 6 errors (**), 0 flaws (~~), 12 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET DRAFT Y. Serbest 3 Internet Engineering Task Force SBC 4 Document: Ray Qiu 5 draft-serbest-l2vpn-vpls-mcast-03.txt Venu Hemige 6 July 2005 Alcatel 7 Category: Informational Rob Nath 8 Expires: January 2006 Riverstone 10 Supporting IP Multicast over VPLS 12 Status of this memo 14 By submitting this Internet-Draft, we represent that any applicable 15 patent or other IPR claims of which we are aware have been disclosed, 16 or will be disclosed, and any of which we become aware will be 17 disclosed in accordance with RFC 3668. 19 This document is an Internet-Draft and is in full conformance with 20 Sections 5 and 6 of RFC 3667 and Section 5 of RFC 3668. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that other 24 groups may also distribute working documents as Internet- Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 IPR Disclosure Acknowledgement 39 By submitting this Internet-Draft, each author represents that any 40 applicable patent or other IPR claims of which he or she is aware 41 have been or will be disclosed, and any of which he or she becomes 42 aware will be disclosed, in accordance with Section 6 of BCP 79. 44 Abstract 46 In Virtual Private LAN Service (VPLS), the PE devices provide a 47 logical interconnect such that CE devices belonging to a specific 48 VPLS instance appear to be connected by a single LAN. A VPLS 49 solution performs replication for multicast traffic at the ingress PE 50 devices. When replicated at the ingress PE, multicast traffic wastes 51 bandwidth when 1. Multicast traffic is sent to sites with no members, 52 and 2. Pseudo wires to different sites go through a shared path. 53 This document is addressing the former by IGMP and PIM snooping. 55 Conventions used in this document 57 The key words MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 58 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 59 document are to be interpreted as described in RFC 2119. 61 Table of Contents 63 1. Contributing Authors............................................3 64 2. Introduction....................................................3 65 3. Overview of VPLS................................................4 66 4. Multicast Traffic over VPLS.....................................5 67 5. Constraining of IP Multicast in a VPLS..........................6 68 5.1. IPv6 Considerations...........................................7 69 5.2. General Rules for IGMP/PIM Snooping in VPLS...................7 70 5.3. IGMP Snooping for VPLS........................................8 71 5.3.1. Discovering Multicast Routers...............................9 72 5.3.2. IGMP Snooping Protocol State................................9 73 5.3.3. IGMP Join..................................................10 74 5.3.4. IGMP Leave.................................................14 75 5.3.5. Failure Scenarios..........................................15 76 5.3.6. Scaling Considerations for IGMP Snooping...................16 77 5.3.7. Downstream Proxy Behavior..................................16 78 5.3.8. Upstream Proxy Behavior....................................17 79 5.4. PIM Snooping for VPLS........................................17 80 5.4.1. PIM Snooping State Summarization Macros....................18 81 5.4.2. PIM-DM.....................................................20 82 5.4.2.1. Discovering Multicast Routers............................20 83 5.4.2.2. PIM-DM Multicast Forwarding..............................21 84 5.4.2.3. PIM-DM Pruning...........................................21 85 5.4.2.4. PIM-DM Grafting..........................................22 86 5.4.2.5. Failure Scenarios........................................23 87 5.4.3. PIM-SM.....................................................23 88 5.4.3.1. Discovering Multicast Routers............................24 89 5.4.3.2. PIM-SM (*,G) Join........................................24 90 5.4.3.3. PIM-SM Pruning...........................................26 91 5.4.3.4. PIM-SM (S,G) Join........................................27 92 5.4.3.5. PIM-SM (S,G,rpt) Prunes..................................28 93 5.4.3.6. PIM-SM (*,*,RP) State....................................28 94 5.4.3.7. Failure Scenarios........................................28 95 5.4.3.8. Special Cases for PIM-SM Snooping........................28 96 5.4.4. PIM-SSM....................................................30 97 5.4.4.1. Discovering Multicast Routers............................31 98 5.4.4.2. Guidelines for PIM-SSM Snooping..........................31 99 5.4.4.3. PIM-SSM Join.............................................32 100 5.4.4.4. PIM-SSM Prune............................................33 101 5.4.4.5. Failure Scenarios........................................33 102 5.4.4.6. Special Cases for PIM-SSM Snooping.......................33 103 5.4.5. Bidirectional-PIM (BIDIR-PIM)..............................33 104 5.4.5.1. Discovering Multicast Routers............................34 105 5.4.5.2. Guidelines for BIDIR-PIM Snooping........................35 106 5.4.5.3. BIDIR-PIM Join...........................................35 107 5.4.5.4. BIDIR-PIM Prune..........................................36 108 5.4.5.5. Failure Scenarios........................................37 109 5.4.6. Multicast Source Directly Connected to the VPLS Instance...37 110 5.5. VPLS Multicast on the Upstream PE............................37 111 5.5.1. Negotiating PIM Multicast capability in LDP................38 112 5.5.2. Exchanging PIM Hellos......................................38 113 5.5.3. Exchanging PIM Join/Prune States...........................39 114 5.5.3.1. PIM Join Suppression Issues..............................39 115 5.5.3.2. Resiliency against soft-state failures...................40 116 5.5.3.2.1. Explicit Tracking of C-Joins at the downstream PE......40 117 5.5.3.2.2. Refreshing PIM Join TLVs on the PWs....................41 118 5.5.3.3. PIM-BIDIR Considerations.................................41 119 5.6. Data Forwarding Rules........................................41 120 6. Security Considerations........................................41 121 7. References.....................................................42 122 7.1. Normative References.........................................42 123 7.2. Informative References.......................................42 125 1. Contributing Authors 127 This document was the combined effort of several individuals. The 128 following are the authors, in alphabetical order, who contributed to 129 this document: 131 Suresh Boddapati 132 Venu Hemige 133 Sunil Khandekar 134 Vach Kompella 135 Marc Lasserre 136 Rob Nath 137 Ray Qiu 138 Yetik Serbest 139 Himanshu Shah 141 2. Introduction 143 In Virtual Private LAN Service (VPLS), the Provider Edge (PE) devices 144 provide a logical interconnect such that Customer Edge (CE) devices 145 belonging to a specific VPLS instance appear to be connected by a 146 single LAN. Forwarding information base for particular VPLS instance 147 is populated dynamically by source MAC address learning. This is a 148 straightforward solution to support unicast traffic, with reasonable 149 flooding for unicast unknown traffic. Since a VPLS provides LAN 150 emulation for IEEE bridges as wells as for routers, the unicast and 151 multicast traffic need to follow the same path for layer-2 protocols 152 to work properly. As such, multicast traffic is treated as broadcast 153 traffic and is flooded to every site in the VPLS instance. 155 VPLS solutions (i.e., [VPLS-LDP] and [VPLS-BGP]) perform replication 156 for multicast traffic at the ingress PE devices. When replicated at 157 the ingress PE, multicast traffic wastes bandwidth when: 1. Multicast 158 traffic is sent to sites with no members, 2. Pseudo wires to 159 different sites go through a shared path, and 3. Multicast traffic is 160 forwarded along a shortest path tree as opposed to the minimum cost 161 spanning tree. This document is addressing the first problem by IGMP 162 and PIM snooping. Using VPLS in conjunction with IGMP and/or PIM 163 snooping has the following advantages: 164 - It improves VPLS to support IP multicast efficiently (not 165 necessarily optimum, as there can still be bandwidth waste), 166 - It prevents sending multicast traffic to sites with no 167 members, 168 - It keeps P routers in the core stateless, 169 - The Service Provider (SP) does not need to perform the tasks 170 to provide multicast service (e.g., running PIM, managing P-group 171 addresses, managing multicast tunnels) 172 - The SP does not need to maintain PIM adjacencies with the 173 customers. 175 In this document, we describe the procedures for Internet Group 176 Management Protocol (IGMP) and Protocol Independent Multicast (PIM) 177 snooping over VPLS for efficient distribution of IP multicast 178 traffic. 180 3. Overview of VPLS 182 In case of VPLS, the PE devices provide a logical interconnect such 183 that CE devices belonging to a specific VPLS appear to be connected 184 by a single LAN. End-to-end VPLS consists of a bridge module and a 185 LAN emulation module ([L2VPN-FR]). 187 In a VPLS, a customer site receives Layer-2 service from the SP. The 188 PE is attached via an access connection to one or more CEs. The PE 189 performs forwarding of user data packets based on information in the 190 Layer-2 header, that is, MAC destination address. The CE sees a 191 bridge. 193 The details of VPLS reference model, which we summarize here, can be 194 found in [L2VPN_FR]. In VPLS, the PE can be viewed as containing a 195 Virtual Switching Instance (VSI) for each L2VPN that it serves. A CE 196 device attaches, possibly through an access network, to a bridge 197 module of a PE. Within the PE, the bridge module attaches, through 198 an Emulated LAN Interface to an Emulated LAN. For each VPLS, there 199 is an Emulated LAN instance. The Emulated LAN consists of VPLS 200 Forwarder module (one per PE per VPLS service instance) connected by 201 pseudo wires (PW), where the PWs may be traveling through Packet 202 Switched Network (PSN) tunnels over a routed backbone. VSI is a 203 logical entity that contains a VPLS forwarder module and part of the 204 bridge module relevant to the VPLS service instance [L2VPN-FR]. 205 Hence, the VSI terminates PWs for interconnection with other VSIs and 206 also terminates attachment circuits (ACs) for accommodating CEs. A 207 VSI includes the forwarding information base for a L2VPN [L2VPN-FR] 208 which is the set of information regarding how to forward Layer-2 209 frames received over the AC from the CE to VSIs in other PEs 210 supporting the same L2VPN service (and/or to other ACs), and contains 211 information regarding how to forward Layer-2 frames received from PWs 212 to ACs. Forwarding information bases can be populated dynamically 213 (such as by source MAC address learning) or statically (e.g., by 214 configuration). Each PE device is responsible for proper forwarding 215 of the customer traffic to the appropriate destination(s) based on 216 the forwarding information base of the corresponding VSI. 218 4. Multicast Traffic over VPLS 220 In VPLS, if a PE receives a frame from an Attachment Circuit (AC) 221 with no matching entry in the forwarding information base for that 222 particular VPLS instance, it floods the frame to all other PEs (which 223 are part of this VPLS instance) and to directly connected ACs (other 224 than the one that the frame is received from). The flooding of a 225 frame occurs when: 226 - The destination MAC address has not been learned, 227 - The destination MAC address is a broadcast address, 228 - The destination MAC address is a multicast address. 230 Malicious attacks (e.g., receiving unknown frames constantly) aside, 231 the first situation is handled by VPLS solutions as long as 232 destination MAC address can be learned. After that point on, the 233 frames will not be flooded. A PE is REQUIRED to have safeguards, 234 such as unknown unicast limiting and MAC table limiting, against 235 malicious unknown unicast attacks. 237 There is no way around flooding broadcast frames. To prevent runaway 238 broadcast traffic from adversely affecting the VPLS service and the 239 SP network, a PE is REQUIRED to have tools to rate limit the 240 broadcast traffic as well. 242 Similar to broadcast frames, multicast frames are flooded as well, as 243 a PE can not know where multicast members reside. Rate limiting 244 multicast traffic, while possible, should be should be done carefully 245 since several network control protocols relies on multicast. For one 246 thing, layer-2 and layer-3 protocols utilize multicast for their 247 operation. For instance, Bridge Protocol Data Units (BPDUs) use an 248 IEEE assigned all bridges multicast MAC address, and OSPF is 249 multicast to all OSPF routers multicast MAC address. If the rate- 250 limiting of multicast traffic is not done properly, the customer 251 network will experience instability and poor performance. For the 252 other, it is not straightforward to determine the right rate limiting 253 parameters for multicast. 255 A VPLS solution MUST NOT affect the operation of customer layer-2 256 protocols (e.g., BPDUs). Additionally, a VPLS solution MUST NOT 257 affect the operation of layer-3 protocols. 259 In the following section, we describe procedures to constrain the 260 flooding of IP multicast traffic in a VPLS. 262 5. Constraining of IP Multicast in a VPLS 264 The objective of improving the efficiency of VPLS for multicast 265 traffic that we are trying to optimize here has the following 266 constraints: 267 - The service is VPLS, i.e., a layer-2 VPN, 268 - In VPLS, ingress replication is required, 269 - There is no layer-3 adjacency (e.g., PIM) between a CE and a 270 PE. 272 Under these circumstances, the most obvious approach is 273 implementation of IGMP and PIM snooping in VPLS. Other multicast 274 routing protocols such as DVMRP and MOSPF are outside the scope of 275 this document. 277 Another approach to constrain multicast traffic in a VPLS is to 278 utilize point-multipoint LSPs (e.g., [PMP-RSVP-TE]). In such case, 279 one has to establish a point-multipoint LSP from a source PE (i.e., 280 the PE to which the source router is connected to) to all other PEs 281 participating in the VPLS instance. In this case, if nothing is 282 done, all PEs will receive multicast traffic even if they do not have 283 any members hanging off of them. One can apply IGMP/PIM snooping, 284 but this time IGMP/PIM snooping should be done in P routers as well. 285 One can propose a dynamic way of establishing point-multipoint LSPs, 286 for instance by mapping IGMP/PIM messages to RSVP-TE signaling. One 287 should consider the effect of such approach on the signaling load and 288 on the delay between the time the join request received and the 289 traffic is received (this is important for IPTV application for 290 instance). This approach is outside the scope of this document. 292 Although, in some extremely controlled cases, such as a ring topology 293 of PE routers with no P routers or a tree topology, the efficiency of 294 the replication of IP multicast can be improved. For instance, spoke 295 PWs of a hierarchical VPLS can be daisy-chained together and some 296 replication rules can be devised. These cases are not expected to be 297 common and will not be considered in this document. 299 In the following sub-sections, we provide some guidelines for the 300 implementation of IGMP and PIM snooping in VPLS. Snooping techniques 301 need to be employed on ACs at the downstream PEs. Snooping techniques 302 can also be employed on PWs at the upstream PEs. This may work well 303 for small to medium scale deployments. However, if there are a large 304 number of VPLS instances with a large number of PEs per instances, 305 then the amount of snooping required at the upstream PEs can 306 overwhelm the upstream PEs. In section 5.5. , we provide an 307 alternative approach using LDP to build multicast replication states 308 on the upstream PEs. Using a reliable mechanism like LDP allows the 309 upstream PEs to eliminate the requirement to snoop on PWs. It also 310 eliminates the need to refresh multicast states on the upstream PEs. 312 5.1. IPv6 Considerations 314 In VPLS, PEs forward Ethernet frames received from CEs and as such 315 are agnostic of the layer-3 protocol used by the CEs. However, as an 316 IGMP and PIM snooping switch, the PE would have to look deeper into 317 the IP and IGMP/PIM packets and build snooping state based on that. 318 As already stated, the scope of this document is limited to snooping 319 IGMP/PIM packets. So, we are concerned with snooping specific IP 320 payloads. Nonetheless, there are two IP versions a PE would have to 321 be able to interpret. IGMP is the Group Management Protocol which 322 applies only to IPv4. MLD [MLD] is the equivalent of IGMPv2 defined 323 for IPv6. MLDv2 [MLDv2] is the equivalent of IGMPv3 defined for 324 IPv6. PIM runs on top of both IPv4 and IPv6. 326 This document mandates that a PE MUST be able to snoop IGMP and PIM 327 encapsulated as IPv4 payloads. The PE SHOULD also be capable of 328 snooping MLD/MLDv2 packets and PIM packets encapsulated as IPv6 329 payloads. If the PE cannot snoop IPv6 payloads, then it MUST NOT 330 build any snooping state for such multicast groups and MUST simply 331 flood any data traffic sent to such groups. This allows an IPv6- 332 unaware PE to perform the snooping function only on IPv4 multicast 333 groups. This is possible because an IPv4 multicast address and an 334 IPv6 multicast address never share the same MAC address. 336 To avoid confusion, this document describes the procedures for 337 IGMP/PIM snooping for IPv4. The procedures described for IGMP can 338 also be applied to MLD and MLDv2. Please refer to Section 3 of 339 [MAGMA-SNOOP] for a list of IPv4/IPv6 differences an IGMP/MLD 340 snooping switch has to be aware of. In addition to those 341 differences, some of the other differences of interest are: 343 - IPv4 multicast addresses map to multicast MAC address 344 starting with 01:00:5E and IPv6 multicast addresses map to 345 multicast MAC addresses starting with 33:33. So the MAC ddresses 346 used for IPv4 and IPv6 never overlap. 348 5.2. General Rules for IGMP/PIM Snooping in VPLS 350 The following rules for the correct operation of IGMP/PIM snooping 351 MUST be followed. 353 Rule 1: IGMP and PIM messages forwarded by PEs MUST follow the split- 354 horizon rule for mesh PWs as defined in [VPLS-LDP]. 356 Rule 2: IGMP/PIM snooping states in a PE MUST be per VPLS instance. 358 Rule 3: If a PE does not have any entry in a IGMP/PIM snooping state 359 for multicast group (*,G) or (S,G), the multicast traffic to that 360 group in the VPLS instance MUST be flooded. 362 Rule 4: A PE MUST support PIM mode selection per VPLS instance via 363 CLI and/or EMS. Another option could be to deduce the PIM mode from 364 RP address for a specific multicast group. For instance, a RP address 365 can be learned during the Designated Forwarder (DF) election stage 366 for Bidirectional-PIM. 368 5.3. IGMP Snooping for VPLS 370 IGMP is a mechanism to inform the routers on a subnet of a hosts 371 request to become a member of a particular multicast group. IGMP is 372 a stateful protocol. The router (i.e., the querier) regularly 373 verifies that the hosts want to continue to participate in the 374 multicast groups by sending periodic queries, transmitted to all 375 hosts multicast group (IP:224.0.0.1, MAC:01-00-5E-00-00-01) on the 376 subnet. If the hosts are still interested in that particular 377 multicast group, they respond with membership report message, 378 transmitted to the multicast group of which they are members. In 379 IGMPv1 [RFC1112], the hosts simply stop responding to IGMP queries 380 with membership reports, when they want to leave a multicast group. 381 IGMPv2 [RFC2236] adds a leave message that a host will use when it 382 needs to leave a particular multicast group. IGMPv3 [RFC3376] 383 extends the report/leave mechanism beyond multicast group to permit 384 joins and leaves to be issued for specific source/group (S,G) pairs. 386 In IGMP snooping, a PE snoops on the IGMP protocol exchange between 387 hosts and routers, and based on that restricts the flooding of IP 388 multicast traffic. In the following, we explore the mechanisms 389 involved in implementing IGMP snooping for VPLS. Please refer to 390 Figure 1 as an example of VPLS with IGMP snooping. In the figure, 391 Router 1 is the Querier. If multiple routers exist on a single 392 subnet (basically that is what a VPLS instance is), they can mutually 393 elect a designated router (DR) that will manage all of the IGMP 394 messages for that subnet. 396 VPLS Instance 397 +------+ AC1 +------+ +------+ AC4 +------+ 398 | Host |-----| PE |-------------| PE |-----|Router| 399 | 1 | | 1 |\ PW1to3 /| 3 | | 1 | 400 +------+ +------+ \ / +------+ +------+ 401 | \ / | 402 | \ / | 403 | \ /PW2to3 | 404 | \ / | 405 PW1to2| \ |PW3to4 406 | / \ | 407 | / \PW1to4 | 408 | / \ | 409 | / \ | 410 +------+ +------+ / \ +------+ +------+ 411 | Host | | PE |/ PW2to4 \| PE | |Router| 412 | 2 |-----| 2 |-------------| 4 |-----| 2 | 413 +------+ AC2 +------+ +------+ AC5 +------+ 414 | 415 |AC3 416 +------+ 417 | Host | 418 | 3 | 419 +------+ 421 Figure 1 Reference Diagram for IGMP Snooping for VPLS 423 5.3.1. Discovering Multicast Routers 425 A PE need to discover the multicast routers in VPLS instances. This 426 is necessary because: 427 - Designated Router can be different from the Querier on a LAN. 428 - It is not always the Querier that initiates PIM joins 429 - Multicast traffic to the LAN could arrive from a non-querying 430 router because it could be the closest to the source. 432 As recommended in [MAGMA-SNOOP], the PEs can discover multicast 433 routers using Multicast Router Discovery Protocol or they can be 434 statically configured. Since multicast routing protocols other than 435 PIM is out scope, multicast routers can also be discovered by 436 snooping PIM Hello packets as described in Section 5.4.2. . 438 5.3.2. IGMP Snooping Protocol State 440 The IGMP snooping mechanism described here builds the following state 441 on the PEs. 443 For each VPLS Instance 444 o Set of Multicast Routers (McastRouters) in the VPLS instance 445 using mechanisms listed in Section 5.2.1. 446 o The IGMP Querying Router (Querier) in the VPLS instance. 448 For each Group entry (*,G) or Source Filtering entry (S,G) in a VPLS 449 instance 451 o Set of interfaces (ACs and/or PWs) from which IGMP membership 452 reports were received. For (*,G) entries, we will call this 453 set igmp_include(*,G). For (S,G) entries, we will call this 454 set igmp_include(S,G). 455 o Set of interfaces from which IGMPv3 hosts have requested to 456 not receive traffic from the specified sources. We will call 457 this set igmp_exclude(S,G). 459 On each interface I, for each (*,G) or (S,G) entry 460 o A Group Timer (GroupTimer(*,G,I)) representing the hold-time 461 for each downstream (*,G) report received on interface I. 462 o A Source Timer (SrcTimer(S,G,I)) representing the hold-time 463 for each downstream (S,G) report received on interface I. 465 5.3.3. IGMP Join 467 The IGMP snooping mechanism for joining a multicast group (for all 468 IGMP versions) works as follows: 470 - The PE does querier election (by tracking query messages and 471 the source IP addresses) to determine the Querier when there are 472 multiple routers present. Additionally, the query must be received 473 with a non-zero source-ip-address to perform the Querier election 474 - At this point all PEs learn the place of the Querier. For 475 instance, for PE 1 it is behind PW1to3, for PE 2 behind PW2to3, 476 for PE 3 behind AC4, for PE 4 behind PW3to4. 477 - The Querier sends a membership query on the LAN. The 478 membership query can be either general query or group specific 479 query. 480 - PE 3 replicates the query message and forwards it to all PEs 481 participating in the VPLS instance (i.e., PE 1, PE 2, PE 4). 482 - PE 3 keeps a state of {[McastRouters: AC4, PW3to4], [Querier: 483 AC4]}. 484 - All PEs then forward the query to ACs which are part of the 485 VPLS instance. 486 - Suppose that all hosts (Host 1, Host 2, and Host 3) want to 487 participate in the multicast group. 489 - Host 2 first (for the sake of the example) sends a membership 490 report to the multicast group (e.g., IP: 239.1.1.1, MAC: 01-00-5E- 491 01-01-01), of which Host 2 wants to be a member. 493 - PE 2 replicates the membership report message and forwards it 494 to all PEs participating in the VPLS instance (i.e., PE 1, PE3, PE 495 4). 496 - PE 2 notes that there is a directly connected host, which is 497 willing to participate in the multicast group and updates its 498 state to {[McastRouters: PW2to3, PW2to4], [Querier: PW2to3], 499 [igmp_include(*,G):AC2; GroupTimer(*,G,AC2)=GMI]}. 501 Guideline 1: A PE MUST forward a membership report message to ACs 502 that are part of "McastRouters" state. This is necessary to avoid 503 report suppression for other members in order for the PEs to 504 construct correct states and to not have any orphan receiver 505 hosts. 507 There are still some scenarios that can result in orphan receivers. 508 For instance, a multicast router and some hosts could be connected to 509 a customer layer-2 switch, and that layer-2 switch can be connected 510 to a PE via an AC. In such scenario, the customer layer-2 switch 511 MUST perform IGMP snooping as well, and it MUST NOT forward the IGMP 512 report messages coming from the PE to the hosts directly connected to 513 it. There can be some cases such that the layer-2 switch does not 514 have IGMP snooping capability or that device is a dummy hub/bridge. 515 In such cases, one can statically configure the AC, through which the 516 IGMP incapable layer-2 device is connected, to be a (S,G)/(*,G) 517 member on the PE. This way, multicast traffic will always be sent to 518 the hosts connected to that layer-2 device, even they do not send 519 joins because of join suppression. 521 Continuing with the example: 523 - PE 2 does not forward the membership report of Host 2 to Host 524 3. 525 - Per the guideline above, PE 1 does not forward the membership 526 report of Host 2 to Host 1. 527 - Per the guideline above, PE 3 does forward the membership 528 report of Host 2 to Router 1 (the Querier). 529 - PE 3 notes that there is a host in the VPLS instance, which 530 is willing to participate in the multicast group and updates its 531 state to {[McastRouters: AC4, PW3to4], [Querier: AC4], 532 [igmp_include(*,G): PW2to3], [GroupTimer(*,G,PW2to3)=GMI]} 533 regardless of the type of the query. 534 - Let us assume that Host 1 subsequently sends a membership 535 report to the same multicast group. 537 - PE 1 replicates the membership report message and forwards it 538 to all PEs participating in the VPLS instance (i.e., PE 2, PE 3, 539 PE 4). 541 - PE 1 notes that there is a directly connected host, which is 542 willing to participate in the multicast group. Basically, it 543 keeps a state of {[McastRouters: PW1to3, PW1to4], [Querier: 544 PW1to3], [igmp_include(*,G): AC1,PW1to2], 545 [GroupTimer(*,G,AC1)=GMI]}. 546 - Per Guideline 1, PE 2 does not forward the membership report 547 of Host 1 to Host 2 and Host 3. 548 - PE 3 and PE 4 receive the membership report message of Host 1 549 and check their states. Per Guideline 1, they send the report to 550 Router 1 and Router 2 respectively. They also update their states 551 to reflect Host 1. 552 - Now, Host 3 sends a membership report to the same multicast 553 group. 554 - PE 2 updates its state to {[McastRouters: PW2to3, PW2to4], 555 [Querier: PW2to3], [igmp_include(*,G): AC2,AC3,PW1to2], 556 GroupTimer(*,G,AC3)=GMI]}. It then floods the report message to 557 all PEs participating in the VPLS instance. Per Guideline 1, PE 3 558 forwards the membership report of Host 3 to Router 1, and PE 4 559 forwards the membership report of Host 3 to Router 2. 561 At this point, all PEs have necessary states to ensure that no 562 multicast traffic will be sent to sites with no members. 564 The previous steps work the same way for IGMPv1 and IGMPv2, when the 565 query is general or source specific. 567 The group and source specific query for IGMPv3 is considered 568 separately below. In IGMPv3, there is no simple membership join or 569 leave report. IGMPv3 reports are one of IS_INCLUDE, IS_EXCLUDE, 570 ALLOW, BLOCK, TO_INCLUDE, TO_EXCLUDE. The PEs MUST implement the 571 "router behavior" portion of the state machine defined in Section 6 572 of [RFC3376]. 574 The IGMP snooping mechanism for joining a multicast group (for 575 IGMPv3) works as follows: 577 - The Querier sends a membership query to the LAN. The 578 membership query is group and source specific query with a list of 579 sources (e.g., S1, S2, .., Sn). 580 - PE 3 replicates the query message and forwards it to all PEs 581 participating in the VPLS instance (i.e., PE 1, PE 2, PE 4). 582 - PE 3 keeps a state of {[McastRouters: AC4, PW3to4], [Querier: 583 AC4]}. 585 - All PEs then forward the query to ACs which are part of the 586 VPLS instance. 587 - Suppose that all hosts (Host 1, Host 2, and Host 3) want to 588 participate in the multicast group. Host 1 and Host 2 want to 589 subscribe to (Sn,G), and Host 3 wants to subscribe to (S3,G). 590 - Host 2 first (for the sake of the example) sends a membership 591 report message with group record type IS_INCLUDE for (Sn,G). 592 - PE 2 replicates the membership report message and forwards it 593 to all PEs participating in the VPLS instance (i.e., PE 1, PE 3, 594 PE 4). 595 - PE 2 notes that there is a directly connected host, which is 596 willing to participate in the multicast group and updates its 597 state to {[McastRouters: PW2to3, PW2to4], [Querier: PW2to3], 598 [igmp_include(Sn,G): AC2], [SrcTimer(Sn,G,AC2)=GMI]}. 599 - Per Guideline 1, PE 2 does not forward the membership report 600 of Host 2 to Host 3. 601 - Per Guideline 1, PE 1 does not forward the membership report 602 of Host 2 to Host 1. 603 - Per Guideline 1, PE 3 does forward the membership report of 604 Host 2 to Router 1 (the Querier). 605 - Per Guideline 1, PE 4 does forward the membership report of 606 Host 2 to Router 2. 607 - PE 3 notes that there is a host in the VPLS instance, which 608 is willing to participate in the multicast group. Basically, it 609 updates its state to {[McastRouters: AC4, PW3to4], [Querier: AC4], 610 [igmp_include(Sn,G): PW2to3], [SrcTimer(Sn,G,PW2to3)=GMI]}. 611 - Likewise, PE 4 updates its state to {[McastRouters: PW3to4, 612 AC5], [Querier: PW3to4], [igmp_include(Sn,G):PW2to4], 613 [SrcTimer(Sn,G,PW2to4)=GMI]}. 614 - Let us say Host 1 now sends a membership report message with 615 group record type IS_INCLUDE for (Sn,G). 616 - Similar procedures are followed by PEs as explained in the 617 previous steps. For instance, PE 1 updates its state to 618 {[McastRouters: PW1to3, PW1to4], [Querier: PW1to3], 619 [igmp_include(Sn,G): PW1to2, AC1], SrcTimer(Sn,G,AC1)=GMI}. PE 3 620 updates its state to {[McastRouters: AC4, PW3to4], [Querier: AC4], 621 [(S1,G); Hosts: ], [igmp_include(Sn,G): PW2to3, PW1to3], 622 [SrcTimer(Sn,G,PW1to3)=GMI]}. 623 - Finally, Host 3 sends a membership report message with group 624 record type IS_INCLUDE for (S3,G). 625 - PE 2 replicates the membership report message and forwards it 626 to all PEs participating in the VPLS instance (i.e., PE 1, PE 3, 627 PE 4). 628 - Per Guideline 1, PE 2 does not forward the membership report 629 of Host 3 to Host 2. 631 - Per Guideline 1, PE 1 does not forward the membership report 632 of Host 3 to Host 1. 633 - Per Guideline 1, PE 3 does forward the membership report of 634 Host 3 to Router 1. 635 - Per Guideline 1, PE 4 does forward the membership report of 636 Host 3 to Router 2. 637 - All PEs update their states accordingly. For instance, PE 2 638 updates its state to {[McastRouters: PW2to3, PW2to4], [Querier: 639 PW2to3], [igmp_include(S3,G): AC3], [igmp_include(Sn,G): PW1to2, 640 AC2], [SrcTimer(S3,G,AC3)=GMI]}. PE 4 updates its state to 641 {[McastRouters: AC5, PW3to4], [Querier: PW3to4], 642 [igmp_include(S3,G): PW2to4], [igmp_include(Sn,G): PW1to4, 643 PW2to4], [SrcTimer(S3,G,PW2to4)=GMI]}. 645 At this point, all PEs have necessary states to not send multicast 646 traffic to sites with no members. 648 Based on above description of IGMPv3 based snooping for VPLS, one may 649 conclude that the PEs MUST have the capability to store (S,G) state 650 and MUST forward/replicate traffic accordingly. This is, however, 651 not MANDATORY. A PE MAY only keep (*,G) based states rather than on 652 a per (S,G) basis with the understanding that this will result in a 653 less efficient IP multicast forwarding within each VPLS instance. 655 Guideline 2: If a PE receives unsolicited report message and if it 656 does not possess a state for that particular multicast group, it MUST 657 flood that unsolicited membership report message to all PEs 658 participating in the VPLS instance, as well as to the multicast 659 router if it is locally attached. 661 5.3.4. IGMP Leave 663 The IGMP snooping mechanism for leaving a multicast group works as 664 follows: 666 - In the case of IGMPv2, when a PE receives a leave (*,G) 667 message from a host via its AC, it lowers the corresponding 668 GroupTimer(*,G,AC) to "Last Member Query Time" (LMQT). 669 - In the case of IGMPv3, when a PE receives a membership report 670 message with group record type of IS_EXCLUDE or TO_EXCLUDE or 671 BLOCK for (S,G) from a host via its AC, it lowers the 672 corresponding SrcTimer(S,G,AC) for all affected (S,G)s to LMQT. 674 In the following guideline, a "leave (*,G)/(S,G) message" also means 675 IGMPv3 membership report message with group record type of IS_EXCLUDE 676 or TO_EXCLUDE or BLOCK for (S,G). 678 Guideline 3: A PE MUST NOT forward a leave (*,G)/(S,G) message to 679 ACs participating in the VPLS instance, If the PE still has 680 locally connected hosts or hosts connected over a H-VPLS spoke in 681 its state. 683 Guideline 4: A PE MUST forward a leave (*,G)/(S,G) message to all 684 PEs participating in the VPLS instance. A PE MAY forward the 685 leave (*,G)/(S,G) message to the "McastRouters" ONLY, if there are 686 no member hosts in its state. 688 Guideline 5: If a PE does not receive a (*,G) membership report 689 from an AC before GroupTimer(*,G,AC) expires, the PE MUST remove 690 the AC from its state. In case of IGMPv3, if a PE does not 691 receive a (S,G) membership report from an AC before the 692 SrcTimer(S,G,AC) expires, the PE MUST remove the AC from its 693 state. 695 5.3.5. Failure Scenarios 697 Up to now, we did not consider any failures, which we will focus in 698 this section. 700 - In case the Querier fails (e.g., AC to the querier fails), 701 another router in the VPLS instance will be selected as the 702 Querier. The new Querier will be sending queries. In such 703 circumstances, the IGMP snooping states in the PEs will be 704 updated/overwritten by the same procedure explained above. 705 - In case a Multicast router fails, the IGMP snooping states in 706 the PEs will be updated/overwritten by the multicast router 707 discovery procedures provided in Section 5.3.1. . 708 - In case a host fails (e.g., AC to the host fails), a PE 709 removes the host from its IGMP snooping state for that particular 710 multicast group. Guidelines 3, 4 and 5 still apply here. 711 - In case a PW (which is in IGMP snooping state) fails, the PEs 712 will remove the PW from their IGMP snooping state. For instance, 713 if PW1to3 fails, then PE 1 will remove PW1to3 from its state as 714 the Querier connection, and PE 3 will remove PW1to3 from its state 715 as one of the host connections. Guidelines 3, 4 and 5 still apply 716 here. After PW is restored, the IGMP snooping states in the PEs 717 will be updated/overwritten by the same procedure explained above. 718 One can implement a dead timer before making any changes to IGMP 719 snooping states upon PW failure. In that case, IGMP snooping 720 states will be altered if the PW can not be restored before the 721 dead timer expires. 723 5.3.6. Scaling Considerations for IGMP Snooping 725 In scenarios where there are multiple ACs connected to a PE, it is 726 quite likely that IGMP membership reports for the same group are 727 received from multiple ACs. The normal behavior would be to have each 728 of the membership reports sent to McastRouters. But in scenarios 729 where many ACs send IGMP membership reports to the same groups, the 730 burden on all the other PEs can be overwhelming. To make matters 731 worse, there can be a large number of hosts on the same AC that all 732 request IGMP membership reports to the same group. While [IGMPv2] 733 suggests the use of report suppression, [IGMPv3] does not. 734 Regardless, if hosts do not implement report suppression, this can be 735 a scaling issue on the PEs. This section outlines the optimization 736 suggested in [MAGMA-SNOOOP] to perform proxy-querying and proxy- 737 reporting function on the PEs to avoid report explosion. 739 For this optimization, we separate out the IGMP group state on the 740 PEs into downstream state and upstream state. 742 Note that the following sub-sections describe the procedures for 743 (*,G). The same procedures must be be extended to (S,G)s. Furthermore 744 the behavior described is for a downstream PE. While it is very 745 important for downstream PEs to implement the proxy behavior 746 described here, the scalability issues are not as bad on upstream 747 PEs. Optimizing upstream PEs would be designed to alleviate the 748 burden on the upstream CEs. Nevertheless the same procedures can be 749 applied to upstream PEs as well as an added optimization. The only 750 difference would be that ACs would be upstream interface(s) and PWs 751 would be downstream interface(s) for such PEs. 753 5.3.7. Downstream Proxy Behavior 755 When a IGMP membership Report for a group is received on an AC, the 756 PE adds the AC to the corresponding igmp_include set and resets the 757 GrpTimer to GMI. 759 When an IGMP membership Leave for a group is received on an AC, the 760 PE lowers the corresponding GrpTimer to LMQT and sends out a proxy 761 group-specific query on that AC. When sending the group-specific 762 query, the PE encodes 0.0.0.0 (or :: in case of IPv6) in the source- 763 ip address field. If there is no other host interested in that group, 764 then the AC is removed from the corresponding igmp_include set after 765 the GrpTimer expires. 767 5.3.8. Upstream Proxy Behavior 769 When the igmp_include set for a group becomes non-null, the PE sends 770 out a proxy IGMP Join report for that group to McastRouters. When the 771 igmp_include set for a group becomes empty, the PE sends out a proxy 772 IGMP Leave report for that group to McastRouters. 774 When the PE receives a general query, it replies with its current 775 snooping state for all groups and group-sources. It also forwards the 776 general query to all ACs thus removing the need for proxy general 777 queries. When the PE receives a group-specific or group-source 778 specific query, the PE does not forward such queries to the ACs. 779 Instead it replies with a proxy report if it has snooping state for 780 that group or group-source. When sending the proxy report, the PE 781 encodes 0.0.0.0 (or :: in the case of IPv6) in the source-ip address 782 field. 784 5.4. PIM Snooping for VPLS 786 IGMP snooping procedures described above provide efficient delivery 787 of IP multicast traffic in a given VPLS service when end stations are 788 connected to the VPLS. However, when VPLS is offered as a WAN 789 service it is likely that the CE devices are routers and would run 790 PIM between them. To provide efficient IP multicasting in such 791 cases, it is necessary that the PE routers offering the VPLS service 792 do PIM snooping. This section describes the procedures for PIM 793 snooping. 795 PIM is a multicast routing protocol, which runs exclusively between 796 routers. PIM shares many of the common characteristics of a routing 797 protocol, such as discovery messages (e.g., neighbor discovery using 798 Hello messages), topology information (e.g., multicast tree), and 799 error detection and notification (e.g., dead timer and designated 800 router election). On the other hand, PIM does not participate in any 801 kind of exchange of databases, as it uses the unicast routing table 802 to provide reverse path information for building multicast trees. 803 There are a few variants of PIM. In PIM-DM ([PIM-DM]), multicast 804 data is pushed towards the members similar to broadcast mechanism. 805 PIM-DM constructs a separate delivery tree for each multicast group. 806 As opposed to PIM-DM, other PIM versions (PIM-SM [RFC2362], PIM-SSM 807 [PIM-SSM], and BIDIR-PIM [BIDIR-PIM]) invokes a pull methodology 808 instead of push technique. 810 PIM routers periodically exchange Hello messages to discover and 811 maintain stateful sessions with neighbors. After neighbors are 812 discovered, PIM routers can signal their intentions to join/prune 813 specific multicast groups. This is accomplished by having downstream 814 routers send an explicit join message (for the sake of 815 generalization, consider Graft messages for PIM-DM as join messages) 816 to the upstream routers. The join/prune message can be group 817 specific (*,G) or group and source specific (S,G). 819 In PIM snooping, a PE snoops on the PIM message exchange between 820 routers, and builds its multicast states. Based on the multicast 821 states, it forwards IP multicast traffic accordingly to avoid 822 unnecessary flooding. 824 5.4.1. PIM Snooping State Summarization Macros 826 The following sets are defined to help build the forwarding state on 827 a PE. Some sets may apply only to a subset of the PIM Protocol 828 variants as noted along with the definition of the sets. 830 pim_joins(*,G) = 831 Set of all downstream interfaces on which PIM (*,G) Joins are 832 received. This set applies only to PIM-SM, PIM-SSM, PIM-BIDIR. 834 pim_joins(S,G) = 835 Set of all downstream interfaces on which PIM (S,G) Joins are 836 received. This set applies only to PIM-SM, PIM-SSM. 838 All_Pim_DM_OifList = 839 If the upstream interface (the interface towards the upstream PIM 840 neighbor) is a PW, then this set is the set of all ACs on which there 841 are PIM neighbors. If the upstream interface is an AC, then this is 842 the set of all interfaces (both AC and PW) on which there are PIM 843 neighbors. This set applies only to PIM-DM. 845 pim_prunes(S,G) = 846 Set of all downstream interfaces on which PIM (S,G) prunes are 847 received. This set applies only to PIM-DM. 849 pim_prunes(S,G,rpt) = 850 Set of all downstream interfaces on which PIM (S,G,rpt) prunes are 851 received. This set applies only to PIM-SM. 853 Pim_oiflist(*,G) = 854 Set of interfaces that PIM contributes to the list of outgoing 855 interfaces to which data traffic must be forwarded on a (*,G) match. 857 Pim_oiflist(S,G) = 858 Set of interfaces that PIM contributes to the list of outgoing 859 interfaces to which data traffic must be forwarded on an (S,G) match. 861 Note that pim_oiflist is not the complete list of outgoing interfaces 862 (oiflist). IGMP/MLD also contribute to this list. 864 For PIM-DM, 866 pim_oiflist(S,G) = All_Pim_DM_OifList (-) pim_prunes(S,G) 868 For PIM-SM and PIM-SSM, 870 Pim_inherited_oiflist(S,G,rpt) = pim_joins(*,G) (-) 871 pim_prunes(S,G,rpt) 873 pim_oiflist(*,G) = pim_joins(*,G) 875 pim_oiflist(S,G) = pim_inherited_oiflist(S,G,rpt) (+) 876 pim_joins(S,G) 878 For PIM-BIDIR, 880 Pim_oiflist(*,G) = DF(RP(G)) + pim_joins(*,G) 881 Where DF(RP(G)) is the AC/PW towards the router that is the 882 designated forwarder for RP(G). 884 In the following, the mechanisms involved for implementing PIMv2 885 ([RFC2362]) snooping in VPLS are specified. PIMv1 is out of the 886 scope of this document. Please refer to Figure 2 as an example of 887 VPLS with PIM snooping. 889 VPLS Instance 890 +------+ AC1 +------+ +------+ AC4 +------+ 891 |Router|-----| PE |-------------| PE |-----|Router| 892 | 1 | | 1 |\ PW1to3 /| 3 | | 4 | 893 +------+ +------+ \ / +------+ +------+ 894 | \ / | 895 | \ / | 896 | \ /PW2to3 | 897 | \ / | 898 PW1to2| \ |PW3to4 899 | / \ | 900 | / \PW1to4 | 901 | / \ | 902 | / \ | 903 +------+ +------+ / \ +------+ +------+ 904 |Router| | PE |/ PW2to4 \| PE | |Router| 905 | 2 |-----| 2 |-------------| 4 |-----| 5 | 906 +------+ AC2 +------+ +------+ AC5 +------+ 907 | 908 |AC3 909 +------+ 910 |Router| 911 | 3 | 912 +------+ 914 Figure 2 Reference Diagram for PIM Snooping for VPLS 916 In the following sub-sections, snooping mechanisms for each variety 917 of PIM are specified. 919 5.4.2. PIM-DM 921 The characteristics of PIM-DM is flood and prune behavior. Shortest 922 path trees are built as a multicast source starts transmitting. 924 In Figure 2, the multicast source is behind Router 4, and all routers 925 have at least one receiver except Router 3 and Router 5. 927 5.4.2.1. Discovering Multicast Routers 929 The PIM-DM snooping mechanism for neighbor discovery works as 930 follows: 932 - To establish PIM neighbor adjacencies, PIM multicast routers 933 (all routers in this example) send PIM Hello messages to the ALL 934 PIM Routers group address (IPv4: 224.0.0.13, MAC: 01-00-5E-00-00- 935 0D) on every PIM enabled interfaces. The IPv6 ALL PIM Routers 936 group is "ff02::d". In addition, PIM Hello messages are used to 937 elect Designated Router for a multi-access network. In PIM-DM, 938 the DR acts as the Querier if IGMPv1 is used. Otherwise, DR has no 939 function in PIM-DM. 941 Guideline 6: PIM Hello messages MUST be flooded in the VPLS 942 instance. A PE MUST populate its "PIM Neighbors" list according 943 to the snooping results. This is a general PIM snooping guideline 944 and applies to all variants of PIM snooping. 946 Guideline 7: For PIM-DM only. pim_oiflist(S,G) is populated with 947 All_Pim_DM_Interfaces (the ACs/PWs in the "PIM Neighbors" list). 948 Changes to the "PIM Neighbors" list MUST be replicated to 949 All_Pim_DM_Interfaces. 951 - Every router starts sending PIM Hello messages. Per 952 Guideline 6, every PE replicates Hello messages and forwards them 953 to all PEs participating in the VPLS instance. 954 - Based on PIM Hello exchanges PE routers populate PIM snooping 955 states as follows. PE 1: {[(,); Source:; Flood to: AC1, PW1to2, 956 PW1to3, PW1to4], [PIM Neighbors: (Router 1,AC1), (Router 2,Router 957 3,PW1to2), (Router 4,PW1to3), (Router 5,PW1to4)] }, PE 2: {[(,); 958 Source:; Flood to: AC2, AC3, PW1to2, PW2to3, PW2to4], [PIM 959 Neighbors: (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), 960 (Router 4,PW2to3), (Router 5,PW2to4)]}, PE 3: {[(,); Source:; 961 Flood to: AC4, PW1to3, PW2to3, PW3to4], [PIM Neighbors: (Router 962 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 4,AC4), (Router 963 5,PW3to4)]}, PE 4: {[(,); Source:; Flood to: AC5, PW1to4, PW2to4, 964 PW3to4], [PIM Neighbors: (Router 1,PW1to4), (Router 2,Router 965 3,PW2to4), (Router 4,PW3to4), (Router 5,AC5)]}. The original 966 pim_oiflist(S,G) is populated with ACs/PWs in the PIM neighbor 967 list per Guideline 7.. 968 - PIM Hello messages contain a Holdtime value, which tells the 969 receiver when to expire the neighbor adjacency (which is three 970 times the Hello period). 972 Guideline 8: If a PE does not receive a Hello message from a 973 router within its Holdtime, the PE MUST remove that router from 974 the PIM snooping state. If a PE receives a Hello message from a 975 router with Holdtime value set to zero, the PE MUST remove that 976 router from the PIM snooping state immediately. PEs MUST track 977 the Hello Holdtime value per PIM neighbor. 979 5.4.2.2. PIM-DM Multicast Forwarding 981 The PIM-DM snooping mechanism for multicast forwarding works as 982 follows: 984 - When the source starts sending traffic to multicast group 985 (S,G), PE 3 updates its state to PE 3: {[(S,G) ; Source: (Router 986 4,AC4); pim_oiflist(S,G): PW1to3, PW2to3, PW3to4], [PIM Neighbors: 987 (Router 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 4,AC4), 988 (Router 5,PW3to4)]}. AC4 is removed from the pim_oiflist list for 989 (S,G), since it is where the multicast traffic comes from. 991 Guideline 9: Multicast traffic MUST be replicated per PW and AC 992 basis, i.e., even if there are more than one PIM neighbor behind a 993 PW/AC, only one replication MUST be sent to that PW/AC. 995 - PE 3 replicates the multicast traffic and sends it to the 996 other PE routers in its pim_oiflist(S,G). 997 - Consequently, all PEs update their states as follows. PE 1: 998 {[(S,G); Source: (Router 4,PW1to3); pim_oiflist(S,G): AC1], [PIM 999 Neighbors: (Router 1,AC1), (Router 2,Router 3,PW1to2), (Router 1000 4,PW1to3), (Router 5,PW1to4)]}, PE 2: {[(S,G); Source: (Router 1001 4,PW2to3); pim_oiflist(S,G): AC2, AC3], [PIM Neighbors: (Router 1002 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 4,PW2to3), 1003 (Router 5,PW2to4)]}, PE 4: {[(S,G); Source: (Router 4,PW3to4); 1004 pim_oiflist(S,G): AC5], [PIM Neighbors: (Router 1,PW1to4), (Router 1005 2,Router 3,PW2to4), (Router 4,PW3to4), (Router 5,AC5)]}. 1007 5.4.2.3. PIM-DM Pruning 1008 At this point all the routers (Router 1, Router 2,Router 3, Router 5) 1009 receive the multicast traffic. 1011 - However, Router 3 and Router 5 do not have any members for 1012 that multicast group, so they send prune messages to leave the 1013 multicast group to the ALL PIM Routers group. PE 2 updates its 1014 state to PE 2: {[(S,G); Source: (Router 4,PW2to3); 1015 pim_prunes(S,G): AC3, pim_oiflist(S,G): AC2], [PIM Neighbors: 1016 (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 1017 4,PW2to3), (Router 5,PW2to4)]}. PE 4 also removes Router 3 and 1018 Router 5 from its state as well. 1020 Guideline 10:.The PIM-DM prune message MUST be forwarded towards 1021 the upstream PE only if pim_oiflist(S,G) became empty as a result 1022 of the received prune message. If pim_oiflist(S,G) was already 1023 null when the PIM-DM prune was received, then the prune MUST NOT 1024 be forwarded upstream. 1026 - PE 2 does not forward the prune message per Guideline 10. PE 1027 4 updates its state to PE 4: {[(S,G); Source: (Router 4,AC4); 1028 pim_prunes(S,G): AC5, pim_oiflist(S,G):], [PIM Neighbors: 1029 (Router 1,PW1to4), (Router 2,Router 3,PW2to4), (Router4, PW3to4). 1030 - PIM-DM prune messages contain a Holdtime value, which 1031 specifies how many seconds the prune state should last. 1033 Guideline 11: For PIM-DM only. A PE MUST keep the prune state for 1034 a PW/AC according to the Holdtime in the prune message, unless a 1035 corresponding Graft message is received. 1037 - Upon receiving the prune messages, each PE 3 updates its 1038 state accordingly to PE 3: {[(S,G); Source: (Router 4,AC4); 1039 pim_prunes(S,G): PW2to4, pim_oiflist(S,G): PW1to3, PW2to3], 1040 [PIM Neighbors: (Router 1,PW1to3), (Router 2,Router 3,PW2to3), 1041 (Router 4,AC4), (Router 5, PW3to4)]}. 1043 Guideline 12: For PIM-DM only. To avoid overriding joins, a PE 1044 SHOULD suppress the PIM prune messages to directly connected 1045 routers (i.e., ACs), as long as there is a PW/AC in its 1046 corresponding pim_oiflist(S,G). 1048 - In this case, PE 1, PE 2, and PE 3 do not forward the prune 1049 messages to their directly connected routers. 1051 5.4.2.4. PIM-DM Grafting 1053 The multicast traffic is now flowing only to points in the network 1054 where receivers are present. 1056 Guideline 13: For PIM-DM only. A PE MUST remove the AC/PW from 1057 its corresponding prune state (pim_prunes(S,G)) when it receives a 1058 graft message from the AC/PW. That is, the corresponding AC/PW 1059 MUST be added to the pim_oiflist(S,G) list. 1061 Guideline 14: For PIM-DM only. PIM-DM graft messages MUST be 1062 forwarded based on the destination MAC address. If the 1063 destination MAC address is 01-00-5E-00-00-0D, then the graft 1064 message MUST be flooded in the VPLS instance. PIM-DM graft 1065 messages MUST NOT be flooded if pim_oiflist is non-null. 1067 - For the sake of example, suppose now Router 3 has a receiver 1068 the multicast group (S,G). Assuming Router 3 sends a graft 1069 message in IP unicast to Router 4 to restart the flow of multicast 1070 traffic. PE 2 updates its state to PE 2: {[(S,G); Source: (Router 1071 4,PW2to3); pim_prunes(S,G): , pim_oiflist(S,G): AC2, AC3], [PIM 1072 Neighbors: (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), 1073 (Router 4,PW2to3), (Router 5,PW2to4)]}. PE 2 then forwards the 1074 graft message to PE 3 according to Guideline 14. 1075 - Upon receiving the graft message, PE 3 updates its state 1076 accordingly to PE 3: {[(S,G); Source: (Router 4,AC4); 1077 pim_prunes(S,G): PW3to4, pim_oiflist(S,G): PW1to3, PW2to3], [PIM 1078 Neighbors: (Router 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 1079 4,AC4), (Router 5,PW3to4)]}. 1081 5.4.2.5. Failure Scenarios 1083 Guideline 15: PIM Assert messages MUST be flooded in the VPLS 1084 instance. 1086 Guideline 16: If an AC/PW goes down, a PE MUST remove it from its 1087 PIM snooping state. 1089 Failures can be easily handled in PIM-DM snooping, as it uses push 1090 technique. If an AC or a PW goes down, PEs in the VPLS instance will 1091 remove it from their snooping state (if the AC/PW is not already 1092 pruned). After the AC/PW comes back up, it will be automatically 1093 added to the snooping state by PE routers, as all PWs/ACs MUST be in 1094 the snooping state, unless they are pruned later on. 1096 5.4.3. PIM-SM 1098 The key characteristics of PIM-SM is explicit join behavior. In this 1099 model, the multicast traffic is only sent to locations that 1100 specifically request it. The root node of a tree is the Rendezvous 1101 Point (RP) in case of a shared tree or the first hop router that is 1102 directly connected to the multicast source in the case of a shortest 1103 path tree. 1105 In Figure 2, the RP is behind Router 4, and all routers have at least 1106 one member except Router 3 and Router 5. 1108 As in the case with IGMPv3 snooping, we assume that the PEs have the 1109 capability to store (S,G) states for PIM-SM snooping and 1110 forward/replicate traffic accordingly. This is not mandatory. An 1111 implementation, can fall back to (*,G) states, if its hardware can 1112 not support it. In such case, the efficiency of multicast forwarding 1113 will be less. 1115 5.4.3.1. Discovering Multicast Routers 1117 The PIM-SM snooping mechanism for neighbor discovery works the same 1118 way as the procedure defined in PIM-DM section, with the exception of 1119 PIM-DM only guidelines. 1121 - Based on PIM Hello exchanges PE routers populate PIM snooping 1122 states as follows. PE 1: {[(,); Flood to:], [PIM Neighbors: 1123 (Router 1,AC1), (Router 2,Router 3,PW1to2), (Router 4,PW1to3), 1124 (Router 5,PW1to4)]}, PE 2: {[(,); Flood to:], [PIM Neighbors: 1125 (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 1126 4,PW2to3), (Router 5,PW2to4)]}, PE 3: {[(,); Flood to:], [PIM 1127 Neighbors: (Router 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 1128 4,AC4), (Router 5,PW3to4)]}, PE 4: {[(,); Flood to:], [PIM 1129 Neighbors: (Router 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 1130 4,PW3to4), (Router 5,AC5)]}. 1132 To reduce the amount of PIM Join/Prune traffic in the VPLS network, 1133 it is important that Explicit-Tracking capability be disabled between 1134 the CEs. If a CE advertises tracking support, it is recommended that 1135 the PEs modify the tracking-support option in CE Hello packets before 1136 forwarding them to ensure that tracking support is disabled between 1137 the CEs. Otherwise, the mechanism listed for "JP_Optimization" 1138 throughout the PIM-SM and PIM-SSM sections of this document MUST NOT 1139 be employed. 1141 NOTE: The examples below are for scenarios where JP_Optimization is 1142 not employed. 1144 For PIM-SM to work properly, all routers within the domain must use 1145 the same mappings of group addresses to RP addresses. Currently, 1146 there are three methods for RP discovery: 1. Static RP configuration, 1147 2, Auto-RP, and 3. PIMv2 Bootstrap Router mechanism. 1149 5.4.3.2. PIM-SM (*,G) Join 1150 The PIM-SM snooping mechanism for joining a multicast group (*,G) 1151 works as follows: 1153 Guideline 18: PIM-SM join messages MUST be sent only to the remote 1154 PE, which is connected to the router to which the Join is 1155 addressed. 1156 JP_Optimization: The PIM-SM join message MUST be forwarded towards 1157 the upstream CE only if pim_joins(*,G) became non-empty as a 1158 result of the received join message. If pim_joins(*,G) was already 1159 non-null when the PIM-SM join was received, then the join MUST NOT 1160 be forwarded upstream. 1162 PIM-SM join messages MUST be sent only to the remote PE, which is 1163 connected to the router to which the Join is addressed. The remote 1164 PE can be determined by the "Upstream Neighbor Address" field of the 1165 Join message. The "Upstream Neighbor Address" can be correlated to a 1166 PW or an AC in the "PIM Neighbors" state. By Guideline 18, we are 1167 ensuring that the other routers that are part of the VPLS instance do 1168 not receive the PIM join messages and will initiate their own join 1169 messages if they are interested in receiving that particular 1170 multicast traffic. 1172 - Assume Router 1 wants to join the multicast group (*,G) sends 1173 a join message for the multicast group (*,G). PE 1 sends the join 1174 message to PE 3 by Guideline 18. 1176 Guideline 19: A PE MUST add a PW/AC to its pim_joins(*,G) list, if 1177 it receives a (*,G) join message from the PW/AC. 1179 - PE 1 updates their states as follows: PE 1: {[pim_joins(*,G): 1180 AC1], [PIM Neighbors: (Router 1,AC1), (Router 2,Router 3,PW1to2), 1181 (Router 4,PW1to3), (Router 5,PW1to4)]}. 1183 A periodic refresh mechanism is used in PIM-SM to maintain the proper 1184 state. PIM-SM join messages contain a Holdtime value, which 1185 specifies for how many seconds the join state should be kept. 1187 Guideline 20: If a PE does not receive a refresh join message from 1188 a PW/AC within its Holdtime, the PE MUST remove the PW/AC from its 1189 pim_joins(*,G) list. 1191 - All PEs update their states accordingly as follows: PE 1: 1192 {[pim_joins(*,G): AC1], [PIM Neighbors: (Router 1,AC1), (Router 1193 2,Router 3,PW1to2), (Router 4,PW1to3), (Router 5,PW1to4)]}, PE 2: 1194 {[(,); Flood to: ], [PIM Neighbors: (Router 1,PW1to2), (Router 1195 2,AC2), (Router 3,AC3), (Router 4,PW2to3), (Router 5,PW2to4)]}, PE 1196 3: {[pim_joins(*,G): PW1to3], [PIM Neighbors: (Router 1,PW1to3), 1197 (Router 2,Router 3,PW2to3), (Router 4,AC4), (Router 5,PW3to4)]}, 1198 PE 4: {[(,); Flood to: ], [PIM Neighbors: (Router 1,PW1to4), 1199 (Router 2,Router 3,PW2to4), (Router 4,PW3to4), (Router 5,AC5)]}. 1200 - After Router 2 joins the same multicast group, the states 1201 become as follows: PE 1: {[pim_joins(*,G): AC1], [PIM Neighbors: 1202 (Router 1,AC1), (Router 2,Router 3,PW1to2), (Router 4,PW1to3), 1203 (Router 5,PW1to4)]}, PE 2: {[pim_joins(*,G): AC2], [PIM Neighbors: 1204 (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 1205 4,PW2to3), (Router 5,PW2to4)]}, PE 3: {[pim_joins(*,G): PW1to3, 1206 PW2to3], [PIM Neighbors: (Router 1,PW1to3), (Router 2,Router 1207 3,PW2to3), (Router 4,AC4), (Router 5,PW3to4)]}, PE 4: {[(,); Flood 1208 to: ], [PIM Neighbors: (Router 1,PW1to4), (Router 2,Router 1209 3,PW2to4), (Router 4,PW3to4), (Router 5,AC5)]}. 1210 - For the sake of example, Router 3 joins the multicast group. 1211 PE 2 sends the join message to PE 3. 1212 - Next Router 5 joins the group, and the states are updated 1213 accordingly: PE 1: {[pim_joins(*,G): AC1], [PIM Neighbors: (Router 1214 1,AC1), (Router 2,Router 3,PW1to2), (Router 4,PW1to3), (Router 1215 5,PW1to4)]}, PE 2: {[pim_joins(*,G): AC2, AC3], [PIM Neighbors: 1216 (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 1217 4,PW2to3), (Router 5,PW2to4)]}, PE 3: {[pim_joins(*,G): PW1to3, 1218 PW2to3, PW3to4], [PIM Neighbors: (Router 1,PW1to3), (Router 1219 2,Router 3,PW2to3), (Router 4,AC4), (Router 5,PW3to4)]}, PE 4: 1220 {[pim_joins(*,G): AC5],[PIM Neighbors: (Router 1,PW1to4), (Router 1221 2,Router 3,PW2to4), (Router 4,PW3to4), (Router 5,AC5)]} 1223 At this point, all PEs have necessary states to not send multicast 1224 traffic to sites with no members. 1226 5.4.3.3. PIM-SM Pruning 1228 The PIM-SM snooping mechanism for leaving a multicast group works as 1229 follows: 1230 - Assume Router 5 sends a prune message. 1232 Guideline 21: PIM-SM prune messages MUST be flooded in the VPLS 1233 instance. 1234 JP_Optimization: Instead of the above guideline, a PE MUST forward 1235 prune messages only towards the upstream CE and only if 1236 pim_joins(*,G) became empty as a result of the received prune 1237 message. If pim_joins(*,G) is non-empty after receiving the prune 1238 message, the PE MUST NOT forward the prune message. 1240 Guideline 22: A PE MUST remove a PW/AC from its pim_joins(*,G) 1241 list if it receives a (*,G) prune message from the PW/AC. A 1242 prune-delay timer SHOULD be implemented to support prune override. 1244 However, the prune-delay timer is not required if there is only 1245 one PIM neighbor on that AC/PW on which the prune was received. 1247 - PE 4 floods the (*,G) prune to the VPLS instance per 1248 Guideline 21. PE routers participating in the VPLS instance also 1249 forward the (*,G) prune to the ACs, which are connected to the 1250 VPLS instance. The states are updated as follows: PE 1: 1251 {[pim_joins(*,G): AC1], [PIM Neighbors: (Router 1,AC1), (Router 1252 2,Router 3,PW1to2), (Router 4,PW1to3), (Router 5,PW1to4)]}, PE 2: 1253 {[pim_joins(*,G): AC2, AC3], [PIM Neighbors: (Router 1,PW1to2), 1254 (Router 2,AC2), (Router 3,AC3), (Router 4,PW2to3), (Router 1255 5,PW2to4)]}, PE 3: {[pim_joins(*,G): PW1to3, PW2to3], [PIM 1256 Neighbors: (Router 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 1257 4,AC4), (Router 5,PW3to4)]}, PE 4: {[(,); Flood to: ],[PIM 1258 Neighbors: (Router 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 1259 4,PW3to4), (Router 5,AC5)]}. 1261 In PIM-SM snooping, prune messages are flooded by PE routers. In 1262 such implementation, PE routers may receive overriding join messages, 1263 which will not affect anything. 1265 5.4.3.4. PIM-SM (S,G) Join 1267 The PIM-SM snooping mechanism for source and group specific join 1268 works as follows: 1270 Guideline 23: A PE MUST add a PW/AC to its pim_joins(S,G) list if 1271 it receives a (S,G) join message from the PW/AC. The PE MUST 1272 forward the received join message towards the upstream CE. 1273 JP_Optimization: The PE MUST forward the Join message towards the 1274 upstream neighbor only if the pim_joins(S,G) list becomes non- 1275 empty as a result of the received join. If the pim_joins(S,G) list 1276 was non-empty prior to receiving the join message, then the PE 1277 MUST NOT forward the join message. 1279 Guideline 24: A PE MUST remove a PW/AC from its pim_joins(S,G) 1280 list if it receives a (S,G) prune message from the PW/AC. The PE 1281 MUST flood the prune message in the VPLS instance. A prune-delay 1282 timer SHOULD be implemented to support prune override on the 1283 downstream AC/PW. However, the prune-delay timer is not required 1284 if there is only one PIM neighbor on that AC/PW on which the prune 1285 was received. 1286 JP_Optimization: Instead of flooding the prune message in the VPLS 1287 instance, the PE MUST forward the prune message towards the 1288 upstream neighbor only if the pim_joins(S,G) list becomes empty as 1289 a result of the received prune. If the pim_joins(S,G) list remains 1290 non-empty after receiving the prune message, then the PE MUST NOT 1291 forward the prune message. 1293 Guideline 25: A PE MUST prefer (S,G) state to (*,G), if both S and 1294 G match. 1296 5.4.3.5. PIM-SM (S,G,rpt) Prunes 1298 Guideline 28: When a PE receives a Prune(S,G,rpt) on an AC/PW, it 1299 MUST add the AC/PW to the pim_prunes(S,G,rpt) list. Additionally, 1300 if pim_snoop_inherited_olist(S,G,rpt) becomes empty, the PE MUST 1301 forward the Prune(S,G,rpt) towards the upstream neighbor. If 1302 pim_snoop_inherited_olist(S,G,rpt) is still non-empty, then the PE 1303 MUST NOT forward the Prunes(S,G,rpt). 1305 5.4.3.6. PIM-SM (*,*,RP) State 1306 PIM-SM defines a (*,*,RP) state which is used when traffic needs to 1307 cross multicast domains. A (*,*,RP) receiver requests all multicast 1308 traffic within a PIM domain to be sent to it. If the two multicast 1309 domains are both PIM-SM, they can use MSDP to leak multicast routes. 1310 But, if one is PIM-SM and the other is PIM-DM (hence, MSDP can not be 1311 used), then the border router would initiate a (*,*,RP) join to all 1312 RPs in the PIM-SM domain. 1314 If the customers will configure multiple and different PIM domains, 1315 PIM-SM snooping MUST support (*,*,RP) state as well. Depending on 1316 how likely scenario this is, future versions may include (*,*,RP) 1317 states. 1319 5.4.3.7. Failure Scenarios 1321 Failures can be easily handled in PIM-SM snooping, as it employs 1322 state-refresh technique. PEs in the VPLS instance will remove any 1323 entry for non-refreshing routers from their states. 1325 5.4.3.8. Special Cases for PIM-SM Snooping 1327 There are some special cases to consider for PIM-SM snooping. First 1328 one is the RP-on-a-stick. The RP-on-a-stick scenario may occur when 1329 the Shortest Path Tree and the Shared Tree shares a common Ethernet 1330 segment, as all routers will be connected over a multicast access 1331 network (i.e., VPLS). Such a scenario will be handled by PIM-SM 1332 rules (particularly, the incoming interface can not also appear in 1333 the outgoing interface list) very nicely. Second scenario is the 1334 turnaround router. The turnaround router scenario occurs when 1335 shortest path tree and shared tree share a common path. The router 1336 at which these tree merge is the turnaround router. PIM-SM handles 1337 this case by proxy (S,G) join implementation by the turnaround 1338 router. 1340 There can be some scenarios where CE routers can receive duplicate 1341 multicast traffic. Let us consider the scenario in Figure 3. 1343 +------+ AC3 +------+ 1344 | PE2 |-----| R3 | 1345 /| | | | 1346 / +------+ +------+ 1347 / | | 1348 / | | 1349 /PW1to2 | | 1350 / | +-----+ 1351 / |PW2to3 | Src | 1352 / | +-----+ 1353 / | | 1354 / | | 1355 / | | 1356 +------+ +------+ / +------+ +------+ 1357 | R1 | | PE1 |/ PW1to3 | PE3 | | R4 | 1358 | |-----| |-------------| |-----| | 1359 +------+ AC1 +------+ +------+ AC4 +------+ 1360 | | 1361 |AC2 |AC5 1362 +------+ +------+ 1363 | R2 | | R5 | +---+ 1364 | | | |-----|RP | 1365 +------+ +------+ +---+ 1367 Figure 3 CE Routers Receive Duplicate Traffic 1369 In the scenario depicted in Figure 3, both R1 and R2 has two ECMP 1370 routes to reach the source Src. Hence, R1 may pick R3 as its next 1371 hop ("Upstream Neighbor"), and R2 may pick R4 as its next hop. As a 1372 result, both R1 and R2 will receive duplicate traffic. 1374 This issue can be solved as follows. PEs can keep the PW/AC that the 1375 join message is forwarded to (upstream PW/AC) in "Flood to" list in 1376 addition to the PW/AC that the join message is received (downstream 1377 PWAC). If the traffic arrives from a different PW/AC, that traffic 1378 is not forwarded downstream. Hence, in the example depicted in 1379 Figure 3 where source is dual homed to R3 and R4, R1 will receive 1380 (S,G) traffic if it comes from PW1to2, and R2 will receive (S,G) 1381 traffic if it comes from PW1to3. 1383 Again, in Figure 3, R1 may send (S,G) join to R3 and R2 may send 1384 (*,G) join to the RP behind R5. In this scenario as well, both R1 1385 and R2 will receive duplicate traffic, as Guideline 25 will be no 1386 help to prevent it. 1388 In this case, where R1 joins for (S,G), and R2 joins for (*,G), we 1389 can do the following. The PEs SHOULD keep the upstream PW/AC in the 1390 state as described above. In addition, the PEs need to act on 1391 (S,G,RPT) prunes and remove the related upstream PW/AC from "Flood 1392 to" list of (S,G) state copied from (*,G) state. As a result, Ces 1393 will not receive duplicate traffic. 1395 However, there will still be bandwidth waste as the egress PE takes 1396 care of the duplicate traffic problem. We can further enhance the 1397 proposal by triggering Assert mechanism in CE routers. The PE which 1398 detects the duplicate traffic problem can simply remove the snooping 1399 state for that particular multicast group, and can send out "flush" 1400 message to other PEs participating in the VPLS instance. In return, 1401 other PEs also flush their snooping state for that multicast group. 1402 As a result, all the PEs will flood the multicast traffic in the VPLS 1403 instance (by Rule 3). Consequently, CEs will do Assert. The flush 1404 message TLV can be sent over the targeted LDP sessions running among 1405 PEs. Future versions will include the details. 1407 5.4.4. PIM-SSM 1409 The key characteristics of PIM-SSM is explicit join behavior, but it 1410 eliminates the shared tree and the rendezvous point in PIM-SM. In 1411 this model, a shortest path tree for each (S,G) is built with the 1412 first hop router (that is directly connected to the multicast source) 1413 being the root node. PIM-SSM is ideal for one-to-many multicast 1414 services. 1416 In Figure 2, S1 is behind Router 1, and S4 is behind Router 4. 1417 Routers 2 and 4 want to join (S1,G), while Router 5 wants to join 1418 (S4,G). 1420 We assume that the PEs have the capability to store (S,G) states for 1421 PIM-SSM snooping and constrain multicast flooding scope accordingly. 1422 An implementation, can fall back to (*,G) states, if its hardware can 1423 not support it. In such case, the efficiency of multicast forwarding 1424 will be less. 1426 5.4.4.1. Discovering Multicast Routers 1428 The PIM-SSM snooping mechanism for neighbor discovery works the same 1429 way as the procedure defined in PIM-DM section, with the exception of 1430 PIM-DM only guidelines. 1432 - Based on PIM Hello exchanges PE routers populate PIM snooping 1433 states as follows. PE 1: {[(,); Flood to:], [PIM Neighbors: 1434 (Router 1,AC1), (Router 2,Router 3,PW1to2), (Router 4,PW1to3), 1435 (Router 5,PW1to4)]}, PE 2: {[(,); Flood to:], [PIM Neighbors: 1436 (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 1437 4,PW2to3), (Router 5,PW2to4)]}, PE 3: {[(,); Flood to:], [PIM 1438 Neighbors: (Router 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 1439 4,AC4), (Router 5,PW3to4)]}, PE 4: {[(,); Flood to:], [PIM 1440 Neighbors: (Router 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 1441 4,PW3to4), (Router 5,AC5)]}. 1443 5.4.4.2. Guidelines for PIM-SSM Snooping 1444 PIM-SSM snooping is actually simpler than PIM-SM and only the 1445 following guidelines (some of which are repetitions from PIM-SM 1446 section) apply. 1448 Guideline 28: A PE MUST add a PW/AC to its (S,G) pim_joins(S,G) 1449 list if it receives a (S,G) join message from the PW/AC. 1451 Guideline 29: PIM-SSM join messages MUST be sent only to the 1452 remote PE, which is connected to the router to which the Join is 1453 addressed. 1454 JP_Optimization: The PE MUST forward the Join message towards the 1455 upstream neighbor only if the pim_joins(S,G) list becomes non- 1456 empty as a result of the received join. If the pim_joins(S,G) list 1457 was non-empty prior to receiving the join message, then the PE 1458 MUST NOT forward the join message. 1460 Guideline 30: PIM prune messages MUST be flooded in the VPLS 1461 instance. A prune-delay timer SHOULD be implemented to support 1462 prune override on the downstream AC/PW. However, the prune-delay 1463 timer is not required if there is only one PIM neighbor on that 1464 AC/PW on which the prune was received. 1465 JP_Optimization: Instead of flooding the prune message in the VPLS 1466 instance, the PE MUST forward the prune message towards the 1467 upstream neighbor only if the pim_joins(S,G) list becomes empty as 1468 a result of the received prune. If the pim_joins(S,G) list remains 1469 non-empty after receiving the prune message, then the PE MUST NOT 1470 forward the prune message. 1472 Guideline 31: If A PE does not receive a refresh join message from 1473 a PW/AC within its Holdtime, the PE MUST remove the PW/AC from its 1474 pim_joins(S,G) list. 1476 Guideline 32: A PE MUST remove a PW/AC from its pim_joins(S,G) 1477 list if it receives a (S,G) prune message from the PW/AC. A 1478 prune-delay timer SHOULD be implemented to support prune override. 1480 5.4.4.3. PIM-SSM Join 1482 The PIM-SSM snooping mechanism for joining a multicast group works as 1483 follows: 1485 - Assume Router 2 requests to join the multicast group (S1,G). 1486 - PE 2 updates its state, and then sends the join message to PE 1487 1. 1488 - All PEs update their states as follows: PE 1: {[(S1,G); Flood 1489 to: PW1to2], [PIM Neighbors: (Router 1,AC1), (Router 2,Router 1490 3,PW1to2), (Router 4,PW1to3), (Router 5,PW1to4)]}, PE 2: 1491 {[pim_joins(S1,G): AC2], [PIM Neighbors: (Router 1,PW1to2), 1492 (Router 2,AC2), (Router 3,AC3), (Router 4,PW2to3), (Router 1493 5,PW2to4)]}, PE 3: {[(,); Flood to: ], [PIM Neighbors: (Router 1494 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 4,AC4), (Router 1495 5,PW3to4)]}, PE 4: {[(,); Flood to: ], [PIM Neighbors: (Router 1496 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 4,PW3to4), (Router 1497 5,AC5)]}. 1498 - Next, assume Router 4 sends a join (S1,G) message. Following 1499 the same procedures, all PEs update their states as follows: PE 1: 1500 {[pim_joins(S1,G): PW1to2, PW1to3], [PIM Neighbors: (Router 1501 1,AC1), (Router 2,Router 3,PW1to2), (Router 4,PW1to3), (Router 1502 5,PW1to4)]}, PE 2: {[pim_joins(S1,G): AC2], [PIM Neighbors: 1503 (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 1504 4,PW2to3), (Router 5,PW2to4)]}, PE 3: {[pim_joins(S1,G): AC4], 1505 [PIM Neighbors: (Router 1,PW1to3), (Router 2,Router 3,PW2to3), 1506 (Router 4,AC4), (Router 5,PW3to4)]}, PE 4: {[(,); Flood to: ], 1507 [PIM Neighbors: (Router 1,PW1to4), (Router 2,Router 3,PW2to4), 1508 (Router 4,PW3to4), (Router 5,AC5)]}. 1509 - Then, assume Router 5 requests to join the multicast group 1510 (S4,G). After the same procedures are applied, all PEs update 1511 their states as follows: PE 1: {[pim_joins(S1,G): PW1to2, PW1to3], 1512 [PIM Neighbors: (Router 1,AC1), (Router 2,Router 3,PW1to2), 1513 (Router 4,PW1to3), (Router 5,PW1to4)]}, PE 2: {[pim_joins(S1,G): 1514 AC2], [PIM Neighbors: (Router 1,PW1to2), (Router 2,AC2), (Router 1515 3,AC3), (Router 4,PW2to3), (Router 5,PW2to4)]}, PE 3: 1516 {[pim_joins(S1,G): AC4], [pim_joins(S4,G): PW3to4], [PIM 1517 Neighbors: (Router 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 1518 4,AC4), (Router 5,PW3to4)]}, PE 4: {[pim_joins(S4,G): AC5], [PIM 1519 Neighbors: (Router 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 1520 4,PW3to4), (Router 5,AC5)]}. 1522 5.4.4.4. PIM-SSM Prune 1524 At this point, all PEs have necessary states to not send multicast 1525 traffic to sites with no members. 1527 The PIM-SSM snooping mechanism for leaving a multicast group works as 1528 follows: 1530 Assume Router 2 sends a (S1,G) prune message to leave the multicast 1531 group. The prune message gets flooded in the VPLS instance. All PEs 1532 update their states as follows: PE 1: {[pim_joins(S1,G): PW1to3], 1533 [PIM Neighbors: (Router 1,AC1), (Router 2,Router 3,PW1to2), (Router 1534 4,PW1to3), (Router 5,PW1to4)]}, PE 2: {[Deletes (S1,G) state], [PIM 1535 Neighbors: (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 1536 4,PW2to3), (Router 5,PW2to4)]}, PE 3: {[(S1,G); Flood to: AC4], 1537 [(S4,G); Flood to: PW3to4], [PIM Neighbors: (Router 1,PW1to3), 1538 (Router 2,Router 3,PW2to3), (Router 4,AC4), (Router 5,PW3to4)]}, PE 1539 4: {[(S4,G); Flood to: AC5], [PIM Neighbors: (Router 1,PW1to4), 1540 (Router 2,Router 3,PW2to4), (Router 4,PW3to4), (Router 5,AC5)]}. 1542 In PIM-SSM snooping, prune messages are flooded by PE routers. In 1543 such implementation, PE routers may receive overriding join messages, 1544 which will not affect anything. 1546 5.4.4.5. Failure Scenarios 1548 Similar to PIM-SSM snooping, failures can be easily handled in PIM- 1549 SSM snooping, as it employs state-refresh technique. The PEs in the 1550 VPLS instance will remove entry for non-refreshing routers from their 1551 states. 1553 5.4.4.6. Special Cases for PIM-SSM Snooping 1555 The scenarios with duplicate traffic as depicted in Figure 3 apply to 1556 PIM-SSM snooping as well. Again, the issue can be solved by the 1557 method described in Section 5.4.3.8. . 1559 5.4.5. Bidirectional-PIM (BIDIR-PIM) 1561 BIDIR-PIM is a variation of PIM-SM. The main differences between 1562 PIM-SM and Bidirectional-PIM are as follows: 1563 - There are no source-based trees, and source-specific 1564 multicast is not supported (i.e., no (S,G) states) in BIDIR-PIM. 1565 - Multicast traffic can flow up the shared tree in BIDIR-PIM. 1566 - To avoid forwarding loops, one router on each link is elected 1567 as the Designated Forwarder (DF) for each RP in BIDIR-PIM. 1569 The main advantage of BIDIR-PIM is that it scales well for many-to- 1570 many applications. However, the lack of source-based trees means 1571 that multicast traffic is forced to remain on the shared tree. 1573 In Figure 2, the RP for (*,G4) is behind Router 4, and the RP for 1574 (*,G1) is behind Router 1. Router 2 and Router 4 want to join 1575 (*,G1), whereas Router 5 wants to join (*,G4). On the VPLS instance, 1576 Router 4 is the DF for the RP of (*,G4), and Router 1 is the DF of 1577 the RP for (*,G1). 1579 5.4.5.1. Discovering Multicast Routers 1581 The PIM-SSM snooping mechanism for neighbor discovery works the same 1582 way as the procedure defined in PIM-DM section, with the exception of 1583 PIM-DM only guidelines. 1584 - Based on PIM Hello exchanges PE routers populate PIM snooping 1585 states as follows. PE 1: { [PIM Neighbors: (Router 1,AC1), 1586 (Router 2,Router 3,PW1to2), (Router 4,PW1to3), (Router 5,PW1to4)] 1587 }, PE 2: { [PIM Neighbors: (Router 1,PW1to2), (Router 2,AC2), 1588 (Router 3,AC3), (Router 4,PW2to3), (Router 5,PW2to4)] }, PE 3: 1589 {[PIM Neighbors: (Router 1,PW1to3), (Router 2,Router 3,PW2to3), 1590 (Router 4,AC4), (Router 5,PW3to4)]}, PE 4: { [PIM Neighbors: 1591 (Router 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 4,PW3to4), 1592 (Router 5,AC5)]}. 1594 For BIDIR-PIM to work properly, all routers within the domain must 1595 know the address of the RP. There are three methods to do that: 1. 1596 Static RP configuration, 2, Auto-RP, and 3. PIMv2 Bootstrap. 1597 Guideline 17 applies here as well. 1599 During RP discovery time, PIM routers elect DF per subnet for each 1600 RP. The algorithm to elect the DF is as follows: all PIM neighbors 1601 in a subnet advertise their unicast route to elect the RP and the 1602 router with the best route is elected. 1604 Guideline 33: All PEs MUST snoop the DF elections messages and 1605 determine the DF for AC/PW towards the DF (DF(RP)) MUST be added 1606 to the oiflist for each (*,G) whose RP(G) is RP. When DF(RP) 1607 changes., the oiflist must be updated accordingly, the oiflist 1608 must be updated accordingly 1610 - In Figure 2, there is one RP (call it RPA) behind Router 5. 1611 Based on DF election messages, PE routers populate PIM snooping 1612 states as follows: PE 1: {[PIM Neighbors: (Router 1,AC1), (Router 1613 2,Router 3,PW1to2), (Router 4,PW1to3), (Router 5,PW1to4)], 1614 [DF(RPA): PW1to3], PE 2: {[PIM Neighbors: (Router 1,PW1to2), 1615 (Router 2,AC2), (Router 3,AC3), (Router 4,PW2to3), (Router 1616 5,PW2to4)], [DF(RPA): PW2to3]}, PE 3: {[PIM Neighbors: (Router 1617 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 4,AC4), (Router 1618 5,PW3to4)], [DF(RPA): AC5]}, PE 4: {[PIM Neighbors: (Router 1619 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 4,PW3to4), (Router 1620 5,AC5)], [DF(RPA): PW3to4]}. 1622 5.4.5.2. Guidelines for BIDIR-PIM Snooping 1624 The BIDIR-PIM snooping for Join and Prune messages is similar to 1625 PIM-SM and the following guidelines (some of which are repetitions 1626 from PIM-SM section) apply. 1628 Guideline 34: A PE MUST add a PW/AC to its pim_joins(*,G) list if 1629 it receives a (*,G) join message from the PW/AC. 1631 Guideline 35: BIDIR-PIM join messages MUST be flooded to all PEs 1632 in the VPLS instance. BIDIR-PIM join messages received on remote 1633 PEs MUST be forwarded only towards the router to which the Join is 1634 addressed. 1636 Guideline 36: BIDIR-PIM prune messages MUST be flooded in the VPLS 1637 instance. 1639 Guideline 37: If A PE does not receive a refresh join message from 1640 a PW/AC within its Holdtime, the PE MUST remove the PW/AC from its 1641 pim_joins(*,G) list. 1643 Guideline 38: A PE MUST remove a PW/AC from its pim_joins(*,G) 1644 list if it receives a (*,G) prune message from the PW/AC. A 1645 prune-delay timer SHOULD be implemented to support prune override. 1647 5.4.5.3. BIDIR-PIM Join 1649 The BIDIR-PIM snooping mechanism for joining a multicast group works 1650 as follows: 1651 - As before, assume the RP for both G1 and G4 (RPA) is behind 1652 Router 4. Assume Router 2 wants to join the multicast group 1653 (*,G1). PE 2 sends the join message to the other PEs. All PEs 1654 update their states as follows: PE 1: { [PIM Neighbors: (Router 1655 1,AC1), (Router 2,Router 3,PW1to2), (Router 4,PW1to3), (Router 1656 5,PW1to4)], [DF(RPA): PW1to3], [pim_joins(*,G1): PW1to2]}, PE 2: 1657 {[PIM Neighbors: (Router 1,PW1to2), (Router 2,AC2), (Router 1658 3,AC3), (Router 4,PW2to3), (Router 5,PW2to4)], [DF(RPA): PW2to3], 1659 [pim_joins(*,G1): AC2]}, PE 3: {[PIM Neighbors: (Router 1,PW1to3), 1660 (Router 2,Router 3,PW2to3), (Router 4,AC4), (Router 5,PW3to4)], 1661 [DF(RPA): AC4], [pim_joins(*,G1): PW2to3]}, PE 4: { [PIM 1662 Neighbors: (Router 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 1663 4,PW3to4), (Router 5,AC5)], [DF(RPA): PW3to4], [pim_joins(*,G1): 1664 PW2to4]}. 1665 - Next, assume Router 4 wants to join the multicast group 1666 (*,G1). All PEs update their states as follows: PE 1: {[PIM 1667 Neighbors: (Router 1,AC1), (Router 2,Router 3,PW1to2), (Router 1668 4,PW1to3), (Router 5,PW1to4)], [DF(RPA): PW1to3], 1669 [pim_joins(*,G1): PW1to2, PW1to3]}, PE 2: {[PIM Neighbors: (Router 1670 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 4,PW2to3), 1671 (Router 5,PW2to4)], [DF(RPA): PW2to3], [pim_joins(*,G1): AC2, 1672 PW2to3}, PE 3: {[PIM Neighbors: (Router 1,PW1to3), (Router 1673 2,Router 3,PW2to3), (Router 4,AC4), (Router 5,PW3to4)], [DF(RPA): 1674 AC4], pim_joins(*,G1): PW2to3, AC4]}, PE 4: {[PIM Neighbors: 1675 (Router 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 4,PW3to4), 1676 (Router 5,AC5)], [DF(RPA): PW3to4], [pim_joins(*,G1): PW2to4, 1677 PW3to4]}. 1678 - Then, assume Router 5 wants to join the multicast group 1679 (*,G4). Following the same procedures, all PEs update their states 1680 as follows: PE 1: {[PIM Neighbors: (Router 1,AC1), (Router 1681 2,Router 3,PW1to2), (Router 4,PW1to3), (Router 5,PW1to4)], 1682 [DF(RPA): PW1to3], [pim_joins(*,G1): PW1to2, PW1to3], 1683 [pim_joins(*,G4): PW1to4]}, PE 2: {[PIM Neighbors: (Router 1684 1,PW1to2), (Router 2,AC2), (Router 3,AC3), (Router 4,PW2to3), 1685 (Router 5,PW2to4)], [DF(RPA): PW2to3], [pim_joins(*,G1): AC2, 1686 PW2to3], [pim_joins(*,G4): PW2to4]}, PE 3: {[PIM Neighbors: 1687 (Router 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 4,AC4), 1688 (Router 5,PW3to4)], [DF(RPA): AC4], pim_joins(*,G1): PW2to3, AC4], 1689 pim_joins(*,G4): PW3to4]}, PE 4: {[PIM Neighbors: (Router 1690 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 4,PW3to4), (Router 1691 5,AC5)], [DF(RPA): PW3to4], [pim_joins(*,G1): PW2to4, PW3to4], 1692 [pim_joins(*,G4): AC5]}. 1694 5.4.5.4. BIDIR-PIM Prune 1696 At this point, all PEs have necessary states to not send multicast 1697 traffic to sites with no members. 1699 One example of the BIDIR-PIM snooping mechanism for leaving a 1700 multicast group works as follows: 1701 - Assume Router 2 wants to leave the multicast group (*,G1) and 1702 sends a (*,G1) prune message. The prune message gets flooded in 1703 the VPLS instance. All PEs update their states as follows: PE 1: 1704 {[PIM Neighbors: (Router 1,AC1), (Router 2,Router 3,PW1to2), 1705 (Router 4,PW1to3), (Router 5,PW1to4)], [DF(RPA): PW1to3], 1706 [pim_joins(*,G1): PW1to3], [pim_joins(*,G4): PW1to4]}, PE 2: {[PIM 1707 Neighbors: (Router 1,PW1to2), (Router 2,AC2), (Router 3,AC3), 1708 (Router 4,PW2to3), (Router 5,PW2to4)], [DF(RPA): PW2to3], 1710 [pim_joins(*,G1): PW2to3], [pim_joins(*,G4): PW2to4]}, PE 3: {[PIM 1711 Neighbors: (Router 1,PW1to3), (Router 2,Router 3,PW2to3), (Router 1712 4,AC4), (Router 5,PW3to4)], [DF(RPA): AC4], [pim_joins(*,G1): 1713 AC1], [pim_joins(*,G4): PW3to4]}, PE 4: {[PIM Neighbors: (Router 1714 1,PW1to4), (Router 2,Router 3,PW2to4), (Router 4,PW3to4), (Router 1715 5,AC5)], [DF(RPA): PW3to4], [pim_joins(*,G1): PW3to4], 1716 [pim_joins(*,G4): AC5]}. 1718 5.4.5.5. Failure Scenarios 1720 Once again, failures can be easily handled in BIDIR-PIM snooping, as 1721 it employs state-refresh technique. PEs in the VPLS instance will 1722 remove entry for non-refreshing routers from their states. 1724 5.4.6. Multicast Source Directly Connected to the VPLS Instance 1726 If there is a source in the CE network that connects directly into 1727 the VPLS instance, then multicast traffic from that source MUST be 1728 sent to all PIM routers on the VPLS instance apart from the outgoing 1729 interface list for the corresponding snooping state. If there is 1730 already (S,G)/(*,G) snooping state that is formed on any PE, this 1731 will not happen per the current forwarding rules and guidelines. The 1732 (S,G)/(*,G) state may not send traffic towards all the routers. So, 1733 in order to determine if traffic needs to be flooded to all routers, 1734 a PE must be able to determine if the traffic came from a host on 1735 that LAN. There are three ways to address this problem: 1736 - The PE would have to do ARP snooping to determine if a source 1737 is directly connected. 1738 - Another option is to have configuration on all PEs to say 1739 there are CE sources that are directly connected to the VPLS 1740 instance and disallow snooping for the groups for which the source 1741 is going to send traffic. This way traffic from that source to 1742 those groups will always be flooded within the provider network. 1743 - A third option is to require that sources of CE multicast 1744 routers must appear behind a router. 1746 5.5. VPLS Multicast on the Upstream PE 1748 An implementation MAY use native PIM Snooping procedures on the 1749 upstream PE to build multicast state as described in the previous 1750 sections. But snooping on PWs may overwhelm the upstream PEs. In this 1751 section, we propose an alternate approach for building multicast 1752 states on the upstream PE and thus avoid snooping on the PWs. 1754 As discussed earlier in the previous sections, VPLS Multicast for the 1755 various flavors of PIM requires only the downstream PE(s) and the 1756 upstream PE to build multicast state for a given multicast flow. 1757 Join/Prune messages need only be sent towards the upstream CE and it 1758 is wasteful to distribute and/or build multicast state on all PEs. 1760 Unless otherwise noted, an upstream PE in this section refers to the 1761 PE to which the upstream RPF neighbor in the C-instance is connected. 1763 [VPLS-LDP] already uses LDP as the infrastructure to build PWs 1764 between the PEs and to exchange VPLS information. It suits VPLS 1765 multicast very well to leverage this existing infrastructure to send 1766 PIM multicast state information from the downstream PE to the 1767 upstream PE. While the procedures described here are extensible to 1768 other IP multicast protocols as well, we will define the procedures 1769 for PIM in this draft. The rules and procedures for this are 1770 described below. 1772 5.5.1. Negotiating PIM Multicast capability in LDP 1774 When a PE is capable of exchanging PIM Multicast states using LDP, it 1775 signals this capability to its peers. When the two ends of a PW are 1776 both capable of exchanging PIM control information using LDP, the 1777 procedures described in the following sub-sections are employed. 1778 Otherwise, the following procedures are simply skipped. 1780 If a LDP-multicast capable PE determines that the other end of a PW 1781 is LDP-multicast capable, it SHOULD turn off native PIM snooping 1782 procedures on that PW. Otherwise, it MAY employ PIM snooping 1783 procedures to build multicast states. 1785 Packet format for exchanging PIM Multicast capability information in 1786 LDP will be defined in a future revision of the draft. One option is 1787 to define a bit in the LDP Hello Message to signal this capability. 1789 5.5.2. Exchanging PIM Hellos 1791 We introduce a new PIM Hello TLV to carry the PIM Hellos received at 1792 the downstream PEs to all the other PEs using LDP. This TLV is sent 1793 in the LDP Address Message to all other PEs. The scope of this TLV is 1794 the VPLS instance specified in the FEC TLV in the LDP Address 1795 Message. 1797 When a new PIM Hello is received at the downstream PE, it is sent to 1798 all other PEs using the PIM Hello TLV. Subsequently, if a PIM Hello 1799 received from a C-router is the same as the previous Hello received 1800 from that C-router, it SHOULD NOT be sent in the PIM Hello TLV. If a 1801 PIM Hello received from a C-router is different from the previous 1802 received Hello from that C-router, the PE MUST send it in the PIM 1803 Hello TLV to all PEs. If the downstream PE ages out a PIM Hello, it 1804 MUST send a PIM Hello TLV with zero hold time to remove the Hello 1805 state on all other PEs. 1807 The downstream PE MUST also forward the PIM Hello on all PWs. 1809 When a PE receives a PIM Hello TLV for a C-router, it replaces the 1810 old PIM Hello information with the new one received in the TLV. The 1811 upstream PE never ages out a PIM Hello state. It MUST remove a PIM 1812 Hello state only when it receives a PIM Hello TLV with zero hold-time 1813 OR when the PW is torn down. 1815 When a PE receives a new PIM Hello TLV, then all multicast states 1816 with that PW as the RPF interface MUST be refreshed with PIM 1817 Join/Prune TLVs. 1819 Packet format for the PIM Hello TLV will be defined in a future 1820 revision of this draft. 1822 5.5.3. Exchanging PIM Join/Prune States 1824 We introduce a new PIM Join/Prune TLV to advertise C-Join/Prune 1825 messages received at a downstream PE to the upstream PE. This TLV is 1826 sent in the LDP Address Message to the upstream PE. The scope of this 1827 TLV is the VPLS instance specified in the FEC TLV in the LDP Address 1828 Message. 1830 When a downstream PE receives a new C-Join/Prune message for a 1831 multicast state, it MUST send the C-Join/Prune message in a PIM 1832 Join/Prune TLV to the upstream PE. If it receives a C-Join/Prune 1833 message information different than what was received before (e.g. 1834 newer hold-time, change in Join/Prune state, different RPF-neighbor 1835 field, etc), the PE MUST send the Join/Prune message in a PIM 1836 Join/Prune TLV to the new upstream PE. Note that if a C-Join is 1837 received for a new upstream PE, it may not imply that the state on 1838 the old upstream PE needs to be torn down. Different C-Routers may be 1839 sending C-Joins to different upstream RPF neighbors. 1841 If the downstream PE cannot determine who the upstream PE is, then it 1842 SHOULD save the Join/Prune state and send the PIM Join/Prune TLV when 1843 it is able to determine who the upstream PE is i.e. when it receives 1844 a PIM Hello TLV on a PW from the corresponding C-RPF-neighbor. 1846 The downstream PE MUST also forward the C-Join/Prune message. These 1847 packets will not be snooped at the upstream PE(s) and are intended 1848 only for the upstream C-router(s). 1850 Packet format for the PIM Join/Prune TLV will be defined in a future 1851 revision of this draft. 1853 5.5.3.1. PIM Join Suppression Issues 1855 For VPLS Multicast to work, the C-routers MUST disable PIM Join 1856 suppression. However, it is our understanding that existing 1857 deployments from several vendors do not support the capability to 1858 disable PIM Join suppression. If that is so, then VPLS Multicast 1859 simply does not work if we multicast the C-Joins to all C-routers. 1860 Also, the provider has no control over the configuration on a C- 1861 router (to ensure that C-Join Suppression is disabled). 1863 If the downstream PE determines that PIM Join suppression is active 1864 in a VPLS instance, then it MUST unicast-forward the C-Joins towards 1865 the RPF-neighbor field in the C-Join. This allows the C-Join to not 1866 be seen by other C-routers. Since we recommend that it unicast- 1867 forward the C-Join/Prune packets, it is important to ensure that the 1868 PIM control packets are received in order at the upstream C-router. 1869 To achieve this, the same ordering restriction that apply to 1870 broadcast and unknown frames apply to PIM control packets. 1872 5.5.3.2. Resiliency against soft-state failures 1874 PIM is a soft-state protocol. So it is possible for packets to get 1875 dropped. Even though the Join/Prune exchange between the PEs is 1876 reliable; if certain packets are not received at the downstream PE, 1877 it can leave stale state on the upstream PE. Specifically, an RPF 1878 change on a downstream C-Router results in a C-Prune message to be 1879 sent to the old RPF-neighbor and a C-Join to be sent to the new RPF- 1880 neighbor. If the C-Prune message is not received at the downstream 1881 PE, then the downstream PE will not be able to forward that message 1882 to the upstream PE. This will result in stale state in the upstream 1883 PE. 1885 An implementation MUST implement one of the following two procedures 1886 to handle this. 1888 5.5.3.2.1. Explicit Tracking of C-Joins at the downstream PE 1890 This method allows us to not require refreshes on PWs and yet achieve 1891 resiliency from soft-state failures. If this method is used, then the 1892 hold-time encoded in the PIM Join/Prune TLVs MUST be set to infinity. 1893 This is the recommended method since it eliminates refreshes on PWs. 1895 For each C-Join(S,G) received at the downstream PE on an AC, the 1896 downstream PE MUST keep the following state per Upstream C-Router: 1898 - The Set of Upstream C-Router Addresses 1899 o Per Upstream C-Router, a set of: 1900 . Downstream C-Router Address 1901 . Downstream Join Expiry Timer 1903 When a C-Join received at the downstream PE results in the set of 1904 Downstream C-Routers for a (C-Source, C-Group, C-RPF-Neighbor) to 1905 become non-empty, then a PIM Join TLV MUST be sent to the Upstream 1906 PE. 1908 If the set of Downstream C-Routers for a (C-Source, C-Group, C-RPF- 1909 Neighbor) becomes empty, then a PIM Prune TLV MUST be sent to the 1910 corresponding Upstream PE. This may happen as a result of a received 1911 C-Prune message or as a result of the Downstream Join Timer Expiry. 1913 5.5.3.2.2. Refreshing PIM Join TLVs on the PWs 1915 If this method is employed, the downstream PE MUST forward received 1916 C-Joins in the form of PIM Join TLVs on the PWs at periodic 1917 intervals. The refresh interval across the PWs should be configurable 1918 in multiples of the C-Join refresh interval. If this refresh multiple 1919 is N, then every Nth C-Join refresh for a given multicast state MUST 1920 also be sent as a PIM Join TLV to the upstream PE. The HoldTime field 1921 in the PIM Join TLV MUST be set to ((N * HoldTime in C-Join) + 20) 1922 seconds. 1924 5.5.3.3. PIM-BIDIR Considerations 1926 Unlike other PIM variants, in PIM-BIDIR, a traffic source need not be 1927 behind the RPF-neighbor. Traffic can come from any AC/PW and it MUST 1928 be forwarded by the switches. Following are the deviations from the 1929 procedures defined earlier to handle PIM-BIDIR. 1931 PIM-BIDIR Join/Prune TLVs MUST be forwarded to all PEs instead of 1932 just the upstream PE towards the RP. PIM BIDIR Join/Prune Packets 1933 MUST also be multicast-forwarded as is on all PWs. 1935 5.6. Data Forwarding Rules 1937 The final list of outgoing interfaces for a given (S,G) or (*,G) is 1938 computed by combining the IGMP and PIM state summarization macros. 1940 OifList(*,G) = igmp_include(*,G) (+) pim_oiflist(*,G) 1942 Oiflist(S,G) = igmp_include(*,G) (-) igmp_exclude(S,G) (+) 1943 Igmp_include(S,G) (+) pim_oiflist(S,G) 1945 If PIM Snooping is active for a given (*,G) or (S,G), then the PE 1946 also tracks the upstream AC/PW as the RPF interface. Data traffic 1947 MUST be forwarded ONLY IF traffic arrives on the RPF interface. If 1948 data traffic arrives on any other interface, then the following rules 1949 apply: 1950 - If the traffic arrives on an AC and the PE determines that 1951 the traffic is coming from a directly connected source, then the 1952 rules described in Section 5.4.6. apply. 1953 - Otherwise, it could be a PIM ASSERT scenario. Then the rules 1954 described in Section 5.4.3.8. apply. 1956 In the presence of only IGMP Snooping state, there is no RPF 1957 interface that can be remembered. In such a scenario, traffic should 1958 simply be forwarded to the Oiflist after performing source interface 1959 pruning. 1961 6. Security Considerations 1962 Security considerations provided in VPLS solution documents (i.e., 1963 [VPLS-LDP] and [VPLS-BGP) apply to this document as well. 1965 7. References 1967 7.1. Normative References 1969 7.2. Informative References 1971 [VPLS-LDP] Lasserre, M, et al. "Virtual Private LAN Services 1972 over MPLS", work in progress 1973 [VPLSD-BGP] Kompella, K, et al. "Virtual Private LAN Service", 1974 work in progress 1975 [L2VPN-FR] Andersson, L, et al. "L2VPN Framework", work in 1976 progress 1977 [PMP-RSVP-TE] Aggarwal, R, et al. "Extensions to RSVP-TE for 1978 Point to Multipoint TE LSPs", work in progress 1979 [RFC1112] Deering, S., "Host Extensions for IP Multicasting", 1980 RFC 1112, August 1989. 1981 [RFC2236] Fenner, W., "Internet Group Management Protocol, 1982 Version 2", RFC 2236, November 1997. 1983 [RFC3376] Cain, B., et al. "Internet Group Management 1984 Protocol, Version 3", RFC 3376, October 2002. 1985 [MAGMA-SNOOP] Christensen, M., et al. "Considerations for IGMP 1986 and MLD Snooping Switches", work in progress 1987 [PIM-DM] Deering, S., et al. "Protocol Independent Multicast 1988 Version 2 Dense Mode Specification", draft-ietf- 1989 pim-v2-dm-03.txt, June 1999. 1990 [RFC2362] Estrin, D, et al. "Protocol Independent Multicast- 1991 Sparse Mode (PIM-SM): Protocol Specification", RFC 1992 2362, June 1998. 1993 [PIM-SSM] Holbrook, H., et al. "Source-Specific Multicast for 1994 IP", work in progress 1995 [BIDIR-PIM] Handley, M., et al. "Bi-directional Protocol 1996 Independent Multicast (BIDIR-PIM)", work in 1997 progress 1999 Authors' Addresses 2001 Yetik Serbest 2002 SBC Labs 2003 9505 Arboretum Blvd. 2004 Austin, TX 78759 2005 Yetik_serbest@labs.sbc.com 2007 Ray Qiu 2008 Alcatel North America 2009 701 East Middlefield Rd. 2010 Mountain View, CA 94043 2011 Ray.Qiu@alcatel.com 2013 Venu Hemige 2014 Alcatel North America 2015 701 East Middlefield Rd. 2016 Mountain View, CA 94043 2017 Venu.hemige@alcatel.com 2019 Rob Nath 2020 Riverstone Networks 2021 5200 Great America Parkway 2022 Santa Clara, CA 95054 2023 Rnath@riverstonenet.com 2025 Suresh Boddapati 2026 Alcatel North America 2027 701 East Middlefield Rd. 2028 Mountain View, CA 94043 2029 Suresh.boddapati@alcatel.com 2031 Sunil Khandekar 2032 Alcatel North America 2033 701 East Middlefield Rd. 2034 Mountain View, CA 94043 2035 Sunil.khandekar@alcatel.com 2037 Vach Kompella 2038 Alcatel North America 2039 701 East Middlefield Rd. 2040 Mountain View, CA 94043 2041 Vach.kompella@alcatel.com 2043 Marc Lasserre 2044 Riverstone Networks 2045 Marc@riverstonenet.com 2047 Himanshu Shah 2048 Ciena 2049 hshah@ciena.com 2051 Intellectual Property Statement 2053 The IETF takes no position regarding the validity or scope of any 2054 Intellectual Property Rights or other rights that might be claimed to 2055 pertain to the implementation or use of the technology described in 2056 this document or the extent to which any license under such rights 2057 might or might not be available; nor does it represent that it has 2058 made any independent effort to identify any such rights. Information 2059 on the procedures with respect to rights in RFC documents can be 2060 found in BCP 78 and BCP 79. 2062 Copies of IPR disclosures made to the IETF Secretariat and any 2063 assurances of licenses to be made available, or the result of an 2064 attempt made to obtain a general license or permission for the use of 2065 such proprietary rights by implementers or users of this 2066 specification can be obtained from the IETF on-line IPR repository at 2067 http://www.ietf.org/ipr. 2069 The IETF invites any interested party to bring to its attention any 2070 copyrights, patents or patent applications, or other proprietary 2071 rights that may cover technology that may be required to implement 2072 this standard. Please address the information to the IETF at ietf- 2073 ipr@ietf.org. 2075 Full copyright statement 2077 Copyright (C) The Internet Society (2005). This document is subject 2078 to the rights, licenses and restrictions contained in BCP 78, and 2079 except as set forth therein, the authors retain all their rights. 2081 This document and the information contained herein are provided on an 2082 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 2083 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 2084 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 2085 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 2086 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 2087 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.