idnits 2.17.1 draft-rosen-vpn-mcast-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([PIMv2], [RFC2547bis]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 589: '... MUST have mutually unique source a...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 500 has weird spacing: '...message from ...' == Line 505 has weird spacing: '... from which...' == Line 506 has weird spacing: '...message with ...' == Line 508 has weird spacing: '... shared tree,...' == Line 509 has weird spacing: '...r prune lists...' == (19 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 2002) is 7925 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'BIDIR' -- Possible downref: Non-RFC (?) normative reference: ref. 'MPLS-PIM' == Outdated reference: A later version (-12) exists of draft-ietf-pim-sm-v2-new-05 -- No information found for draft-ietf-ppvpn-rfc2547bis - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'RFC2547bis' Summary: 6 errors (**), 0 flaws (~~), 8 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Eric C. Rosen 3 Internet Draft Yiqun Cai 4 Expiration Date: February 2003 Dan Tappan 5 IJsbrand Wijnands 6 Cisco Systems, Inc. 8 Yakov Rekhter 9 Juniper Networks, Inc. 11 Dino Farinacci 12 Procket Networks, Inc. 14 August 2002 16 Multicast in MPLS/BGP VPNs 18 draft-rosen-vpn-mcast-04.txt 20 Status of this Memo 22 This document is an Internet-Draft and is in full conformance with 23 all provisions of Section 10 of RFC2026. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 Abstract 43 [RFC2547bis] describes a method of providing a VPN service. It 44 specifies the protocols and procedures which must be implemented in 45 order for a Service Provider to provide a unicast VPN. This document 46 extends that specification by describing the protocols and procedures 47 which a Service Provider must implement in order to support multicast 48 traffic in a VPN, assuming that PIM [PIMv2] is the multicast routing 49 protocol used within the VPN, and the the SP network can provide PIM 50 as well. 52 Table of Contents 54 1 Introduction ....................................... 3 55 2 Multicast Domains .................................. 4 56 2.1 Multicast VRFs ..................................... 5 57 2.2 Multicast Tunnels .................................. 5 58 2.3 PIM across the MD .................................. 6 59 2.4 RPF Determination .................................. 6 60 2.5 Avoiding Conflict with Internet Multicast .......... 7 61 2.6 Dense Mode ......................................... 7 62 2.7 Forwarding ......................................... 8 63 2.8 Scalability ........................................ 8 64 2.9 Increasing the Optimality .......................... 9 65 2.10 Inter-Provider Considerations ...................... 9 66 3 VPN-IP PIM-SM ...................................... 10 67 3.1 Multicast VRFs ..................................... 11 68 3.2 Use of VPN-IP addresses in PIM ..................... 11 69 3.3 Forwarding ......................................... 12 70 3.4 Associating VPN-IP PIM Messages with VRFs .......... 12 71 3.5 The RPF Hint ....................................... 12 72 3.6 When a PE Sends a PIM Message to the Backbone ...... 13 73 3.7 PIM Bootstrap Messages ............................. 14 74 4 Multicast Domains Using PIM NBMA Techniques ........ 15 75 5 Summary for Sub-IP Area ............................ 16 76 5.1 Related Documents .................................. 16 77 5.2 Where it Fits in the Picture of the Sub-IP Work .... 16 78 5.3 Why is it Targeted at this WG ...................... 16 79 5.4 Justification ...................................... 17 80 6 Intellectual Property Considerations ............... 17 81 7 Acknowledgments .................................... 17 82 8 References ......................................... 17 83 9 Authors' Addresses ................................. 18 85 1. Introduction 87 [RFC2547bis] describes a method of providing a VPN service. It 88 specifies the protocols and procedures which must be implemented in 89 order for a Service Provider to provide a unicast VPN. This document 90 extends that specification by describing the protocols and procedures 91 which a Service Provider must implement in order to support multicast 92 traffic in a VPN, assuming that PIM [PIMv2] is the multicast routing 93 protocol used within the VPN, and that the SP network can provide PIM 94 as well. Familiarity with the terminology and procedures of 95 [RFC2547bis] is presupposed. Familiarity with [PIMv2] is also 96 presupposed. 98 The discussion here must not be confused with discussions elsewhere 99 of Internet multicast. What we are considering here is primarily 100 Enterprise multicast; our goal is to allow an Enterprise which has a 101 VPN service as defined in [RFC2547bis] to implement Enterprise 102 multicasts using PIM-SM or PIM-DM. 104 VPNs which are constructed according to [RFC2547bis] obtain optimal 105 unicast routing through the SP backbone, even though: 107 - the P routers do not maintain any routing information for the 108 VPNs, or indeed, any per-VPN state at all, and 110 - the CE routers at different sites do not maintain routing 111 adjacencies with each other. 113 Unfortunately, one cannot do quite so well with multicast routing. 114 For optimal multicast routing, when a PE router receives a multicast 115 data packet of a particular multicast group from a CE router, the 116 packet must get to every other PE router which is on the path to a 117 receiver of that group. It must not get to any other PEs. And it 118 must not be unnecessarily replicated. Optimal routing requires a 119 source-tree for the multicast group, which would mean that the P 120 routers would have to maintain state for each transmitter of each 121 multicast group in each VPN. 123 While this would provide optimal multicast routing, it also requires 124 an unbounded amount of state in the P routers, since the SP has no 125 control whatsoever of the number of multicast groups in the VPNs that 126 it supports. Nor does it have any control over the number of 127 transmitters in each group, nor of the distribution of the receivers. 129 In short, true optimal routing of VPN multicasts in the SP network 130 does not appear to be scalable. For completeness, we do specify, in 131 section 3 ("VPN-IP PIM"), how one could provide true optimal routing 132 of Spare Mode VPN multicasts in the SP network. While we do not 133 propose to adopt this solution, it is instructive to compare it with 134 the solution we do propose in section 2. 136 We also include, in section 4, a very brief description of a scheme 137 which does not require any multicast routing state at all to be kept 138 in the P routers, but in which all the replication is done in the PE 139 routers. 141 If we are willing to send multicasts along paths on which there are 142 no receivers, then it is possible to support VPN multicasts, using 143 exactly one multicast distribution tree for each VPN, and without 144 requiring that all replication is done by the PE routers. If more 145 than one site in a VPN may have multicast transmitters, it is best 146 for this single tree to be a bidirectional tree [BIDIR]. (In 147 environments, such as an ATM-LSR backbone, where bidirectional trees 148 cannot be supported, a single shared tree can be used.) 150 We describe in section 2 the procedures and protocols used to 151 implement this solution, which we dub "Multicast Domains". It is 152 this procedure which we propose for adoption. 154 In unicast routing, a CE router is an adjacency of a PE router, and 155 CE routers at different sites do NOT become adjacencies of each 156 other. We retain this characteristic for multicast routing -- a CE 157 router becomes a PIM adjacency of a PE router, and CE routers at 158 different sites do NOT become adjacencies of each other. 160 An Enterprise which uses PIM multicasting in its network before 161 adopting the VPN service can transition to the VPN service while 162 continuing to use whatever multicast router configurations it was 163 previously using; no changes need be made to CE routers or to other 164 routers at customer sites. Any dynamic RP-discovery procedures that 165 area already in use may be left in place. 167 The notion of a "VRF", defined in [RFC2547bis], to include multicast 168 routing entries as well as unicast routing entries. 170 2. Multicast Domains 172 In this section, we describe the solution we are proposing for 173 adoption. 175 A "Multicast Domain" is essentially a set of VRFs associated with 176 interfaces that can send multicast traffic to each other. 178 2.1. Multicast VRFs 180 Each VRF has its own multicast routing table. When a multicast data 181 or control packet is received from a particular CE device, multicast 182 routing is done in the associated VRF. 184 Each PE router runs a number of instances of PIM-SM, as many as one 185 per VRF. In each instance of PIM-SM, the PE maintains a PIM 186 adjacency with each of the PIM-capable CE routers associated with 187 that VRF. The multicast routing table created by each instance is 188 specific to the corresponding VRF. We will refer to these PIM 189 instances as "VPN-specific PIM instances". 191 Each PE router also runs a "provider-wide" instance of PIM-SM, in 192 which it has a PIM adjacency with each of its IGP neighbors (i.e., 193 with P routers), but NOT with any CE routers, and not with other PE 194 routers (unless they happen to be adjacent in the SP's network). 196 In order to help clarify when we are speaking of the provider-wide 197 PIM instance and when we are speaking of a VPN-specific PIM instance, 198 we will use the prefixes "P-" and "C-" respectively. Thus a P-Join 199 would be a PIM Join which is processed by the provider-wide PIM 200 instance, and a C-Join would be a PIM Join which is processed by a 201 VPN-specific PIM instance. A P-group address would be a group 202 address in the SP's address space, and a C-group address would be a 203 group address in a VPN's address space. 205 2.2. Multicast Tunnels 207 Each multicast VRF is assigned to one or more multicast domains. 208 Each such VRF MD is configured with a multicast P-group address. As 209 part of the configuration of the provider-wide PIM instance, an RP 210 address (in the address space of the P network) is configured for 211 each such P-group address. (Or the RP addresses may be discovered by 212 any other acceptable procedure, such as PIM Bootstrap messages.) 214 Each MD has a distinct P-group address. For each MD, a "Multicast 215 Tunnel" (MT) is created in the provider-wide PIM instance, using 216 ordinary PIM-SM techniques. The various PEs in the MD discover each 217 other by joining the shared tree rooted at the RP. For best 218 scalability, this should be a bidirectional tree [BIDIR]. 220 (Strictly speaking, the scheme works even if the MTs are realized by 221 PIM source trees. However, this could result in large numbers of 222 multicast distribution trees per MD, which would severely reduce the 223 scalability of the scheme.) 224 The MT is used to carry multicast C-packets, both data and control 225 packets, among the PE routers in a common MD. 227 To send a packet through an MT the packet must of course be 228 encapsulated. This could be done either with MPLS or with GRE. If 229 it is done with MPLS, then the "MPLS label distribution via PIM" 230 procedures [MPLS-PIM] must be supported. 232 When a packet is received from an MT, the receiving PE must be able 233 to determine the MT (and hence the MD) from which the packet was 234 received. (In the case of MPLS encapsulation, this will be 235 determined from the incoming MPLS label; penultimate hop popping must 236 not be performed.) The packet is then passed to the corresponding 237 Multicast VRF and VPN-specific PIM instance for further processing. 239 2.3. PIM across the MD 241 If a particular VRF is in a particular MD, the corresponding MT is 242 treated by that VRF's VPN-specific PIM instances as a LAN interface. 243 The PEs which are adjacent on the MT must execute the PIM LAN 244 procedures, including the generation and processing of Assert 245 packets. This allows VPN-specific PIM routes to be extended from 246 site to site, without appearing in the P routers. 248 If a PE in a particular MD transmits a C-multicast data packet to the 249 backbone, by transmitting it through an MD, every other PE in that MD 250 will receive it. Any of those PEs which are not on a C-multicast 251 distribution tree for the packet's C-multicast destination address 252 (as determined by applying ordinary PIM procedures to the 253 corresponding multicast VRF) will have to discard the packet. 255 2.4. RPF Determination 257 Although the MT is treated as a PIM-enabled interface, unicast 258 routing is NOT run over it, and there are no unicast routing 259 adjacencies over it. 261 If a VRF is in a single MD: 263 - a C-packet received over an MT is considered to pass the RPF 264 check if the IGP next hop to its source address, according to the 265 associated VRF, is not one of the interfaces associated with that 266 VRF; 268 - a C-Join/Prune message from a CE router needs to be forwarded 269 over the MT if the next hop interface to the root of the 270 corresponding multicast tree is not one of the interfaces 271 associated with that VRF. 273 If a VRF is in more than one MD, then the PE must be able to 274 determine which MT is the RPF for a particular C-address. This can 275 be done by means of BGP Extended Community attributes. Each MD can 276 be associated with a BGP Extended Community attribute, into which the 277 MD's group address is encoded. When a unicast VPN-IP address is 278 distributed from a VRF which is in a MD, the address can carry 279 Extended Community attributes which identify the MDs that the VRF 280 belongs to. Then the PE can find the MT which is the RPF for a given 281 address by looking at the Extended Community attributes of the 282 corresponding route. 284 The above specifies how to determine the RPF interface. To determine 285 the RPF neighbor for a particular C-address, we need to first 286 determine the BGP next hop for the corresponding VPN-IP address, then 287 verify that the BGP next hop is a PIM neighbor on the RPF interface. 289 2.5. Avoiding Conflict with Internet Multicast 291 If the SP is providing Internet multicast, distinct from its VPN 292 multicast services, it must ensure that the P-group addresses which 293 correspond to its MDs are distinct from any of the group addresses of 294 the Internet multicasts it supports. This is best done by using 295 administratively scoped addresses [ADMIN-ADDR]. 297 The C-group addresses need not be distinct from either the P-group 298 addresses or the Internet multicast addresses. 300 2.6. Dense Mode 302 Dense mode multicasts via PIM-DM are easily supported using MDs. The 303 MT is still created using PIM-SM, and the PEs simply use PIM-DM 304 procedures as necessary when transmitting C-data and C-control 305 packets across the MT. Thus an Enterprise which uses dense mode 306 multicasting can use the VPN service without changing its native 307 multicasting techniques. The P routers are not aware of whether the 308 Enterprise is using dense mode or sparse mode. 310 2.7. Forwarding 312 The P routers will not be able to tell, from the contents of the C- 313 packet as sent from CE to PE, which MT the packet should be sent 314 along. Therefore the packets need to be encapsulated. 316 If MPLS multicast [MPLS-PIM] is supported, then MPLS can be used for 317 the encapsulation. This would require only a single MPLS label. 318 Penultimate hop popping would not be used (otherwise the egress PE 319 could not tell which MD the packet belongs to). 321 Other encapsulations are also possible. For example, one could use a 322 GRE encapsulation, with the MD's P-group address appearing in the IP 323 destination address field. In this case, the SP must filter, at the 324 edges of its network, all non-VPN packets carrying any of these P- 325 group addresses in their destination address fields. 327 If a PIM shared tree (RP-tree) is being used, rather than a bidir 328 tree, and if MPLS encapsulation is being used, then Register packets 329 must themselves be encapsulated in GRE before being encapsulated in 330 MPLS. This is necessary in order to carry the MT's P-group address 331 corresponding to the RP. Note that the RP cannot remove the GRE 332 header before forwarding the packet, since the RP has no way of 333 knowing that a particular packet is a tunneled VPN multicast packet, 334 rather than an "ordinary" multicast. As a result, the GRE header 335 would have to be used for all tunneled VPN multicast packets carried 336 within MPLS, even if those packets are sent down a source tree. 338 2.8. Scalability 340 While this procedure requires the P routers to maintain multicast 341 state, the amount of state is bounded by the number of supported 342 VPNs. The P routers do NOT run any VPN-specific PIM instances. 344 The multicast routing provided by this scheme is not optimal, in that 345 a packet of a particular multicast group may be forwarded to PE 346 routers which have no downstream receivers for that group, and hence 347 which may need to discard the packet. 349 The use of a single bidirectional tree per VPN scales well as the 350 number of transmitters and receivers increases, but not so well as 351 the amount of multicast traffic per VPN increases. 353 2.9. Increasing the Optimality 355 Suppose that for each MT we create not one, but a number of multicast 356 distribution trees. One of these trees, the default tree, is joined 357 by all PEs in the MD. PIM control messages from the CEs are 358 forwarded along the default tree. However, multicast data messages 359 are mapped to particular distribution trees depending on the source 360 and group addresses that appear in them. The assignment of an (S,G) 361 pair (or, in our terminology, a (C-S, C-G) pair) to a particular 362 distribution tree would be done by the PE which receives the data 363 from a CE (i.e., by the transmitting side), and indicated to the 364 other PEs by a special PIM message sent on the default distribution 365 tree. 367 While every PE in an MD joins the default distribution tree for the 368 corresponding MT, a PE does not join a non-default distribution tree 369 unless it is connected to a VPN site which needs to receive traffic 370 from a group which has been assigned to that tree. 372 If it were known that certain C-groups have receivers at many VPN 373 sites, but others have receivers only at a few VPN sites, the former 374 could be mapped to the default tree, and the latter could be mapped 375 to one or more non-default distribution trees. This could 376 significantly reduce the amount of multicast data traffic that gets 377 sent to PEs that do not need to receive it. 379 Another, perhaps more feasible, approach is to keep all the low 380 throughput groups on the default distribution tree, and to distribute 381 the high throughput groups among the other distribution trees. 383 Of course, any scheme like this requires still more state in the P 384 routers, so again presents a trade-off between state and optimality. 386 2.10. Inter-Provider Considerations 388 If there are multi-provider VPNs which require multicast, then an MD 389 will cross provider boundaries. The multicast group address 390 associated with the MT must then be agreed upon by the providers. 392 [RFC2547bis] describes three methods for creating inter-provider 393 VPNs: 395 1. VRF-to-VRF connections at the AS border routers. 397 2. EBGP redistribution of labeled VPN-IP routes from AS to 398 neighboring AS. 400 3. Multihop EBGP redistribution of labeled VPN-IP routes between 401 source and destination ASes, with EBGP redistribution of 402 labeled IP routes from AS to neighboring AS. 404 The use of MDs for interprovider VPN multicast is compatible with 405 methods 1 and 3, but not with method 2. 407 3. VPN-IP PIM-SM 409 In this section, we present for completeness sake a solution that, 410 while it provides optimal multicast routing, we must deprecate due to 411 its scalability problems. 413 In this solution, PIM-SM is used to extend an (S,G) or (*,G) 414 multicast distribution tree from a set of customer sites, through the 415 SP backbone, to a set of customer sites. 417 This solution must solve the following basic problems: 419 - It extends the multicast routing of a VPN into the backbone, 420 despite the facts that: 422 * the unicast routing of that VPN is NOT extended into the 423 backbone, and 425 * PIM-SM assumes the presence of the unicast routing in order 426 to determine the RPF interface for a multicast distribution 427 tree. 429 - It ensures that multicast routing entries of different VPNs are 430 kept distinct in the backbone, even if the IP addresses 431 corresponding to the respective S and G values of the (S,G) 432 entries are not unique across VPNs. 434 - It ensures that when a PE router receives a PIM Join/Prune 435 message from the backbone, it associates that message with the 436 proper VRF. 438 - It properly handles PIM-SM Bootstrap Messages, which must be 439 flooded along an RPF-tree away from the unicast route to the 440 origin of the message. 442 3.1. Multicast VRFs 444 Each PE router runs a number of instances of PIM-SM, as many as one 445 per VRF. In each instance of PIM-SM, the PE maintains a PIM 446 adjacency with each of the PIM-capable CE routers associated with 447 that VRF. The multicast routing table created by each instance is 448 specific to the corresponding VRF. 450 Each PE router also runs a "provider-wide" instance of PIM-SM, in 451 which it has a PIM adjacency with each of its IGP neighbors. 452 Multicast routing data from a particular VRF is leaked into the 453 provider-wide multicast routing table, but the address of the root of 454 each multicast tree is first translated from an IP address to a VPN- 455 IP address. 457 3.2. Use of VPN-IP addresses in PIM 459 When multicast routing entries are distributed from a VRF to the 460 provider-wide routing table, they are modified as follows. Consider 461 an (S,G) entry in a VRF multicast routing table. This entry can only 462 exist if there is a route to S in the corresponding unicast VRF. If 463 there a route to S in the unicast VRF, it will correspond to a VPN-IP 464 route RD:S (where RD is the Route Distinguisher, see [RFC2547bis]). 465 So the (S,G) entry in the multicast VRF becomes an (RD:S,G) entry in 466 the provider-wide multicast routing table. This distinguishes the 467 multicast distribution tree from any other (S,G) tree which comes 468 from a different VPN. 470 In the case of a shared tree, a (*,G) entry becomes a RD:* entry, 471 where RD is the Route Distinguisher of the VPN-IP address of the 472 tree's RP. 474 P routers have only provider-wide multicast routing tables. 475 Join/Prune messages are sent between P routers and between P and PE 476 routers using the PIMv2 address family extensions, which allow the 477 VPN-IP addresses to be encoded. 479 In short, if two trees have the same value of G, but are in different 480 VPNS, they are distinguished by means of the RD of the root of the 481 tree. 483 3.3. Forwarding 485 The contents of the IP header of a multicast packet are insufficient 486 to determine the multicast tree that a particular packet is traveling 487 on. So the packets must be encapsulated while being forwarded 488 through the backbone, where the encapsulation can be used to uniquely 489 associate each packet with an (RD:S,G) or (RD:*,G) entry. This is 490 best done using MPLS multicast labels, and the MPLS label 491 distribution technique specified in [MPLS-PIM]. 493 Whereas unicast VPN packets generally carry two MPLS labels, 494 multicast VPN packets would carry only one. When the egress PE 495 receives a labeled multicast packet from the backbone, the top label 496 tells it which CEs to send the packet to after the label is popped. 498 3.4. Associating VPN-IP PIM Messages with VRFs 500 How does a PE, when it receives a PIM message from the backbone, 501 associate it with a particular VRF? 503 If the PIM message references a source tree, then the VPN-IP address 504 of the source (RD:S) is in the PIM message. The PE finds the VRF 505 from which the route to RD:s was exported, and associates the PIM 506 message with the (S,G) entry of that VRF. 508 If the PIM message references a shared tree, then the entries in 509 the join and/or prune lists will each have an RD. The PE looks 510 for a (*,G) entry whose RP has that RD. The presence of more than 511 one such is considered a configuration error. 513 3.5. The RPF Hint 515 When a P router receives a PIM Join/Prune message corresponding to a 516 VPN-IP PIM message, it must be able to determine the IGP next hop 517 towards the root of the specified multicast distribution tree. 519 In ordinary PIM, determination of the next hop is easily done, since 520 the IP address of the root of each multicast tree is known, and the 521 backbone routers know the unicast route towards the root of the tree. 522 However, in the case of a BGP/MPLS VPN the root of the multicast 523 distribution tree will be within a VPN, and hence will not have a 524 unique IP address. VPN-IP PIM must therefore use the VPN-IP address 525 of the root of the multicast distribution tree. But the backbone 526 routers do not know unicast routes for VPN-IP addresses. 528 To solve this, the PE router, before sending a PIM Join/Prune 529 message to a backbone router, must insert the address of its BGP 530 next hop towards the root of the tree. Call this the RPF hint. This 531 is generally the address of the PE router which attaches to the 532 site containing the root. Backbone routers always have IGP routes 533 to the PE routers' BGP next hops, and the IGP next hop towards the 534 root of a tree is always the same as the IGP next hop towards the PE 535 router which attaches to the site containing the root. 537 When a P router receives one of these Join/Prune messages, 538 instead of looking up the IGP next hop to the root of the 539 specified multicast distribution tree, it looks up the IGP next hop 540 of the RPF hint. 542 The RPF hint is not an essential part of the identification of the 543 multicast distribution tree. A change in the value of the RPF hint is 544 regarded simply as an RPF change, which changes the shape of the 545 tree, but which does not necessarily require construction of an 546 entirely new tree. 548 3.6. When a PE Sends a PIM Message to the Backbone 550 A PE router only sends a VPN-IP PIM Join to the backbone if it 551 receives a PIM Join from a CE router. That Join will contain the IP 552 address of the root of a particular multicast distribution tree. The 553 PE looks up the unicast route to this address in the VRF associated 554 with the CE. If this route exists in the VRF as a result of a VPN-IP 555 route's having been imported from BGP, the corresponding VPN-IP route 556 is identified. The VPN-IP address of the root of the tree can then be 557 formed by appending its IP address to the RD of that VPN-IP route. 559 This VPN-IP address, rather than the IP address will be placed in the 560 VPN-IP PIM Join List. (This applies in the case where the WC bit and 561 the RPT bit are both 1, as well as the case where they are both 0.) 563 With respect to the (S,G) state maintained by a PE router, the "S" 564 will be a VPN-IP address rather than an IP address. 566 With respect to the (*,G) state maintained by a PE router, the 567 address of the RP corresponding to the (*,G) tree will be maintained 568 as a VPN-IP address. 570 Similarly, if a CE prunes itself from a tree, and as a result the PE 571 must prune itself from the tree, the VPN-IP address of the root of 572 the tree will appear in the Prune List of the VPN-IP PIM Join/Prune 573 message sent to the backbone. (This applies in the case where the WC 574 bit and the RPT bit are both 1, as well a the case where they are 575 both 0.) 576 If a PE must prune a particular source from a (*,G) tree whose input 577 interface leads to the backbone, then the prune list in the VPN-IP 578 PIM Join/Prune message will contain a VPN-IP address whose RD is 579 taken from the VPN-IP address of the (*,G) tree's RP, and whose IP 580 address part is the IP address of the source being pruned. (This 581 applies in the case where the WC bit is 0 and the RPT bit is 1.) 583 When a router receives a VPN-IP PIM Join/Prune message which requests 584 that a VPN-IP source be pruned off a shared tree, it identifies the 585 shared tree by looking for a (*,G) entry with the specified value of 586 G, and whose RP has a VPN-IP address with the specified RD. 588 Note that all the sources transmitting to a particular group 589 MUST have mutually unique source addresses, so it is not necessary 590 to use an RD to identify the source when pruning the source off the 591 shared tree. (Of course it is necessary to use an RD to identify 592 the source when operating on a source tree.) It is however 593 necessary to use an RD to identify the RP that corresponds to the 594 shared tree. 596 Of course, a PE router may receive a PIM Join/Prune from a CE 597 router, and find that the RPF leads to a directly attached CE 598 router, rather than to a PE router. In this case, an "ordinary" PIM 599 Join/Prune message is just sent to the CE router. 601 If multicast is being done by a multi-provider VPN, the VPN-IP PIM 602 messages have to be processed by and forwarded by the BGP border 603 routers. Further, the RPF hint put in the VPN-IP PIM message by the 604 ingress PE will be the address of a border router, rather than the 605 address of the egress PE. Thus a border router processing a VPN-IP 606 PIM message has to replace the RPF hint with the address of its own 607 BGP next hop towards the VPN-IP address of the root of the multicast 608 distribution tree which the VPN-IP PIM message references. 610 3.7. PIM Bootstrap Messages 612 If a particular VPN uses PIM Bootstrap Messages to do auto-discovery 613 of RPs, and the SP is providing VPN multicast service via the VPN-IP 614 PIM scheme, then the bootstrap messages will need to be flooded 615 throughout the backbone. Suppose a PE receives a Bootstrap Message 616 from a CE, and the interface to the CE is the RPF interface to the 617 source of the Bootstrap Message. Then the PE router must flood the 618 Bootstrap Message to all its P router PIM neighbors. 620 However, the PE should first modify the Bootstrap Message as follows: 622 - It should replace the source address with its own address. 624 - The original source address of the Bootstrap Message must be 625 modified to be a VPN-IP address, and placed in a newly defined 626 "origin" field within the Bootstrap Message. The VPN-IP address 627 is formed by using the RD which is used when the route to the 628 origin is exported. 630 We can call these "VPN-IP PIM Bootstrap Messages". 632 P routers that receive VPN-IP PIM Bootstrap Messages must flood them 633 normally, but should not maintain the RP/Group mappings from these 634 messages. 636 When a PE router receives a VPN-IP PIM Bootstrap Message on the RPF 637 interface to the message's source address, then in addition to 638 forwarding it as necessary to other backbone routers, it extracts the 639 origin field, and checks to see if that VPN-IP address (or less 640 specific prefix) has been imported into one or more of its VRFs. If 641 so, it translates the message back into the original PIM Bootstrap 642 Message and forwards it to the CEs associated with those VRFs. 644 4. Multicast Domains Using PIM NBMA Techniques 646 In the solution of section 2, the PEs in a common Multicast Domain 647 are attached to a common Multicast Tunnel, which is treated as a 648 LAN-like interface, and which is instantiated as one or more 649 multicast distribution trees. It is possible instead to think of the 650 Multicast Tunnel as an NBMA-like interface. Then one doesn't need to 651 instantiate the tunnel as a multicast distribution tree at all. 652 Rather, C-multicast packets are simply unicast (tunneled) from one PE 653 router to the other PE routers which need to receive those packets. 655 This has the advantages of keeping all multicast routing state out of 656 the P routers, and of not delivering multicast traffic to any PE 657 routers that don't need to receive it. It has the disadvantage of 658 requiring the transmitting PE router to replicate the multicast 659 packets, along with the consequent disadvantage of sending more 660 packets through the core. While multicast routers must always be 661 able to replicate packets, generally the number of replicas that need 662 to be created is bounded by the number of outgoing interfaces; in 663 this case, it would be bounded only by the number of other PE routers 664 containing sites in the same VPN. So the characteristics of this 665 solution seem unfavorable. 667 This solution could be implemented with a two-layer MPLS stack, very 668 similar to the handling of unicast. Each PE router would distribute, 669 via BGP, a list of the Multicast Domains in which it has VRFs, along 670 with an MPLS label for each one. This would enable the PE in a 671 common Multicast Domain to auto-discover each other, as well as 672 providing the bottom label of the two-label MPLS label stack. 674 5. Summary for Sub-IP Area 676 The base specification for RFC2547 VPNs, i.e., draft-rosen- 677 rfc2547bis- 03.txt, does not specify the procedures necessary for a 678 service provider to support multicast within the VPNs that it 679 provides to its customers. That specification is contained in this 680 document, along with discussion of some alternative procedures. 682 5.1. Related Documents 684 draft-ietf-ppvpn-requirements-00.txt 685 draft-ietf-ppvpn-framework-00.txt 686 draft-rosen-rfc2547bis-03.txt 687 draft-rosen-vpns-ospf-bgp-mpls-02.txt 688 draft-rosen-ppvpn-ipsec-2547-00.txt 689 draft-farinacci-mpls-multicast-03.txt 691 5.2. Where it Fits in the Picture of the Sub-IP Work 693 This work fits squarely in the PPVPN box. 695 5.3. Why is it Targeted at this WG 697 It addresses the following work item from the PPVPN WG charter: 699 The working group is expected to consider at least three 700 specific approaches, including BGP-VPNs (e.g. RFC 2547). 702 It extends the base specification for RFC2547 VPNs by adding 703 procedures to enable multicasts used within the VPN to traverse the 704 backbone. 706 5.4. Justification 708 The WG should consider this document as it specifically addresses one 709 of the work items called out in the charter. 711 6. Intellectual Property Considerations 713 Cisco Systems may seek patent or other intellectual property 714 protection for some of all of the technologies disclosed in this 715 document. If any standards arising from this document are or become 716 protected by one or more patents assigned to Cisco Systems, Cisco 717 intends to disclose those patents and license them on reasonable and 718 non-discriminatory terms. 720 7. Acknowledgments 722 The authors wish to thank Tony Speakman and Ted Qian for their help 723 and their ideas. 725 8. References 727 [ADMIN-ADDR] "Administratively Scoped IP Multicast", Meyer, July 728 1998, RFC 2365 730 [BIDIR] "Bi-directional Protocol Independent Multicast", Handley, 731 Kouvelas, Speakman, Vicisano, June 2002, 734 [MPLS-PIM] "Using PIM to Distribute MPLS Labels for Multicast 735 Routes", Farinacci, Rekhter, Rosen, Qian, November 2000, 738 [PIMv2] "Protocol Independent Multicast - Sparse Mode (PIM-SM)", 739 Fenner, Handley, Holbrook, Kouvelas, March 2002, draft-ietf-pim-sm- 740 v2-new-05.txt 742 [RFC2547bis] "BGP/MPLS VPNs", Rosen, et. al., July 2002, draft-ietf- 743 ppvpn-rfc2547bis-02.txt 745 9. Authors' Addresses 747 Yiqun Cai 748 Cisco Systems, Inc. 749 170 Tasman Drive 750 San Jose, CA, 95134 751 E-mail: ycai@cisco.com 753 Dino Farinacci 754 Procket Networks, Inc. 755 3850 North First Street 756 SAn Jose, CA, 95134 757 E-mail: dino@procket.com 759 Yakov Rekhter 760 Juniper Networks 761 1194 N. Mathilda Avenue 762 Sunnyvale, CA 94089 763 E-mail: yakov@juniper.net 765 Eric C. Rosen 766 Cisco Systems, Inc. 767 250 Apollo Drive 768 Chelmsford, MA, 01824 769 E-mail: erosen@cisco.com 771 Dan Tappan 772 Cisco Systems, Inc. 773 250 Apollo Drive 774 Chelmsford, MA, 01824 775 E-mail: tappan@cisco.com 777 IJsbrand Wijnands 778 Cisco Systems, Inc. 779 170 Tasman Drive 780 San Jose, CA, 95134 781 E-mail: ice@cisco.com