idnits 2.17.1 draft-rosen-vpn-mcast-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([PIMv2], [RFC2547bis]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 587: '... MUST have mutually unique source a...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 497 has weird spacing: '...message from ...' == Line 502 has weird spacing: '... from which...' == Line 503 has weird spacing: '...message with ...' == Line 505 has weird spacing: '... shared tree,...' == Line 506 has weird spacing: '...r prune lists...' == (19 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 2003) is 7675 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'BIDIR' -- Possible downref: Non-RFC (?) normative reference: ref. 'MPLS-PIM' == Outdated reference: A later version (-12) exists of draft-ietf-pim-sm-v2-new-06 -- No information found for draft-ietf-ppvpn-rfc2547bis - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'RFC2547bis' Summary: 6 errors (**), 0 flaws (~~), 8 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Eric C. Rosen 3 Internet Draft Yiqun Cai 4 Expiration Date: October 2003 Dan Tappan 5 IJsbrand Wijnands 6 Cisco Systems, Inc. 8 Yakov Rekhter 9 Juniper Networks, Inc. 11 Dino Farinacci 12 Procket Networks, Inc. 14 April 2003 16 Multicast in MPLS/BGP VPNs 18 draft-rosen-vpn-mcast-05.txt 20 Status of this Memo 22 This document is an Internet-Draft and is in full conformance with 23 all provisions of Section 10 of RFC2026. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt. 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 Abstract 43 [RFC2547bis] describes a method of providing a VPN service. It 44 specifies the protocols and procedures which must be implemented in 45 order for a Service Provider to provide a unicast VPN. This document 46 extends that specification by describing the protocols and procedures 47 which a Service Provider must implement in order to support multicast 48 traffic in a VPN, assuming that PIM [PIMv2] is the multicast routing 49 protocol used within the VPN, and the the SP network can provide PIM 50 as well. 52 Table of Contents 54 1 Introduction ....................................... 2 55 2 Multicast Domains .................................. 4 56 2.1 Multicast VRFs ..................................... 4 57 2.2 Multicast Tunnels .................................. 5 58 2.3 PIM across the MD .................................. 6 59 2.4 RPF Determination .................................. 6 60 2.5 Avoiding Conflict with Internet Multicast .......... 7 61 2.6 Dense Mode ......................................... 7 62 2.7 Forwarding ......................................... 7 63 2.8 Scalability ........................................ 8 64 2.9 Increasing the Optimality .......................... 8 65 2.10 Inter-Provider Considerations ...................... 9 66 3 VPN-IP PIM-SM ...................................... 9 67 3.1 Multicast VRFs ..................................... 10 68 3.2 Use of VPN-IP addresses in PIM ..................... 10 69 3.3 Forwarding ......................................... 11 70 3.4 Associating VPN-IP PIM Messages with VRFs .......... 11 71 3.5 The RPF Hint ....................................... 12 72 3.6 When a PE Sends a PIM Message to the Backbone ...... 12 73 3.7 PIM Bootstrap Messages ............................. 14 74 4 Multicast Domains Using PIM NBMA Techniques ........ 15 75 5 Intellectual Property Considerations ............... 16 76 6 Acknowledgments .................................... 16 77 7 References ......................................... 16 78 8 Authors' Addresses ................................. 16 80 1. Introduction 82 [RFC2547bis] describes a method of providing a VPN service. It 83 specifies the protocols and procedures which must be implemented in 84 order for a Service Provider to provide a unicast VPN. This document 85 extends that specification by describing the protocols and procedures 86 which a Service Provider must implement in order to support multicast 87 traffic in a VPN, assuming that PIM [PIMv2] is the multicast routing 88 protocol used within the VPN, and that the SP network can provide PIM 89 as well. Familiarity with the terminology and procedures of 90 [RFC2547bis] is presupposed. Familiarity with [PIMv2] is also 91 presupposed. 93 The discussion here must not be confused with discussions elsewhere 94 of Internet multicast. What we are considering here is primarily 95 Enterprise multicast; our goal is to allow an Enterprise which has a 96 VPN service as defined in [RFC2547bis] to implement Enterprise 97 multicasts using PIM-SM or PIM-DM. 99 VPNs which are constructed according to [RFC2547bis] obtain optimal 100 unicast routing through the SP backbone, even though: 102 - the P routers do not maintain any routing information for the 103 VPNs, or indeed, any per-VPN state at all, and 105 - the CE routers at different sites do not maintain routing 106 adjacencies with each other. 108 Unfortunately, one cannot do quite so well with multicast routing. 109 For optimal multicast routing, when a PE router receives a multicast 110 data packet of a particular multicast group from a CE router, the 111 packet must get to every other PE router which is on the path to a 112 receiver of that group. It must not get to any other PEs. And it 113 must not be unnecessarily replicated. Optimal routing requires a 114 source-tree for the multicast group, which would mean that the P 115 routers would have to maintain state for each transmitter of each 116 multicast group in each VPN. 118 While this would provide optimal multicast routing, it also requires 119 an unbounded amount of state in the P routers, since the SP has no 120 control whatsoever of the number of multicast groups in the VPNs that 121 it supports. Nor does it have any control over the number of 122 transmitters in each group, nor of the distribution of the receivers. 124 In short, true optimal routing of VPN multicasts in the SP network 125 does not appear to be scalable. For completeness, we do specify, in 126 section 3 ("VPN-IP PIM"), how one could provide true optimal routing 127 of Spare Mode VPN multicasts in the SP network. While we do not 128 propose to adopt this solution, it is instructive to compare it with 129 the solution we do propose in section 2. 131 We also include, in section 4, a very brief description of a scheme 132 which does not require any multicast routing state at all to be kept 133 in the P routers, but in which all the replication is done in the PE 134 routers. 136 If we are willing to send multicasts along paths on which there are 137 no receivers, then it is possible to support VPN multicasts, using 138 exactly one multicast distribution tree for each VPN, and without 139 requiring that all replication is done by the PE routers. If more 140 than one site in a VPN may have multicast transmitters, it is best 141 for this single tree to be a bidirectional tree [BIDIR]. (In 142 environments, such as an ATM-LSR backbone, where bidirectional trees 143 cannot be supported, a single shared tree can be used.) 145 We describe in section 2 the procedures and protocols used to 146 implement this solution, which we dub "Multicast Domains". It is 147 this procedure which we propose for adoption. 149 In unicast routing, a CE router is an adjacency of a PE router, and 150 CE routers at different sites do NOT become adjacencies of each 151 other. We retain this characteristic for multicast routing -- a CE 152 router becomes a PIM adjacency of a PE router, and CE routers at 153 different sites do NOT become adjacencies of each other. 155 An Enterprise which uses PIM multicasting in its network before 156 adopting the VPN service can transition to the VPN service while 157 continuing to use whatever multicast router configurations it was 158 previously using; no changes need be made to CE routers or to other 159 routers at customer sites. Any dynamic RP-discovery procedures that 160 area already in use may be left in place. 162 The notion of a "VRF", defined in [RFC2547bis], to include multicast 163 routing entries as well as unicast routing entries. 165 2. Multicast Domains 167 In this section, we describe the solution we are proposing for 168 adoption. 170 A "Multicast Domain" is essentially a set of VRFs associated with 171 interfaces that can send multicast traffic to each other. 173 2.1. Multicast VRFs 175 Each VRF has its own multicast routing table. When a multicast data 176 or control packet is received from a particular CE device, multicast 177 routing is done in the associated VRF. 179 Each PE router runs a number of instances of PIM-SM, as many as one 180 per VRF. In each instance of PIM-SM, the PE maintains a PIM 181 adjacency with each of the PIM-capable CE routers associated with 182 that VRF. The multicast routing table created by each instance is 183 specific to the corresponding VRF. We will refer to these PIM 184 instances as "VPN-specific PIM instances". 186 Each PE router also runs a "provider-wide" instance of PIM-SM, in 187 which it has a PIM adjacency with each of its IGP neighbors (i.e., 188 with P routers), but NOT with any CE routers, and not with other PE 189 routers (unless they happen to be adjacent in the SP's network). 191 In order to help clarify when we are speaking of the provider-wide 192 PIM instance and when we are speaking of a VPN-specific PIM instance, 193 we will use the prefixes "P-" and "C-" respectively. Thus a P-Join 194 would be a PIM Join which is processed by the provider-wide PIM 195 instance, and a C-Join would be a PIM Join which is processed by a 196 VPN-specific PIM instance. A P-group address would be a group 197 address in the SP's address space, and a C-group address would be a 198 group address in a VPN's address space. 200 2.2. Multicast Tunnels 202 Each multicast VRF is assigned to one or more multicast domains. 203 Each such VRF MD is configured with a multicast P-group address. As 204 part of the configuration of the provider-wide PIM instance, an RP 205 address (in the address space of the P network) is configured for 206 each such P-group address. (Or the RP addresses may be discovered by 207 any other acceptable procedure, such as PIM Bootstrap messages.) 209 Each MD has a distinct P-group address. For each MD, a "Multicast 210 Tunnel" (MT) is created in the provider-wide PIM instance, using 211 ordinary PIM-SM techniques. The various PEs in the MD discover each 212 other by joining the shared tree rooted at the RP. For best 213 scalability, this should be a bidirectional tree [BIDIR]. 215 (Strictly speaking, the scheme works even if the MTs are realized by 216 PIM source trees. However, this could result in large numbers of 217 multicast distribution trees per MD, which would severely reduce the 218 scalability of the scheme.) 220 The MT is used to carry multicast C-packets, both data and control 221 packets, among the PE routers in a common MD. 223 To send a packet through an MT the packet must of course be 224 encapsulated. This could be done either with MPLS or with GRE. If 225 it is done with MPLS, then the "MPLS label distribution via PIM" 226 procedures [MPLS-PIM] must be supported. 228 When a packet is received from an MT, the receiving PE must be able 229 to determine the MT (and hence the MD) from which the packet was 230 received. (In the case of MPLS encapsulation, this will be 231 determined from the incoming MPLS label; penultimate hop popping must 232 not be performed.) The packet is then passed to the corresponding 233 Multicast VRF and VPN-specific PIM instance for further processing. 235 2.3. PIM across the MD 237 If a particular VRF is in a particular MD, the corresponding MT is 238 treated by that VRF's VPN-specific PIM instances as a LAN interface. 239 The PEs which are adjacent on the MT must execute the PIM LAN 240 procedures, including the generation and processing of Assert 241 packets. This allows VPN-specific PIM routes to be extended from 242 site to site, without appearing in the P routers. 244 If a PE in a particular MD transmits a C-multicast data packet to the 245 backbone, by transmitting it through an MD, every other PE in that MD 246 will receive it. Any of those PEs which are not on a C-multicast 247 distribution tree for the packet's C-multicast destination address 248 (as determined by applying ordinary PIM procedures to the 249 corresponding multicast VRF) will have to discard the packet. 251 2.4. RPF Determination 253 Although the MT is treated as a PIM-enabled interface, unicast 254 routing is NOT run over it, and there are no unicast routing 255 adjacencies over it. 257 If a VRF is in a single MD: 259 - a C-packet received over an MT is considered to pass the RPF 260 check if the IGP next hop to its source address, according to the 261 associated VRF, is not one of the interfaces associated with that 262 VRF; 264 - a C-Join/Prune message from a CE router needs to be forwarded 265 over the MT if the next hop interface to the root of the 266 corresponding multicast tree is not one of the interfaces 267 associated with that VRF. 269 If a VRF is in more than one MD, then the PE must be able to 270 determine which MT is the RPF for a particular C-address. This can 271 be done by means of BGP Extended Community attributes. Each MD can 272 be associated with a BGP Extended Community attribute, into which the 273 MD's group address is encoded. When a unicast VPN-IP address is 274 distributed from a VRF which is in a MD, the address can carry 275 Extended Community attributes which identify the MDs that the VRF 276 belongs to. Then the PE can find the MT which is the RPF for a given 277 address by looking at the Extended Community attributes of the 278 corresponding route. 280 The above specifies how to determine the RPF interface. To determine 281 the RPF neighbor for a particular C-address, we need to first 282 determine the BGP next hop for the corresponding VPN-IP address, then 283 verify that the BGP next hop is a PIM neighbor on the RPF interface. 285 2.5. Avoiding Conflict with Internet Multicast 287 If the SP is providing Internet multicast, distinct from its VPN 288 multicast services, it must ensure that the P-group addresses which 289 correspond to its MDs are distinct from any of the group addresses of 290 the Internet multicasts it supports. This is best done by using 291 administratively scoped addresses [ADMIN-ADDR]. 293 The C-group addresses need not be distinct from either the P-group 294 addresses or the Internet multicast addresses. 296 2.6. Dense Mode 298 Dense mode multicasts via PIM-DM are easily supported using MDs. The 299 MT is still created using PIM-SM, and the PEs simply use PIM-DM 300 procedures as necessary when transmitting C-data and C-control 301 packets across the MT. Thus an Enterprise which uses dense mode 302 multicasting can use the VPN service without changing its native 303 multicasting techniques. The P routers are not aware of whether the 304 Enterprise is using dense mode or sparse mode. 306 2.7. Forwarding 308 The P routers will not be able to tell, from the contents of the C- 309 packet as sent from CE to PE, which MT the packet should be sent 310 along. Therefore the packets need to be encapsulated. 312 If MPLS multicast [MPLS-PIM] is supported, then MPLS can be used for 313 the encapsulation. This would require only a single MPLS label. 314 Penultimate hop popping would not be used (otherwise the egress PE 315 could not tell which MD the packet belongs to). 317 Other encapsulations are also possible. For example, one could use a 318 GRE encapsulation, with the MD's P-group address appearing in the IP 319 destination address field. In this case, the SP must filter, at the 320 edges of its network, all non-VPN packets carrying any of these P- 321 group addresses in their destination address fields. 323 If a PIM shared tree (RP-tree) is being used, rather than a bidir 324 tree, and if MPLS encapsulation is being used, then Register packets 325 must themselves be encapsulated in GRE before being encapsulated in 326 MPLS. This is necessary in order to carry the MT's P-group address 327 corresponding to the RP. Note that the RP cannot remove the GRE 328 header before forwarding the packet, since the RP has no way of 329 knowing that a particular packet is a tunneled VPN multicast packet, 330 rather than an "ordinary" multicast. As a result, the GRE header 331 would have to be used for all tunneled VPN multicast packets carried 332 within MPLS, even if those packets are sent down a source tree. 334 2.8. Scalability 336 While this procedure requires the P routers to maintain multicast 337 state, the amount of state is bounded by the number of supported 338 VPNs. The P routers do NOT run any VPN-specific PIM instances. 340 The multicast routing provided by this scheme is not optimal, in that 341 a packet of a particular multicast group may be forwarded to PE 342 routers which have no downstream receivers for that group, and hence 343 which may need to discard the packet. 345 The use of a single bidirectional tree per VPN scales well as the 346 number of transmitters and receivers increases, but not so well as 347 the amount of multicast traffic per VPN increases. 349 2.9. Increasing the Optimality 351 Suppose that for each MT we create not one, but a number of multicast 352 distribution trees. One of these trees, the default tree, is joined 353 by all PEs in the MD. PIM control messages from the CEs are 354 forwarded along the default tree. However, multicast data messages 355 are mapped to particular distribution trees depending on the source 356 and group addresses that appear in them. The assignment of an (S,G) 357 pair (or, in our terminology, a (C-S, C-G) pair) to a particular 358 distribution tree would be done by the PE which receives the data 359 from a CE (i.e., by the transmitting side), and indicated to the 360 other PEs by a special PIM message sent on the default distribution 361 tree. 363 While every PE in an MD joins the default distribution tree for the 364 corresponding MT, a PE does not join a non-default distribution tree 365 unless it is connected to a VPN site which needs to receive traffic 366 from a group which has been assigned to that tree. 368 If it were known that certain C-groups have receivers at many VPN 369 sites, but others have receivers only at a few VPN sites, the former 370 could be mapped to the default tree, and the latter could be mapped 371 to one or more non-default distribution trees. This could 372 significantly reduce the amount of multicast data traffic that gets 373 sent to PEs that do not need to receive it. 375 Another, perhaps more feasible, approach is to keep all the low 376 throughput groups on the default distribution tree, and to distribute 377 the high throughput groups among the other distribution trees. 379 Of course, any scheme like this requires still more state in the P 380 routers, so again presents a trade-off between state and optimality. 382 2.10. Inter-Provider Considerations 384 If there are multi-provider VPNs which require multicast, then an MD 385 will cross provider boundaries. The multicast group address 386 associated with the MT must then be agreed upon by the providers. 388 [RFC2547bis] describes three methods for creating inter-provider 389 VPNs: 391 1. VRF-to-VRF connections at the AS border routers. 393 2. EBGP redistribution of labeled VPN-IP routes from AS to 394 neighboring AS. 396 3. Multihop EBGP redistribution of labeled VPN-IP routes between 397 source and destination ASes, with EBGP redistribution of 398 labeled IP routes from AS to neighboring AS. 400 The use of MDs for interprovider VPN multicast is compatible with 401 methods 1 and 3, but not with method 2. 403 3. VPN-IP PIM-SM 405 In this section, we present for completeness sake a solution that, 406 while it provides optimal multicast routing, we must deprecate due to 407 its scalability problems. 409 In this solution, PIM-SM is used to extend an (S,G) or (*,G) 410 multicast distribution tree from a set of customer sites, through the 411 SP backbone, to a set of customer sites. 413 This solution must solve the following basic problems: 415 - It extends the multicast routing of a VPN into the backbone, 416 despite the facts that: 418 * the unicast routing of that VPN is NOT extended into the 419 backbone, and 421 * PIM-SM assumes the presence of the unicast routing in order 422 to determine the RPF interface for a multicast distribution 423 tree. 425 - It ensures that multicast routing entries of different VPNs are 426 kept distinct in the backbone, even if the IP addresses 427 corresponding to the respective S and G values of the (S,G) 428 entries are not unique across VPNs. 430 - It ensures that when a PE router receives a PIM Join/Prune 431 message from the backbone, it associates that message with the 432 proper VRF. 434 - It properly handles PIM-SM Bootstrap Messages, which must be 435 flooded along an RPF-tree away from the unicast route to the 436 origin of the message. 438 3.1. Multicast VRFs 440 Each PE router runs a number of instances of PIM-SM, as many as one 441 per VRF. In each instance of PIM-SM, the PE maintains a PIM 442 adjacency with each of the PIM-capable CE routers associated with 443 that VRF. The multicast routing table created by each instance is 444 specific to the corresponding VRF. 446 Each PE router also runs a "provider-wide" instance of PIM-SM, in 447 which it has a PIM adjacency with each of its IGP neighbors. 448 Multicast routing data from a particular VRF is leaked into the 449 provider-wide multicast routing table, but the address of the root of 450 each multicast tree is first translated from an IP address to a VPN- 451 IP address. 453 3.2. Use of VPN-IP addresses in PIM 455 When multicast routing entries are distributed from a VRF to the 456 provider-wide routing table, they are modified as follows. Consider 457 an (S,G) entry in a VRF multicast routing table. This entry can only 458 exist if there is a route to S in the corresponding unicast VRF. If 459 there a route to S in the unicast VRF, it will correspond to a VPN-IP 460 route RD:S (where RD is the Route Distinguisher, see [RFC2547bis]). 462 So the (S,G) entry in the multicast VRF becomes an (RD:S,G) entry in 463 the provider-wide multicast routing table. This distinguishes the 464 multicast distribution tree from any other (S,G) tree which comes 465 from a different VPN. 467 In the case of a shared tree, a (*,G) entry becomes a RD:* entry, 468 where RD is the Route Distinguisher of the VPN-IP address of the 469 tree's RP. 471 P routers have only provider-wide multicast routing tables. 472 Join/Prune messages are sent between P routers and between P and PE 473 routers using the PIMv2 address family extensions, which allow the 474 VPN-IP addresses to be encoded. 476 In short, if two trees have the same value of G, but are in different 477 VPNS, they are distinguished by means of the RD of the root of the 478 tree. 480 3.3. Forwarding 482 The contents of the IP header of a multicast packet are insufficient 483 to determine the multicast tree that a particular packet is traveling 484 on. So the packets must be encapsulated while being forwarded 485 through the backbone, where the encapsulation can be used to uniquely 486 associate each packet with an (RD:S,G) or (RD:*,G) entry. This is 487 best done using MPLS multicast labels, and the MPLS label 488 distribution technique specified in [MPLS-PIM]. 490 Whereas unicast VPN packets generally carry two MPLS labels, 491 multicast VPN packets would carry only one. When the egress PE 492 receives a labeled multicast packet from the backbone, the top label 493 tells it which CEs to send the packet to after the label is popped. 495 3.4. Associating VPN-IP PIM Messages with VRFs 497 How does a PE, when it receives a PIM message from the backbone, 498 associate it with a particular VRF? 500 If the PIM message references a source tree, then the VPN-IP address 501 of the source (RD:S) is in the PIM message. The PE finds the VRF 502 from which the route to RD:s was exported, and associates the PIM 503 message with the (S,G) entry of that VRF. 505 If the PIM message references a shared tree, then the entries in 506 the join and/or prune lists will each have an RD. The PE looks 507 for a (*,G) entry whose RP has that RD. The presence of more than 508 one such is considered a configuration error. 510 3.5. The RPF Hint 512 When a P router receives a PIM Join/Prune message corresponding to a 513 VPN-IP PIM message, it must be able to determine the IGP next hop 514 towards the root of the specified multicast distribution tree. 516 In ordinary PIM, determination of the next hop is easily done, since 517 the IP address of the root of each multicast tree is known, and the 518 backbone routers know the unicast route towards the root of the tree. 519 However, in the case of a BGP/MPLS VPN the root of the multicast 520 distribution tree will be within a VPN, and hence will not have a 521 unique IP address. VPN-IP PIM must therefore use the VPN-IP address 522 of the root of the multicast distribution tree. But the backbone 523 routers do not know unicast routes for VPN-IP addresses. 525 To solve this, the PE router, before sending a PIM Join/Prune 526 message to a backbone router, must insert the address of its BGP 527 next hop towards the root of the tree. Call this the RPF hint. This 528 is generally the address of the PE router which attaches to the 529 site containing the root. Backbone routers always have IGP routes 530 to the PE routers' BGP next hops, and the IGP next hop towards the 531 root of a tree is always the same as the IGP next hop towards the PE 532 router which attaches to the site containing the root. 534 When a P router receives one of these Join/Prune messages, 535 instead of looking up the IGP next hop to the root of the 536 specified multicast distribution tree, it looks up the IGP next hop 537 of the RPF hint. 539 The RPF hint is not an essential part of the identification of the 540 multicast distribution tree. A change in the value of the RPF hint is 541 regarded simply as an RPF change, which changes the shape of the 542 tree, but which does not necessarily require construction of an 543 entirely new tree. 545 3.6. When a PE Sends a PIM Message to the Backbone 547 A PE router only sends a VPN-IP PIM Join to the backbone if it 548 receives a PIM Join from a CE router. That Join will contain the IP 549 address of the root of a particular multicast distribution tree. The 550 PE looks up the unicast route to this address in the VRF associated 551 with the CE. If this route exists in the VRF as a result of a VPN-IP 552 route's having been imported from BGP, the corresponding VPN-IP route 553 is identified. The VPN-IP address of the root of the tree can then be 554 formed by appending its IP address to the RD of that VPN-IP route. 556 This VPN-IP address, rather than the IP address will be placed in the 557 VPN-IP PIM Join List. (This applies in the case where the WC bit and 558 the RPT bit are both 1, as well as the case where they are both 0.) 560 With respect to the (S,G) state maintained by a PE router, the "S" 561 will be a VPN-IP address rather than an IP address. 563 With respect to the (*,G) state maintained by a PE router, the 564 address of the RP corresponding to the (*,G) tree will be maintained 565 as a VPN-IP address. 567 Similarly, if a CE prunes itself from a tree, and as a result the PE 568 must prune itself from the tree, the VPN-IP address of the root of 569 the tree will appear in the Prune List of the VPN-IP PIM Join/Prune 570 message sent to the backbone. (This applies in the case where the WC 571 bit and the RPT bit are both 1, as well a the case where they are 572 both 0.) 574 If a PE must prune a particular source from a (*,G) tree whose input 575 interface leads to the backbone, then the prune list in the VPN-IP 576 PIM Join/Prune message will contain a VPN-IP address whose RD is 577 taken from the VPN-IP address of the (*,G) tree's RP, and whose IP 578 address part is the IP address of the source being pruned. (This 579 applies in the case where the WC bit is 0 and the RPT bit is 1.) 581 When a router receives a VPN-IP PIM Join/Prune message which requests 582 that a VPN-IP source be pruned off a shared tree, it identifies the 583 shared tree by looking for a (*,G) entry with the specified value of 584 G, and whose RP has a VPN-IP address with the specified RD. 586 Note that all the sources transmitting to a particular group 587 MUST have mutually unique source addresses, so it is not necessary 588 to use an RD to identify the source when pruning the source off the 589 shared tree. (Of course it is necessary to use an RD to identify 590 the source when operating on a source tree.) It is however 591 necessary to use an RD to identify the RP that corresponds to the 592 shared tree. 594 Of course, a PE router may receive a PIM Join/Prune from a CE 595 router, and find that the RPF leads to a directly attached CE 596 router, rather than to a PE router. In this case, an "ordinary" PIM 597 Join/Prune message is just sent to the CE router. 599 If multicast is being done by a multi-provider VPN, the VPN-IP PIM 600 messages have to be processed by and forwarded by the BGP border 601 routers. Further, the RPF hint put in the VPN-IP PIM message by the 602 ingress PE will be the address of a border router, rather than the 603 address of the egress PE. Thus a border router processing a VPN-IP 604 PIM message has to replace the RPF hint with the address of its own 605 BGP next hop towards the VPN-IP address of the root of the multicast 606 distribution tree which the VPN-IP PIM message references. 608 3.7. PIM Bootstrap Messages 610 If a particular VPN uses PIM Bootstrap Messages to do auto-discovery 611 of RPs, and the SP is providing VPN multicast service via the VPN-IP 612 PIM scheme, then the bootstrap messages will need to be flooded 613 throughout the backbone. Suppose a PE receives a Bootstrap Message 614 from a CE, and the interface to the CE is the RPF interface to the 615 source of the Bootstrap Message. Then the PE router must flood the 616 Bootstrap Message to all its P router PIM neighbors. 618 However, the PE should first modify the Bootstrap Message as follows: 620 - It should replace the source address with its own address. 622 - The original source address of the Bootstrap Message must be 623 modified to be a VPN-IP address, and placed in a newly defined 624 "origin" field within the Bootstrap Message. The VPN-IP address 625 is formed by using the RD which is used when the route to the 626 origin is exported. 628 We can call these "VPN-IP PIM Bootstrap Messages". 630 P routers that receive VPN-IP PIM Bootstrap Messages must flood them 631 normally, but should not maintain the RP/Group mappings from these 632 messages. 634 When a PE router receives a VPN-IP PIM Bootstrap Message on the RPF 635 interface to the message's source address, then in addition to 636 forwarding it as necessary to other backbone routers, it extracts the 637 origin field, and checks to see if that VPN-IP address (or less 638 specific prefix) has been imported into one or more of its VRFs. If 639 so, it translates the message back into the original PIM Bootstrap 640 Message and forwards it to the CEs associated with those VRFs. 642 4. Multicast Domains Using PIM NBMA Techniques 644 In the solution of section 2, the PEs in a common Multicast Domain 645 are attached to a common Multicast Tunnel, which is treated as a 646 LAN-like interface, and which is instantiated as one or more 647 multicast distribution trees. It is possible instead to think of the 648 Multicast Tunnel as an NBMA-like interface. Then one doesn't need to 649 instantiate the tunnel as a multicast distribution tree at all. 650 Rather, C-multicast packets are simply unicast (tunneled) from one PE 651 router to the other PE routers which need to receive those packets. 653 This has the advantages of keeping all multicast routing state out of 654 the P routers, and of not delivering multicast traffic to any PE 655 routers that don't need to receive it. It has the disadvantage of 656 requiring the transmitting PE router to replicate the multicast 657 packets, along with the consequent disadvantage of sending more 658 packets through the core. While multicast routers must always be 659 able to replicate packets, generally the number of replicas that need 660 to be created is bounded by the number of outgoing interfaces; in 661 this case, it would be bounded only by the number of other PE routers 662 containing sites in the same VPN. So the characteristics of this 663 solution seem unfavorable. 665 This solution could be implemented with a two-layer MPLS stack, very 666 similar to the handling of unicast. Each PE router would distribute, 667 via BGP, a list of the Multicast Domains in which it has VRFs, along 668 with an MPLS label for each one. This would enable the PE in a 669 common Multicast Domain to auto-discover each other, as well as 670 providing the bottom label of the two-label MPLS label stack. 672 5. Intellectual Property Considerations 674 Cisco Systems may seek patent or other intellectual property 675 protection for some of all of the technologies disclosed in this 676 document. If any standards arising from this document are or become 677 protected by one or more patents assigned to Cisco Systems, Cisco 678 intends to disclose those patents and license them on reasonable and 679 non-discriminatory terms. 681 6. Acknowledgments 683 The authors wish to thank Tony Speakman and Ted Qian for their help 684 and their ideas. 686 7. References 688 [ADMIN-ADDR] "Administratively Scoped IP Multicast", Meyer, July 689 1998, RFC 2365 691 [BIDIR] "Bi-directional Protocol Independent Multicast", Handley, 692 Kouvelas, Speakman, Vicisano, June 2002, 695 [MPLS-PIM] "Using PIM to Distribute MPLS Labels for Multicast 696 Routes", Farinacci, Rekhter, Rosen, Qian, November 2000, 699 [PIMv2] "Protocol Independent Multicast - Sparse Mode (PIM-SM)", 700 Fenner, Handley, Holbrook, Kouvelas, December 2002, draft-ietf-pim- 701 sm-v2-new-06.txt 703 [RFC2547bis] "BGP/MPLS VPNs", Rosen, et. al., November 2002, draft- 704 ietf-ppvpn-rfc2547bis-03.txt 706 8. Authors' Addresses 708 Yiqun Cai 709 Cisco Systems, Inc. 710 170 Tasman Drive 711 San Jose, CA, 95134 712 E-mail: ycai@cisco.com 714 Dino Farinacci 715 Procket Networks, Inc. 716 3850 North First Street 717 SAn Jose, CA, 95134 718 E-mail: dino@procket.com 720 Yakov Rekhter 721 Juniper Networks 722 1194 N. Mathilda Avenue 723 Sunnyvale, CA 94089 724 E-mail: yakov@juniper.net 726 Eric C. Rosen 727 Cisco Systems, Inc. 728 250 Apollo Drive 729 Chelmsford, MA, 01824 730 E-mail: erosen@cisco.com 732 Dan Tappan 733 Cisco Systems, Inc. 734 250 Apollo Drive 735 Chelmsford, MA, 01824 736 E-mail: tappan@cisco.com 738 IJsbrand Wijnands 739 Cisco Systems, Inc. 740 170 Tasman Drive 741 San Jose, CA, 95134 742 E-mail: ice@cisco.com