idnits 2.17.1 draft-ietf-l3vpn-2547bis-mcast-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 3805. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 3816. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 3823. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 3829. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 14, 2008) is 5944 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-15) exists of draft-ietf-mpls-ldp-p2mp-03 == Outdated reference: A later version (-10) exists of draft-ietf-mpls-multicast-encaps-06 == Outdated reference: A later version (-07) exists of draft-ietf-mpls-upstream-label-02 == Outdated reference: A later version (-08) exists of draft-ietf-l3vpn-2547bis-mcast-bgp-04 == Outdated reference: A later version (-06) exists of draft-ietf-pim-join-attributes-03 ** Obsolete normative reference: RFC 4601 (ref. 'PIM-SM') (Obsoleted by RFC 7761) Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Eric C. Rosen (Editor) 3 Internet Draft Cisco Systems, Inc. 4 Intended Status: Standards Track 5 Expires: July 14, 2008 Rahul Aggarwal (Editor) 6 Juniper Networks 8 January 14, 2008 10 Multicast in MPLS/BGP IP VPNs 12 draft-ietf-l3vpn-2547bis-mcast-06.txt 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 Abstract 39 In order for IP multicast traffic within a BGP/MPLS IP VPN (Virtual 40 Private Network) to travel from one VPN site to another, special 41 protocols and procedures must be implemented by the VPN Service 42 Provider. These protocols and procedures are specified in this 43 document. 45 Table of Contents 47 1 Specification of requirements ......................... 5 48 2 Introduction .......................................... 5 49 2.1 Optimality vs Scalability ............................. 5 50 2.1.1 Multicast Distribution Trees .......................... 7 51 2.1.2 Ingress Replication through Unicast Tunnels ........... 8 52 2.2 Overview .............................................. 8 53 2.2.1 Multicast Routing Adjacencies ......................... 8 54 2.2.2 MVPN Definition ....................................... 9 55 2.2.3 Auto-Discovery ........................................ 10 56 2.2.4 PE-PE Multicast Routing Information ................... 11 57 2.2.5 PE-PE Multicast Data Transmission ..................... 11 58 2.2.6 Inter-AS MVPNs ........................................ 12 59 2.2.7 Optionally Eliminating Shared Tree State .............. 12 60 3 Concepts and Framework ................................ 13 61 3.1 PE-CE Multicast Routing ............................... 13 62 3.2 P-Multicast Service Interfaces (PMSIs) ................ 14 63 3.2.1 Inclusive and Selective PMSIs ......................... 15 64 3.2.2 Tunnels Instantiating PMSIs ........................... 16 65 3.3 Use of PMSIs for Carrying Multicast Data .............. 18 66 3.3.1 MVPNs with MI-PMSIs ................................... 18 67 3.3.2 When MI-PMSIs are Required ............................ 19 68 3.3.3 MVPNs That Do Not Use MI-PMSIs ........................ 19 69 3.4 PE-PE Transmission of C-Multicast Routing ............. 19 70 3.4.1 PIM Peering ........................................... 20 71 3.4.1.1 Full Per-MVPN PIM Peering Across a MI-PMSI ............ 20 72 3.4.1.2 Lightweight PIM Peering Across a MI-PMSI .............. 20 73 3.4.1.3 Unicasting of PIM C-Join/Prune Messages ............... 21 74 3.4.2 Using BGP to Carry C-Multicast Routing ................ 21 75 4 BGP-Based Autodiscovery of MVPN Membership ............ 22 76 5 PE-PE Transmission of C-Multicast Routing ............. 25 77 5.1 Selecting the Upstream Multicast Hop (UMH) ............ 25 78 5.1.1 Eligible Routes for UMH Selection ..................... 26 79 5.1.2 Information Carried by Eligible UMH Routes ............ 26 80 5.1.3 Selecting the Upstream PE ............................. 27 81 5.1.4 Selecting the Upstream Multicast Hop .................. 29 82 5.2 Details of Per-MVPN Full PIM Peering over MI-PMSI ..... 29 83 5.2.1 PIM C-Instance Control Packets ........................ 30 84 5.2.2 PIM C-instance RPF Determination ...................... 30 85 5.2.3 Backwards Compatibility ............................... 31 86 5.3 Use of BGP for Carrying C-Multicast Routing ........... 31 87 5.3.1 Sending BGP Updates ................................... 31 88 5.3.2 Explicit Tracking ..................................... 33 89 5.3.3 Withdrawing BGP Updates ............................... 33 90 6 I-PMSI Instantiation .................................. 33 91 6.1 MVPN Membership and Egress PE Auto-Discovery .......... 34 92 6.1.1 Auto-Discovery for Ingress Replication ................ 34 93 6.1.2 Auto-Discovery for P-Multicast Trees .................. 34 94 6.2 C-Multicast Routing Information Exchange .............. 35 95 6.3 Aggregation ........................................... 35 96 6.3.1 Aggregate Tree Leaf Discovery ......................... 35 97 6.3.2 Aggregation Methodology ............................... 36 98 6.3.3 Encapsulation of the Aggregate Tree ................... 37 99 6.3.4 Demultiplexing C-multicast traffic .................... 37 100 6.4 Mapping Received Packets to MVPNs ..................... 38 101 6.4.1 Unicast Tunnels ....................................... 38 102 6.4.2 Non-Aggregated P-Multicast Trees ...................... 39 103 6.4.3 Aggregate P-Multicast Trees ........................... 39 104 6.5 I-PMSI Instantiation Using Ingress Replication ........ 40 105 6.6 Establishing P-Multicast Trees ........................ 41 106 6.7 RSVP-TE P2MP LSPs ..................................... 42 107 6.7.1 P2MP TE LSP Tunnel - MVPN Mapping ..................... 42 108 6.7.2 Demultiplexing C-Multicast Data Packets ............... 42 109 7 Optimizing Multicast Distribution via S-PMSIs ......... 43 110 7.1 S-PMSI Instantiation Using Ingress Replication ........ 44 111 7.2 Protocol for Switching to S-PMSIs ..................... 44 112 7.2.1 A UDP-based Protocol for Switching to S-PMSIs ......... 44 113 7.2.1.1 Binding a Stream to an S-PMSI ......................... 45 114 7.2.1.2 Packet Formats and Constants .......................... 46 115 7.2.2 A BGP-based Protocol for Switching to S-PMSIs ......... 48 116 7.2.2.1 Advertising C-(S, G) Binding to a S-PMSI using BGP .... 48 117 7.2.2.2 Explicit Tracking ..................................... 49 118 7.2.2.3 Switching to S-PMSI ................................... 50 119 7.3 Aggregation ........................................... 50 120 7.4 Instantiating the S-PMSI with a PIM Tree .............. 51 121 7.5 Instantiating S-PMSIs using RSVP-TE P2MP Tunnels ...... 52 122 8 Inter-AS Procedures ................................... 52 123 8.1 Non-Segmented Inter-AS Tunnels ........................ 53 124 8.1.1 Inter-AS MVPN Auto-Discovery .......................... 53 125 8.1.2 Inter-AS MVPN Routing Information Exchange ............ 53 126 8.1.3 Inter-AS P-Tunnels .................................... 54 127 8.1.3.1 PIM-Based Inter-AS P-Multicast Trees .................. 54 128 8.2 Segmented Inter-AS Tunnels ............................ 55 129 8.2.1 Inter-AS MVPN Auto-Discovery Routes ................... 55 130 8.2.1.1 Originating Inter-AS MVPN A-D Information ............. 56 131 8.2.1.2 Propagating Inter-AS MVPN A-D Information ............. 57 132 8.2.1.2.1 Inter-AS Auto-Discovery Route received via EBGP ....... 57 133 8.2.1.2.2 Leaf Auto-Discovery Route received via EBGP ........... 58 134 8.2.1.2.3 Inter-AS Auto-Discovery Route received via IBGP ....... 58 135 8.2.2 Inter-AS MVPN Routing Information Exchange ............ 60 136 8.2.3 Inter-AS I-PMSI ....................................... 60 137 8.2.3.1 Support for Unicast VPN Inter-AS Methods .............. 61 138 8.2.4 Inter-AS S-PMSI ....................................... 61 139 9 Duplicate Packet Detection and Single Forwarder PE .... 62 140 9.1 Multihomed C-S or C-RP ................................ 63 141 9.1.1 Single forwarder PE selection ......................... 64 142 9.2 Switching from the C-RP tree to C-S tree .............. 65 143 10 Eliminating PE-PE Distribution of (C-*,C-G) State ..... 66 144 10.1 Co-locating C-RPs on a PE ............................. 67 145 10.1.1 Initial Configuration ................................. 68 146 10.1.2 Anycast RP Based on Propagating Active Sources ........ 68 147 10.1.2.1 Receiver(s) Within a Site ............................. 68 148 10.1.2.2 Source Within a Site .................................. 68 149 10.1.2.3 Receiver Switching from Shared to Source Tree ......... 69 150 10.2 Using MSDP between a PE and a Local C-RP .............. 69 151 11 Encapsulations ........................................ 70 152 11.1 Encapsulations for Single PMSI per Tunnel ............. 70 153 11.1.1 Encapsulation in GRE .................................. 70 154 11.1.2 Encapsulation in IP ................................... 72 155 11.1.3 Encapsulation in MPLS ................................. 72 156 11.2 Encapsulations for Multiple PMSIs per Tunnel .......... 73 157 11.2.1 Encapsulation in GRE .................................. 73 158 11.2.2 Encapsulation in IP ................................... 73 159 11.3 Encapsulations Identifying a Distinguished PE ......... 74 160 11.3.1 For MP2MP LSP P-tunnels ............................... 74 161 11.3.2 For Support of PIM-BIDIR C-Groups ..................... 74 162 11.4 Encapsulations for Unicasting PIM Control Messages .... 75 163 11.5 General Considerations for IP and GRE Encaps .......... 75 164 11.5.1 MTU ................................................... 75 165 11.5.2 TTL ................................................... 76 166 11.5.3 Avoiding Conflict with Internet Multicast ............. 76 167 11.6 Differentiated Services ............................... 76 168 12 Support for PIM-BIDIR C-Groups ........................ 77 169 12.1 The VPN Backbone Becomes the RPL ...................... 78 170 12.1.1 Control Plane ......................................... 78 171 12.1.2 Data Plane ............................................ 79 172 12.2 Partitioned Sets of PEs ............................... 79 173 12.2.1 Partitions ............................................ 79 174 12.2.2 Using PE Labels ....................................... 80 175 12.2.3 Mesh of MP2MP P-Tunnels ............................... 81 176 13 Security Considerations ............................... 81 177 14 IANA Considerations ................................... 82 178 15 Other Authors ......................................... 82 179 16 Other Contributors .................................... 82 180 17 Authors' Addresses .................................... 82 181 18 Normative References .................................. 84 182 19 Informative References ................................ 85 183 20 Full Copyright Statement .............................. 85 184 21 Intellectual Property ................................. 86 186 1. Specification of requirements 188 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 189 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 190 document are to be interpreted as described in [RFC2119]. 192 2. Introduction 194 [RFC4364] specifies the set of procedures which a Service Provider 195 (SP) must implement in order to provide a particular kind of VPN 196 service ("BGP/MPLS IP VPN") for its customers. The service described 197 therein allows IP unicast packets to travel from one customer site to 198 another, but it does not provide a way for IP multicast traffic to 199 travel from one customer site to another. 201 This document extends the service defined in [RFC4364] so that it 202 also includes the capability of handling IP multicast traffic. This 203 requires a number of different protocols to work together. The 204 document provides a framework describing how the various protocols 205 fit together, and also provides detailed specification of some of the 206 protocols. The detailed specification of some of the other protocols 207 is found in pre-existing documents or in companion documents. 209 2.1. Optimality vs Scalability 211 In a "BGP/MPLS IP VPN" [RFC4364], unicast routing of VPN packets is 212 achieved without the need to keep any per-VPN state in the core of 213 the SP's network (the "P routers"). Routing information from a 214 particular VPN is maintained only by the Provider Edge routers (the 215 "PE routers", or "PEs") that attach directly to sites of that VPN. 216 Customer data travels through the P routers in tunnels from one PE to 217 another (usually MPLS Label Switched Paths, LSPs), so to support the 218 VPN service the P routers only need to have routes to the PE routers. 220 The PE-to-PE routing is optimal, but the amount of associated state 221 in the P routers depends only on the number of PEs, not on the number 222 of VPNs. 224 However, in order to provide optimal multicast routing for a 225 particular multicast flow, the P routers through which that flow 226 travels have to hold state which is specific to that flow. A 227 multicast flow is identified by the (source, group) tuple where the 228 source is the IP address of the sender and the group is the IP 229 multicast group address of the destination. Scalability would be 230 poor if the amount of state in the P routers were proportional to the 231 number of multicast flows in the VPNs. Therefore, when supporting 232 multicast service for a BGP/MPLS IP VPN, the optimality of the 233 multicast routing must be traded off against the scalability of the P 234 routers. We explain this below in more detail. 236 If a particular VPN is transmitting "native" multicast traffic over 237 the backbone, we refer to it as an "MVPN". By "native" multicast 238 traffic, we mean packets that a CE sends to a PE, such that the IP 239 destination address of the packets is a multicast group address, or 240 the packets are multicast control packets addressed to the PE router 241 itself, or the packets are IP multicast data packets encapsulated in 242 MPLS. 244 We say that the backbone multicast routing for a particular multicast 245 group in a particular VPN is "optimal" if and only if all of the 246 following conditions hold: 248 - When a PE router receives a multicast data packet of that group 249 from a CE router, it transmits the packet in such a way that the 250 packet is received by every other PE router which is on the path 251 to a receiver of that group; 253 - The packet is not received by any other PEs; 255 - While in the backbone, no more than one copy of the packet ever 256 traverses any link. 258 - While in the backbone, if bandwidth usage is to be optimized, the 259 packet traverses minimum cost trees rather than shortest path 260 trees. 262 Optimal routing for a particular multicast group requires that the 263 backbone maintain one or more source-trees which are specific to that 264 flow. Each such tree requires that state be maintained in all the P 265 routers that are in the tree. 267 This would potentially require an unbounded amount of state in the P 268 routers, since the SP has no control of the number of multicast 269 groups in the VPNs that it supports. Nor does the SP have any control 270 over the number of transmitters in each group, nor of the 271 distribution of the receivers. 273 The procedures defined in this document allow an SP to provide 274 multicast VPN service without requiring the amount of state 275 maintained by the P routers to be proportional to the number of 276 multicast data flows in the VPNs. The amount of state is traded off 277 against the optimality of the multicast routing. Enough flexibility 278 is provided so that a given SP can make his own tradeoffs between 279 scalability and optimality. An SP can even allow some multicast 280 groups in some VPNs to receive optimal routing, while others do not. 281 Of course, the cost of this flexibility is an increase in the number 282 of options provided by the protocols. 284 The basic technique for providing scalability is to aggregate a 285 number of customer multicast flows onto a single multicast 286 distribution tree through the P routers. A number of aggregation 287 methods are supported. 289 The procedures defined in this document also accommodate the SP that 290 does not want to build multicast distribution trees in his backbone 291 at all; the ingress PE can replicate each multicast data packet and 292 then unicast each replica through a tunnel to each egress PE that 293 needs to receive the data. 295 2.1.1. Multicast Distribution Trees 297 This document supports the use of a single multicast distribution 298 tree in the backbone to carry all the multicast traffic from a 299 specified set of one or more MVPNs. Such a tree is referred to as an 300 "Inclusive Tree". An Inclusive Tree which carries the traffic of more 301 than one MVPN is an "Aggregate Inclusive Tree". An Inclusive Tree 302 contains, as its members, all the PEs that attach to any of the MVPNs 303 using the tree. 305 With this option, even if each tree supports only one MVPN, the upper 306 bound on the amount of state maintained by the P routers is 307 proportional to the number of VPNs supported, rather than to the 308 number of multicast flows in those VPNs. If the trees are 309 unidirectional, it would be more accurate to say that the state is 310 proportional to the product of the number of VPNs and the average 311 number of PEs per VPN. The amount of state maintained by the P 312 routers can be further reduced by aggregating more MVPNs onto a 313 single tree. If each such tree supports a set of MVPNs, (call it an 314 "MVPN aggregation set"), the state maintained by the P routers is 315 proportional to the product of the number of MVPN aggregation sets 316 and the average number of PEs per MVPN. Thus the state does not grow 317 linearly with the number of MVPNs. 319 However, as data from many multicast groups is aggregated together 320 onto a single "Inclusive Tree", it is likely that some PEs will 321 receive multicast data for which they have no need, i.e., some degree 322 of optimality has been sacrificed. 324 This document also provides procedures which enable a single 325 multicast distribution tree in the backbone to be used to carry 326 traffic belonging only to a specified set of one or more multicast 327 groups, from one or more MVPNs. Such a tree is referred to as a 328 "Selective Tree" and more specifically as an "Aggregate Selective 329 Tree" when the multicast groups belong to different MVPNs. By 330 default, traffic from most multicast groups could be carried by an 331 Inclusive Tree, while traffic from, e.g., high bandwidth groups could 332 be carried in one of the "Selective Trees". When setting up the 333 Selective Trees, one should include only those PEs which need to 334 receive multicast data from one or more of the groups assigned to the 335 tree. This provides more optimal routing than can be obtained by 336 using only Inclusive Trees, though it requires additional state in 337 the P routers. 339 2.1.2. Ingress Replication through Unicast Tunnels 341 This document also provides procedures for carry MVPN data traffic 342 through unicast tunnels from the ingress PE to each of the egress 343 PEs. The ingress PE replicates the multicast data packet received 344 from a CE and sends it to each of the egress PEs using the unicast 345 tunnels. This requires no multicast routing state in the P routers 346 at all, but it puts the entire replication load on the ingress PE 347 router, and makes no attempt to optimize the multicast routing. 349 2.2. Overview 351 2.2.1. Multicast Routing Adjacencies 353 In BGP/MPLS IP VPNs [RFC4364], each CE ("Customer Edge") router is a 354 unicast routing adjacency of a PE router, but CE routers at different 355 sites do not become unicast routing adjacencies of each other. This 356 important characteristic is retained for multicast routing -- a CE 357 router becomes a multicast routing adjacency of a PE router, but CE 358 routers at different sites do not become multicast routing 359 adjacencies of each other. 361 The multicast routing protocol on the PE-CE link is presumed to be 362 PIM ("Protocol Independent Multicast") [PIM-SM]. The Sparse Mode, 363 Dense Mode, Single Source Mode, and Bidirectional Modes are 364 supported. A CE router exchanges "ordinary" PIM control messages with 365 the PE router to which it is attached. 367 The PEs attaching to a particular MVPN then have to exchange the 368 multicast routing information with each other. Two basic methods for 369 doing this are defined: (1) PE-PE PIM, and (2) BGP. In the former 370 case, the PEs need to be multicast routing adjacencies of each other. 371 In the latter case, they do not. For example, each PE may be a BGP 372 adjacency of a Route Reflector (RR), and not of any other PEs. 374 To support the "Carrier's Carrier" model of [RFC4364], mLDP or BGP 375 can be used on the PE-CE interface. This will be described in 376 subsequent versions of this document. 378 2.2.2. MVPN Definition 380 An MVPN is defined by two sets of sites, Sender Sites set and 381 Receiver Sites set, with the following properties: 383 - Hosts within the Sender Sites set could originate multicast 384 traffic for receivers in the Receiver Sites set. 386 - Receivers not in the Receiver Sites set should not be able to 387 receive this traffic. 389 - Hosts within the Receiver Sites set could receive multicast 390 traffic originated by any host in the Sender Sites set. 392 - Hosts within the Receiver Sites set should not be able to receive 393 multicast traffic originated by any host that is not in the 394 Sender Sites set. 396 A site could be both in the Sender Sites set and Receiver Sites set, 397 which implies that hosts within such a site could both originate and 398 receive multicast traffic. An extreme case is when the Sender Sites 399 set is the same as the Receiver Sites set, in which case all sites 400 could originate and receive multicast traffic from each other. 402 Sites within a given MVPN may be either within the same, or in 403 different organizations, which implies that an MVPN can be either an 404 Intranet or an Extranet. 406 A given site may be in more than one MVPN, which implies that MVPNs 407 may overlap. 409 Not all sites of a given MVPN have to be connected to the same 410 service provider, which implies that an MVPN can span multiple 411 service providers. 413 Another way to look at MVPN is to say that an MVPN is defined by a 414 set of administrative policies. Such policies determine both Sender 415 Sites set and Receiver Site set. Such policies are established by 416 MVPN customers, but implemented/realized by MVPN Service Providers 417 using the existing BGP/MPLS VPN mechanisms, such as Route Targets, 418 with extensions, as necessary. 420 2.2.3. Auto-Discovery 422 In order for the PE routers attaching to a given MVPN to exchange 423 MVPN control information with each other, each one needs to discover 424 all the other PEs that attach to the same MVPN. (Strictly speaking, 425 a PE in the receiver sites set need only discover the other PEs in 426 the sender sites set and a PE in the sender sites set need only 427 discover the other PEs in the receiver sites set.) This is referred 428 to as "MVPN Auto-Discovery". 430 This document discusses two ways of providing MVPN autodiscovery: 432 - BGP can be used for discovering and maintaining MVPN membership. 433 The PE routers advertise their MVPN membership to other PE 434 routers using BGP. A PE is considered to be a "member" of a 435 particular MVPN if it contains a VRF (Virtual Routing and 436 Forwarding table, see [RFC4364]) which is configured to contain 437 the multicast routing information of that MVPN. This auto- 438 discovery option does not make any assumptions about the methods 439 used for transmitting MVPN multicast data packets through the 440 backbone. 442 - If it is known that the multicast data packets of a particular 443 MVPN are to be transmitted (at least, by default) through a non- 444 aggregated Inclusive Tree which is to be set up by PIM-SM or 445 BIDIR-PIM, and if the PEs attaching to that MVPN are configured 446 with the group address corresponding to that tree, then the PEs 447 can auto-discover each other simply by joining the tree and then 448 multicasting PIM Hellos over the tree. 450 2.2.4. PE-PE Multicast Routing Information 452 The BGP/MPLS IP VPN [RFC4364] specification requires a PE to maintain 453 at most one BGP peering with every other PE in the network. This 454 peering is used to exchange VPN routing information. The use of Route 455 Reflectors further reduces the number of BGP adjacencies maintained 456 by a PE to exchange VPN routing information with other PEs. This 457 document describes various options for exchanging MVPN control 458 information between PE routers based on the use of PIM or BGP. These 459 options have different overheads with respect to the number of 460 routing adjacencies that a PE router needs to maintain to exchange 461 MVPN control information with other PE routers. Some of these options 462 allow the retention of the unicast BGP/MPLS VPN model letting a PE 463 maintain at most one BGP routing adjacency with other PE routers to 464 exchange MVPN control information. BGP also provides reliable 465 transport and uses incremental updates. Another option is the use of 466 the currently existing, "soft state" PIM standard [PIM-SM] that uses 467 periodic complete updates. 469 2.2.5. PE-PE Multicast Data Transmission 471 Like [RFC4364], this document decouples the procedures for exchanging 472 routing information from the procedures for transmitting data 473 traffic. Hence a variety of transport technologies may be used in the 474 backbone. For inclusive trees, these transport technologies include 475 unicast PE-PE tunnels (using MPLS or IP/GRE encapsulation), multicast 476 distribution trees created by PIM-SSM, PIM-SM, or BIDIR-PIM (using 477 IP/GRE encapsulation), point-to-multipoint LSPs created by RSVP-TE or 478 mLDP, and multipoint-to-multipoint LSPs created by mLDP. (However, 479 techniques for aggregating the traffic of multiple MVPNs onto a 480 single multipoint-to-multipoint LSP or onto a single bidirectional 481 multicast distribution tree are for further study.) For selective 482 trees, only unicast PE-PE tunnels (using MPLS or IP/GRE 483 encapsulation) and unidirectional single-source trees are supported, 484 and the supported tree creation protocols are PIM-SSM (using IP/GRE 485 encapsulation), RSVP-TE, and mLDP. 487 In order to aggregate traffic from multiple MVPNs onto a single 488 multicast distribution tree, it is necessary to have a mechanism to 489 enable the egresses of the tree to demultiplex the multicast traffic 490 received over the tree and to associate each received packet with a 491 particular MVPN. This document specifies a mechanism whereby 492 upstream label assignment [MPLS-UPSTREAM-LABEL] is used by the root 493 of the tree to assign a label to each flow. This label is used by 494 the receivers to perform the demultiplexing. This document also 495 describes procedures based on BGP that are used by the root of an 496 Aggregate Tree to advertise the Inclusive and/or Selective binding 497 and the demultiplexing information to the leaves of the tree. 499 This document also describes the data plane encapsulations for 500 supporting the various SP multicast transport options. 502 This document assumes that when SP multicast trees are used, traffic 503 for a particular multicast group is transmitted by a particular PE on 504 only one SP multicast tree. The use of multiple SP multicast trees 505 for transmitting traffic belonging to a particular multicast group is 506 for further study. 508 2.2.6. Inter-AS MVPNs 510 [RFC4364] describes different options for supporting BGP/MPLS IP 511 unicast VPNs whose provider backbones contain more than one 512 Autonomous System (AS). These are know as Inter-AS VPNs. In an 513 Inter-AS VPN, the ASes may belong to the same provider or to 514 different providers. This document describes how Inter-AS MVPNs can 515 be supported for each of the unicast BGP/MPLS VPN Inter-AS options. 516 This document also specifies a model where Inter-AS MVPN service can 517 be offered without requiring a single SP multicast tree to span 518 multiple ASes. In this model, an inter-AS multicast tree consists of 519 a number of "segments", one per AS, which are stitched together at AS 520 boundary points. These are known as "segmented inter-AS trees". Each 521 segment of a segmented inter-AS tree may use a different multicast 522 transport technology. 524 It is also possible to support Inter-AS MVPNs with non-segmented 525 source trees that extend across AS boundaries. 527 2.2.7. Optionally Eliminating Shared Tree State 529 The document also discusses some options and protocol extensions 530 which can be used to eliminate the need for the PE routers to 531 distribute to each other the (*, G) and (*, G, RPT-bit) states when 532 there are PIM Sparse Mode multicast groups in the VPNs. 534 3. Concepts and Framework 536 3.1. PE-CE Multicast Routing 538 Support of multicast in BGP/MPLS IP VPNs is modeled closely after 539 support of unicast in BGP/MPLS IP VPNs. That is, a multicast routing 540 protocol will be run on the PE-CE interfaces, such that PE and CE are 541 multicast routing adjacencies on that interface. CEs at different 542 sites do not become multicast routing adjacencies of each other. 544 If a PE attaches to n VPNs for which multicast support is provided 545 (i.e., to n "MVPNs"), the PE will run n independent instances of a 546 multicast routing protocol. We will refer to these multicast routing 547 instances as "VPN-specific multicast routing instances", or more 548 briefly as "multicast C-instances". The notion of a "VRF" ("Virtual 549 Routing and Forwarding Table"), defined in [RFC4364], is extended to 550 include multicast routing entries as well as unicast routing entries. 551 Each multicast routing entry is thus associated with a particular 552 VRF. 554 Whether a particular VRF belongs to an MVPN or not is determined by 555 configuration. 557 In this document, we will not attempt to provide support for every 558 possible multicast routing protocol that could possibly run on the 559 PE-CE link. Rather, we consider multicast C-instances only for the 560 following multicast routing protocols: 562 - PIM Sparse Mode (PIM-SM) 564 - PIM Single Source Mode (PIM-SSM) 566 - PIM Bidirectional Mode (BIDIR-PIM) 568 - PIM Dense Mode (PIM-DM) 570 In order to support the "Carrier's Carrier" model of [RFC4364], mLDP 571 or BGP will also be supported on the PE-CE interface. The use of mLDP 572 on the PE-CE interface is described in [MVPN-BGP]. The use of BGP on 573 the PE-CE interface is not described in this revision. 575 As the document only supports PIM-based C-instances, we will 576 generally use the term "PIM C-instances" to refer to the multicast C- 577 instances. 579 A PE router may also be running a "provider-wide" instance of PIM, (a 580 "PIM P-instance"), in which it has a PIM adjacency with, e.g., each 581 of its IGP neighbors (i.e., with P routers), but NOT with any CE 582 routers, and not with other PE routers (unless another PE router 583 happens to be an IGP adjacency). In this case, P routers would also 584 run the P-instance of PIM, but NOT a C-instance. If there is a PIM 585 P-instance, it may or may not have a role to play in support of VPN 586 multicast; this is discussed in later sections. However, in no case 587 will the PIM P-instance contain VPN-specific multicast routing 588 information. 590 In order to help clarify when we are speaking of the PIM P-instance 591 and when we are speaking of a PIM C-instance, we will also apply the 592 prefixes "P-" and "C-" respectively to control messages, addresses, 593 etc. Thus a P-Join would be a PIM Join which is processed by the PIM 594 P-instance, and a C-Join would be a PIM Join which is processed by a 595 C-instance. A P-group address would be a group address in the SP's 596 address space, and a C-group address would be a group address in a 597 VPN's address space. 599 3.2. P-Multicast Service Interfaces (PMSIs) 601 Multicast data packets received by a PE over a PE-CE interface must 602 be forwarded to one or more of the other PEs in the same MVPN for 603 delivery to one or more other CEs. 605 We define the notion of a "P-Multicast Service Interface" (PMSI). If 606 a particular MVPN is supported by a particular set of PE routers, 607 then there will be a PMSI connecting those PE routers. A PMSI is a 608 conceptual "overlay" on the P network with the following property: a 609 PE in a given MVPN can give a packet to the PMSI, and the packet will 610 be delivered to some or all of the other PEs in the MVPN, such that 611 any PE receiving such a packet will be able to tell which MVPN the 612 packet belongs to. 614 As we discuss below, a PMSI may be instantiated by a number of 615 different transport mechanisms, depending on the particular 616 requirements of the MVPN and of the SP. We will refer to these 617 transport mechanisms as "tunnels". 619 For each MVPN, there are one or more PMSIs that are used for 620 transmitting the MVPN's multicast data from one PE to others. We 621 will use the term "PMSI" such that a single PMSI belongs to a single 622 MVPN. However, the transport mechanism which is used to instantiate 623 a PMSI may allow a single "tunnel" to carry the data of multiple 624 PMSIs. 626 In this document we make a clear distinction between the multicast 627 service (the PMSI) and its instantiation. This allows us to separate 628 the discussion of different services from the discussion of different 629 instantiations of each service. The term "tunnel" is used to refer 630 only to the transport mechanism that instantiates a service. 632 3.2.1. Inclusive and Selective PMSIs 634 We will distinguish between three different kinds of PMSI: 636 - "Multidirectional Inclusive" PMSI (MI-PMSI) 638 A Multidirectional Inclusive PMSI is one which enables ANY PE 639 attaching to a particular MVPN to transmit a message such that it 640 will be received by EVERY other PE attaching to that MVPN. 642 There is at most one MI-PMSI per MVPN. (Though the tunnel or 643 tunnels that instantiate an MI-PMSI may actually carry the data 644 of more than one PMSI.) 646 An MI-PMSI can be thought of as an overlay broadcast network 647 connecting the set of PEs supporting a particular MVPN. 649 - "Unidirectional Inclusive" PMSI (UI-PMSI) 651 A Unidirectional Inclusive PMSI is one which enables a particular 652 PE, attached to a particular MVPN, to transmit a message such 653 that it will be received by all the other PEs attaching to that 654 MVPN. There is at most one UI-PMSI per PE per MVPN, though the 655 tunnel which instantiates a UI-PMSI may in fact carry the data of 656 more than one PMSI. 658 - "Selective" PMSI (S-PMSI). 660 A Selective PMSI is one which provides a mechanism wherein a 661 particular PE in an MVPN can multicast messages so that they will 662 be received by a subset of the other PEs of that MVPN. There may 663 be an arbitrary number of S-PMSIs per PE per MVPN. Again, the 664 tunnel which instantiates a given S-PMSI may carry data from 665 multiple S-PMSIs. 667 We will see in later sections the role played by these different 668 kinds of PMSI. We will use the term "I-PMSI" when we are not 669 distinguishing between "MI-PMSIs" and "UI-PMSIs". 671 3.2.2. Tunnels Instantiating PMSIs 673 The tunnels which are used to instantiate PMSIs will be referred to 674 as "P-tunnels". A number of different tunnel setup techniques can be 675 used to create the P-tunnels that instantiate the PMSIs. Among these 676 are: 678 - PIM 680 A PMSI can be instantiated as (a set of) Multicast Distribution 681 Trees created by the PIM P-instance ("P-trees"). 683 PIM-SSM, BIDIR-PIM, or PIM-SM can be used to create P-trees. 684 (PIM-DM is not supported for this purpose.) 686 A single MI-PMSI can be instantiated by a single shared P-tree, 687 or by a number of source P-trees (one for each PE of the MI- 688 PMSI). P-trees may be shared by multiple MVPNs (i.e., a given P- 689 tree may be the instantiation of multiple PMSIs), as long as the 690 encapsulation provides some means of demultiplexing the data 691 traffic by MVPN. 693 Selective PMSIs are instantiated by source P-trees, and are most 694 naturally created by PIM-SSM, since by definition only one PE is 695 the source of the multicast data on a Selective PMSI. 697 - MLDP 699 A PMSI may be instantiated as one or more mLDP Point-to- 700 Multipoint (P2MP) LSPs, or as an mLDP Multipoint-to- 701 MultiPoint(MP2MP) LSP. A Selective PMSI or a Unidirectional 702 Inclusive PMSI would be instantiated as a single mLDP P2MP LSP, 703 whereas a Multidirectional Inclusive PMSI could be instantiated 704 either as a set of such LSPs (one for each PE in the MVPN) or as 705 a single MP2MP LSP. 707 MLDP P2MP LSPs can be shared across multiple MVPNs. 709 - RSVP-TE 711 A PMSI may be instantiated as one or more RSVP-TE Point-to- 712 Multipoint (P2MP) LSPs. A Selective PMSI or a Unidirectional 713 Inclusive PMSI would be instantiated as a single RSVP-TE P2MP 714 LSP, whereas a Multidirectional Inclusive PMSI would be 715 instantiated as a set of such LSPs, one for each PE in the MVPN. 716 RSVP-TE P2MP LSPs can be shared across multiple MVPNs. 718 - A Mesh of Unicast Tunnels. 720 If a PMSI is implemented as a mesh of unicast tunnels, a PE 721 wishing to transmit a packet through the PMSI would replicate the 722 packet, and send a copy to each of the other PEs. 724 An MI-PMSI for a given MVPN can be instantiated as a full mesh of 725 unicast tunnels among that MVPN's PEs. A UI-PMSI or an S-PMSI 726 can be instantiated as a partial mesh. 728 - Unicast Tunnels to the Root of a P-Tree. 730 Any type of PMSI can be instantiated through a method in which 731 there is a single P-tree (created, for example, via PIM-SSM or 732 via RSVP-TE), and a PE transmits a packet to the PMSI by sending 733 it in a unicast tunnel to the root of that P-tree. All PEs in 734 the given MVPN would need to be leaves of the tree. 736 When this instantiation method is used, the transmitter of the 737 multicast data may receive its own data back. Methods for 738 avoiding this are for further study. 740 It can be seen that each method of implementing PMSIs has its own 741 area of applicability. This specification therefore allows for the 742 use of any of these methods. At first glance, this may seem like an 743 overabundance of options. However, the history of multicast 744 development and deployment should make it clear that there is no one 745 option which is always acceptable. The use of segmented inter-AS 746 trees does allow each SP to select the option which it finds most 747 applicable in its own environment, without causing any other SP to 748 choose that same option. 750 Specifying the conditions under which a particular tree building 751 method is applicable is outside the scope of this document. 753 The choice of the tunnel technique belongs to the sender router and 754 is a local policy decision of the router. The procedures defined 755 throughout this document do not mandate that the same tunnel 756 technique be used for all PMSI tunnels going through a given provider 757 backbone. It is however expected that any tunnel technique that can 758 be used by a PE for a particular MVPN is also supported by other PE 759 having VRFs for the MVPN. Moreover, the use of ingress replication 760 by any PE for an MVPN, implies that all other PEs MUST use ingress 761 replication for this MVPN. 763 3.3. Use of PMSIs for Carrying Multicast Data 765 Each PE supporting a particular MVPN must have a way of discovering: 767 - The set of other PEs in its AS that are attached to sites of that 768 MVPN, and the set of other ASes that have PEs attached to sites 769 of that MVPN. However, if segmented inter-AS trees are not used 770 (see section 8.2), then each PE needs to know the entire set of 771 PEs attached to sites of that MVPN. 773 - If segmented inter-AS trees are to be used, the set of border 774 routers in its AS that support inter-AS connectivity for that 775 MVPN 777 - If the MVPN is configured to use a MI-PMSI, the information 778 needed to set up and to use the tunnels instantiating the default 779 MI-PMSI, 781 - For each other PE, whether the PE supports Aggregate Trees for 782 the MVPN, and if so, the demultiplexing information which must be 783 provided so that the other PE can determine whether a packet 784 which it received on an aggregate tree belongs to this MVPN. 786 In some cases this information is provided by means of the BGP-based 787 auto-discovery procedures detailed in section 4. In other cases, 788 this information is provided after discovery is complete, by means of 789 procedures defined in section 6.1.2. In either case, the information 790 which is provided must be sufficient to enable the PMSI to be bound 791 to the identified tunnel, to enable the tunnel to be created if it 792 does not already exist, and to enable the different PMSIs which may 793 travel on the same tunnel to be properly demultiplexed. 795 3.3.1. MVPNs with MI-PMSIs 797 If an MVPN uses an MI-PMSI, then the MI-PMSI for that MVPN will be 798 created as soon as the necessary information has been obtained. 799 Creating a PMSI means creating the tunnel which carries it (unless 800 that tunnel already exists), as well as binding the PMSI to the 801 tunnel. The MI-PMSI for that MVPN is then used as the default method 802 of transmitting multicast data packets for that MVPN. In effect, all 803 the multicast streams for the MVPN are, by default, aggregated onto 804 the MI-MVPN. 806 If a particular multicast stream from a particular source PE has 807 certain characteristics, it can be desirable to migrate it from the 808 MI-PMSI to an S-PMSI. These characteristics and procedures for 809 migrating a stream from an MI-PMSI to an S-PMSI are discussed in 810 section 7. 812 3.3.2. When MI-PMSIs are Required 814 MI-PMSIs are required under the following conditions: 816 - The MVPN is using PIM-DM, or some other protocol (such as BSR) 817 which relies upon flooding. Only with an MI-PMSI can the C-data 818 (or C-control-packets) received from any CE be flooded to all 819 PEs. 821 - If the procedure for carrying C-multicast routes from PE to PE 822 involves the multicasting of P-PIM control messages among the PEs 823 (see sections 3.4.1.1, 3.4.1.2, and 5.2). 825 3.3.3. MVPNs That Do Not Use MI-PMSIs 827 If a particular MVPN does not use a MI-PMSI, then its multicast data 828 may be sent on a set of UI-PMSIs. 830 It is also possible to send all the multicast data on a set of S- 831 PMSIs, omitting any usage of I-PMSIs. This prevents PEs from 832 receiving data which they don't need, at the cost of requiring 833 additional tunnels. However, cost-effective instantiation of S-PMSIs 834 is likely to require Aggregate P-trees, which in turn makes it 835 necessary for the transmitting PE to know which PEs need to receive 836 which multicast streams. This is known as "explicit tracking", and 837 the procedures to enable explicit tracking may themselves impose a 838 cost. This is further discussed in section 7.2.2.2. 840 3.4. PE-PE Transmission of C-Multicast Routing 842 As a PE attached to a given MVPN receives C-Join/Prune messages from 843 its CEs in that MVPN, it must convey the information contained in 844 those messages to other PEs that are attached to the same MVPN. 846 There are several different methods for doing this. As these methods 847 are not interoperable, the method to be used for a particular MVPN 848 must either be configured, or discovered as part of the auto- 849 discovery process. 851 3.4.1. PIM Peering 853 3.4.1.1. Full Per-MVPN PIM Peering Across a MI-PMSI 855 If the set of PEs attached to a given MVPN are connected via a MI- 856 PMSI, the PEs can form "normal" PIM adjacencies with each other. 857 Since the MI-PMSI functions as a broadcast network, the standard PIM 858 procedures for forming and maintaining adjacencies over a LAN can be 859 applied. 861 As a result, the C-Join/Prune messages which a PE receives from a CE 862 can be multicast to all the other PEs of the MVPN. PIM "join 863 suppression" can be enabled and the PEs can send Asserts as needed. 865 This procedure is fully specified in section 5.2. 867 3.4.1.2. Lightweight PIM Peering Across a MI-PMSI 869 The procedure of the previous section has the following 870 disadvantages: 872 - Periodic Hello messages must be sent by all PEs. 874 Standard PIM procedures require that each PE in a particular MVPN 875 periodically multicast a Hello to all the other PEs in that MVPN. 876 If the number of MVPNs becomes very large, sending and receiving 877 these Hellos can become a substantial overhead for the PE 878 routers. 880 - Periodic retransmission of C-Join/Prune messages. 882 PIM is a "soft-state" protocol, in which reliability is assured 883 through frequent retransmissions (refresh) of control messages. 884 This too can begin to impose a large overhead on the PE routers 885 as the number of MVPNs grows. 887 The first of these disadvantages is easily remedied. The reason for 888 the periodic PIM Hellos is to ensure that each PIM speaker on a LAN 889 knows who all the other PIM speakers on the LAN are. However, in the 890 context of MVPN, PEs in a given MVPN can learn the identities of all 891 the other PEs in the MVPN by means of the BGP-based auto-discovery 892 procedure of section 4. In that case, the periodic Hellos would 893 serve no function, and could simply be eliminated. (Of course, this 894 does imply a change to the standard PIM procedures.) 896 When Hellos are suppressed, we may speak of "lightweight PIM 897 peering". 899 The periodic refresh of the C-Join/Prunes is not as simple to 900 eliminate. If and when "refresh reduction" procedures are specified 901 for PIM, it may be useful to incorporate them, so as to make the 902 lightweight PIM peering procedures even more lightweight. 904 Lightweight PIM peering is not specified in this document. 906 3.4.1.3. Unicasting of PIM C-Join/Prune Messages 908 PIM does not require that the C-Join/Prune messages which a PE 909 receives from a CE to be multicast to all the other PEs; it allows 910 them to be unicast to a single PE, the one which is upstream on the 911 path to the root of the multicast tree mentioned in the Join/Prune 912 message. Note that when the C-Join/Prune messages are unicast, there 913 is no such thing as "join suppression". Therefore PIM Refresh 914 Reduction may be considered to be a pre-requisite for the procedure 915 of unicasting the C-Join/Prune messages. 917 When the C-Join/Prunes are unicast, they are not transmitted on a 918 PMSI at all. Note that the procedure of unicasting the C-Join/Prunes 919 is different than the procedure of transmitting the C-Join/Prunes on 920 an MI-PMSI which is instantiated as a mesh of unicast tunnels. 922 If there are multiple PEs that can be used to reach a given C-source, 923 procedures described in section 9 MUST be used to ensue that, at 924 least within a single AS, all PEs choose the same PE to reach the C- 925 source. 927 Procedures for unicasting the PIM control messages are not further 928 specified in this document. 930 3.4.2. Using BGP to Carry C-Multicast Routing 932 It is possible to use BGP to carry C-multicast routing information 933 from PE to PE, dispensing entirely with the transmission of C- 934 Join/Prune messages from PE to PE. This is specified in section 5.3. 935 Inter-AS procedures are described in section 8. 937 4. BGP-Based Autodiscovery of MVPN Membership 939 BGP-based autodiscovery is done by means of a new address family, the 940 MCAST-VPN address family. (This address family also has other uses, 941 as will be seen later.) Any PE which attaches to an MVPN must issue 942 a BGP update message containing an NLRI in this address family, along 943 with a specific set of attributes. In this document, we specify the 944 information which must be contained in these BGP updates in order to 945 provide auto-discovery. The encoding details, along with the 946 complete set of detailed procedures, are specified in a separate 947 document [MVPN-BGP]. 949 This section specifies the intra-AS BGP-based autodiscovery 950 procedures. When segmented inter-AS trees are used, additional 951 procedures are needed, as specified in section 8. Further detail may 952 be found in [MVPN-BGP]. (When segmented inter-AS trees are not used, 953 the inter-AS procedures are almost identical to the intra-AS 954 procedures.) 956 BGP-based autodiscovery uses a particular kind of MCAST-VPN route 957 known as an "auto-discovery routes", or "A-D route". In particular, 958 it uses two kinds of "A-D routes", the "Intra-AS A-D Route" and the 959 "Inter-AS A-D Route". (There are also additional kinds of A-D 960 routes, such as the Source Active A-D routes which are used for 961 purposes that go beyond auto-discovery. These are discussed in 962 subsequent sections.) 964 The Inter-AS A-D Route is used only when segmented inter-AS tunnels 965 are used, as specified in section 8. 967 The "Intra-AS A-D route" is originated by the PEs that are (directly) 968 connected to the site(s) of an MVPN. It is distributed to other PEs 969 that attach to sites of the MVPN. If segmented Inter-AS Tunnels are 970 used, then the Intra-AS A-D routes are not distributed outside the AS 971 where they originate; if segmented Inter-AS Tunnels are not used, 972 then the Intra-AS A-D routes are, despite their name, distributed to 973 all PEs attached to the VPN, no matter what AS the PEs are in. 975 The NLRI of an Intra-AS A-D route must contain the following 976 information: 978 - The route type (i.e., Intra-AS A-D route) 980 - The IP address of the originating PE 981 - An RD configured locally for the MVPN. This is an RD which can 982 be prepended to that IP address to form a globally unique VPN-IP 983 address of the PE. 985 The A-D route must also carry the following attributes: 987 - One or more Route Target attributes. If any other PE has one of 988 these Route Targets configured for import into a VRF, it treats 989 the advertising PE as a member in the MVPN to which the VRF 990 belongs. This allows each PE to discover the PEs that belong to a 991 given MVPN. More specifically it allows a PE in the receiver 992 sites set to discover the PEs in the sender sites set of the MVPN 993 and the PEs in the sender sites set of the MVPN to discover the 994 PEs in the receiver sites set of the MVPN. The PEs in the 995 receiver sites set would be configured to import the Route 996 Targets advertised in the BGP Auto-Discovery routes by PEs in the 997 sender sites set. The PEs in the sender sites set would be 998 configured to import the Route Targets advertised in the BGP 999 Auto-Discovery routes by PEs in the receiver sites set. 1001 - PMSI tunnel attribute. This attribute is present if and only if 1002 either MI-PMSI is to be used for the MVPN, or UI-PMSI is to be 1003 used for the MVPN on the PE that originates the intra-AS A-D 1004 route. It contains the following information: 1006 * whether the MI-PMSI is instantiated by 1008 + A BIDIR-PIM tree, 1010 + a set of PIM-SSM trees, 1012 + a set of PIM-SM trees 1014 + a set of RSVP-TE point-to-multipoint LSPs 1016 + a set of mLDP point-to-multipoint LSPs 1018 + an mLDP multipoint-to-multipoint LSP 1020 + a set of unicast tunnels 1022 + a set of unicast tunnels to the root of a shared tree (in 1023 this case the root must be identified) 1025 * If the PE wishes to setup a tunnel to instantiate the I-PMSI, 1026 a unique identifier for the tunnel used to instantiate the I- 1027 PMSI. This identifier depends on the tunnel technology used. 1029 All the PEs attaching to a given MVPN (within a given AS) 1030 must have been configured with the same PMSI tunnel attribute 1031 for that MVPN. They are also expected to know the 1032 encapsulation to use. 1034 Note that a tunnel can be identified at discovery time only 1035 if the tunnel already exists (e.g., it was constructed by 1036 means of configuration), or if it can be constructed without 1037 each PE knowing the the identities of all the others. This is 1038 obviously the case when the tunnel is constructed by a 1039 receiver-initiated join technique such as PIM or mLDP. It is 1040 also the case when the tunnel is an RSVP-TE P2MP LSP as the 1041 tunnel identifier can be constructed without the head end 1042 learning the identities of the other PEs. 1044 In other cases, a tunnel cannot be identified until the PE 1045 has discovered one or more of the other PEs. In these cases, 1046 a PE will first send an A-D route without a tunnel 1047 identifier, and then will send another one with a tunnel 1048 identifier after discovering one or more of the other PEs. 1050 All the PEs attaching to a given MVPN must be configured with 1051 information specifying the encapsulation to use. 1053 * Whether the tunnel used to instantiate the I-PMSI for this 1054 MVPN is aggregating I-PMSIs from multiple MVPNs. This will 1055 affect the encapsulation used. If aggregation is to be used, 1056 a demultiplexor value to be carried by packets for this 1057 particular MVPN must also be specified. The demultiplexing 1058 mechanism and signaling procedures are described in section 1059 6. 1061 Further details of the use of this information are provided in 1062 subsequent sections. 1064 Sometimes it is necessary for one PE to advertise an upstream- 1065 assigned MPLS label that identifies another PE. Under certain 1066 circumstances to be discussed later, a PE which is the root of a 1067 multicast P-tunnel will bind an MPLS label value to one or more 1068 of the PEs that belong to the P-tunnel, and will distribute these 1069 label bindings using A-D routes. The precise details of this 1070 label distribution will be included in the next revision of this 1071 document. We will refer to these as "PE Labels". A packet 1072 traveling on the P-tunnel may carry one of these labels as an 1073 indication that the PE corresponding to that label is special. 1074 See section 11.3 for more details. 1076 5. PE-PE Transmission of C-Multicast Routing 1078 As a PE attached to a given MVPN receives C-Join/Prune messages from 1079 its CEs in that MVPN, it must convey the information contained in 1080 those messages to other PEs that are attached to the same MVPN. This 1081 is known as the "PE-PE transmission of C-multicast routing 1082 information". 1084 This section specifies the procedures used for PE-PE transmission of 1085 C-multicast routing information. Not every procedure mentioned in 1086 section 3.4 is specified here. Rather, this section focuses on two 1087 particular procedures: 1089 - Full PIM Peering. 1091 This procedure is fully specified herein. 1093 - Use of BGP to distribute C-multicast routing 1095 This procedure is described herein, but the full specification 1096 appears in [MVPN-BGP]. 1098 Those aspect of the procedures which apply to both of the above are 1099 also specified fully herein. 1101 Specification of other procedures is for future study. 1103 5.1. Selecting the Upstream Multicast Hop (UMH) 1105 When a PE receives a C-Join/Prune message from a CE, the message 1106 identifies a particular multicast flow as belonging either to a 1107 source tree (S,G) or to a shared tree (*,G). Throughout this 1108 section, we use the term C-source to refer to S, in the case of a 1109 source tree, or to the Rendezvous Point (RP) for G, in the case of 1110 (*,G). If the route to the C-source is across the VPN backbone, then 1111 the PE needs to find the "upstream multicast hop" (UMH) for the (S,G) 1112 or (*,G) flow. The "upstream multicast hop" is either the PE at which 1113 (S,G) or (*,G) data packets enter the VPN backbone, or else is the 1114 Autonomous System Border Router (ASBR) at which those data packets 1115 enter the local AS when traveling through the VPN backbone. The 1116 process of finding the upstream multicast hop for a given C-source is 1117 known as "upstream multicast hop selection". 1119 5.1.1. Eligible Routes for UMH Selection 1121 In the simplest case, the PE does the upstream hop selection by 1122 looking up the C-source in the unicast VRF associated with the PE-CE 1123 interface over which the C-Join/Prune was received. The route that 1124 matches the C-source will contain the information needed to select 1125 the upstream multicast hop. 1127 However, in some cases, the CEs may be distributing to the PEs a 1128 special set of routes that are to be used exclusively for the purpose 1129 of upstream multicast hop selection, and not used for unicast routing 1130 at all. For example, when BGP is the CE-PE unicast routing protocol, 1131 the CEs may be using SAFI 2 to distribute a special set of routes 1132 that are to be used for, and only for, upstream multicast hop 1133 selection. When OSPF is the CE-PE routing protocol, the CE may use 1134 an MT-ID of 1 to distribute a special set of routes that are to be 1135 used for, and only for, upstream multicast hop selection . When a CE 1136 uses one of these mechanisms to distribute to a PE a special set of 1137 routes to be used exclusively for upstream multicast hop selection, 1138 these routes are distributed among the PEs using SAFI 129, as 1139 described in [MVPN-BGP]. 1141 Whether the routes used for upstream multicast hop selection are (a) 1142 the "ordinary" unicast routes or (b) a special set of routes that are 1143 used exclusively for upstream multicast hop selection, is a matter of 1144 policy. How that policy is chosen, deployed, or implemented is 1145 outside the scope of this document. In the following, we will simply 1146 refer to the set of routes that are used for upstream multicast hop 1147 selection, the "Eligible UMH routes", with no presumptions about the 1148 policy by which this set of routes was chosen. 1150 5.1.2. Information Carried by Eligible UMH Routes 1152 Every route which is eligible for UMH selection MUST carry a VRF 1153 Route Import Extended Community [MVPN-BGP]. This attribute 1154 identifies the PE that originated the route. 1156 If BGP is used for carrying C-multicast routes, OR if "Segmented 1157 Inter-AS Tunnels" (see section 8.2) are used, then every UMH route 1158 MUST also carry a Source AS Extended Community [MVPN-BGP]. 1160 These two attributes are used in the upstream multicast hop selection 1161 procedures described below. 1163 5.1.3. Selecting the Upstream PE 1165 The first step in selecting the upstream multicast hop for a given C- 1166 source is to select the upstream PE router for that C-source. 1168 The PE that received the C-Join message from a CE looks in the VRF 1169 corresponding to the interfaces over which the C-Join was received. 1170 It finds the Eligible UMH route which is the best match for the C- 1171 source specified in that C-Join. Call this the "Installed UMH 1172 Route". 1174 Note that the outgoing interface of the Installed UMH Route may be 1175 one of the interfaces associated with the VRF, in which case the 1176 upstream multicast hop is a CE and the route to the C-source is not 1177 across the VPN backbone. 1179 Consider the set of all VPN-IP routes that are: (a) eligible to be 1180 imported into the VRF (as determined by their Route Targets), (b) are 1181 eligible to be used for upstream multicast hop selection, and (c) 1182 have exactly the same IP prefix (not necessarily the same RD) as the 1183 installed UMH route. 1185 For each route in this set, determine the corresponding upstream PE 1186 and upstream RD. If a route has a VRF Route Import Extended 1187 Community, the route's upstream PE is determined from it. If a route 1188 does not have a VRF Route Import Extended Community, the route's 1189 upstream PE is determined from the route's BGP next hop attribute. 1190 In either case, the upstream RD is taken from the route's NLRI. 1192 This results in a set of pairs of . 1194 Call this the "UMH Route Candidate Set." Then the PE MUST select a 1195 single route from the set to be the "Selected UMH Route". The 1196 corresponding upstream PE is known as the "Selected Upstream PE", and 1197 the corresponding upstream RD is known as the "Selected Upstream RD". 1199 There are several possible procedures that can be used by a PE to 1200 select a single route from the candidate set. 1202 The default procedure, which MUST be implemented, is to select the 1203 route whose corresponding upstream PE address is numerically highest, 1204 where a 32-bit IP address is treated as a 32 bit unsigned integer. 1205 Call this the "default upstream PE selection". For a given C-source, 1206 provided that the routing information used to create the candidate 1207 set is stable, all PEs will have the same default upstream PE 1208 selection. (Though different default upstream PE selections may be 1209 chosen during a routing transient.) 1210 An alternative procedure which MUST be implemented, but which is 1211 disabled by default, is the following. This procedure ensures that, 1212 except during a routing transient, each PE chooses the same upstream 1213 PE for a given combination of C-source and C-G. 1215 1. The PEs in the candidate set are numbered from lower to higher 1216 IP address, starting from 0. 1218 2. The following hash is performed: 1220 - A bytewise exclusive-or of all the bytes in the C-source 1221 address and the C-G address is performed. 1223 - The result is taken modulo n, where n is the number of PEs 1224 in the candidate set. Call this result N. 1226 The selected upstream PE is then the one that appears in position N 1227 in the list of step 1. 1229 Other hashing algorithms are allowed as well, but not required. 1231 The alternative procedure allows a form of "equal cost load 1232 balancing". Suppose, for example, that from egress PEs PE3 and PE4, 1233 source C-S can be reached, at equal cost, via ingress PE PE1 or 1234 ingress PE PE2. The load balancing procedure makes it possible for 1235 PE1 to be the ingress PE for (C-S, C-G1) data traffic while PE2 is 1236 the ingress PE for (C-S, C-G2) data traffic. 1238 Another procedure, which SHOULD be implemented, is to use the 1239 Installed UMH Route as the Selected UMH Route. If this procedure is 1240 used, the result is likely to be that a given PE will choose the 1241 upstream PE that is closest to it, according to the routing in the SP 1242 backbone. As a result, for a given C-source, different PEs may 1243 choose different upstream PEs. This is useful if the C-source is an 1244 anycast address, and can also be useful if the C-source is in a 1245 multihomed site (i.e., a site that is attached to multiple PEs). 1246 However, this procedure is more likely to lead to steady state 1247 duplication of traffic unless (a) PEs discard data traffic which 1248 arrives from the "wrong" upstream PE, or (b) data traffic is carried 1249 only in non-aggregated S-PMSIs . This issue is discussed at length 1250 in section 9. 1252 General policy-based procedures for selecting the UMH route are 1253 allowed, but not required and are not further discussed in this 1254 specification. 1256 5.1.4. Selecting the Upstream Multicast Hop 1258 In certain cases, the selected upstream multicast hop is the same as 1259 the selected upstream PE. In other cases, the selected upstream 1260 multicast hop is the ASBR which is the "BGP next hop" of the Selected 1261 UMH Route. 1263 If the selected upstream PE is in the local AS, then the selected 1264 upstream PE is also the selected upstream multicast hop. This is the 1265 case if any of the following conditions holds: 1267 - The selected UMH route has a Source AS Extended Community, and 1268 the Source AS is the same as the local AS, 1270 - The selected UMH route does not have a Source AS Extended 1271 Community, but the route's BGP next hop is the same as the 1272 upstream PE. 1274 Otherwise, the selected upstream multicast hop is an ASBR. The 1275 method of determining just which ASBR it is depends on the particular 1276 inter-AS signaling method being used (PIM or BGP), and on whether 1277 segmented or non-segmented inter-AS tunnels are used. These details 1278 are presented in later sections. 1280 5.2. Details of Per-MVPN Full PIM Peering over MI-PMSI 1282 In this section, we assume that inter-AS MVPNs will be supported by 1283 means of non-segmented inter-AS trees. Support for segmented inter- 1284 AS trees with PIM peering is for further study. 1286 When an MVPN uses an MI-PMSI, the C-instances of that MVPN can treat 1287 the MI-PMSI as a LAN interface, and form either full PIM adjacencies 1288 with each other over that "LAN interface". 1290 To form a full PIM adjacency, the PEs execute the PIM LAN procedures, 1291 including the generation and processing of PIM Hello, Join/Prune, 1292 Assert, DF election and other PIM control packets. These are 1293 executed independently for each C-instance. PIM "join suppression" 1294 SHOULD be enabled. 1296 5.2.1. PIM C-Instance Control Packets 1298 All PIM C-Instance control packets of a particular MVPN are addressed 1299 to the ALL-PIM-ROUTERS (224.0.0.13) IP destination address, and 1300 transmitted over the MI-PMSI of that MVPN. While in transit in the 1301 P-network, the packets are encapsulated as required for the 1302 particular kind of tunnel that is being used to instantiate the MI- 1303 PMSI. Thus the C-instance control packets are not processed by the P 1304 routers, and MVPN-specific PIM routes can be extended from site to 1305 site without appearing in the P routers. 1307 As specified in section 5.1.2, when a PE distributes VPN-IP routes 1308 which are eligible for use as UMH routes, the PE MUST include a VRF 1309 Route Import Extended Community with each route. For a given MVPN, a 1310 single such IP address MUST be used, and that same IP address MUST be 1311 used as the source address in all PIM control packets for that MVPN. 1313 5.2.2. PIM C-instance RPF Determination 1315 Although the MI-PMSI is treated by PIM as a LAN interface, unicast 1316 routing is NOT run over it, and there are no unicast routing 1317 adjacencies over it. It is therefore necessary to specify special 1318 procedures for determining when the MI-PMSI is to be regarded as the 1319 "RPF Interface" for a particular C-address. 1321 The PE follows the procedures of section 5.1 to determine the 1322 selected UMH route. If that route is NOT a VPN-IP route learned from 1323 BGP as described in [RFC4364], or if that route's outgoing interface 1324 is one of the interfaces associated with the VRF, then ordinary PIM 1325 procedures for determining the RPF interface apply. 1327 However, if the selected UMH route is a VPN-IP route whose outgoing 1328 interface is not one of the interfaces associated with the VRF, then 1329 PIM will consider the RPF interface to be the MI-PMSI associated with 1330 the VPN-specific PIM instance. 1332 Once PIM has determined that the RPF interface for a particular C- 1333 source is the MI-PMSI, it is necessary for PIM to determine the "RPF 1334 neighbor" for that C-source. This will be one of the other PEs that 1335 is a PIM adjacency over the MI-PMSI. In particular, it will be the 1336 "selected upstream PE" as defined in section 5.1. 1338 5.2.3. Backwards Compatibility 1340 There are older implementations which do not use the VRF Route Import 1341 Extended Community or any explicit mechanism for carrying information 1342 to identify the originating PE of a selected UMH route. 1344 For backwards compatibility, when the selected UMH route does not 1345 have any such mechanism, the IP address from the "BGP Next Hop" field 1346 of the selected UMH route will be used as the selected UMH address, 1347 and will be treated as the address of the upstream PE. There is no 1348 selected upstream RD in this case. However, use of this backwards 1349 compatibility technique presupposes that: 1351 - The PE which originated the selected UMH route placed the same IP 1352 address in the BGP Next Hop field that it is using as the source 1353 address of the PE-PE PIM control packets for this MVPN. 1355 - The MVPN is not an Inter-AS MVPN that uses option b from section 1356 10 of [RFC4364]. 1358 Should either of these conditions fail, interoperability with the 1359 older implementations will not be achieved. 1361 5.3. Use of BGP for Carrying C-Multicast Routing 1363 It is possible to use BGP to carry C-multicast routing information 1364 from PE to PE, dispensing entirely with the transmission of C- 1365 Join/Prune messages from PE to PE. This section describes the 1366 procedures for carrying intra-AS multicast routing information. 1367 Inter-AS procedures are described in section 8. The complete 1368 specification of both sets of procedures and of the encodings can be 1369 found in [MVPN-BGP]. 1371 5.3.1. Sending BGP Updates 1373 The MCAST-VPN address family is used for this purpose. MCAST-VPN 1374 routes used for the purpose of carrying C-multicast routing 1375 information are distinguished from those used for the purpose of 1376 carrying auto-discovery information by means of a "route type" field 1377 which is encoded into the NLRI. The following information is 1378 required in BGP to advertise the MVPN routing information. The NLRI 1379 contains: 1381 - The type of C-multicast route. 1383 There are two types: 1385 * source tree join 1387 * shared tree join 1389 - The RD configured, for the MVPN, on the PE that is advertising 1390 the information. The RD is required in order to uniquely 1391 identify the when different MVPNs have 1392 overlapping address spaces. 1394 - The C-Group address. 1396 - The C-Source address. 1398 This field is omitted if the route type is "shared tree join". 1399 In the case of a shared tree join, the C-source is a C-RP. The 1400 address of the C-RP corresponding to the C-group address is 1401 presumed to be already known (or automatically determinable) be 1402 the other PEs, though means that are outside the scope of this 1403 specification. 1405 - The Selected Upstream RD corresponding to the C-source address 1406 (determined by the procedures of section 5.1). 1408 Whenever a C-multicast route is sent, it must also carry the Selected 1409 Upstream Multicast Hop corresponding to the C-source address 1410 (determined by the procedures of section 5.1). The selected upstream 1411 multicast hop must be encoded as part of a Route Target Extended 1412 Community, to facilitate the optional use of filters which can 1413 prevent the distribution of the update to BGP speakers other than the 1414 upstream multicast hop. See section 10.1.3 of [MVPN-BGP] for the 1415 details. 1417 There is no C-multicast route corresponding to the PIM function of 1418 pruning a source off the shared tree when a PE switches from a tree to a tree. Section 9 of this document specifies 1420 a mandatory procedure that ensures that if any PE joins a 1421 source tree, all other PEs that have joined or will join the shared tree will also join the source tree. This 1423 eliminates the need for a C-multicast route that prunes C-S off the 1424 shared tree when switching from to 1425 tree. 1427 5.3.2. Explicit Tracking 1429 Note that the upstream multicast hop is NOT part of the NLRI in the 1430 C-multicast BGP routes. This means that if several PEs join the same 1431 C-tree, the BGP routes they distribute to do so are regarded by BGP 1432 as comparable routes, and only one will be installed. If a route 1433 reflector is being used, this further means that the PE which is used 1434 to reach the C-source will know only that one or more of the other 1435 PEs have joined the tree, but it won't know which one. That is, this 1436 BGP update mechanism does not provide "explicit tracking". Explicit 1437 tracking is not provided by default because it increases the amount 1438 of state needed and thus decreases scalability. Also, as 1439 constructing the C-PIM messages to send "upstream" for a given tree 1440 does not depend on knowing all the PEs that are downstream on that 1441 tree, there is no reason for the C-multicast route type updates to 1442 provide explicit tracking. 1444 There are some cases in which explicit tracking is necessary in order 1445 for the PEs to set up certain kinds of P-trees. There are other 1446 cases in which explicit tracking is desirable in order to determine 1447 how to optimally aggregate multicast flows onto a given aggregate 1448 tree. As these functions have to do with the setting up of 1449 infrastructure in the P-network, rather than with the dissemination 1450 of C-multicast routing information, any explicit tracking that is 1451 necessary is handled by sending the "source active" A-D routes, that 1452 are described in sections 9 and 10. Detailed procedures for turning 1453 on explicit tracking can be found in [MVPN-BGP]. 1455 5.3.3. Withdrawing BGP Updates 1457 A PE removes itself from a C-multicast tree (shared or source) by 1458 withdrawing the corresponding BGP update. 1460 If a PE has pruned a C-source from a shared C-multicast tree, and it 1461 needs to "unprune" that source from that tree, it does so by 1462 withdrawing the route that pruned the source from the tree. 1464 6. I-PMSI Instantiation 1466 This section describes how tunnels in the SP network can be used to 1467 instantiate an I-PMSI for an MVPN on a PE. When C-multicast data is 1468 delivered on an I-PMSI, the data will go to all PEs that are on the 1469 path to receivers for that C-group, but may also go to PEs that are 1470 not on the path to receivers for that C-group. 1472 The tunnels which instantiate I-PMSIs can be either PE-PE unicast 1473 tunnels or P-multicast trees. When PE-PE unicast tunnels are used the 1474 PMSI is said to be instantiated using ingress replication. The 1475 instantiation of a tunnel for an I-PMSI is a matter of local policy 1476 decision and is not mandatory. Even for a site attached to multicast 1477 sources, transport of customer multicast traffic can be accommodated 1478 with S-PMSI-bound tunnels only 1480 6.1. MVPN Membership and Egress PE Auto-Discovery 1482 As described in section 4 a PE discovers the MVPN membership 1483 information of other PEs using BGP auto-discovery mechanisms or using 1484 a mechanism that instantiates a MI-PMSI interface. When a PE supports 1485 only a UI-PMSI service for an MVPN, it MUST rely on the BGP auto- 1486 discovery mechanisms for discovering this information. This 1487 information also results in a PE in the sender sites set discovering 1488 the leaves of the P-multicast tree, which are the egress PEs that 1489 have sites in the receiver sites set in one or more MVPNs mapped onto 1490 the tree. 1492 6.1.1. Auto-Discovery for Ingress Replication 1494 In order for a PE to use Unicast Tunnels to send a C-multicast data 1495 packet for a particular MVPN to a set of remote PEs, the remote PEs 1496 must be able to correctly decapsulate such packets and to assign each 1497 one to the proper MVPN. This requires that the encapsulation used for 1498 sending packets through the tunnel have demultiplexing information 1499 which the receiver can associate with a particular MVPN. 1501 If ingress replication is being used for an MVPN, the PEs announce 1502 this as part of the BGP based MVPN membership auto-discovery process, 1503 described in section 4. The PMSI tunnel attribute specifies ingress 1504 replication. The demultiplexor value is a downstream-assigned MPLS 1505 label (i.e., assigned by the PE that originated the A-D route, to be 1506 used by other PEs when they send multicast packets on a unicast 1507 tunnel to that PE). 1509 Other demultiplexing procedures for unicast are under consideration. 1511 6.1.2. Auto-Discovery for P-Multicast Trees 1513 A PE announces the P-multicast technology it supports for a specified 1514 MVPN, as part of the BGP MVPN membership discovery. This allows other 1515 PEs to determine the P-multicast technology they can use for building 1516 P-multicast trees to instantiate an I-PMSI. If a PE has a tree 1517 instantiation of an I-PMSI, it also announces the tree identifier as 1518 part of the auto-discovery, as well as announcing its aggregation 1519 capability. 1521 The announcement of a tree identifier at discovery time is only 1522 possible if the tree already exists (e.g., a preconfigured "traffic 1523 engineered" tunnel), or if the tree can be constructed dynamically 1524 without any PE having to know in advance all the other PEs on the 1525 tree (e.g., the tree is created by receiver-initiated joins). 1527 6.2. C-Multicast Routing Information Exchange 1529 When a PE doesn't support the use of a MI-PMSI for a given MVPN, it 1530 MUST either unicast MVPN routing information using PIM or else use 1531 BGP for exchanging the MVPN routing information. 1533 6.3. Aggregation 1535 A P-multicast tree can be used to instantiate a PMSI service for only 1536 one MVPN or for more than one MVPN. When a P-multicast tree is shared 1537 across multiple MVPNs it is termed an "Aggregate Tree". The 1538 procedures described in this document allow a single SP multicast 1539 tree to be shared across multiple MVPNs. The procedures that are 1540 specific to aggregation are optional and are explicitly pointed out. 1541 Unless otherwise specified a P-multicast tree technology supports 1542 aggregation. 1544 Aggregate Trees allow a single P-multicast tree to be used across 1545 multiple MVPNs and hence state in the SP core grows per-set-of-MVPNs 1546 and not per MVPN. Depending on the congruence of the aggregated 1547 MVPNs, this may result in trading off optimality of multicast 1548 routing. 1550 An Aggregate Tree can be used by a PE to provide an UI-PMSI or MI- 1551 PMSI service for more than one MVPN. When this is the case the 1552 Aggregate Tree is said to have an inclusive mapping. 1554 6.3.1. Aggregate Tree Leaf Discovery 1556 BGP MVPN membership discovery allows a PE to determine the different 1557 Aggregate Trees that it should create and the MVPNs that should be 1558 mapped onto each such tree. The leaves of an Aggregate Tree are 1559 determined by the PEs, supporting aggregation, that belong to all the 1560 MVPNs that are mapped onto the tree. 1562 If an Aggregate Tree is used to instantiate one or more S-PMSIs, then 1563 it may be desirable for the PE at the root of the tree to know which 1564 PEs (in its MVPN) are receivers on that tree. This enables the PE to 1565 decide when to aggregate two S-PMSIs, based on congruence (as 1566 discussed in the next section). Thus explicit tracking may be 1567 required. Since the procedures for disseminating C-multicast routes 1568 do not provide explicit tracking, a type of A-D route known as a 1569 "Leaf A-D Route" is used. The PE which wants to assign a particular 1570 C-multicast flow to a particular Aggregate Tree can send an A-D route 1571 which elicits Leaf A-D routes from the PEs that need to receive that 1572 C-multicast flow. This provides the explicit tracking information 1573 needed to support the aggregation methodology discussed in the next 1574 section. For more details on Leaf A-D routes please refer to [MVPN- 1575 BGP]. 1577 6.3.2. Aggregation Methodology 1579 This document does not specify the mandatory implementation of any 1580 particular set of rules for determining whether or not the PMSIs of 1581 two particular MVPNs are to be instantiated by the same Aggregate 1582 Tree. This determination can be made by implementation-specific 1583 heuristics, by configuration, or even perhaps by the use of offline 1584 tools. 1586 It is the intention of this document that the control procedures will 1587 always result in all the PEs of an MVPN to agree on the PMSIs which 1588 are to be used and on the tunnels used to instantiate those PMSIs. 1590 This section discusses potential methodologies with respect to 1591 aggregation. 1593 The "congruence" of aggregation is defined by the amount of overlap 1594 in the leaves of the customer trees that are aggregated on a SP tree. 1595 For Aggregate Trees with an inclusive mapping the congruence depends 1596 on the overlap in the membership of the MVPNs that are aggregated on 1597 the tree. If there is complete overlap i.e. all MVPNs have exactly 1598 the same sites, aggregation is perfectly congruent. As the overlap 1599 between the MVPNs that are aggregated reduces, i.e. the number of 1600 sites that are common across all the MVPNs reduces, the congruence 1601 reduces. 1603 If aggregation is done such that it is not perfectly congruent a PE 1604 may receive traffic for MVPNs to which it doesn't belong. As the 1605 amount of multicast traffic in these unwanted MVPNs increases 1606 aggregation becomes less optimal with respect to delivered traffic. 1607 Hence there is a tradeoff between reducing state and delivering 1608 unwanted traffic. 1610 An implementation should provide knobs to control the congruence of 1611 aggregation. These knobs are implementation dependent. Configuring 1612 the percentage of sites that MVPNs must have in common to be 1613 aggregated, is an example of such a knob. This will allow a SP to 1614 deploy aggregation depending on the MVPN membership and traffic 1615 profiles in its network. If different PEs or servers are setting up 1616 Aggregate Trees this will also allow a service provider to engineer 1617 the maximum amount of unwanted MVPNs hat a particular PE may receive 1618 traffic for. 1620 6.3.3. Encapsulation of the Aggregate Tree 1622 An Aggregate Tree may use an IP/GRE encapsulation or an MPLS 1623 encapsulation. The protocol type in the IP/GRE header in the former 1624 case and the protocol type in the data link header in the latter need 1625 further explanation. This will be specified in a separate document. 1627 6.3.4. Demultiplexing C-multicast traffic 1629 When multiple MVPNs are aggregated onto one P-Multicast tree, 1630 determining the tree over which the packet is received is not 1631 sufficient to determine the MVPN to which the packet belongs. The 1632 packet must also carry some demultiplexing information to allow the 1633 egress PEs to determine the MVPN to which the packet belongs. Since 1634 the packet has been multicast through the P network, any given 1635 demultiplexing value must have the same meaning to all the egress 1636 PEs. The demultiplexing value is a MPLS label that corresponds to 1637 the multicast VRF to which the packet belongs. This label is placed 1638 by the ingress PE immediately beneath the P-Multicast tree header. 1639 Each of the egress PEs must be able to associate this MPLS label with 1640 the same MVPN. If downstream label assignment were used this would 1641 require all the egress PEs in the MVPN to agree on a common label for 1642 the MVPN. Instead the MPLS label is upstream assigned [MPLS-UPSTREAM- 1643 LABEL]. The label bindings are advertised via BGP updates originated 1644 the ingress PEs. 1646 This procedure requires each egress PE to support a separate label 1647 space for every other PE. The egress PEs create a forwarding entry 1648 for the upstream assigned MPLS label, allocated by the ingress PE, in 1649 this label space. Hence when the egress PE receives a packet over an 1650 Aggregate Tree, it first determines the tree that the packet was 1651 received over. The tree identifier determines the label space in 1652 which the upstream assigned MPLS label lookup has to be performed. 1653 The same label space may be used for all P-multicast trees rooted at 1654 the same ingress PE, or an implementation may decide to use a 1655 separate label space for every P-multicast tree. 1657 The support of aggregation for shared trees and MP2MP trees is 1658 discussed in section 6.6. 1660 The encapsulation format is either MPLS or MPLS-in-something (e.g. 1661 MPLS-in-GRE [MPLS-IP]). When MPLS is used, this label will appear 1662 immediately below the label that identifies the P-multicast tree. 1663 When MPLS-in-GRE is used, this label will be the top MPLS label that 1664 appears when the GRE header is stripped off. 1666 When IP encapsulation is used for the P-multicast Tree, whatever 1667 information that particular encapsulation format uses for identifying 1668 a particular tunnel is used to determine the label space in which the 1669 MPLS label is looked up. 1671 If the P-multicast tree uses MPLS encapsulation, the P-multicast tree 1672 is itself identified by an MPLS label. The egress PE MUST NOT 1673 advertise IMPLICIT NULL or EXPLICIT NULL for that tree. Once the 1674 label representing the tree is popped off the MPLS label stack, the 1675 next label is the demultiplexing information that allows the proper 1676 MVPN to be determined. 1678 This specification requires that, to support this sort of 1679 aggregation, there be at least one upstream-assigned label per MVPN. 1680 It does not require that there be only one. For example, an ingress 1681 PE could assign a unique label to each C-(S,G). (This could be done 1682 using the same technique this is used to assign a particular C-(S,G) 1683 to an S-PMSI, see section 7.3.) 1685 6.4. Mapping Received Packets to MVPNs 1687 When an egress PE receives a C-multicast data packet over a P- 1688 multicast tree, it needs to forward the packet to the CEs that have 1689 receivers in the packet's C-multicast group. In order to do this the 1690 egress PE needs to determine the tunnel that the packet was received 1691 on. The PE can then determine the MVPN that the packet belongs to and 1692 if needed do any further lookups that are needed to forward the 1693 packet. 1695 6.4.1. Unicast Tunnels 1697 When ingress replication is used, the MVPN to which the received C- 1698 multicast data packet belongs can be determined by the MPLS label 1699 that was allocated by the egress. This label is distributed by the 1700 egress. 1702 6.4.2. Non-Aggregated P-Multicast Trees 1704 If a P-multicast tree is associated with only one MVPN, determining 1705 the P-multicast tree on which a packet was received is sufficient to 1706 determine the packet's MVPN. All that the egress PE needs to know is 1707 the MVPN the P-multicast tree is associated with. 1709 There are different ways in which the egress PE can learn this 1710 association: 1712 a) Configuration. The P-multicast tree that a particular MVPN 1713 belongs to is configured on each PE. 1715 b) BGP based advertisement of the P-multicast tree - MPVN mapping 1716 after the root of the tree discovers the leaves of the tree. 1717 The root of the tree sets up the tree after discovering each of 1718 the PEs that belong to the MVPN. It then advertises the P- 1719 multicast tree - MVPN mapping to each of the leaves. This 1720 mechanism can be used with both source initiated trees [e.g. 1721 RSVP-TE P2MP LSPs] and receiver initiated trees [e.g. PIM 1722 trees]. 1724 c) BGP based advertisement of the P-multicast tree - MVPN mapping 1725 as part of the MVPN membership discovery. The root of the tree 1726 advertises, to each of the other PEs that belong to the MVPN, 1727 the P-multicast tree that the MVPN is associated with. This 1728 implies that the root doesn't need to know the leaves of the 1729 tree beforehand. This is possible only for receiver initiated 1730 trees e.g. PIM based trees. 1732 Both of the above require the BGP based advertisement to contain the 1733 P-multicast tree identifier. This identifier is encoded as a BGP 1734 attribute and contains the following elements: 1736 - Tunnel Type. 1738 - Tunnel identifier. The semantics of the identifier is determined 1739 by the tunnel type. 1741 6.4.3. Aggregate P-Multicast Trees 1743 Once a PE sets up an Aggregate Tree it needs to announce the C- 1744 multicast groups being mapped to this tree to other PEs in the 1745 network. This procedure is referred to as Aggregate Tree discovery. 1746 For an Aggregate Tree with an inclusive mapping this discovery 1747 implies announcing: 1749 - The mapping of all MVPNs mapped to the Tree. 1751 - For each MVPN mapped onto the tree the inner label allocated for 1752 it by the ingress PE. The use of this label is explained in the 1753 demultiplexing procedures of section 6.3.4. 1755 - The P-multicast tree Identifier 1757 The egress PE creates a logical interface corresponding to the tree 1758 identifier. This interface is the RPF interface for all the entries mapped to that tree. 1761 When PIM is used to setup P-multicast trees, the egress PE also Joins 1762 the P-Group Address corresponding to the tree. This results in setup 1763 of the PIM P-multicast tree. 1765 6.5. I-PMSI Instantiation Using Ingress Replication 1767 As described in section 3 a PMSI can be instantiated using Unicast 1768 Tunnels between the PEs that are participating in the MVPN. In this 1769 mechanism the ingress PE replicates a C-multicast data packet 1770 belonging to a particular MVPN and sends a copy to all or a subset of 1771 the PEs that belong to the MVPN. A copy of the packet is tunneled to 1772 a remote PE over an Unicast Tunnel to the remote PE. IP/GRE Tunnels 1773 or MPLS LSPs are examples of unicast tunnels that may be used. Note 1774 that the same Unicast Tunnel can be used to transport packets 1775 belonging to different MVPNs. 1777 Ingress replication can be used to instantiate a UI-PMSI. The PE sets 1778 up unicast tunnels to each of the remote PEs that support ingress 1779 replication. For a given MVPN all C-multicast data packets are sent 1780 to each of the remote PEs in the MVPN that support ingress 1781 replication. Hence a remote PE may receive C-multicast data packets 1782 for a group even if it doesn't have any receivers in that group. 1784 Ingress replication can also be used to instantiate a MI-PMSI. In 1785 this case each PE has a mesh of unicast tunnels to every other PE in 1786 that MVPN. 1788 However when ingress replication is used it is recommended that only 1789 S-PMSIs be used. Instantiation of S-PMSIs with ingress replication is 1790 described in section 7.1. Note that this requires the use of 1791 explicit tracking, i.e., a PE must know which of the other PEs have 1792 receivers for each C-multicast tree. 1794 6.6. Establishing P-Multicast Trees 1796 It is believed that the architecture outlined in this document places 1797 no limitations on the protocols used to instantiate P-multicast 1798 trees. However, the only protocols being explicitly considered are 1799 PIM-SM, PIM-SSM, BIDIR-PIM, RSVP-TE, and mLDP. 1801 A P-multicast tree can be either a source tree or a shared tree. A 1802 source tree is used to carry traffic only for the multicast VRFs that 1803 exist locally on the root of the tree i.e. for which the root has 1804 local CEs. The root is a PE router. Source P-multicast trees can be 1805 instantiated using PIM-SM, PIM-SSM, RSVP-TE P2MP LSPs, and mLDP P2MP 1806 LSPs. 1808 A shared tree on the other hand can be used to carry traffic 1809 belonging to VRFs that exist on other PEs as well. The root of a 1810 shared tree is not necessarily one of the PEs in the MVPN. All PEs 1811 that use the shared tree will send MVPN data packets to the root of 1812 the shared tree; if PIM is being used as the control protocol, PIM 1813 control packets also get sent to the root of the shared tree. This 1814 may require an unicast tunnel between each of these PEs and the root. 1815 The root will then send them on the shared tree and all the PEs that 1816 are leaves of the shared tree will receive the packets. For example a 1817 RP based PIM-SM tree would be a shared tree. Shared trees can be 1818 instantiated using PIM-SM, PIM-SSM, BIDIR-PIM, RSVP-TE P2MP LSPs, 1819 mLDP P2MP LSPs, and mLDP MP2MP LSPs.. Aggregation support for 1820 bidirectional P-trees (i.e., BIDIR-PIM trees or mLDP MP2MP trees) is 1821 for further study. Shared trees require all the PEs to discover the 1822 root of the shared tree for a MVPN. To achieve this the root of a 1823 shared tree advertises as part of the BGP based MVPN membership 1824 discovery: 1826 - The capability to setup a shared tree for a specified MVPN. 1828 - A downstream assigned label that is to be used by each PE to 1829 encapsulate a MVPN data packet, when they send this packet to the 1830 root of the shared tree. 1832 - A downstream assigned label that is to be used by each PE to 1833 encapsulate a MVPN control packet, when they send this packet to 1834 the root of the shared tree. 1836 Both a source tree and a shared tree can be used to instantiate an I- 1837 PMSI. If a source tree is used to instantiate an UI-PMSI for a MVPN, 1838 all the other PEs that belong to the MVPN, must be leaves of the 1839 source tree. If a shared tree is used to instantiate a UI-PMSI for a 1840 MVPN, all the PEs that are members of the MVPN must be leaves of the 1841 shared tree. 1843 6.7. RSVP-TE P2MP LSPs 1845 This section describes procedures that are specific to the usage of 1846 RSVP-TE P2MP LSPs for instantiating a UI-PMSI. The RSVP-TE P2MP LSP 1847 can be either a source tree or a shared tree. Procedures in [RSVP- 1848 P2MP] are used to signal the LSP. The LSP is signaled after the root 1849 of the LSP discovers the leaves. The egress PEs are discovered using 1850 the MVPN membership procedures described in section 4. RSVP-TE P2MP 1851 LSPs can optionally support aggregation. 1853 6.7.1. P2MP TE LSP Tunnel - MVPN Mapping 1855 P2MP TE LSP Tunnel to MVPN mapping can be learned at the egress PEs 1856 using either option (a) or option (b) described in section 6.4.2. 1857 Option (b) i.e. BGP based advertisements of the P2MP TE LSP Tunnel - 1858 MPVN mapping require that the root of the tree include the P2MP TE 1859 LSP Tunnel identifier as the tunnel identifier in the BGP 1860 advertisements. This identifier contains the following information 1861 elements: 1863 - The type of the tunnel is set to RSVP-TE P2MP Tunnel 1865 - RSVP-TE P2MP Tunnel's SESSION Object 1867 - Optionally RSVP-TE P2MP LSP's SENDER_TEMPLATE Object. This object 1868 is included when it is desired to identify a particular P2MP TE 1869 LSP. 1871 6.7.2. Demultiplexing C-Multicast Data Packets 1873 Demultiplexing the C-multicast data packets at the egress PE follow 1874 procedures described in section 6.3.4. The RSVP-TE P2MP LSP Tunnel 1875 must be signaled with penultimate-hop-popping (PHP) off. Signaling 1876 the P2MP TE LSP Tunnel with PHP off requires an extension to RSVP-TE 1877 which will be described later. 1879 7. Optimizing Multicast Distribution via S-PMSIs 1881 Whenever a particular multicast stream is being sent on an I-PMSI, it 1882 is likely that the data of that stream is being sent to PEs that do 1883 not require it. If a particular stream has a significant amount of 1884 traffic, it may be beneficial to move it to an S-PMSI which includes 1885 only those PEs that are transmitters and/or receivers (or at least 1886 includes fewer PEs that are neither). 1888 If explicit tracking is being done, S-PMSI creation can also be 1889 triggered on other criteria. For instance there could be a "pseudo 1890 wasted bandwidth" criteria: switching to an S-PMSI would be done if 1891 the bandwidth multiplied by the number of uninterested PEs (PE that 1892 are receiving the stream but have no receivers) is above a specified 1893 threshold. The motivation is that (a) the total bandwidth wasted by 1894 many sparsely subscribed low-bandwidth groups may be large, and (b) 1895 there's no point to moving a high-bandwidth group to an S-PMSI if all 1896 the PEs have receivers for it. 1898 Switching a (C-S, C-G) stream to an S-PMSI may require the root of 1899 the S-PMSI to determine the egress PEs that need to receive the (C-S, 1900 C-G) traffic. This is true in the following cases: 1902 - If the tunnel is a source initiated tree, such as a RSVP-TE P2MP 1903 Tunnel, the PE needs to know the leaves of the tree before it can 1904 instantiate the S-PMSI. 1906 - If a PE instantiates multiple S-PMSIs, belonging to different 1907 MVPNs, using one P-multicast tree, such a tree is termed an 1908 Aggregate Tree with a selective mapping. The setting up of such 1909 an Aggregate Tree requires the ingress PE to know all the other 1910 PEs that have receivers for multicast groups that are mapped onto 1911 the tree. 1913 The above two cases require that explicit tracking be done for the 1914 (C-S, C-G) stream. The root of the S-PMSI MAY decide to do explicit 1915 tracking of this stream only after it has determined to move the 1916 stream to an S-PMSI, or it MAY have been doing explicit tracking all 1917 along. 1919 If the S-PMSI is instantiated by a P-multicast tree, the PE at the 1920 root of the tree must signal the leaves of the tree that the (C-S, C- 1921 G) stream is now bound to the to the S-PMSI. Note that the PE could 1922 create the identity of the P-multicast tree prior to the actual 1923 instantiation of the tunnel. 1925 If the S-PMSI is instantiated by a source-initiated P-multicast tree 1926 (e.g., an RSVP-TE P2MP tunnel), the PE at the root of the tree must 1927 establish the source-initiated P-multicast tree to the leaves. This 1928 tree MAY have been established before the leaves receive the S-PMSI 1929 binding, or MAY be established after the leaves receives the binding. 1930 The leaves MUST NOT switch to the S-PMSI until they receive both the 1931 binding and the tree signaling message. 1933 7.1. S-PMSI Instantiation Using Ingress Replication 1935 As described in section 6.1.1, ingress replication can be used to 1936 instantiate a UI-PMSI. However this can result in a PE receiving 1937 packets for a multicast group for which it doesn't have any 1938 receivers. This can be avoided if the ingress PE tracks the remote 1939 PEs which have receivers in a particular C-multicast group. In order 1940 to do this it needs to receive C-Joins from each of the remote PEs. 1941 It then replicates the C-multicast data packet and sends it to only 1942 those egress PEs which are on the path to a receiver of that C-group. 1943 It is possible that each PE that is using ingress replication 1944 instantiates only S-PMSIs. It is also possible that some PEs 1945 instantiate UI-PMSIs while others instantiate only S-PMSIs. In both 1946 these cases the PE MUST either unicast MVPN routing information using 1947 PIM or use BGP for exchanging the MVPN routing information. This is 1948 because there may be no MI-PMSI available for it to exchange MVPN 1949 routing information. 1951 Note that the use of ingress replication doesn't require any extra 1952 procedures for signaling the binding of the S-PMSI from the ingress 1953 PE to the egress PEs. The procedures described for I-PMSIs are 1954 sufficient. 1956 7.2. Protocol for Switching to S-PMSIs 1958 We describe two protocols for switching to S-PMSIs. These protocols 1959 can be used when the tunnel that instantiates the S-PMSI is a P- 1960 multicast tree. 1962 7.2.1. A UDP-based Protocol for Switching to S-PMSIs 1964 This procedure can be used for any MVPN which has an MI-PMSI. 1965 Traffic from all multicast streams in a given MPVN is sent, by 1966 default, on the MI-PMSI. Consider a single multicast stream within a 1967 given MVPN, and consider a PE which is attached to a source of 1968 multicast traffic for that stream. The PE can be configured to move 1969 the stream from the MI-PMSI to an S-PMSI if certain configurable 1970 conditions are met. To do this, it needs to inform all the PEs which 1971 attach to receivers for stream. These PEs need to start listening 1972 for traffic on the S-PMSI, and the transmitting PE may start sending 1973 traffic on the S-PMSI when it is reasonably certain that all 1974 receiving PEs are listening on the S-PMSI. 1976 7.2.1.1. Binding a Stream to an S-PMSI 1978 When a PE which attaches to a transmitter for a particular multicast 1979 stream notices that the conditions for moving the stream to an S-PMSI 1980 are met, it begins to periodically send an "S-PMSI Join Message" on 1981 the MI-PMSI. The S-PMSI Join is a UDP-encapsulated message whose 1982 destination address is ALL-PIM-ROUTERS (224.0.0.13), and whose 1983 destination port is 3232. 1985 The S-PMSI Join Message contains the following information: 1987 - An identifier for the particular multicast stream which is to be 1988 bound to the S-PMSI. This can be represented as an (S,G) pair. 1990 - An identifier for the particular S-PMSI to which the stream is to 1991 be bound. This identifier is a structured field which includes 1992 the following information: 1994 * The type of tunnel used to instantiate the S-PMSI 1996 * An identifier for the tunnel. The form of the identifier 1997 will depend upon the tunnel type. The combination of tunnel 1998 identifier and tunnel type should contain enough information 1999 to enable all the PEs to "join" the tunnel and receive 2000 messages from it. 2002 * Any demultiplexing information needed by the tunnel 2003 encapsulation protocol to identify the particular S-PMSI. 2004 This allows a single tunnel to aggregate multiple S-PMSIs. 2005 If a particular tunnel is not aggregating multiple S-PMSIs, 2006 then no demultiplexing information is needed. 2008 A PE router which is not connected to a receiver will still receive 2009 the S-PMSI Joins, and MAY cache the information contained therein. 2010 Then if the PE later finds that it is attached to a receiver, it can 2011 immediately start listening to the S-PMSI. 2013 Upon receiving the S-PMSI Join, PE routers connected to receivers for 2014 the specified stream will take whatever action is necessary to start 2015 receiving multicast data packets on the S-PMSI. The precise action 2016 taken will depend upon the tunnel type. 2018 After a configurable delay, the PE router which is sending the S-PMSI 2019 Joins will start transmitting the stream's data packets on the S- 2020 PMSI. 2022 When the pre-configured conditions are no longer met for a particular 2023 stream, e.g. the traffic stops, the PE router connected to the source 2024 stops announcing S-PMSI Joins for that stream. Any PE that does not 2025 receive, over a configurable interval, an S-PMSI Join for a 2026 particular stream will stop listening to the S-PMSI. 2028 7.2.1.2. Packet Formats and Constants 2030 The S-PMSI Join message is encapsulated within UDP, and has the 2031 following type/length/value (TLV) encoding: 2033 0 1 2 3 2034 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2035 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2036 | Type | Length | Value | 2037 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2038 | . | 2039 | . | 2040 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2042 Type (8 bits) 2044 Length (16 bits): the total number of octets in the Type, Length, and 2045 Value fields combined 2047 Value (variable length) 2049 Currently only one type of S-PMSI Join is defined. A type 1 S-PMSI 2050 Join is used when the S-PMSI tunnel is a PIM tunnel which is used to 2051 carry a single multicast stream, where the packets of that stream 2052 have IPv4 source and destination IP addresses. 2054 0 1 2 3 2055 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2056 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2057 | Type | Length | Reserved | 2058 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2059 | C-source | 2060 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2061 | C-group | 2062 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2063 | P-group | 2064 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2066 Type (8 bits): 1 2068 Length (16 bits): 16 2070 Reserved (8 bits): This field SHOULD be zero when transmitted, and 2071 MUST be ignored when received. 2073 C-Source (32 bits): the IPv4 address of the traffic source in the 2074 VPN. 2076 C-Group (32 bits): the IPv4 address of the multicast traffic 2077 destination address in the VPN. 2079 P-Group (32 bits): the IPv4 group address that the PE router is going 2080 to use to encapsulate the flow (C-Source, C-Group). 2082 The P-group identifies the S-PMSI tunnel, and the (C-S, C-G) 2083 identifies the multicast flow that is carried in the tunnel. 2085 The protocol uses the following constants. 2087 [S-PMSI_DELAY]: 2089 the PE router which is to transmit onto the S-PMSI will delay 2090 this amount of time before it begins using the S-PMSI. The 2091 default value is 3 seconds. 2093 [S-PMSI_TIMEOUT]: 2095 if a PE (other than the transmitter) does not receive any packets 2096 over the S-PMSI tunnel for this amount of time, the PE will prune 2097 itself from the S-PMSI tunnel, and will expect (C-S, C-G) packets 2098 to arrive on an I-PMSI. The default value is 3 minutes. This 2099 value must be consistent among PE routers. 2101 [S-PMSI_HOLDOWN]: 2103 if the PE that transmits onto the S-PMSI does not see any (C-S, 2104 C-G) packets for this amount of time, it will resume sending (C- 2105 S, C-G) packets on an I-PMSI. 2107 This is used to avoid oscillation when traffic is bursty. The 2108 default value is 1 minute. 2110 [S-PMSI_INTERVAL] 2111 the interval the transmitting PE router uses to periodically send 2112 the S-PMSI Join message. The default value is 60 seconds. 2114 7.2.2. A BGP-based Protocol for Switching to S-PMSIs 2116 This procedure can be used for a MVPN that is using either a UI-PMSI 2117 or a MI-PMSI. Consider a single multicast stream for a C-(S, G) 2118 within a given MVPN, and consider a PE which is attached to a source 2119 of multicast traffic for that stream. The PE can be configured to 2120 move the stream from the MI-PMSI or UI-PMSI to an S-PMSI if certain 2121 configurable conditions are met. Once a PE decides to move the C-(S, 2122 G) for a given MVPN to a S-PMSI, it needs to instantiate the S-PMSI 2123 using a tunnel and announce to all the egress PEs, that are on the 2124 path to receivers of the C-(S, G), of the binding of the S-PMSI to 2125 the C-(S, G). The announcement is done using BGP. Depending on the 2126 tunneling technology used, this announcement may be done before or 2127 after setting up the tunnel. The source and egress PEs have to switch 2128 to using the S-PMSI for the C-(S, G). 2130 7.2.2.1. Advertising C-(S, G) Binding to a S-PMSI using BGP 2132 The ingress PE informs all the PEs that are on the path to receivers 2133 of the C-(S, G) of the binding of the S-PMSI to the C-(S, G). The BGP 2134 announcement is done by sending update for the MCAST-VPN address 2135 family. An A-D route is used, containing the following information: 2137 a) IP address of the originating PE 2139 b) The RD configured locally for the MVPN. This is required to 2140 uniquely identify the as the addresses 2141 could overlap between different MVPNs. This is the same RD 2142 value used in the auto-discovery process. 2144 c) The C-Source address. 2146 d) The C-Group address. 2148 e) A PE MAY aggregate two or more S-PMSIs originated by the PE 2149 onto the same P-Multicast tree. If the PE already advertises S- 2150 PMSI auto-discovery routes for these S-PMSIs, then aggregation 2151 requires the PE to re-advertise these routes. The re-advertised 2152 routes MUST be the same as the original ones, except for the 2153 PMSI tunnel attribute. If the PE has not previously advertised 2154 S-PMSI auto-discovery routes for these S-PMSIs, then the 2155 aggregation requires the PE to advertise (new) S-PMSI auto- 2156 discovery routes for these S-PMSIs. The PMSI Tunnel attribute 2157 in the newly advertised/re-advertised routes MUST carry the 2158 identity of the P- Multicast tree that aggregates the S-PMSIs. 2159 If at least some of the S-PMSIs aggregated onto the same P- 2160 Multicast tree belong to different MVPNs, then all these routes 2161 MUST carry an MPLS upstream assigned label [MPLS-UPSTREAM- 2162 LABEL, section 6.3.4]. If all these aggregated S-PMSIs belong 2163 to the same MVPN, then the routes MAY carry an MPLS upstream 2164 assigned label [MPLS-UPSTREAM-LABEL]. The labels MUST be 2165 distinct on a per MVPN basis, and MAY be distinct on a per 2166 route basis. 2168 When a PE distributes this information via BGP, it must include the 2169 following: 2171 1. An identifier for the particular S-PMSI to which the stream is 2172 to be bound. This identifier is a structured field which 2173 includes the following information: 2175 * The type of tunnel used to instantiate the S-PMSI 2177 * An identifier for the tunnel. The form of the identifier 2178 will depend upon the tunnel type. The combination of 2179 tunnel identifier and tunnel type should contain enough 2180 information to enable all the PEs to "join" the tunnel and 2181 receive messages from it. 2183 2. Route Target Extended Communities attribute. This is used as 2184 described in section 4. 2186 7.2.2.2. Explicit Tracking 2188 If the PE wants to enable explicit tracking for the specified flow, 2189 it also indicates this in the A-D route it uses to bind the flow to a 2190 particular S-PMSI. Then any PE which receives the A-D route will 2191 respond with a "Leaf A-D Route" in which it identifies itself as a 2192 receiver of the specified flow. The Leaf A-D route will be withdrawn 2193 when the PE is no longer a receiver for the flow. 2195 If the PE needs to enable explicit tracking for a flow before binding 2196 the flow to an S-PMSI, it can do so by sending an A-D route 2197 identifying the flow but not specifying an S-PMSI. This will elicit 2198 the Leaf A-D Routes. This is useful when the PE needs to know the 2199 receivers before selecting an S-PMSI. 2201 7.2.2.3. Switching to S-PMSI 2203 After the egress PEs receive the announcement they setup their 2204 forwarding path to receive traffic on the S-PMSI if they have one or 2205 more receivers interested in the bound to the S-PMSI. This 2206 involves changing the RPF interface for the relevant 2207 entries to the interface that is used to instantiate the S-PMSI. If 2208 an Aggregate Tree is used to instantiate a S-PMSI this also implies 2209 setting up the demultiplexing forwarding entries based on the inner 2210 label as described in section 6.3.4. The egress PEs may perform the 2211 switch to the S-PMSI once the advertisement from the ingress PE is 2212 received or wait for a preconfigured timer to do so. 2214 A source PE may use one of two approaches to decide when to start 2215 transmitting data on the S-PMSI. In the first approach once the 2216 source PE instantiates the S-PMSI, it starts sending multicast 2217 packets for entries mapped to the S-PMSI on both that as 2218 well as on the I-PMSI, which is currently used to send traffic for 2219 the . After some preconfigured timer the PE stops sending 2220 multicast packets for on the I-PMSI. In the second 2221 approach after a certain pre-configured delay after advertising the 2222 entry bound to a S-PMSI, the source PE begins to send 2223 traffic on the S-PMSI. At this point it stops to send traffic for the 2224 on the I-PMSI. This traffic is instead transmitted on the 2225 S-PMSI. 2227 7.3. Aggregation 2229 S-PMSIs can be aggregated on a P-multicast tree. The S-PMSI to C-(S, 2230 G) binding advertisement supports aggregation. Furthermore the 2231 aggregation procedures of section 6.3 apply. It is also possible to 2232 aggregate both S-PMSIs and I-PMSIs on the same P-multicast tree. 2234 7.4. Instantiating the S-PMSI with a PIM Tree 2236 The procedures of section 7.3 tell a PE when it must start listening 2237 and stop listening to a particular S-PMSI. Those procedures also 2238 specify the method for instantiating the S-PMSI. In this section, we 2239 provide the procedures to be used when the S-PMSI is instantiated as 2240 a PIM tree. The PIM tree is created by the PIM P-instance. 2242 If a single PIM tree is being used to aggregate multiple S-PMSIs, 2243 then the PIM tree to which a given stream is bound may have already 2244 been joined by a given receiving PE. If the tree does not already 2245 exist, then the appropriate PIM procedures to create it must be 2246 executed in the P-instance. 2248 If the S-PMSI for a particular multicast stream is instantiated as a 2249 PIM-SM or BIDIR-PIM tree, the S-PMSI identifier will specify the RP 2250 and the group P-address, and the PE routers which have receivers for 2251 that stream must build a shared tree toward the RP. 2253 If the S-PMSI is instantiated as a PIM-SSM tree, the PE routers build 2254 a source tree toward the PE router that is advertising the S-PMSI 2255 Join. The IP address root of the tree is the same as the source IP 2256 address which appears in the S-PMSI Join. In this case, the tunnel 2257 identifier in the S-PMSI Join will only need to specify a group P- 2258 address. 2260 The above procedures assume that each PE router has a set of group P- 2261 addresses that it can use for setting up the PIM-trees. Each PE must 2262 be configured with this set of P-addresses. If PIM-SSM is used to 2263 set up the tunnels, then the PEs may be with overlapping sets of 2264 group P-addresses. If PIM-SSM is not used, then each PE must be 2265 configured with a unique set of group P-addresses (i.e., having no 2266 overlap with the set configured at any other PE router). The 2267 management of this set of addresses is thus greatly simplified when 2268 PIM-SSM is used, so the use of PIM-SSM is strongly recommended 2269 whenever PIM trees are used to instantiate S-PMSIs. 2271 If it is known that all the PEs which need to receive data traffic on 2272 a given S-PMSI can support aggregation of multiple S-PMSIs on a 2273 single PIM tree, then the transmitting PE, may, at its discretion, 2274 decide to bind the S-PMSI to a PIM tree which is already bound to one 2275 or more other S-PMSIs, from the same or from different MVPNs. In 2276 this case, appropriate demultiplexing information must be signaled. 2278 7.5. Instantiating S-PMSIs using RSVP-TE P2MP Tunnels 2280 RSVP-TE P2MP Tunnels can be used for instantiating S-PMSIs. 2281 Procedures described in the context of I-PMSIs in section 6.7 apply. 2283 8. Inter-AS Procedures 2285 If an MVPN has sites in more than one AS, it requires one or more 2286 PMSIs to be instantiated by inter-AS tunnels. This document 2287 describes two different types of inter-AS tunnel: 2289 1. "Segmented Inter-AS tunnels" 2291 A segmented inter-AS tunnel consists of a number of independent 2292 segments which are stitched together at the ASBRs. There are 2293 two types of segment, inter-AS segments and intra-AS segments. 2294 The segmented inter-AS tunnel consists of alternating intra-AS 2295 and inter-AS segments. 2297 Inter-AS segments connect adjacent ASBRs of different ASes; 2298 these "one-hop" segments are instantiated as unicast tunnels. 2300 Intra-AS segments connect ASBRs and PEs which are in the same 2301 AS. An intra-AS segment may be of whatever technology is 2302 desired by the SP that administers the that AS. Different 2303 intra-AS segments may be of different technologies. 2305 Note that the intra-AS segments of inter-AS tunnels form a 2306 category of tunnels that is distinct from simple intra-AS 2307 tunnels; we will rely on this distinction later (see Section 2308 9). 2310 A segmented inter-AS tunnel can be thought of as a tree which 2311 is rooted at a particular AS, and which has as its leaves the 2312 other ASes which need to receive multicast data from the root 2313 AS. 2315 2. "Non-segmented Inter-AS tunnels" 2317 A non-segmented inter-AS tunnel is a single tunnel which spans 2318 AS boundaries. The tunnel technology cannot change from one 2319 point in the tunnel to the next, so all ASes through which the 2320 tunnel passes must support that technology. In essence, AS 2321 boundaries are of no significance to a non-segmented inter-AS 2322 tunnel. 2324 Section 10 of [RFC4364] describes three different options for 2325 supporting unicast Inter-AS BGP/MPLS IP VPNs, known as options A, B, 2326 and C. We describe below how both segmented and non-segmented inter- 2327 AS trees can be supported when option B or option C is used. (Option 2328 A does not pass any routing information through an ASBR at all, so no 2329 special inter-AS procedures are needed.) 2331 8.1. Non-Segmented Inter-AS Tunnels 2333 In this model, the previously described discovery and tunnel setup 2334 mechanisms are used, even though the PEs belonging to a given MVPN 2335 may be in different ASes. 2337 8.1.1. Inter-AS MVPN Auto-Discovery 2339 The previously described BGP-based auto-discovery mechanisms work "as 2340 is" when an MVPN contains PEs that are in different Autonomous 2341 Systems. However, please note that, if non-segmented Inter-AS 2342 Tunnels are to be used, then the "Intra-AS" A-D routes MUST be 2343 distributed across AS boundaries! 2345 8.1.2. Inter-AS MVPN Routing Information Exchange 2347 When non-segmented inter-AS tunnels are used, MVPN C-multicast 2348 routing information may be exchanged by means of PIM peering across 2349 an MI-PMSI, or by means of BGP carrying C-multicast routes. 2351 When PIM peering is used to distribute the C-multicast routing 2352 information, a PE that sends C-PIM Join/Prune messages for a 2353 particular C-(S,G) must be able to identify the PE which is its PIM 2354 adjacency on the path to S. This is the "selected upstream PE" 2355 described in section 5.1. 2357 If BGP (rather than PIM) is used to distribute the C-multicast 2358 routing information, and if option b of section 10 of [RFC4364] is in 2359 use, then the C-multicast routes will be installed in the ASBRs along 2360 the path from each multicast source in the MVPN to each multicast 2361 receiver in the MVPN. If option b is not in use, the C-multicast 2362 routes are not installed in the ASBRs. The handling of the C- 2363 multicast routes in either case is thus exactly analogous to the 2364 handling of unicast VPN-IP routes in the corresponding case. 2366 8.1.3. Inter-AS P-Tunnels 2368 The procedures described earlier in this document can be used to 2369 instantiate either an I-PMSI or an S-PMSI with inter-AS P-tunnels. 2370 Specific tunneling techniques require some explanation. 2372 If ingress replication is used, the inter-AS PE-PE tunnels will use 2373 the inter-AS tunneling procedures for the tunneling technology used. 2375 Procedures in [RSVP-P2MP] are used for inter-AS RSVP-TE P2MP P- 2376 Tunnels. 2378 Procedures for using PIM to set up the P-tunnels are discussed in 2379 the next section. 2381 8.1.3.1. PIM-Based Inter-AS P-Multicast Trees 2383 When PIM is used to set up an inter-AS P-multicast tree, the PIM 2384 Join/Prune messages used to join the tree contain the IP address of 2385 the upstream PE. However, there are two special considerations that 2386 must be taken into account: 2388 - It is possible that the P routers within one or more of the ASes 2389 will not have routes to the upstream PE. For example, if an AS 2390 has a "BGP-free core", the P routers in an AS will not have 2391 routes to addresses outside the AS. 2393 - If the PIM Join/Prune message must travel through several ASes, 2394 it is possible that the ASBRs will not have routes to he PE 2395 routers. For example, in an inter-AS VPN constructed according 2396 to "option b" of section 10 of [RFC4364], the ASBRs do not 2397 necessarily have routes to the PE routers. 2399 If either of these two conditions obtains, then "ordinary" PIM 2400 Join/Prune messages cannot be routed to the upstream PE. Thus the 2401 following information needs to be added to the PIM Join/Prune 2402 messages: a "Proxy Address", which contains the address of the next 2403 ASBR on the path to the upstream PE. When the PIM Join/Prune arrives 2404 at the ASBR which is identified by the "proxy address", that ASBR 2405 must change the proxy address to identify the next hop ASBR. 2407 This information allows the PIM Join/Prune to be routed through an AS 2408 even if the P routers of that AS do not have routes to the upstream 2409 PE. However, this information is not sufficient to enable the ASBRs 2410 to route the Join/Prune if the ASBRs themselves do not have routes to 2411 the upstream PE. 2413 However, even if the ASBRs do not have routes to the upstream PE, the 2414 procedures of this draft ensure that they will have A-D routes that 2415 lead to the upstream PE. If non-segmented inter-AS MVPNs are being 2416 used, the ASBRs (and PEs) will have Intra-AS A-D routes which have 2417 been distributed inter-AS. 2419 So rather than having the PIM Join/Prune messages routed by the ASBRs 2420 along a route to the upstream PE, the PIM Join/Prune messages MUST 2421 be routed along the path determined by the intra-AS A-D routes. 2423 If the only intra-AS A-D route for a given MVPN is the "Intra-AS I- 2424 PMSI Route", the PIM Join/Prunes will be routed along that. However, 2425 if the PIM Join/Prune message is for a particular P-group address, 2426 and there is an "Intra-AS S-PMSI Route" specifying that particular P- 2427 group address as the P-tunnel for a particular S-PMSI, then the PIM 2428 Join/Prunes MUST be routed along the path determined by those intra- 2429 AS A-D routes. 2431 The next revision of this document will provide the following 2432 details: 2434 - encoding of the proxy address in the PIM message (the PIM Join 2435 Attribute [PIM-ATTRIB] will be used) 2437 - encoding of any other information which may be needed in order to 2438 enable the correct intra-AS route to be chosen. 2440 Support for non-segmented inter-AS trees using BIDIR-PIM is for 2441 further study. 2443 8.2. Segmented Inter-AS Tunnels 2445 8.2.1. Inter-AS MVPN Auto-Discovery Routes 2447 The BGP based MVPN membership discovery procedures of section 4 are 2448 used to auto-discover the intra-AS MVPN membership. This section 2449 describes the additional procedures for inter-AS MVPN membership 2450 discovery. It also describes the procedures for constructing 2451 segmented inter-AS tunnels. 2453 In this case, for a given MVPN in an AS, the objective is to form a 2454 spanning tree of MVPN membership, rooted at the AS. The nodes of this 2455 tree are ASes. The leaves of this tree are only those ASes that have 2456 at least one PE with a member in the MVPN. The inter-AS tunnel used 2457 to instantiate an inter-AS PMSI must traverse this spanning tree. A 2458 given AS needs to announce to another AS only the fact that it has 2459 membership in a given MVPN. It doesn't need to announce the 2460 membership of each PE in the AS to other ASes. 2462 This section defines an inter-AS auto-discovery route as a route that 2463 carries information about an AS that has one or more PEs (directly) 2464 connected to the site(s) of that MVPN. Further it defines an inter-AS 2465 leaf auto-discovery route in the following way: 2466 - Consider a node which is the root of an an intra-AS segment of an 2467 inter-AS tunnel. An inter-AS leaf autodiscovery route is used to 2468 inform such a node of a leaf of that intra-AS segment. 2470 8.2.1.1. Originating Inter-AS MVPN A-D Information 2472 A PE in a given AS advertises its MVPN membership to all its IBGP 2473 peers. This IBGP peer may be a route reflector which in turn 2474 advertises this information to only its IBGP peers. In this manner 2475 all the PEs and ASBRs in the AS learn this membership information. 2477 An Autonomous System Border Router (ASBR) may be configured to 2478 support a particular MVPN. If an ASBR is configured to support a 2479 particular MVPN, the ASBR MUST participate in the intra-AS MVPN auto- 2480 discovery/binding procedures for that MVPN within the AS that the 2481 ASBR belongs to, as defined in this document. 2483 Each ASBR then advertises the "AS MVPN membership" to its neighbor 2484 ASBRs using EBGP. This inter-AS auto-discovery route must not be 2485 advertised to the PEs/ASBRs in the same AS as this ASBR. The 2486 advertisement carries the following information elements: 2488 a. A Route Distinguisher for the MVPN. For a given MVPN each ASBR 2489 in the AS must use the same RD when advertising this 2490 information to other ASBRs. To accomplish this all the ASBRs 2491 within that AS, that are configured to support the MVPN, MUST 2492 be configured with the same RD for that MVPN. This RD MUST be 2493 of Type 0, MUST embed the autonomous system number of the AS. 2495 b. The announcing ASBR's local address as the next-hop for the 2496 above information elements. 2498 c. By default the BGP Update message MUST carry export Route 2499 Targets used by the unicast routing of that VPN. The default 2500 could be modified via configuration by having a set of Route 2501 Targets used for the inter-AS auto-discovery routes being 2502 distinct from the ones used by the unicast routing of that VPN. 2504 8.2.1.2. Propagating Inter-AS MVPN A-D Information 2506 As an inter-AS auto-discovery route originated by an ASBR within a 2507 given AS is propagated via BGP to other ASes, this results in 2508 creation of a data plane tunnel that spans multiple ASes. This tunnel 2509 is used to carry (multicast) traffic from the MVPN sites connected to 2510 the PEs of the AS to the MVPN sites connected to the PEs that are in 2511 the other ASes. Such tunnel consists of multiple intra-AS segments 2512 (one per AS) stitched at ASBRs' boundaries by single hop 2513 LSP segments. 2515 An ASBR originates creation of an intra-AS segment when the ASBR 2516 receives an inter-AS auto-discovery route from an EBGP neighbor. 2517 Creation of the segment is completed as a result of distributing via 2518 IBGP this route within the ASBR's own AS. 2520 For a given inter-AS tunnel each of its intra-AS segments could be 2521 constructed by its own independent mechanism. Moreover, by using 2522 upstream labels within a given AS multiple intra-AS segments of 2523 different inter-AS tunnels of either the same or different MVPNs may 2524 share the same P-Multicast Tree. 2526 Since (aggregated) inter-AS auto-discovery routes have granularity of 2527 , an MVPN that is present in N ASes would have total of N 2528 inter-AS tunnels. Thus for a given MVPN the number of inter-AS 2529 tunnels is independent of the number of PEs that have this MVPN. 2531 The following sections specify procedures for propagation of 2532 (aggregated) inter-AS auto-discovery routes across ASes. 2534 8.2.1.2.1. Inter-AS Auto-Discovery Route received via EBGP 2536 When an ASBR receives from one of its EBGP neighbors a BGP Update 2537 message that carries the inter-AS auto-discovery route if (a) at 2538 least one of the Route Targets carried in the message matches one of 2539 the import Route Targets configured on the ASBR, and (b) the ASBR 2540 determines that the received route is the best route to the 2541 destination carried in the NLRI of the route, the ASBR: 2543 a) Re-advertises this inter-AS auto-discovery route within its own 2544 AS. 2546 If the ASBR uses ingress replication to instantiate the intra- 2547 AS segment of the inter-AS tunnel, the re-advertised route 2548 SHOULD carry a Tunnel attribute with the Tunnel Identifier set 2549 to Ingress Replication, but no MPLS labels. 2551 If a P-Multicast Tree is used to instantiate the intra-AS 2552 segment of the inter-AS tunnel, and in order to advertise the 2553 P-Multicast tree identifier the ASBR doesn't need to know the 2554 leaves of the tree beforehand, then the advertising ASBR SHOULD 2555 advertise the P-Multicast tree identifier in the Tunnel 2556 Identifier of the Tunnel attribute. This, in effect, creates a 2557 binding between the inter-AS auto-discovery route and the P- 2558 Multicast Tree. 2560 If a P-Multicast Tree is used to instantiate the intra-AS 2561 segment of the inter-AS tunnel, and in order to advertise the 2562 P-Multicast tree identifier the advertising ASBR needs to know 2563 the leaves of the tree beforehand, the ASBR first discovers the 2564 leaves using the Auto-Discovery procedures, as specified 2565 further down. It then advertises the binding of the tree to the 2566 inter-AS auto-discovery route using the the original auto- 2567 discovery route with the addition of carrying in the route the 2568 Tunnel attribute that contains the type and the identity of the 2569 tree (encoded in the Tunnel Identifier of the attribute). 2571 b) Re-advertises the received inter-AS auto-discovery route to its 2572 EBGP peers, other than the EBGP neighbor from which the best 2573 inter-AS auto-discovery route was received. 2575 c) Advertises to its neighbor ASBR, from which it received the 2576 best inter-AS autodiscovery route to the destination carried in 2577 the NRLI of the route, a leaf auto-discovery route that carries 2578 an ASBR-ASBR tunnel binding with the tunnel identifier set to 2579 ingress replication. This binding as described in section 6 can 2580 be used by the neighbor ASBR to send traffic to this ASBR. 2582 8.2.1.2.2. Leaf Auto-Discovery Route received via EBGP 2584 When an ASBR receives via EBGP a leaf auto-discovery route, the ASBR 2585 finds an inter-AS auto-discovery route that has the same RD as the 2586 leaf auto-discovery route. The MPLS label carried in the leaf auto- 2587 discovery route is used to stitch a one hop ASBR-ASBR LSP to the tail 2588 of the intra-AS tunnel segment associated with the inter-AS auto- 2589 discovery route. 2591 8.2.1.2.3. Inter-AS Auto-Discovery Route received via IBGP 2593 If a given inter-AS auto-discovery route is advertised within an AS 2594 by multiple ASBRs of that AS, the BGP best route selection performed 2595 by other PE/ASBR routers within the AS does not require all these 2596 PE/ASBR routers to select the route advertised by the same ASBR - to 2597 the contrary different PE/ASBR routers may select routes advertised 2598 by different ASBRs. 2600 Further when a PE/ASBR receives from one of its IBGP neighbors a BGP 2601 Update message that carries a AS MVPN membership tree , if (a) the 2602 route was originated outside of the router's own AS, (b) at least one 2603 of the Route Targets carried in the message matches one of the import 2604 Route Targets configured on the PE/ASBR, and (c) the PE/ASBR 2605 determines that the received route is the best route to the 2606 destination carried in the NLRI of the route, if the router is an 2607 ASBR then the ASBR propagates the route to its EBGP neighbors. In 2608 addition the PE/ASBR performs the following. 2610 If the received inter-AS auto-discovery route carries the Tunnel 2611 attribute with the Tunnel Identifier set to LDP P2MP LSP, or PIM-SSM 2612 tree, or PIM-SM tree, the PE/ASBR SHOULD join the P-Multicast tree 2613 whose identity is carried in the Tunnel Identifier. 2615 If the received source auto-discovery route carries the Tunnel 2616 attribute with the Tunnel Identifier set to RSVP-TE P2MP LSP, then 2617 the ASBR that originated the route MUST signal the local PE/ASBR as 2618 one of leaf LSRs of the RSVP-TE P2MP LSP. This signaling MAY have 2619 been completed before the local PE/ASBR receives the BGP Update 2620 message. 2622 If the NLRI of the route does not carry a label, then this tree is an 2623 intra-AS tunnel segment that is part of the inter-AS Tunnel for the 2624 MVPN advertised by the inter-AS auto-discovery route. If the NLRI 2625 carries a (upstream) label, then a combination of this tree and the 2626 label identifies the intra-AS segment. 2628 If this is an ASBR, this intra-AS segment may further be stitched to 2629 ASBR-ASBR inter-AS segment of the inter-AS tunnel. If the PE/ASBR has 2630 local receivers in the MVPN, packets received over the intra-AS 2631 segment must be forwarded to the local receivers using the local VRF. 2633 If the received inter-AS auto-discovery route either does not carry 2634 the Tunnel attribute, or carries the Tunnel attribute with the Tunnel 2635 Identifier set to ingress replication, then the PE/ASBR originates a 2636 new auto-discovery route to allow the ASBR from which the auto- 2637 discovery route was received, to learn of this ASBR as a leaf of the 2638 intra-AS tree. 2640 Thus the AS MVPN membership information propagates across multiple 2641 ASes along a spanning tree. BGP AS-Path based loop prevention 2642 mechanism prevents loops from forming as this information propagates. 2644 8.2.2. Inter-AS MVPN Routing Information Exchange 2646 All of the MVPN routing information exchange methods specified in 2647 section 5 can be supported across ASes. 2649 The objective in this case is to propagate the MVPN routing 2650 information to the remote PE that originates the unicast route to C- 2651 S/C-RP, in the reverse direction of the AS MVPN membership 2652 information announced by the remote PE's origin AS. This information 2653 is processed by each ASBR along this reverse path. 2655 To achieve this the PE that is generating the MVPN routing 2656 advertisement, first determines the source AS of the unicast route to 2657 C-S/C-RP. It then determines from the received AS MVPN membership 2658 information, for the source AS, the ASBR that is the next-hop for the 2659 best path of the source AS MVPN membership. The BGP MVPN routing 2660 update is sent to this ASBR and the ASBR then further propagates the 2661 BGP advertisement. BGP filtering mechanisms ensure that the BGP MVPN 2662 routing information updates flow only to the upstream router on the 2663 reverse path of the inter-AS MVPN membership tree. Details of this 2664 filtering mechanism and the relevant encoding will be specified in a 2665 separate document. 2667 8.2.3. Inter-AS I-PMSI 2669 All PEs in a given AS, use the same inter-AS heterogeneous tunnel, 2670 rooted at the AS, to instantiate an I-PMSI for an inter-AS MVPN 2671 service. As explained earlier the intra-AS tunnel segments that 2672 comprise this tunnel can be built using different tunneling 2673 technologies. To instantiate an MI-PMSI service for a MVPN there must 2674 be an inter-AS tunnel rooted at each AS that has at least one PE that 2675 is a member of the MVPN. 2677 A C-multicast data packet is sent using an intra-AS tunnel segment by 2678 the PE that first receives this packet from the MVPN customer site. 2679 An ASBR forwards this packet to any locally connected MVPN receivers 2680 for the multicast stream. If this ASBR has received a tunnel binding 2681 for the AS MVPN membership that it advertised to a neighboring ASBR, 2682 it also forwards this packet to the neighboring ASBR. In this case 2683 the packet is encapsulated in the downstream MPLS label received from 2684 the neighboring ASBR. The neighboring ASBR delivers this packet to 2685 any locally connected MVPN receivers for that multicast stream. It 2686 also transports this packet on an intra-AS tunnel segment, for the 2687 inter-AS MVPN tunnel, and the other PEs and ASBRs in the AS then 2688 receive this packet. The other ASBRs then repeat the procedure 2689 followed by the ASBR in the origin AS and the packet traverses the 2690 overlay inter-AS tunnel along a spanning tree. 2692 8.2.3.1. Support for Unicast VPN Inter-AS Methods 2694 The above procedures for setting up an inter-AS I-PMSI can be 2695 supported for each of the unicast VPN inter-AS models described in 2696 [RFC4364]. These procedures do not depend on the method used to 2697 exchange unicast VPN routes. For Option B and Option C they do 2698 require MPLS encapsulation between the ASBRs. 2700 8.2.4. Inter-AS S-PMSI 2702 An inter-AS tunnel for an S-PMSI is constructed similar to an inter- 2703 AS tunnel for an I-PMSI. Namely, such a tunnel is constructed as a 2704 concatenation of tunnel segments. There are two types of tunnel 2705 segments: an intra-AS tunnel segment (a segment that spans ASBRs and 2706 PEs within the same AS), and inter-AS tunnel segment (a segment that 2707 spans adjacent ASBRs in adjacent ASes). ASes that are spanned by a 2708 tunnel are not required to use the same tunneling mechanism to 2709 construct the tunnel - each AS may pick up a tunneling mechanism to 2710 construct the intra-AS tunnel segment of the tunnel on its own. 2712 The PE that decides to set up a S-PMSI, advertises the S-PMSI tunnel 2713 binding using procedures in section 7.3.2 to the routers in its own 2714 AS. The membership for which the S-PMSI is instantiated, 2715 is propagated along an inter-AS spanning tree. This spanning tree 2716 traverses the same ASBRs as the AS MVPN membership spanning tree. In 2717 addition to the information elements described in section 7.3.2 2718 (Origin AS, RD, next-hop) the C-S and C-G is also advertised. 2720 An ASBR that receives the AS information from its upstream 2721 ASBR using EBGP sends back a tunnel binding for AS 2722 information if a) at least one of the Route Targets carried in the 2723 message matches one of the import Route Targets configured on the 2724 ASBR, and (b) the ASBR determines that the received route is the best 2725 route to the destination carried in the NLRI of the route. If the 2726 ASBR instantiates a S-PMSI for the AS it sends back a 2727 downstream label that is used to forward the packet along its intra- 2728 AS S-PMSI for the . However the ASBR may decide to use an 2729 AS MVPN membership I-PMSI instead, in which case it sends back the 2730 same label that it advertised for the AS MVPN membership I-PMSI. If 2731 the downstream ASBR instantiates a S-PMSI, it further propagates the 2732 membership to its downstream ASes, else it does not. 2734 An AS can instantiate an intra-AS S-PMSI for the inter-AS S-PMSI 2735 tunnel only if the upstream AS instantiates a S-PMSI. The procedures 2736 allow each AS to determine whether it wishes to setup a S-PMSI or not 2737 and the AS is not forced to setup a S-PMSI just because the upstream 2738 AS decides to do so. 2740 The leaves of an intra-AS S-PMSI tunnel will be the PEs that have 2741 local receivers that are interested in and the ASBRs that 2742 have received MVPN routing information for . Note that an 2743 AS can determine these ASBRs as the MVPN routing information is 2744 propagated and processed by each ASBR on the AS MVPN membership 2745 spanning tree. 2747 The C-multicast data traffic is sent on the S-PMSI by the originating 2748 PE. When it reaches an ASBR that is on the spanning tree, it is 2749 delivered to local receivers, if any, and is also forwarded to the 2750 neighbor ASBR after being encapsulated in the label advertised by the 2751 neighbor. The neighbor ASBR either transports this packet on the S- 2752 PMSI for the multicast stream or an I-PMSI, delivering it to the 2753 ASBRs in its own AS. These ASBRs in turn repeat the procedures of the 2754 origin AS ASBRs and the multicast packet traverses the spanning tree. 2756 9. Duplicate Packet Detection and Single Forwarder PE 2758 Consider the case of an egress PE that receives packets of a customer 2759 multicast stream (C-S, C-G) over a non-aggregated S-PMSI. The 2760 procedures described so far will never cause the PE to receive 2761 duplicate copies of any packet in that stream. It is possible that 2762 the (C-S, C-G) stream is carried in more than one S-PMSI; this may 2763 happen when the site that contains C-S is multihomed to more than one 2764 PE. However, a PE that needs to receive (C-S, C-G) packets only 2765 joins one of these S-PMSIs, and so only receives one copy of each 2766 packet. 2768 However, if the data packets of stream (C-S, C-G) are carried in 2769 either an I-PMSI or in an aggregated S-PMSI, then it the procedures 2770 specified so far make it possible for an egress PE to receive more 2771 than one copy of each data packet. In this section, we define 2772 additional procedures to that an MVPN customer sees no multicast data 2773 packet duplication. 2775 This section covers the situation where the customer multicast tree 2776 is unidirectional, i.e. with the C-G is either a "Sparse Mode" or a 2777 "Single Source Mode" group. The case where the customer multicast 2778 tree is bidirectional (the C-G is a BIDIR-PIM group) is considered 2779 separately in section 12. 2781 The first case when an egress PE may receive duplicate multicast data 2782 packets, is the case where both (a) an MVPN site that contains C-S or 2783 C-RP is multihomed to more than one PE, and (b) either an I-PMSI, or 2784 an aggregated S-PMSI is used for carrying the packets originated by 2785 C-S. In this case, an egress PE may receive one copy of the packet 2786 from each PE to which the site is homed. 2788 The second case when an egress PE may receive duplicate multicast 2789 data packets is when all of the following is true: (a) the IP 2790 destination address of the customer packet is a C-G that is operating 2791 in ASM mode, and whose C-multicast tree is set up using PIM-SM, (b) 2792 an MI-PMSI is used for carrying the packets, and (c) a router or a CE 2793 in a site connected to the egress PE switches from the C-RP tree to 2794 C-S tree. In this case, it is possible to get one copy of a given 2795 packet from the ingress PE attached to the C-RP's site, and one from 2796 the ingress PE attached to the C-S's site. 2798 9.1. Multihomed C-S or C-RP 2800 In the first case for a given an egress PE, say PE1, 2801 expects to receive C-data packets from the upstream PE, say PE2, 2802 which PE1 identified as the upstream multicast hop in the C-Multicast 2803 Routing Update that PE1 sent in order to join . If PE1 can 2804 determine that a data packet for was received from the 2805 expected upstream PE, PE2, PE1 will accept and forward the packet. 2806 Otherwise, PE1 will drop the packet; this means that the PE will see 2807 a duplicate, but the duplicate will not get forwarded. (But see 2808 section 10 for an exception case where PE1 will accept a packet even 2809 if it is from an unexpected upstream PE.) 2811 The method used by an egress PE to determine the ingress PE for a 2812 particular packet, received over a particular PMSI, depends on the P- 2813 tunnel technology that is used to instantiate the PMSI. If the P- 2814 tunnel is a P2MP LSP, a PIM-SM or PIM-SSM tree, or a unicast tunnel, 2815 then the tunnel encapsulation contains information which can be used 2816 (possibly along with other state information in the PE) to determine 2817 the ingress PE, as long as the P-tunnel is instantiating an intra-AS 2818 PMSI, or an inter-AS PMSI which is supported by a non-segmented 2819 inter-AS tunnel. 2821 Even when inter-AS segmented tunnels are used, if an aggregated S- 2822 PMSI is used for carrying the packets, the P-tunnel encapsulation 2823 must have some information which can be used to identify the PMSI, 2824 and that in turn implicitly identifies the ingress PE. 2826 If an I-PMSI is used for carrying the packets, the I-PMSI spans 2827 multiple ASes, and the I-PMSI is realized via segmented inter-AS 2828 tunnels, if C-S or C-RP is multi-homed to different PEs, as long as 2829 each such PE is in a different AS, the egress PE can detect duplicate 2830 traffic as such duplicate traffic will arrive on a different (inter- 2831 AS) tunnel. Specifically, if the PE was expecting the traffic on an 2832 particular inter-AS tunnel, duplicate traffic will arrive either on 2833 an intra-AS tunnel [this is not an intra-AS tunnel segment, of an 2834 inter-AS tunnel], or on some other inter-AS tunnel. Therefore, to 2835 detect duplicates the PE has to keep track of which (inter-AS) auto- 2836 discovery route the PE uses for sending MVPN multicast routing 2837 information towards C-S/C-RP. Then the PE should receive (multicast) 2838 traffic originated by C-S/C-RP only from the (inter-AS) tunnel that 2839 was carried in the best Inter-AS auto-discovery route for the MVPN 2840 and was originated by the AS that contains C-S/C-RP (where "the best" 2841 is determined by the PE). The PE should discard, as duplicated, all 2842 other multicast traffic originated by C-S/C-RP, but received on any 2843 other tunnel. 2845 9.1.1. Single forwarder PE selection 2847 When for a given MVPN (a) MI-PMSI is used for carrying multicast data 2848 packets, (b) C-S or C-RP is multi-homed to different PEs, and (c) at 2849 least two of such PEs are in the same AS, then depending on the 2850 tunneling technology used by the MI-PMSI it may not always be 2851 possible for the egress PE to determine the upstream PE. Therefore, 2852 when this determination may not be possible procedures are needed to 2853 ensure that packets are received on an MI-PMSI at an egress PE only 2854 from a single upstream PE. Furthermore, even if the determination is 2855 possible, it may be preferable to send only one copy of each packet 2856 to each egress PE, rather than sending multiple copies and having the 2857 egress PE discard all but one. 2859 Section 5.1 specifies a procedure for choosing a "default upstream PE 2860 selection", such that (except during routing transients) all PEs will 2861 choose the same default upstream PE. To ensure that duplicate 2862 packets are not sent through the backbone (except during routing 2863 transients), an ingress PE does not forward to the backbone any (C-S, 2864 C-G) multicast data packet it receives from a CE, unless the PE is 2865 the default upstream PE selection. 2867 This procedure is optional whenever the P-tunnel technology that is 2868 being used to carry the multicast stream in question allows the 2869 egress PEs to determine the identity of the ingress PE. This 2870 procedure is mandatory if the P-tunnel technology does not make this 2871 determination possible. 2873 The above procedure ensures that if C-S or C-RP is multi-homed to PEs 2874 within a single AS, a PE will not receive duplicate traffic as long 2875 as all the PEs are on either the C-S or C-RP tree. If some PEs are on 2876 the C-S tree and some on the C-RP tree, however, packet duplication 2877 is still possible. This is discussed in the next section. 2879 9.2. Switching from the C-RP tree to C-S tree 2881 If some PEs are on the C-S tree and some on the R-RP tree then a PE 2882 may also receive duplicate traffic during a to 2883 switch. The issue and the solution are described next. 2885 When for a given MVPN (a) MI-PMSI is used for carrying multicast data 2886 packets, (b) C-S and C-RP are connected to PEs within the same AS, 2887 and (c) the MI-PMSI tunneling technology in use does not allow the 2888 egress PEs to identify the ingress PE, then having all the PEs select 2889 the same PE to be the upstream multicast hop for C-S or C-RP is not 2890 sufficient to prevent packet duplication. 2892 The reason is that a single tunnel used by MI-PMSI may be carrying 2893 traffic on both the (C-*, C-G) tree and the (C-S, C-G) tree. If some 2894 of the egress PEs have joined the source tree, but others expect to 2895 receive (C-S, C-G) packets from the shared tree, then two copies of 2896 data packet will travel on the tunnel, and since due to the choice of 2897 the tunneling technology the egress PEs have no way to identify the 2898 ingress PE, the egress PEs will have no way to determine that only 2899 one copy should be accepted. 2901 To avoid this, it is necessary to ensure that once any PE joins the 2902 (C-S, C-G) tree, any other PE that has joined the (C-*, C- G) tree 2903 also switches to the (C-S, C-G) tree (selecting, of course, the same 2904 upstream multicast hop, as specified above). 2906 Whenever a PE creates an state as a result of receiving a 2907 C-multicast route for from some other PE, and the C-G 2908 group is a Sparse Mode group, the PE that creates the state MUST 2909 originate a Source Active auto-discovery route (see [MVPN-BGP] 2910 section 4.5) as specified below. The route is advertised using the 2911 same procedures as the MVPN auto-discovery/binding (both intra-AS and 2912 inter-AS) specified in this document with the following 2913 modifications: 2915 1. The Multicast Source field MUST be set to C-S. The Multicast 2916 Source Length field is set appropriately to reflect this. 2918 2. The Multicast Group field MUST be set to C-G. The Multicast 2919 Group Length field is set appropriately to reflect this. 2921 The route goes to all the PEs of the MVPN. When as a result of 2922 receiving a new Source Active auto-discovery route a PE updates its 2923 VRF with the route, the PE MUST check if the newly received route 2924 matches any entries. If (a) there is a matching entry, (b) 2925 the PE does not have (C-S, C-G) state in its MVPN-TIB for (C-S, C-G) 2926 carried in the route, and (c) the received route is selected as the 2927 best (using the BGP route selection procedures), then the PE sets up 2928 its forwarding path to receive (C-S, C-G) traffic from the tunnel the 2929 originator of the selected Source Active auto-discovery route uses 2930 for sending (C-S, C-G). This procedures forces all the PEs (in all 2931 ASes) to switch from the C-RP tree to the C-S tree for . 2933 (Additional uses of the Source Active A-D route are discussed in 2934 section 10.) 2936 Note that when a PE thus joins the tree, it may need to 2937 send a PIM (S,G,RPT-bit) prune to one of its CE PIM neighbors, as 2938 determined by ordinary PIM procedures. (This will be the case if the 2939 incoming interface for the (C-*, C-G) tree is one of the VRF 2940 interfaces.) However, before doing this, it SHOULD run a timer to 2941 help ensure that the source is not pruned from the shared tree until 2942 all PEs have had time to receive the Source Active route. 2944 Whenever the PE deletes the state that was previously 2945 created as a result of receiving a C-multicast route for 2946 from some other PE, the PE that deletes the state also withdraws the 2947 auto-discovery route that was advertised when the state was created. 2949 N.B.: SINCE ALL PEs WITH RECEIVERS FOR GROUP C-G WILL JOIN THE C-S 2950 SOURCE TREE IF ANY OF THEM DO, IT IS NEVER NECESSARY TO DISTRIBUTE A 2951 BGP C-MULTICAST ROUTE FOR THE PURPOSE OF PRUNING SOURCES FROM THE 2952 SHARED TREE. 2954 It is worth nothing that if a PE joins a source tree as a result of 2955 this procedure, the UMH is not necessarily the same as it would be if 2956 the PE had joined the source tree as a result of receiving a PIM Join 2957 for the same source tree from a directly attached CE. 2959 10. Eliminating PE-PE Distribution of (C-*,C-G) State 2961 In sparse mode PIM, a node that wants to become a receiver for a 2962 particular multicast group G first joins a shared tree, rooted at a 2963 rendezvous point. When the receiver detects traffic from a 2964 particular source it has the option of joining a source tree, rooted 2965 at that source. If it does so, it has to prune that source from the 2966 shared tree, to ensure that it receives packets from that source on 2967 only one tree. 2969 Maintaining the shared tree can require considerable state, as it is 2970 necessary not only to know who the upstream and downstream nodes are, 2971 but to know which sources have been pruned off which branches of the 2972 share tree. 2974 The BGP-based signaling procedures defined in this document and in 2975 [MVPN-BGP] eliminate the need for PEs to distribute to each other any 2976 state having to do with which sources have been pruned off a shared 2977 C-tree. Those procedures do still allow multicast data traffic to 2978 travel on a shared C-tree, but they do not allow a situation in which 2979 some CEs receive (S,G) traffic on a shared tree and some on a source 2980 tree. This results in a considerable simplification of the PE-PE 2981 procedures with minimal change to the multicast service seen within 2982 the VPN. However, shared C-trees are still supported across the VPN 2983 backbone. That is, (C-*, C-G) state is distributed PE-PE, but (C-*, 2984 C-G, RPT-bit) state is not. 2986 In this section, we specify a number of optional procedures which go 2987 further, and which completely eliminate the support for shared C- 2988 trees across the VPN backbone. In these procedures, the PEs keep 2989 track of the active sources for each C-G. As soon as a CE tries to 2990 join the (*,G) tree, the PEs instead join the (S,G) trees for all the 2991 active sources. Thus all distribution of (C-*,C-G) state is 2992 eliminated. These procedures are optional because they require some 2993 additional support on the part of the VPN customer, and because they 2994 are not always appropriate. (E.g., a VPN customer may have his own 2995 policy of always using shared trees for certain multicast groups.) 2996 There are several different options, described in the following sub- 2997 sections. 2999 10.1. Co-locating C-RPs on a PE 3001 [MVPN-REQ] describes C-RP engineering as an issue when PIM-SM (or 3002 BIDIR-PIM) is used in "Any Source Multicast (ASM) mode" [RFC4607] on 3003 the VPN customer site. To quote from [MVPN-REQ]: 3005 "In some cases this engineering problem is not trivial: for instance, 3006 if sources and receivers are located in VPN sites that are different 3007 than that of the RP, then traffic may flow twice through the SP 3008 network and the CE-PE link of the RP (from source to RP, and then 3009 from RP to receivers) ; this is obviously not ideal. A multicast VPN 3010 solution SHOULD propose a way to help on solving this RP engineering 3011 issue." 3013 One of the C-RP deployment models is for the customer to outsource 3014 the RP to the provider. In this case the provider may co-locate the 3015 RP on the PE that is connected to the customer site [MVPN-REQ]. This 3016 section describes how anycast-RP can be used for achieving this. This 3017 is described below. 3019 10.1.1. Initial Configuration 3021 For a particular MVPN, at least one or more PEs that have sites in 3022 that MVPN, act as an RP for the sites of that MVPN connected to these 3023 PEs. Within each MVPN all these RPs use the same (anycast) address. 3024 All these RPs use the Anycast RP technique. 3026 10.1.2. Anycast RP Based on Propagating Active Sources 3028 This mechanism is based on propagating active sources between RPs. 3030 10.1.2.1. Receiver(s) Within a Site 3032 The PE which receives C-Join for (*,G) or (S,G) does not send the 3033 information that it has receiver(s) for G until it receives 3034 information about active sources for G from an upstream PE. 3036 On receiving this (described in the next section), the downstream PE 3037 will respond with Join for C-(S,G). Sending this information could be 3038 done using any of the procedures described in section 5. If BGP is 3039 used, the ingress address is set to the upstream PE's address which 3040 has triggered the source active information. Only the upstream PE 3041 will process this information. If unicast PIM is used then a unicast 3042 PIM message will have to be sent to the PE upstream PE that has 3043 triggered the source active information. If a MI-PMSI is used than 3044 further clarification is needed on the upstream neighbor address of 3045 the PIM message and will be provided in a future revision. 3047 10.1.2.2. Source Within a Site 3049 When a PE receives PIM-Register from a site that belongs to a given 3050 VPN, PE follows the normal PIM anycast RP procedures. It then 3051 advertises the source and group of the multicast data packet carried 3052 in PIM-Register message to other PEs in BGP using the following 3053 information elements: 3055 - Active source address 3057 - Active group address 3059 - Route target of the MVPN. 3061 This advertisement goes to all the PEs that belong to that MVPN. When 3062 a PE receives this advertisement, it checks whether there are any 3063 receivers in the sites attached to the PE for the group carried in 3064 the source active advertisement. If yes, then it generates an 3065 advertisement for C-(S,G) as specified in the previous section. 3067 Note that the mechanism described in section 7.3.2. can be leveraged 3068 to advertise a S-PMSI binding along with the source active messages. 3070 10.1.2.3. Receiver Switching from Shared to Source Tree 3072 No additional procedures are required when multicast receivers in 3073 customer's site shift from shared tree to source tree. 3075 10.2. Using MSDP between a PE and a Local C-RP 3077 Section 10.1 describes the case where each PE is a C-RP. This 3078 enables the PEs to know the active multicast sources for each MVPN, 3079 and they can then use BGP to distribute this information to each 3080 other. As a result, the PEs do not have to join any shared C-trees, 3081 and this results in a simplification of the PE operation. 3083 In another deployment scenario, the PEs are not themselves C-RPs, but 3084 use MSDP to talk to the C-RPs. In particular, a PE which attaches to 3085 a site that contains a C-RP becomes an MSDP peer of that C-RP. That 3086 PE then uses BGP to distribute the information about the active 3087 sources to the other PEs. When the PE determines, by MSDP, that a 3088 particular source is no longer active, then it withdraws the 3089 corresponding BGP update. Then the PEs do not have to join any 3090 shared C-trees, but they do not have to be C-RPs either. 3092 MSDP provides the capability for a Source Active message to carry an 3093 encapsulated data packet. This capability can be used to allow an 3094 MSDP speaker to receive the first (or first several) packet(s) of an 3095 (S,G) flow, even though the MSDP speaker hasn't yet joined the (S,G) 3096 tree. (Presumably it will join that tree as a result of receiving 3097 the SA message which carries the encapsulated data packet.) If this 3098 capability is not used, the first several data packets of an (S,G) 3099 stream may be lost. 3101 A PE which is talking MSDP to an RP may receive such an encapsulated 3102 data packet from the RP. The data packet should be decapsulated and 3103 transmitted to the other PEs in the MVPN. If the packet belongs to a 3104 particular (S,G) flow, and if the PE is a transmitter for some S-PMSI 3105 to which (S,G) has already been bound, the decapsulated data packet 3106 should be transmitted on that S-PMSI. Otherwise, if an I-PMSI exists 3107 for that MVPN, the decapsulated data packet should be transmitted on 3108 it. (If a MI-PMSI exists, this would typically be used.) If neither 3109 of these conditions hold, the decapsulated data packet is not 3110 transmitted to the other PEs in the MVPN. The decision as to whether 3111 and how to transmit the decapsulated data packet does not effect the 3112 processing of the SA control message itself. 3114 Suppose that PE1 transmits a multicast data packet on a PMSI, where 3115 that data packet is part of an (S,G) flow, and PE2 receives that 3116 packet from that PMSI. According to section 9, if PE1 is not the PE 3117 that PE2 expects to be transmitting (S,G) packets, then PE2 must 3118 discard the packet. If an MSDP-encapsulated data packet is 3119 transmitted on a PMSI as specified above, this rule from section 9 3120 would likely result in the packet's getting discarded. Therefore, if 3121 MSDP-encapsulated data packets being decapsulated and transmitted on 3122 a PMSI, we need to modify the rules of section 9 as follows: 3124 1. If the receiving PE, PE2, has already joined the (S,G) tree, 3125 and has chosen PE1 as the upstream PE for the (S,G) tree, but 3126 this packet does not come from PE1, PE2 must discard the 3127 packet. 3129 2. If the receiving PE, PE2, has not already joined the (S,G) 3130 tree, but is a PIM adjacency to a CE which is downstream on the 3131 (*,G) tree, the packet should be forwarded to the CE. 3133 11. Encapsulations 3135 The BGP-based auto-discovery procedures will ensure that the PEs in a 3136 single MVPN only use tunnels that they can all support, and for a 3137 given kind of tunnel, that they only use encapsulations that they can 3138 all support. 3140 11.1. Encapsulations for Single PMSI per Tunnel 3142 11.1.1. Encapsulation in GRE 3144 GRE encapsulation can be used for any PMSI that is instantiated by a 3145 mesh of unicast tunnels, as well as for any PMSI that is instantiated 3146 by one or more PIM tunnels of any sort. 3148 Packets received Packets in transit Packets forwarded 3149 at ingress PE in the service by egress PEs 3150 provider network 3152 +---------------+ 3153 | P-IP Header | 3154 +---------------+ 3155 | GRE | 3156 ++=============++ ++=============++ ++=============++ 3157 || C-IP Header || || C-IP Header || || C-IP Header || 3158 ++=============++ >>>>> ++=============++ >>>>> ++=============++ 3159 || C-Payload || || C-Payload || || C-Payload || 3160 ++=============++ ++=============++ ++=============++ 3162 The IP Protocol Number field in the P-IP Header must be set to 47. 3163 The Protocol Type field of the GRE Header must be set to 0x800. 3165 When an encapsulated packet is transmitted by a particular PE, the 3166 source IP address in the P-IP header must be the same address that 3167 the PE uses to identify itself in the VRF Route Import Extended 3168 Communities that it attaches to any of VPN-IP routes eligible for UMH 3169 determination that it advertises via BGP (see section 5.1). 3171 If the PMSI is instantiated by a PIM tree, the destination IP address 3172 in the P-IP header is the group P-address associated with that tree. 3173 The GRE key field value is omitted. 3175 If the PMSI is instantiated by unicast tunnels, the destination IP 3176 address is the address of the destination PE, and the optional GRE 3177 Key field is used to identify a particular MVPN. In this case, each 3178 PE would have to advertise a key field value for each MVPN; each PE 3179 would assign the key field value that it expects to receive. 3181 [RFC2784] specifies an optional GRE checksum, and [RFC2890] specifies 3182 an optional GRE sequence number fields. 3184 The GRE sequence number field is not needed because the transport 3185 layer services for the original application will be provided by the 3186 C-IP Header. 3188 The use of GRE checksum field must follow [RFC2784]. 3190 To facilitate high speed implementation, this document recommends 3191 that the ingress PE routers encapsulate VPN packets without setting 3192 the checksum, or sequence fields. 3194 11.1.2. Encapsulation in IP 3196 IP-in-IP [RFC1853] is also a viable option. When it is used, the 3197 IPv4 Protocol Number field is set to 4. The following diagram shows 3198 the progression of the packet as it enters and leaves the service 3199 provider network. 3201 Packets received Packets in transit Packets forwarded 3202 at ingress PE in the service by egress PEs 3203 provider network 3205 +---------------+ 3206 | P-IP Header | 3207 ++=============++ ++=============++ ++=============++ 3208 || C-IP Header || || C-IP Header || || C-IP Header || 3209 ++=============++ >>>>> ++=============++ >>>>> ++=============++ 3210 || C-Payload || || C-Payload || || C-Payload || 3211 ++=============++ ++=============++ ++=============++ 3213 When an encapsulated packet is transmitted by a particular PE, the 3214 source IP address in the P-IP header must be the same address that 3215 the PE uses to identify itself in the VRF Route Import Extended 3216 Communities that it attaches to any of VPN-IP routes eligible for UMH 3217 determination that it advertises via BGP (see section 5.1). 3219 11.1.3. Encapsulation in MPLS 3221 If the PMSI is instantiated as a P2MP MPLS LSP or MP2MP LSP, MPLS 3222 encapsulation is used. Penultimate-hop-popping must be disabled for 3223 the P2MP MPLS LSP. If the PMSI is instantiated as an RSVP-TE P2MP 3224 LSP, additional MPLS encapsulation procedures are used, as specified 3225 in [RSVP-P2MP]. 3227 If other methods of assigning MPLS labels to multicast distribution 3228 trees are in use, these multicast distribution trees may be used as 3229 appropriate to instantiate PMSIs, and appropriate additional MPLS 3230 encapsulation procedures may be used. 3232 Packets received Packets in transit Packets forwarded 3233 at ingress PE in the service by egress PEs 3234 provider network 3236 +---------------+ 3237 | P-MPLS Header | 3238 ++=============++ ++=============++ ++=============++ 3239 || C-IP Header || || C-IP Header || || C-IP Header || 3240 ++=============++ >>>>> ++=============++ >>>>> ++=============++ 3241 || C-Payload || || C-Payload || || C-Payload || 3242 ++=============++ ++=============++ ++=============++ 3244 11.2. Encapsulations for Multiple PMSIs per Tunnel 3246 The encapsulations for transmitting multicast data messages when 3247 there are multiple PMSIs per tunnel are based on the encapsulation 3248 for a single PMSI per tunnel, but with an MPLS label used for 3249 demultiplexing. 3251 The label is upstream-assigned and distributed via BGP as specified 3252 in section 4. The label must enable the receiver to select the 3253 proper VRF, and may enable the receiver to select a particular 3254 multicast routing entry within that VRF. 3256 11.2.1. Encapsulation in GRE 3258 Rather than the IP-in-GRE encapsulation discussed in section 11.1.1, 3259 we use the MPLS-in-GRE encapsulation. This is specified in [MPLS- 3260 IP]. The GRE protocol type MUST be set to 0x8847. [The reason for 3261 using the unicast rather than the multicast value is specified in 3262 [MPLS-MCAST-ENCAPS]. 3264 11.2.2. Encapsulation in IP 3266 Rather than the IP-in-IP encapsulation discussed in section 12.1.2, 3267 we use the MPLS-in-IP encapsulation. This is specified in [MPLS-IP]. 3268 The IP protocol number MUST be set to the value identifying the 3269 payload as an MPLS unicast packet. [There is no "MPLS multicast 3270 packet" protocol number.] 3272 11.3. Encapsulations Identifying a Distinguished PE 3274 11.3.1. For MP2MP LSP P-tunnels 3276 As discussed in section 9, if a multicast data packet belongs to a 3277 Sparse Mode or Single Source Mode multicast group, it is highly 3278 desirable for the PE that receives the packet from a PMSI to be able 3279 to determine the identity of the PE that transmitted the data packet 3280 onto the PMSI. The encapsulations of the previous sections all 3281 provide this information, except in one case. If a PMSI is being 3282 instantiated by a MP2MP LSP, then the encapsulations discussed so far 3283 do not allow one to determine the identity of the PE that transmitted 3284 the packet onto the PMSI. 3286 Therefore, when a packet that belongs to a Sparse Mode or Single 3287 Source Mode multicast group is traveling on a MP2MP LSP P-tunnel, it 3288 MUST carry, as its second label, a label which has been bound to the 3289 packet's ingress PE. This label is an upstream-assigned label that 3290 the LSP's root node has bound to the ingress PE and has distributed 3291 via an A-D Route (see section 4; precise details of this distribution 3292 procedure will be included in the next revision of this document). 3293 This label will appear immediately beneath the labels that are 3294 discussed in sections 11.1.3 and 11.2. 3296 11.3.2. For Support of PIM-BIDIR C-Groups 3298 As will be discussed in section 12, when a packet belongs to a PIM- 3299 BIDIR multicast group, the set of PEs of that packet's VPN can be 3300 partitioned into a number of subsets, where exactly one PE in each 3301 partition is the upstream PE for that partition. When such packets 3302 are transmitted on a PMSI, then unless the procedures of section 3303 12.2.3 are being used, it is necessary for the packet to carry 3304 information identifying a particular partition. This is done by 3305 having the packet carry the PE label corresponding to the upstream PE 3306 of one partition. For a particular P-tunnel, this label will have 3307 been advertised by the node which is the root of that P-tunnel. 3308 (Details of the procedure by which the PE labels are advertised will 3309 be included in the next revision of this document.) 3311 This label needs to be used whenever a packet belongs to a PIM-BIDIR 3312 C-group, no matter what encapsulation is used by the P-tunnel. Hence 3313 the encapsulations of section 11.2 MUST be used. If the tunnel 3314 contains only one PMSI, the PE label replaces the label discussed in 3315 section 11.2 If the tunnel contains multiple PMSIs, the PE label 3316 follows the label discussed in section 11.2 3318 11.4. Encapsulations for Unicasting PIM Control Messages 3320 When PIM control messages are unicast, rather than being sent on an 3321 MI-PMSI, then the receiving PE needs to determine the particular MVPN 3322 whose multicast routing information is being carried in the PIM 3323 message. One method is to use a downstream-assigned MPLS label which 3324 the receiving PE has allocated for this specific purpose. The label 3325 would be distributed via BGP. This can be used with an MPLS, MPLS- 3326 in-GRE, or MPLS-in-IP encapsulation. 3328 A possible alternative to modify the PIM messages themselves so that 3329 they carry information which can be used to identify a particular 3330 MVPN, such as an RT. 3332 This area is still under consideration. 3334 11.5. General Considerations for IP and GRE Encaps 3336 These apply also to the MPLS-in-IP and MPLS-in-GRE encapsulations. 3338 11.5.1. MTU 3340 It is the responsibility of the originator of a C-packet to ensure 3341 that the packet is small enough to reach all of its destinations, 3342 even when it is encapsulated within IP or GRE. 3344 When a packet is encapsulated in IP or GRE, the router that does the 3345 encapsulation MUST set the DF bit in the outer header. This ensures 3346 that the decapsulating router will not need to reassemble the 3347 encapsulating packets before performing decapsulation. 3349 In some cases the encapsulating router may know that a particular C- 3350 packet is too large to reach its destinations. Procedures by which 3351 it may know this are outside the scope of the current document. 3352 However, if this is known, then: 3354 - If the DF bit is set in the IP header of a C-packet which is 3355 known to be too large, the router will discard the C-packet as 3356 being "too large", and follow normal IP procedures (which may 3357 require the return of an ICMP message to the source). 3359 - If the DF bit is not set in the IP header of a C-packet which is 3360 known to be too large, the router MAY fragment the packet before 3361 encapsulating it, and then encapsulate each fragment separately. 3362 Alternatively, the router MAY discard the packet. 3364 If the router discards a packet as too large, it should maintain OAM 3365 information related to this behavior, allowing the operator to 3366 properly troubleshoot the issue. 3368 Note that if the entire path of the tunnel does not support an MTU 3369 which is large enough to carry the a particular encapsulated C- 3370 packet, and if the encapsulating router does not do fragmentation, 3371 then the customer will not receive the expected connectivity. 3373 11.5.2. TTL 3375 The ingress PE should not copy the TTL field from the payload IP 3376 header received from a CE router to the delivery IP or MPLS header. 3377 The setting of the TTL of the delivery header is determined by the 3378 local policy of the ingress PE router. 3380 11.5.3. Avoiding Conflict with Internet Multicast 3382 If the SP is providing Internet multicast, distinct from its VPN 3383 multicast services, and using PIM based P-multicast trees, it must 3384 ensure that the group P-addresses which it used in support of MPVN 3385 services are distinct from any of the group addresses of the Internet 3386 multicasts it supports. This is best done by using administratively 3387 scoped addresses [ADMIN-ADDR]. 3389 The group C-addresses need not be distinct from either the group P- 3390 addresses or the Internet multicast addresses. 3392 11.6. Differentiated Services 3394 The setting of the DS field in the delivery IP header should follow 3395 the guidelines outlined in [RFC2983]. Setting the EXP field in the 3396 delivery MPLS header should follow the guidelines in [RFC3270]. An SP 3397 may also choose to deploy any of additional Differentiated Services 3398 mechanisms that the PE routers support for the encapsulation in use. 3399 Note that the type of encapsulation determines the set of 3400 Differentiated Services mechanisms that may be deployed. 3402 12. Support for PIM-BIDIR C-Groups 3404 In BIDIR-PIM, each multicast group is associated with an RPA 3405 (Rendezvous Point Address). The Rendezvous Point Link (RPL) is the 3406 link that attaches to the RPA. Usually it's a LAN where the RPA is 3407 in the IP subnet assigned to the LAN. The root node of a BIDIR-PIM 3408 tree is a node which has an interface on the RPL. 3410 On any LAN (other than the RPL) which is a link in a PIM-bidir tree, 3411 there must be a single node that has been chosen to be the DF. (More 3412 precisely, for each RPA there is a single node which is the DF for 3413 that RPA.) A node which receives traffic from an upstream interface 3414 may forward it on a particular downstream interface only if the node 3415 is the DF for that downstream interface. A node which receives 3416 traffic from a downstream interface may forward it on an upstream 3417 interface only if that node is the DF for the downstream interface. 3419 If, for any period of time, there is a link on which each of two 3420 different nodes believes itself to be the DF, data forwarding loops 3421 can form. Loops in a bidirectional multicast tree can be very 3422 harmful. However, any election procedure will have a convergence 3423 period. The BIDIR-PIM DF election procedures is very complicated, 3424 because it goes to great pains to ensure that if convergence is not 3425 extremely fast, then there is no forwarding at all until convergence 3426 has taken place. 3428 Other variants of PIM also have a DF election procedure for LANs. 3429 However, as long as the multicast tree is unidirectional, 3430 disagreement about who the DF is can result only in duplication of 3431 packets, not in loops. Therefore the time taken to converge on a 3432 single DF is of much less concern for unidirectional trees and it is 3433 for bidirectional trees. 3435 In the MVPN environment, if PIM signaling is used among the PEs, the 3436 can use the standard LAN-based DF election procedure can be used. 3437 However, election procedures that are optimized for a LAN may not 3438 work as well in the MVPN environment. So an alternative to DF 3439 election would be desirable. 3441 If BGP signaling is used among the PEs, an alternative to DF election 3442 is necessary. One might think that use the "single forwarder 3443 selection" procedures described in sections 5 and 9 coudl be used to 3444 choose a single PE "DF" for the backbone (for a given RPA in a given 3445 MVPN). However, that is still likely to leave a convergence period 3446 of at least several seconds during which loops could form, and there 3447 could be a much longer convergence period if there is anything 3448 disrupting the smooth flow of BGP updates. So a simple procedure 3449 like that is not sufficient. 3451 The remainder of this section describes two different methods that 3452 can be used to support BIDIR-PIM while eliminating the DF election. 3454 12.1. The VPN Backbone Becomes the RPL 3456 On a per MVPN basis, this method treats the whole service provider(s) 3457 infrastructure as a single RPL (RP Link). We refer to such an RPL as 3458 an "MVPN-RPL". This eliminates the need for the PEs to engage in any 3459 "DF election" procedure, because PIM-bidir does not have a DF on the 3460 RPL. 3462 However, this method can only be used if the customer is 3463 "outsourcing" the RPL/RPA functionality to the SP. 3465 An MVPN-RPL could be realized either via an I-PMSI (this I-PMSI is on 3466 a per MVPN basis and spans all the PEs that have sites of a given 3467 MVPN), or via a collection of S-PMSIs, or even via a combination of 3468 an I-PMSI and one or more S-PMSIs. 3470 12.1.1. Control Plane 3472 Associated with each MVPN-RPL is an address prefix that is 3473 unambiguous within the context of the MVPN associated with the MVPN- 3474 RPL. 3476 For a given MVPN, each VRF connected to an MVPN-RPL of that MVPN is 3477 configured to advertise to all of its connected CEs the address 3478 prefix of the MVPN-RPL. 3480 Since in PIM Bidir there is no Designated Forwarder on an RPL, in the 3481 context of MVPN-RPL there is no need to perform the Designated 3482 Forwarder election among the PEs (note there is still necessary to 3483 perform the Designated Forwarder election between a PE and its 3484 directly attached CEs, but that is done using plain PIM Bidir 3485 procedures). 3487 For a given MVPN a PE connected to an MVPN-RPL of that MVPN should 3488 send multicast data (C-S,C-G) on the MVPN-RPL only if at least one 3489 other PE connected to the MVPN-RPL has a downstream multicast state 3490 for C-G. In the context of MVPN this is accomplished by requring a PE 3491 that has a downstream state for a particular C-G of a particular VRF 3492 present on the PE to originate a C-multicast route for (*, C-G). The 3493 RD of this route should be the same as the RD associated with the 3494 VRF. The RT(s) carried by the route should be the same as the one(s) 3495 used for VPN-IPv4 routes. This route will be distributed to all the 3496 PEs of the MVPN. 3498 12.1.2. Data Plane 3500 A PE that receives (C-S,C-G) multicast data from a CE should forward 3501 this data on the MVPN-RPL of the MVPN the CE belongs to only if the 3502 PE receives at least one C-multicast route for (*, C-G). Otherwise, 3503 the PE should not forward the data on the RPL/I-PMSI. 3505 When a PE receives a multicast packet with (C-S,C-G) on an MVPN-RPL 3506 associated with a given MVPN, the PE forwards this packet to every 3507 directly connected CE of that MVPN, provided that the CE sends Join 3508 (*,C-G) to the PE (provided that the PE has the downstream (*,C-G) 3509 state). The PE does not forward this packet back on the MVPN-RPL. If 3510 a PE has no downstream (*,C-G) state, the PE does not forward the 3511 packet. 3513 12.2. Partitioned Sets of PEs 3515 This method does not require the use of the MVPN-RPL, and does not 3516 require the customer to outsource the RPA/RPL functionality to the 3517 SP. 3519 12.2.1. Partitions 3521 Consider a particular C-RPA, call it C-R, in a particular MVPN. 3522 Consider the set of PEs that attach to sites that have senders or 3523 receivers for a BIDIR-PIM group C-G, where C-R is the RPA for C-G. 3524 (As always we use the "C-" prefix to indicate that we are referring 3525 to an address in the VPN's address space rather than in the 3526 provider's address space.) 3528 Following the procedures of section 5.1, each PE in the set 3529 independently chooses some other PE in the set to be its "upstream 3530 PE" for those BIDIR-PIM groups with RPA C-R. Optionally, they can 3531 all choose the "default selection" (described in section 5.1), to 3532 ensure that each PE to choose the same upstream PE. Note that if a 3533 PE has a route to C-R via a VRF interface, then the PE may choose 3534 itself as the upstream PE. 3536 The set of PEs can now be partitioned into a number of subsets. 3537 We'll say that PE1 and PE2 are in the same partition if and only if 3538 there is some PE3 such that PE1 and PE2 have each chosen PE3 as the 3539 upstream PE for C-R. Note that each partition has exactly one 3540 upstream PE. So it is possible to identify the partition by 3541 identifying its upstream PE. 3543 Consider packet P, and let PE1 be its ingress PE. PE1 will send the 3544 packet on a PMSI so that it reaches the other PEs that need to 3545 receive it. This is done by encapsulating the packet and sending it 3546 on a P-tunnel. If the original packet is part of a PIM-BIDIR group 3547 (its ingress PE determines this from the packet's destination address 3548 C-G), and if the VPN backbone is not the RPL, then the encapsulation 3549 MUST carry information that can be used to identify the partition to 3550 which the ingress PE belongs. 3552 When PE2 receives a packet from the PMSI, PE2 must determine, by 3553 examining the encapsulation, whether the packet's ingress PE belongs 3554 to the same partition (relative to the C-RPA of the packet's C-G) 3555 that PE2 itself belongs to. If not, PE2 discards the packet. 3556 Otherwise PE2 performs the normal BIDIR-PIM data packet processing. 3557 With this rule in place, harmful loops cannot be introduced by the 3558 PEs into the customer's bidirectional tree. 3560 Note that if there is more than one partition, the VPN backbone will 3561 not carry a packet from one partition to another. The only way for a 3562 packet to get from one partition to another is for it to go up 3563 towards the RPA and then to go down another path to the backbone. If 3564 this is not considered desirable, then all PEs should choose the same 3565 upstream PE for a given C-RPA. Then multiple partitions will only 3566 exist during routing transients. 3568 12.2.2. Using PE Labels 3570 If a given P-tunnel is to be used to carry packets belonging to a 3571 bidirectional C-group, then, EXCEPT for the case described in section 3572 12.2.3 the packets that travel on that P-tunnel MUST carry a PE label 3573 (defined in section 4), using the encapsulation discussed in section 3574 11.3. 3576 When a given PE transmits a given packet of a bidirectional C-group 3577 to the P-tunnel, the packet will carry the PE label corresponding to 3578 the partition, for the C-group's C-RPA, that contains the 3579 transmitting PE. This is the PE label that has been bound to the 3580 upstream PE of that partition; it is not necessarily the label that 3581 has been bound to the transmitting PE. 3583 Recall that the PE labels are upstream-assigned labels that are 3584 assigned and advertised by the node which is at the root of the P- 3585 tunnel. (Procedures for PE label assignment when the P-tunnel is not 3586 a multicast tree will be given is later revisions of this document.) 3588 When a PE receives a packet with a PE label that does not identify 3589 the partition of the receiving PE, then the receiving PE discards the 3590 packet. 3592 Note that this procedure does not require the root of a P-tunnel to 3593 assign a PE label for every PE that belongs to the tunnel, but only 3594 for those PEs that might become the upstream PEs of some partition. 3596 12.2.3. Mesh of MP2MP P-Tunnels 3598 There is one case in which support for BIDIR-PIM C-groups does not 3599 require the use of a PE label. For a given C-RPA, suppose a distinct 3600 MP2MP LSP is used as the P-tunnel serving that partition. Then for a 3601 given packet, a PE receiving the packet from a P-tunnel can be infer 3602 the partition from the tunnel. So PE labels are not needed in this 3603 case. 3605 13. Security Considerations 3607 This document describes an extension to the procedures of [RFC4364], 3608 and hence shares the security considerations described in [RFC4364] 3609 and [RFC4365]. 3611 When GRE encapsulation is used, the security considerations of [MPLS- 3612 IP] are also relevant. The security considerations of [RFC4797] are 3613 also relevant as it discusses implications on packet spoofing in the 3614 context of 2547 VPNs. 3616 The security considerations of [MPLS-HDR] apply when MPLS 3617 encapsulation is used. 3619 This document makes use of a number of control protocols: PIM [PIM- 3620 SM], BGP MVPN-BGP], mLDP [MLDP], and RSVP-TE [RSVP-P2MP]. Security 3621 considerations relevant to each protocol are discussed in the 3622 respective protocol specifications. 3624 If one uses the UDP-based protocol for switching to S-PMSI (as 3625 specified in Section 7.2.1), then by default each PE router MUST 3626 install packet filters that would result in discarding all UDP 3627 packets with the destination port 3232 that the PE router receives 3628 from the CE routers connected to the PE router. 3630 The various procedures for P-tunnel construction have security issues 3631 that are specific to the way in which the P-tunnels are used in this 3632 document. When P-tunnels are constructed via such techniques as as 3633 PIM, mLDP, or RSVP-TE, it is important for each P or PE router 3634 receiving a control message to be sure that the control message comes 3635 from another P or PE router, not from a CE router. This should not 3636 be a problem, because mLDP or PIM or RSVP-TE control messages from CE 3637 routers will never be interpreted as referring to P-tunnels. 3639 An ASBR may receive, from one SP's domain, an mLDP, PIM, or RSVP-TE 3640 control message that attempts to extend a multicast distribution tree 3641 from one SP's domain into another SP's domain. The ASBR should not 3642 allow this unless explicitly configured to do so. 3644 14. IANA Considerations 3646 Section 7.2.1.1 defines the "S-PMSI Join Message", which is carried 3647 in a UDP datagram whose port number is 3232. This port number is 3648 already assigned by IANA to "MDT port". IANA should now have that 3649 assignment reference this document. 3651 IANA should create a registry for the "S-PMSI Join Message Type 3652 Field". The value 1 should be registered with a reference to this 3653 document. The description should read "PIM IPv4 S-PMSI 3654 (unaggregated)". 3656 15. Other Authors 3658 Sarveshwar Bandi, Yiqun Cai, Thomas Morin, Yakov Rekhter, IJsbrands 3659 Wijnands, Seisho Yasukawa 3661 16. Other Contributors 3663 Significant contributions were made Arjen Boers, Toerless Eckert, 3664 Adrian Farrel, Luyuan Fang, Dino Farinacci, Lenny Guiliano, Shankar 3665 Karuna, Anil Lohiya, Tom Pusateri, Ted Qian, Robert Raszuk, Tony 3666 Speakman, Dan Tappan. 3668 17. Authors' Addresses 3670 Rahul Aggarwal (Editor) 3671 Juniper Networks 3672 1194 North Mathilda Ave. 3673 Sunnyvale, CA 94089 3674 Email: rahul@juniper.net 3675 Sarveshwar Bandi 3676 Motorola 3677 Vanenburg IT park, Madhapur, 3678 Hyderabad, India 3679 Email: sarvesh@motorola.com 3681 Yiqun Cai 3682 Cisco Systems, Inc. 3683 170 Tasman Drive 3684 San Jose, CA, 95134 3685 E-mail: ycai@cisco.com 3687 Thomas Morin 3688 France Telecom R & D 3689 2, avenue Pierre-Marzin 3690 22307 Lannion Cedex 3691 France 3692 Email: thomas.morin@francetelecom.com 3694 Yakov Rekhter 3695 Juniper Networks 3696 1194 North Mathilda Ave. 3697 Sunnyvale, CA 94089 3698 Email: yakov@juniper.net 3700 Eric C. Rosen (Editor) 3701 Cisco Systems, Inc. 3702 1414 Massachusetts Avenue 3703 Boxborough, MA, 01719 3704 E-mail: erosen@cisco.com 3706 IJsbrand Wijnands 3707 Cisco Systems, Inc. 3708 170 Tasman Drive 3709 San Jose, CA, 95134 3710 E-mail: ice@cisco.com 3711 Seisho Yasukawa 3712 NTT Corporation 3713 9-11, Midori-Cho 3-Chome 3714 Musashino-Shi, Tokyo 180-8585, 3715 Japan 3716 Phone: +81 422 59 4769 3717 Email: yasukawa.seisho@lab.ntt.co.jp 3719 18. Normative References 3721 [MLDP] I. Minei, K., Kompella, I. Wijnands, B. Thomas, "Label 3722 Distribution Protocol Extensions for Point-to-Multipoint and 3723 Multipoint-to-Multipoint Label Switched Paths", draft-ietf-mpls-ldp- 3724 p2mp-03, July 2007 3726 [MPLS-HDR] E. Rosen, et. al., "MPLS Label Stack Encoding", RFC 3032, 3727 January 2001 3729 [MPLS-IP] T. Worster, Y. Rekhter, E. Rosen, "Encapsulating MPLS in IP 3730 or Generic Routing Encapsulation (GRE)", RFC 4023, March 2005 3732 [MPLS-MCAST-ENCAPS] T. Eckert, E. Rosen, R. Aggarwal, Y. Rekhter, 3733 "MPLS Multicast Encapsulations", draft-ietf-mpls-multicast- 3734 encaps-06.txt, July 2007 3736 [MPLS-UPSTREAM-LABEL] R. Aggarwal, Y. Rekhter, E. Rosen, "MPLS 3737 Upstream Label Assignment and Context Specific Label Space", draft- 3738 ietf-mpls-upstream-label-02.txt, March 2007 3740 [MVPN-BGP], R. Aggarwal, E. Rosen, T. Morin, Y. Rekhter, C. 3741 Kodeboniya, "BGP Encodings for Multicast in MPLS/BGP IP VPNs", draft- 3742 ietf-l3vpn-2547bis-mcast-bgp-04.txt, November 2007 3744 [PIM-ATTRIB], A. Boers, IJ. Wijnands, E. Rosen, "Format for Using 3745 TLVs in PIM Messages", draft-ietf-pim-join-attributes-03, May 2007 3747 [PIM-SM] "Protocol Independent Multicast - Sparse Mode (PIM-SM)", 3748 Fenner, Handley, Holbrook, Kouvelas, August 2006, RFC 4601 3750 [RFC2119] "Key words for use in RFCs to Indicate Requirement 3751 Levels.", Bradner, March 1997 3753 [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006 3755 [RSVP-P2MP] R. Aggarwal, D. Papadimitriou, S. Yasukawa, et. al., 3756 "Extensions to RSVP-TE for Point-to-Multipoint TE LSPs", RFC 4875, 3757 May 2007 3759 19. Informative References 3761 [ADMIN-ADDR] D. Meyer, "Administratively Scoped IP Multicast", RFC 3762 2365, July 1998 3764 [MVPN-REQ] T. Morin, Ed., "Requirements for Multicast in L3 Provider- 3765 Provisioned VPNs", RFC 4834, April 2007 3767 [RFC1853] W. Simpson, "IP in IP Tunneling", October 1995 3769 [RFC2784] D. Farinacci, et. al., "Generic Routing Encapsulation", 3770 March 2000 3772 [RFC2890] G. Dommety, "Key and Sequence Number Extensions to GRE", 3773 September 2000 3775 [RFC2983] D. Black, "Differentiated Services and Tunnels", October 3776 2000 3778 [RFC3270] F. Le Faucheur, et. al., "MPLS Support of Differentiated 3779 Services", May 2002 3781 [RFC4365], E. Rosen, " Applicability Statement for BGP/MPLS IP 3782 Virtual Private Networks (VPNs)", February 2006 3784 [RFC4607] H. Holbrook, B. Cain, "Source-Specific Multicast for IP", 3785 August 2006 3787 [RFC4797] Y. Rekhter, R. Bonica, E. Rosen, "Use of Provider Edge to 3788 Provider Edge (PE-PE) Generic Routing Encapsulation (GRE) or IP in 3789 BGP/MPLS IP Virtual Private Networks", January 2007 3791 20. Full Copyright Statement 3793 Copyright (C) The IETF Trust (2008). 3795 This document is subject to the rights, licenses and restrictions 3796 contained in BCP 78, and except as set forth therein, the authors 3797 retain all their rights. 3799 This document and the information contained herein are provided on an 3800 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 3801 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 3802 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 3803 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 3804 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 3805 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 3807 21. Intellectual Property 3809 The IETF takes no position regarding the validity or scope of any 3810 Intellectual Property Rights or other rights that might be claimed to 3811 pertain to the implementation or use of the technology described in 3812 this document or the extent to which any license under such rights 3813 might or might not be available; nor does it represent that it has 3814 made any independent effort to identify any such rights. Information 3815 on the procedures with respect to rights in RFC documents can be 3816 found in BCP 78 and BCP 79. 3818 Copies of IPR disclosures made to the IETF Secretariat and any 3819 assurances of licenses to be made available, or the result of an 3820 attempt made to obtain a general license or permission for the use of 3821 such proprietary rights by implementers or users of this 3822 specification can be obtained from the IETF on-line IPR repository at 3823 http://www.ietf.org/ipr. 3825 The IETF invites any interested party to bring to its attention any 3826 copyrights, patents or patent applications, or other proprietary 3827 rights that may cover technology that may be required to implement 3828 this standard. Please address the information to the IETF at 3829 ietf-ipr@ietf.org.