idnits 2.17.1 draft-ietf-l3vpn-2547bis-mcast-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 3304. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 3315. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 3322. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 3328. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: If the S-PMSI is instantiated by a source-initiated P-multicast tree (e.g., an RSVP-TE P2MP tunnel), the PE at the root of the tree must establish the source-initiated P-multicast tree to the leaves. This tree MAY have been established before the leaves receive the S-PMSI binding, or MAY be established after the leaves receives the binding. The leaves MUST not switch to the S-PMSI until they receive both the binding and the tree signaling message. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 2007) is 6221 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-08) exists of draft-ietf-l3vpn-2547bis-mcast-bgp-02 == Outdated reference: A later version (-10) exists of draft-ietf-mpls-multicast-encaps-04 == Outdated reference: A later version (-07) exists of draft-ietf-mpls-upstream-label-02 ** Obsolete normative reference: RFC 4601 (ref. 'PIM-SM') (Obsoleted by RFC 7761) == Outdated reference: A later version (-15) exists of draft-rosen-vpn-mcast-08 Summary: 2 errors (**), 0 flaws (~~), 7 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Eric C. Rosen (Editor) 3 Internet Draft Cisco Systems, Inc. 4 Expiration Date: October 2007 5 Rahul Aggarwal (Editor) 6 Juniper Networks 8 April 2007 10 Multicast in MPLS/BGP IP VPNs 12 draft-ietf-l3vpn-2547bis-mcast-04.txt 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 Abstract 39 In order for IP multicast traffic within a BGP/MPLS IP VPN (Virtual 40 Private Network) to travel from one VPN site to another, special 41 protocols and procedures must be implemented by the VPN Service 42 Provider. These protocols and procedures are specified in this 43 document. 45 Table of Contents 47 1 Specification of requirements ...................... 4 48 2 Introduction ....................................... 4 49 2.1 Optimality vs Scalability .......................... 5 50 2.1.1 Multicast Distribution Trees ....................... 7 51 2.1.2 Ingress Replication through Unicast Tunnels ........ 8 52 2.2 Overview ........................................... 8 53 2.2.1 Multicast Routing Adjacencies ...................... 8 54 2.2.2 MVPN Definition .................................... 8 55 2.2.3 Auto-Discovery ..................................... 9 56 2.2.4 PE-PE Multicast Routing Information ................ 10 57 2.2.5 PE-PE Multicast Data Transmission .................. 11 58 2.2.6 Inter-AS MVPNs ..................................... 11 59 2.2.7 Optional Deployment Models ......................... 12 60 3 Concepts and Framework ............................. 12 61 3.1 PE-CE Multicast Routing ............................ 12 62 3.2 P-Multicast Service Interfaces (PMSIs) ............. 13 63 3.2.1 Inclusive and Selective PMSIs ...................... 14 64 3.2.2 Tunnels Instantiating PMSIs ........................ 15 65 3.3 Use of PMSIs for Carrying Multicast Data ........... 17 66 3.3.1 MVPNs with Default MI-PMSIs ........................ 18 67 3.3.2 When MI-PMSIs are Required ......................... 18 68 3.3.3 MVPNs That Do Not Use MI-PMSIs ..................... 18 69 4 BGP-Based Autodiscovery of MVPN Membership ......... 19 70 5 PE-PE Transmission of C-Multicast Routing .......... 21 71 5.1 RPF Information for Unicast VPN-IP Routes .......... 21 72 5.2 PIM Peering ........................................ 23 73 5.2.1 Full Per-MVPN PIM Peering Across a MI-PMSI ......... 23 74 5.2.2 Lightweight PIM Peering Across a MI-PMSI ........... 23 75 5.2.3 Unicasting of PIM C-Join/Prune Messages ............ 24 76 5.2.4 Details of Per-MVPN PIM Peering over MI-PMSI ....... 24 77 5.2.4.1 PIM C-Instance Control Packets ..................... 25 78 5.2.4.2 PIM C-instance RPF Determination ................... 25 79 5.3 Use of BGP for Carrying C-Multicast Routing ........ 27 80 5.3.1 Sending BGP Updates ................................ 27 81 5.3.2 Explicit Tracking .................................. 29 82 5.3.3 Withdrawing BGP Updates ............................ 29 83 6 I-PMSI Instantiation ............................... 30 84 6.1 MVPN Membership and Egress PE Auto-Discovery ....... 30 85 6.1.1 Auto-Discovery for Ingress Replication ............. 30 86 6.1.2 Auto-Discovery for P-Multicast Trees ............... 31 87 6.2 C-Multicast Routing Information Exchange ........... 31 88 6.3 Aggregation ........................................ 31 89 6.3.1 Aggregate Tree Leaf Discovery ...................... 32 90 6.3.2 Aggregation Methodology ............................ 32 91 6.3.3 Encapsulation of the Aggregate Tree ................ 33 92 6.3.4 Demultiplexing C-multicast traffic ................. 33 93 6.4 Mapping Received Packets to MVPNs .................. 34 94 6.4.1 Unicast Tunnels .................................... 35 95 6.4.2 Non-Aggregated P-Multicast Trees ................... 35 96 6.4.3 Aggregate P-Multicast Trees ........................ 36 97 6.5 I-PMSI Instantiation Using Ingress Replication ..... 36 98 6.6 Establishing P-Multicast Trees ..................... 37 99 6.7 RSVP-TE P2MP LSPs .................................. 38 100 6.7.1 P2MP TE LSP Tunnel - MVPN Mapping .................. 38 101 6.7.2 Demultiplexing C-Multicast Data Packets ............ 39 102 7 Optimizing Multicast Distribution via S-PMSIs ...... 39 103 7.1 S-PMSI Instantiation Using Ingress Replication ..... 40 104 7.2 Protocol for Switching to S-PMSIs .................. 41 105 7.2.1 A UDP-based Protocol for Switching to S-PMSIs ...... 41 106 7.2.1.1 Binding a Stream to an S-PMSI ...................... 41 107 7.2.1.2 Packet Formats and Constants ....................... 42 108 7.2.2 A BGP-based Protocol for Switching to S-PMSIs ...... 44 109 7.2.2.1 Advertising C-(S, G) Binding to a S-PMSI using BGP . 44 110 7.2.2.2 Explicit Tracking .................................. 46 111 7.2.2.3 Switching to S-PMSI ................................ 46 112 7.3 Aggregation ........................................ 47 113 7.4 Instantiating the S-PMSI with a PIM Tree ........... 47 114 7.5 Instantiating S-PMSIs using RSVP-TE P2MP Tunnels ... 48 115 8 Inter-AS Procedures ................................ 48 116 8.1 Non-Segmented Inter-AS Tunnels ..................... 49 117 8.1.1 Inter-AS MVPN Auto-Discovery ....................... 49 118 8.1.2 Inter-AS MVPN Routing Information Exchange ......... 49 119 8.1.3 Inter-AS I-PMSI .................................... 50 120 8.1.4 Inter-AS S-PMSI .................................... 51 121 8.2 Segmented Inter-AS Tunnels ......................... 51 122 8.2.1 Inter-AS MVPN Auto-Discovery Routes ................ 51 123 8.2.1.1 Originating Inter-AS MVPN A-D Information .......... 52 124 8.2.1.2 Propagating Inter-AS MVPN A-D Information .......... 53 125 8.2.1.2.1 Inter-AS Auto-Discovery Route received via EBGP .... 53 126 8.2.1.2.2 Leaf Auto-Discovery Route received via EBGP ........ 54 127 8.2.1.2.3 Inter-AS Auto-Discovery Route received via IBGP .... 55 128 8.2.2 Inter-AS MVPN Routing Information Exchange ......... 56 129 8.2.3 Inter-AS I-PMSI .................................... 56 130 8.2.3.1 Support for Unicast VPN Inter-AS Methods ........... 57 131 8.2.4 Inter-AS S-PMSI .................................... 57 132 9 Duplicate Packet Detection and Single Forwarder PE . 58 133 10 Deployment Models .................................. 62 134 10.1 Co-locating C-RPs on a PE .......................... 62 135 10.1.1 Initial Configuration .............................. 62 136 10.1.2 Anycast RP Based on Propagating Active Sources ..... 62 137 10.1.2.1 Receiver(s) Within a Site .......................... 63 138 10.1.2.2 Source Within a Site ............................... 63 139 10.1.2.3 Receiver Switching from Shared to Source Tree ...... 63 140 10.2 Using MSDP between a PE and a Local C-RP ........... 64 141 11 Encapsulations ..................................... 65 142 11.1 Encapsulations for Single PMSI per Tunnel .......... 65 143 11.1.1 Encapsulation in GRE ............................... 65 144 11.1.2 Encapsulation in IP ................................ 66 145 11.1.3 Encapsulation in MPLS .............................. 67 146 11.2 Encapsulations for Multiple PMSIs per Tunnel ....... 68 147 11.2.1 Encapsulation in GRE ............................... 68 148 11.2.2 Encapsulation in IP ................................ 68 149 11.3 Encapsulations for Unicasting PIM Control Messages . 68 150 11.4 General Considerations for IP and GRE Encaps ....... 69 151 11.4.1 MTU ................................................ 69 152 11.4.2 TTL ................................................ 69 153 11.4.3 Differentiated Services ............................ 70 154 11.4.4 Avoiding Conflict with Internet Multicast .......... 70 155 12 Security Considerations ............................ 70 156 13 IANA Considerations ................................ 70 157 14 Other Authors ...................................... 70 158 15 Other Contributors ................................. 70 159 16 Authors' Addresses ................................. 71 160 17 Normative References ............................... 72 161 18 Informative References ............................. 73 162 19 Full Copyright Statement ........................... 74 163 20 Intellectual Property .............................. 74 165 1. Specification of requirements 167 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 168 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 169 document are to be interpreted as described in [RFC2119]. 171 2. Introduction 173 [RFC4364] specifies the set of procedures which a Service Provider 174 (SP) must implement in order to provide a particular kind of VPN 175 service ("BGP/MPLS IP VPN") for its customers. The service described 176 therein allows IP unicast packets to travel from one customer site to 177 another, but it does not provide a way for IP multicast traffic to 178 travel from one customer site to another. 180 This document extends the service defined in [RFC4364] so that it 181 also includes the capability of handling IP multicast traffic. This 182 requires a number of different protocols to work together. The 183 document provides a framework describing how the various protocols 184 fit together, and also provides detailed specification of some of the 185 protocols. The detailed specification of some of the other 186 protocols is found in pre-existing documents or in companion 187 documents. 189 2.1. Optimality vs Scalability 191 In a "BGP/MPLS IP VPN" [RFC4364], unicast routing of VPN packets is 192 achieved without the need to keep any per-VPN state in the core of 193 the SP's network (the "P routers"). Routing information from a 194 particular VPN is maintained only by the Provider Edge routers (the 195 "PE routers", or "PEs") that attach directly to sites of that VPN. 196 Customer data travels through the P routers in tunnels from one PE to 197 another (usually MPLS Label Switched Paths, LSPs), so to support the 198 VPN service the P routers only need to have routes to the PE routers. 199 The PE-to-PE routing is optimal, but the amount of associated state 200 in the P routers depends only on the number of PEs, not on the number 201 of VPNs. 203 However, in order to provide optimal multicast routing for a 204 particular multicast flow, the P routers through which that flow 205 travels have to hold state which is specific to that flow. 206 Scalability would be poor if the amount of state in the P routers 207 were proportional to the number of multicast flows in the VPNs. 208 Therefore, when supporting multicast service for a BGP/MPLS IP VPN, 209 the optimality of the multicast routing must be traded off against 210 the scalability of the P routers. We explain this below in more 211 detail. 213 If a particular VPN is transmitting "native" multicast traffic over 214 the backbone, we refer to it as an "MVPN". By "native" multicast 215 traffic, we mean packets that a CE sends to a PE, such that the IP 216 destination address of the packets is a multicast group address, or 217 the packets are multicast control packets addressed to the PE router 218 itself, or the packets are IP multicast data packets encapsulated in 219 MPLS. 221 We say that the backbone multicast routing for a particular multicast 222 group in a particular VPN is "optimal" if and only if all of the 223 following conditions hold: 225 - When a PE router receives a multicast data packet of that group 226 from a CE router, it transmits the packet in such a way that the 227 packet is received by every other PE router which is on the path 228 to a receiver of that group; 230 - The packet is not received by any other PEs; 232 - While in the backbone, no more than one copy of the packet ever 233 traverses any link. 235 - While in the backbone, if bandwidth usage is to be optimized, the 236 packet traverses minimum cost trees rather than shortest path 237 trees. 239 Optimal routing for a particular multicast group requires that the 240 backbone maintain one or more source-trees which are specific to that 241 flow. Each such tree requires that state be maintained in all the P 242 routers that are in the tree. 244 This would potentially require an unbounded amount of state in the P 245 routers, since the SP has no control of the number of multicast 246 groups in the VPNs that it supports. Nor does the SP have any control 247 over the number of transmitters in each group, nor of the 248 distribution of the receivers. 250 The procedures defined in this document allow an SP to provide 251 multicast VPN service without requiring the amount of state 252 maintained by the P routers to be proportional to the number of 253 multicast data flows in the VPNs. The amount of state is traded off 254 against the optimality of the multicast routing. Enough flexibility 255 is provided so that a given SP can make his own tradeoffs between 256 scalability and optimality. An SP can even allow some multicast 257 groups in some VPNs to receive optimal routing, while others do not. 258 Of course, the cost of this flexibility is an increase in the number 259 of options provided by the protocols. 261 The basic technique for providing scalability is to aggregate a 262 number of customer multicast flows onto a single multicast 263 distribution tree through the P routers. A number of aggregation 264 methods are supported. 266 The procedures defined in this document also accommodate the SP that 267 does not want to build multicast distribution trees in his backbone 268 at all; the ingress PE can replicate each multicast data packet and 269 then unicast each replica through a tunnel to each egress PE that 270 needs to receive the data. 272 2.1.1. Multicast Distribution Trees 274 This document supports the use of a single multicast distribution 275 tree in the backbone to carry all the multicast traffic from a 276 specified set of one or more MVPNs. Such a tree is referred to as an 277 "Inclusive Tree". An Inclusive Tree which carries the traffic of more 278 than one MVPN is an "Aggregate Inclusive Tree". An Inclusive Tree 279 contains, as its members, all the PEs that attach to any of the MVPNs 280 using the tree. 282 With this option, even if each tree supports only one MVPN, the upper 283 bound on the amount of state maintained by the P routers is 284 proportional to the number of VPNs supported, rather than to the 285 number of multicast flows in those VPNs. If the trees are 286 unidirectional, it would be more accurate to say that the state is 287 proportional to the product of the number of VPNs and the average 288 number of PEs per VPN. The amount of state maintained by the P 289 routers can be further reduced by aggregating more MVPNs onto a 290 single tree. If each such tree supports a set of MVPNs, (call it an 291 "MVPN aggregation set"), the state maintained by the P routers is 292 proportional to the product of the number of MVPN aggregation sets 293 and the average number of PEs per MVPN. Thus the state does not grow 294 linearly with the number of MVPNs. 296 However, as data from many multicast groups is aggregated together 297 onto a single "Inclusive Tree", it is likely that some PEs will 298 receive multicast data for which they have no need, i.e., some degree 299 of optimality has been sacrificed. 301 This document also provides procedures which enable a single 302 multicast distribution tree in the backbone to be used to carry 303 traffic belonging only to a specified set of one or more multicast 304 groups, from one or more MVPNs. Such a tree is referred to as a 305 "Selective Tree" and more specifically as an "Aggregate Selective 306 Tree" when the multicast groups belong to different MVPNs. By 307 default, traffic from most multicast groups could be carried by an 308 Inclusive Tree, while traffic from, e.g., high bandwidth groups could 309 be carried in one of the "Selective Trees". When setting up the 310 Selective Trees, one should include only those PEs which need to 311 receive multicast data from one or more of the groups assigned to the 312 tree. This provides more optimal routing than can be obtained by 313 using only Inclusive Trees, though it requires additional state in 314 the P routers. 316 2.1.2. Ingress Replication through Unicast Tunnels 318 This document also provides procedures for carry MVPN data traffic 319 through unicast tunnels from the ingress PE to each of the egress 320 PEs. The ingress PE replicates the multicast data packet received 321 from a CE and sends it to each of the egress PEs using the unicast 322 tunnels. This requires no multicast routing state in the P routers 323 at all, but it puts the entire replication load on the ingress PE 324 router, and makes no attempt to optimize the multicast routing. 326 2.2. Overview 328 2.2.1. Multicast Routing Adjacencies 330 In BGP MPLS IP VPNs [RFC4364], each CE ("Customer Edge") router is a 331 unicast routing adjacency of a PE router, but CE routers at different 332 sites do not become unicast routing adjacencies of each other. This 333 important characteristic is retained for multicast routing -- a CE 334 router becomes a multicast routing adjacency of a PE router, but CE 335 routers at different sites do not become multicast routing 336 adjacencies of each other. 338 The multicast routing protocol on the PE-CE link is presumed to be 339 PIM. The Sparse Mode, Dense Mode, Single Source Mode, and 340 Bidirectional Modes are supported. A CE router exchanges "ordinary" 341 PIM control messages with the PE router to which it is attached. 343 The PEs attaching to a particular MVPN then have to exchange the 344 multicast routing information with each other. Two basic methods for 345 doing this are defined: (1) PE-PE PIM, and (2) BGP. In the former 346 case, the PEs need to be multicast routing adjacencies of each other. 347 In the latter case, they do not. For example, each PE may be a BGP 348 adjacency of a Route Reflector (RR), and not of any other PEs. 350 To support the "Carrier's Carrier" model of [RFC4364], mLDP or BGP 351 can be used on the PE-CE interface. This will be described in 352 subsequent versions of this document. 354 2.2.2. MVPN Definition 356 An MVPN is defined by two sets of sites, Sender Sites set and 357 Receiver Sites set, with the following properties: 359 - Hosts within the Sender Sites set could originate multicast 360 traffic for receivers in the Receiver Sites set. 362 - Receivers not in the Receiver Sites set should not be able to 363 receive this traffic. 365 - Hosts within the Receiver Sites set could receive multicast 366 traffic originated by any host in the Sender Sites set. 368 - Hosts within the Receiver Sites set should not be able to 369 receive multicast traffic originated by any host that is not in 370 the Sender Sites set. 372 A site could be both in the Sender Sites set and Receiver Sites set, 373 which implies that hosts within such a site could both originate and 374 receive multicast traffic. An extreme case is when the Sender Sites 375 set is the same as the Receiver Sites set, in which case all sites 376 could originate and receive multicast traffic from each other. 378 Sites within a given MVPN may be either within the same, or in 379 different organizations, which implies that an MVPN can be either an 380 Intranet or an Extranet. 382 A given site may be in more than one MVPN, which implies that MVPNs 383 may overlap. 385 Not all sites of a given MVPN have to be connected to the same 386 service provider, which implies that an MVPN can span multiple 387 service providers. 389 Another way to look at MVPN is to say that an MVPN is defined by a 390 set of administrative policies. Such policies determine both Sender 391 Sites set and Receiver Site set. Such policies are established by 392 MVPN customers, but implemented/realized by MVPN Service Providers 393 using the existing BGP/MPLS VPN mechanisms, such as Route Targets, 394 with extensions, as necessary. 396 2.2.3. Auto-Discovery 398 In order for the PE routers attaching to a given MVPN to exchange 399 MVPN control information with each other, each one needs to discover 400 all the other PEs that attach to the same MVPN. (Strictly speaking, 401 a PE in the receiver sites set need only discover the other PEs in 402 the sender sites set and a PE in the sender sites set need only 403 discover the other PEs in the receiver sites set.) This is referred 404 to as "MVPN Auto-Discovery". 406 This document discusses two ways of providing MVPN autodiscovery: 408 - BGP can be used for discovering and maintaining MVPN membership. 409 The PE routers advertise their MVPN membership to other PE 410 routers using BGP. A PE is considered to be a "member" of a 411 particular MVPN if it contains a VRF (Virtual Routing and 412 Forwarding table, see [RFC4364]) which is configured to contain 413 the multicast routing information of that MVPN. This auto- 414 discovery option does not make any assumptions about the methods 415 used for transmitting MVPN multicast data packets through the 416 backbone. 418 - If it is known that the multicast data packets of a particular 419 MVPN are to be transmitted (at least, by default) through a non- 420 aggregated Inclusive Tree which is to be set up by PIM-SM or 421 PIM-Bidir, and if the PEs attaching to that MVPN are configured 422 with the group address corresponding to that tree, then the PEs 423 can auto-discover each other simply by joining the tree and then 424 multicasting PIM Hellos over the tree. 426 2.2.4. PE-PE Multicast Routing Information 428 The BGP/MPLS IP VPN [RFC4364] specification requires a PE to maintain 429 at most one BGP peering with every other PE in the network. This 430 peering is used to exchange VPN routing information. The use of Route 431 Reflectors further reduces the number of BGP adjacencies maintained 432 by a PE to exchange VPN routing information with other PEs. This 433 document describes various options for exchanging MVPN control 434 information between PE routers based on the use of PIM or BGP. These 435 options have different overheads with respect to the number of 436 routing adjacencies that a PE router needs to maintain to exchange 437 MVPN control information with other PE routers. Some of these options 438 allow the retention of the unicast BGP/MPLS VPN model letting a PE 439 maintain at most one routing adjacency with other PE routers to 440 exchange MVPN control information. 442 The solution in [RFC4364] uses BGP to exchange VPN routing 443 information between PE routers. This document describes various 444 solutions for exchanging MVPN control information. One option is the 445 use of BGP, providing reliable transport. Another option is the use 446 of the currently existing, "soft state" PIM standard [PIM-SM]. 448 2.2.5. PE-PE Multicast Data Transmission 450 Like [RFC4364], this document decouples the procedures for exchanging 451 routing information from the procedures for transmitting data 452 traffic. Hence a variety of transport technologies may be used in the 453 backbone. For inclusive trees, these transport technologies include 454 unicast PE-PE tunnels (using MPLS or IP/GRE encapsulation), multicast 455 distribution trees created by PIM-SSM, PIM-SM, or PIM-Bidir (using 456 IP/GRE encapsulation), point-to-multipoint LSPs created by RSVP-TE or 457 mLDP, and multipoint-to-multipoint LSPs created by mLDP. (However, 458 techniques for aggregating the traffic of multiple MVPNs onto a 459 single multipoint-to-multipoint LSP or onto a single bidirectional 460 multicast distribution tree are for further study.) For selective 461 trees, only unicast PE-PE tunnels (using MPLS or IP/GRE 462 encapsulation) and unidirectional single-source trees are supported, 463 and the supported tree creation protocols are PIM-SSM (using IP/GRE 464 encapsulation), RSVP-TE, and mLDP. 466 In order to aggregate traffic from multiple MVPNs onto a single 467 multicast distribution tree, it is necessary to have a mechanism to 468 enable the egresses of the tree to demultiplex the multicast traffic 469 received over the tree and to associate each received packet with a 470 particular MVPN. This document specifies a mechanism whereby 471 upstream label assignment [MPLS-UPSTREAM-LABEL] is used by the root 472 of the tree to assign a label to each flow. This label is used by 473 the receivers to perform the demultiplexing. This document also 474 describes procedures based on BGP that are used by the root of an 475 Aggregate Tree to advertise the Inclusive and/or Selective binding 476 and the demultiplexing information to the leaves of the tree. 478 This document also describes the data plane encapsulations for 479 supporting the various SP multicast transport options. 481 This document assumes that when SP multicast trees are used, traffic 482 for a particular multicast group is transmitted by a particular PE on 483 only one SP multicast tree. The use of multiple SP multicast trees 484 for transmitting traffic belonging to a particular multicast group is 485 for further study. 487 2.2.6. Inter-AS MVPNs 489 [RFC4364] describes different options for supporting Inter-AS 490 BGP/MPLS unicast VPNs. This document describes how Inter-AS MVPNs can 491 be supported for each of the unicast BGP/MPLS VPN Inter-AS options. 492 This document also specifies a model where Inter-AS MVPN service can 493 be offered without requiring a single SP multicast tree to span 494 multiple ASes. In this model, an inter-AS multicast tree consists of 495 a number of "segments", one per AS, which are stitched together at AS 496 boundary points. These are known as "segmented inter-AS trees". Each 497 segment of a segmented inter-AS tree may use a different multicast 498 transport technology. 500 It is also possible to support Inter-AS MVPNs with non-segmented 501 source trees that extend across AS boundaries. 503 2.2.7. Optional Deployment Models 505 The document also discusses an optional MVPN deployment model in 506 which PEs take on all or part of the role of a PIM RP (Rendezvous 507 Point). The necessary protocol extensions to support this are 508 defined. 510 3. Concepts and Framework 512 3.1. PE-CE Multicast Routing 514 Support of multicast in BGP/MPLS IP VPNs is modeled closely after 515 support of unicast in BGP/MPLS IP VPNs. That is, a multicast routing 516 protocol will be run on the PE-CE interfaces, such that PE and CE are 517 multicast routing adjacencies on that interface. CEs at different 518 sites do not become multicast routing adjacencies of each other. 520 If a PE attaches to n VPNs for which multicast support is provided 521 (i.e., to n "MVPNs"), the PE will run n independent instances of a 522 multicast routing protocol. We will refer to these multicast routing 523 instances as "VPN-specific multicast routing instances", or more 524 briefly as "multicast C-instances". The notion of a "VRF" ("Virtual 525 Routing and Forwarding Table"), defined in [RFC4364], is extended to 526 include multicast routing entries as well as unicast routing entries. 527 Each multicast routing entry is thus associated with a particular 528 VRF. 530 Whether a particular VRF belongs to an MVPN or not is determined by 531 configuration. 533 In this document, we will not attempt to provide support for every 534 possible multicast routing protocol that could possibly run on the 535 PE-CE link. Rather, we consider multicast C-instances only for the 536 following multicast routing protocols: 538 - PIM Sparse Mode (PIM-SM) 540 - PIM Single Source Mode (PIM-SSM) 542 - PIM Bidirectional Mode (PIM-Bidir) 544 - PIM Dense Mode (PIM-DM) 546 In order to support the "Carrier's Carrier" model of [RFC4364], mLDP 547 or BGP will also be supported on the PE-CE interface; however, this 548 is not described in this revision. 550 As the document only supports PIM-based C-instances, we will 551 generally use the term "PIM C-instances" to refer to the multicast 552 C-instances. 554 A PE router may also be running a "provider-wide" instance of PIM, (a 555 "PIM P-instance"), in which it has a PIM adjacency with, e.g., each 556 of its IGP neighbors (i.e., with P routers), but NOT with any CE 557 routers, and not with other PE routers (unless another PE router 558 happens to be an IGP adjacency). In this case, P routers would also 559 run the P-instance of PIM, but NOT a C-instance. If there is a PIM 560 P-instance, it may or may not have a role to play in support of VPN 561 multicast; this is discussed in later sections. However, in no case 562 will the PIM P-instance contain VPN-specific multicast routing 563 information. 565 In order to help clarify when we are speaking of the PIM P-instance 566 and when we are speaking of a PIM C-instance, we will also apply the 567 prefixes "P-" and "C-" respectively to control messages, addresses, 568 etc. Thus a P-Join would be a PIM Join which is processed by the PIM 569 P-instance, and a C-Join would be a PIM Join which is processed by a 570 C-instance. A P-group address would be a group address in the SP's 571 address space, and a C-group address would be a group address in a 572 VPN's address space. 574 3.2. P-Multicast Service Interfaces (PMSIs) 576 Multicast data packets received by a PE over a PE-CE interface must 577 be forwarded to one or more of the other PEs in the same MVPN for 578 delivery to one or more other CEs. 580 We define the notion of a "P-Multicast Service Interface" (PMSI). If 581 a particular MVPN is supported by a particular set of PE routers, 582 then there will be a PMSI connecting those PE routers. A PMSI is a 583 conceptual "overlay" on the P network with the following property: a 584 PE in a given MVPN can give a packet to the PMSI, and the packet will 585 be delivered to some or all of the other PEs in the MVPN, such that 586 any PE receiving such a packet will be able to tell which MVPN the 587 packet belongs to. 589 As we discuss below, a PMSI may be instantiated by a number of 590 different transport mechanisms, depending on the particular 591 requirements of the MVPN and of the SP. We will refer to these 592 transport mechanisms as "tunnels". 594 For each MVPN, there are one or more PMSIs that are used for 595 transmitting the MVPN's multicast data from one PE to others. We 596 will use the term "PMSI" such that a single PMSI belongs to a single 597 MVPN. However, the transport mechanism which is used to instantiate 598 a PMSI may allow a single "tunnel" to carry the data of multiple 599 PMSIs. 601 In this document we make a clear distinction between the multicast 602 service (the PMSI) and its instantiation. This allows us to separate 603 the discussion of different services from the discussion of different 604 instantiations of each service. The term "tunnel" is used to refer 605 only to the transport mechanism that instantiates a service. 607 [This is a significant change from previous drafts on the topic of 608 MVPN, which have used the term "Multicast Tunnel" to refer both to 609 the multicast service (what we call here the PMSI) and to its 610 instantiation.] 612 3.2.1. Inclusive and Selective PMSIs 614 We will distinguish between three different kinds of PMSI: 616 - "Multidirectional Inclusive" PMSI (MI-PMSI) 618 A Multidirectional Inclusive PMSI is one which enables ANY PE 619 attaching to a particular MVPN to transmit a message such that it 620 will be received by EVERY other PE attaching to that MVPN. 622 There is at most one MI-PMSI per MVPN. (Though the tunnel which 623 instantiates an MI-PMSI may actually carry the data of more than 624 one PMSI.) 626 An MI-PMSI can be thought of as an overlay broadcast network 627 connecting the set of PEs supporting a particular MVPN. 629 [The "Default MDTs" of rosen-08 provide the transport service of 630 MI-PMSIs, in this terminology.] 632 - "Unidirectional Inclusive" PMSI (UI-PMSI) 634 A Unidirectional Inclusive PMSI is one which enables a particular 635 PE, attached to a particular MVPN, to transmit a message such 636 that it will be received by all the other PEs attaching to that 637 MVPN. There is at most one UI-PMSI per PE per MVPN, though the 638 "tunnel" which instantiates a UI-PMSI may in fact carry the data 639 of more than one PMSI. 641 - "Selective" PMSI (S-PMSI). 643 A Selective PMSI is one which provides a mechanism wherein a 644 particular PE in an MVPN can multicast messages so that they will 645 be received by a subset of the other PEs of that MVPN. There may 646 be an arbitrary number of S-PMSIs per PE per MVPN. Again, the 647 "tunnel" which instantiates a given S-PMSI may carry data from 648 multiple S-PMSIs. 650 [The "Data MDTs" of earlier drafts provide the transport service 651 of "Selective PMSIs" in the terminology of this draft.] 653 We will see in later sections the role played by these different 654 kinds of PMSI. We will use the term "I-PMSI" when we are not 655 distinguishing between "MI-PMSIs" and "UI-PMSIs". 657 3.2.2. Tunnels Instantiating PMSIs 659 A number of different tunnel setup techniques can be used to create 660 the tunnels that instantiate the PMSIs. Among these are: 662 - PIM 664 A PMSI can be instantiated as (a set of) Multicast Distribution 665 Trees created by the PIM P-instance ("P-trees"). 667 PIM-SSM, PIM-Bidir, or PIM-SM can be used to create P-trees. 668 (PIM-DM is not supported for this purpose.) 670 A single MI-PMSI can be instantiated by a single shared P-tree, 671 or by a number of source P-trees (one for each PE of the MI- 672 PMSI). P-trees may be shared by multiple MVPNs (i.e., a given 673 P-tree may be the instantiation of multiple PMSIs), as long as 674 the encapsulation provides some means of demultiplexing the data 675 traffic by MVPN. 677 Selective PMSIs are most instantiated by source P-trees, and are 678 most naturally created by PIM-SSM, since by definition only one 679 PE is the source of the multicast data on a Selective PMSI. 681 [The "Default MDTs" of [rosen-08] are MI-PMSIs instantiated as 682 PIM trees. The "data MDTs" of [rosen-08] are S-PMSIs 683 instantiated as PIM trees.] 685 - MLDP 687 A PMSI may be instantiated as one or more mLDP Point-to- 688 Multipoint (P2MP) LSPs, or as an mLDP Multipoint-to-Point(MP2MP) 689 LSP. A Selective PMSI or a Unidirectional Inclusive PMSI would 690 be instantiated as a single mLDP P2MP LSP, whereas a 691 Multidirectional Inclusive PMSI could be instantiated either as a 692 set of such LSPs (one for each PE in the MVPN) or as a single 693 M2PMP LSP. 695 MLDP P2MP LSPs can be shared across multiple MVPNs. 697 - RSVP-TE 699 A PMSI may be instantiated as one or more RSVP-TE Point-to- 700 Multipoint (P2MP) LSPs. A Selective PMSI or a Unidirectional 701 Inclusive PMSI would be instantiated as a single RSVP-TE P2MP 702 LSP, whereas a Multidirectional Inclusive PMSI would be 703 instantiated as a set of such LSPs, one for each PE in the MVPN. 704 RSVP-TE P2MP LSPs can be shared across multiple MVPNs. 706 - A Mesh of Unicast Tunnels. 708 If a PMSI is implemented as a mesh of unicast tunnels, a PE 709 wishing to transmit a packet through the PMSI would replicate the 710 packet, and send a copy to each of the other PEs. 712 An MI-PMSI for a given MVPN can be instantiated as a full mesh of 713 unicast tunnels among that MVPN's PEs. A UI-PMSI or an S-PMSI 714 can be instantiated as a partial mesh. 716 - Unicast Tunnels to the Root of a P-Tree. 718 Any type of PMSI can be instantiated through a method in which 719 there is a single P-tree (created, for example, via PIM-SSM or 720 via RSVP-TE), and a PE transmits a packet to the PMSI by sending 721 it in a unicast tunnel to the root of that P-tree. All PEs in 722 the given MVPN would need to be leaves of the tree. 724 When this instantiation method is used, the transmitter of the 725 multicast data may receive its own data back. Methods for 726 avoiding this are for further study. 728 It can be seen that each method of implementing PMSIs has its own 729 area of applicability. This specification therefore allows for the 730 use of any of these methods. At first glance, this may seem like an 731 overabundance of options. However, the history of multicast 732 development and deployment should make it clear that there is no one 733 option which is always acceptable. The use of segmented inter-AS 734 trees does allow each SP to select the option which it finds most 735 applicable in its own environment, without causing any other SP to 736 choose that same option. 738 Specifying the conditions under which a particular tree building 739 method is applicable is outside the scope of this document. 741 The choice of the tunnel technique belongs to the sender router and 742 is a local policy decision of the router. The procedures defined 743 throughout this document do not mandate that the same tunnel 744 technique be used for all PMSI tunnels going through a same provider 745 backbone. It is however expected that any tunnel technique that can 746 be subject to being used by a PE for a particular MVPN is also 747 supported by other PE having VRFs for the MVPN. Moreover, the use of 748 ingress replication by any PE for an MVPN, implies that all other PEs 749 MUST use ingress replication for this MVPN. 751 3.3. Use of PMSIs for Carrying Multicast Data 753 Each PE supporting a particular MVPN must have a way of discovering: 755 - The set of other PEs in its AS that are attached to sites of that 756 MVPN, and the set of other ASes that have PEs attached to sites 757 of that MVPN. However, if segmented inter-AS trees are not used 758 (see section 8.2), then each PE needs to know the entire set of 759 PEs attached to sites of that MVPN. 761 - If segmented inter-AS trees are to be used, the set of border 762 routers in its AS that support inter-AS connectivity for that 763 MVPN 765 - If the MVPN is configured to use a default MI-PMSI, the 766 information needed to set up and to use the tunnels instantiating 767 the default MI-PMSI, 769 - For each other PE, whether the PE supports Aggregate Trees for 770 the MVPN, and if so, the demultiplexing information which must be 771 provided so that the other PE can determine whether a packet 772 which it received on an aggregate tree belongs to this MVPN. 774 In some cases this information is provided by means of the BGP-based 775 auto-discovery procedures detailed in section 4. In other cases, 776 this information is provided after discovery is complete, by means of 777 procedures defined in section 6.1.2. In either case, the information 778 which is provided must be sufficient to enable the PMSI to be bound 779 to the identified tunnel, to enable the tunnel to be created if it 780 does not already exist, and to enable the different PMSIs which may 781 travel on the same tunnel to be properly demultiplexed. 783 3.3.1. MVPNs with Default MI-PMSIs 785 If an MVPN uses an MI-PMSI, then the MI-PMSI for that MVPN will be 786 created as soon as the necessary information has been obtained. 787 Creating a PMSI means creating the tunnel which carries it (unless 788 that tunnel already exists), as well as binding the PMSI to the 789 tunnel. The MI-PMSI for that MVPN is then used as the default method 790 of transmitting multicast data packets for that MVPN. In effect, all 791 the multicast streams for the MVPN are, by default, aggregated onto 792 the MI-MVPN. 794 If a particular multicast stream from a particular source PE has 795 certain characteristics, it can be desirable to migrate it from the 796 MI-PMSI to an S-PMSI. Procedures for migrating a stream from an MI- 797 PMSI to an S-PMSI are discussed in section 7. 799 3.3.2. When MI-PMSIs are Required 801 MI-PMSIs are required under the following conditions: 803 - The MVPN is using PIM-DM, or some other protocol (such as BSR) 804 which relies upon flooding. Only with an MI-PMSI can the C-data 805 (or C-control-packets) received from any CE be flooded to all 806 PEs. 808 - If the procedure for carrying C-multicast routes from PE to PE 809 involves the multicasting of P-PIM control messages among the PEs 810 (see sections 5.2.1, 5.2.2, and 5.2.4). 812 3.3.3. MVPNs That Do Not Use MI-PMSIs 814 If a particular MVPN does not use a default MI-PMSI, then its 815 multicast data may be sent by default on a UI-PMSI. 817 It is also possible to send all the multicast data on an S-PMSI, 818 omitting any usage of I-PMSIs. This prevents PEs from receiving data 819 which they don't need, at the cost of requiring additional tunnels. 820 However, cost-effective instantiation of S-PMSIs is likely to require 821 Aggregate P-trees, which in turn makes it necessary for the 822 transmitting PE to know which PEs need to receive which multicast 823 streams. This is known as "explicit tracking", and the procedures to 824 enable explicit tracking may themselves impose a cost. This is 825 further discussed in section 7.2.2.2. 827 4. BGP-Based Autodiscovery of MVPN Membership 829 BGP-based autodiscovery is done by means of a new address family, the 830 MCAST-VPN address family. (This address family also has other uses, 831 as will be seen later.) Any PE which attaches to an MVPN must issue 832 a BGP update message containing an NLRI in this address family, along 833 with a specific set of attributes. In this document, we specify the 834 information which must be contained in these BGP updates in order to 835 provide auto-discovery. The encoding details, along with the 836 complete set of detailed procedures, are specified in a separate 837 document [MVPN-BGP]. 839 This section specifies the intra-AS BGP-based autodiscovery 840 procedures. When segmented inter-AS trees are used, additional 841 procedures are needed, as specified in section 8. Further detail may 842 be found in [MVPN-BGP]. (When segmented inter-AS trees are not used, 843 the inter-AS procedures are almost identical to the intra-AS 844 procedures.) 846 BGP-based autodiscovery uses a particular kind of MCAST-VPN route 847 known as an "auto-discovery routes", or "A-D route". 849 An "intra-AS A-D route" is a particular kind of A-D route that is 850 never distributed outside its AS of origin. Intra-AS A-D routes are 851 originated by the PEs that are (directly) connected to the site(s) of 852 that MVPN. 854 For the purpose of auto-discovery, each PE attached to a site in a 855 given MVPN must originate an intra-AS auto-discovery route. The NLRI 856 of that route must the following information: 858 - The route type (i.e., intra-AS A-D route) 860 - IP address of the originating PE 862 - An RD configured locally for the MVPN. This is an RD which can 863 be prepended to that IP address to form a globally unique VPN-IP 864 address of the PE. 866 The A-D route must also carry the following attributes: 868 - One or more Route Target attributes. If any other PE has one of 869 these Route Targets configured for import into a VRF, it treats 870 the advertising PE as a member in the MVPN to which the VRF 871 belongs. This allows each PE to discover the PEs that belong to a 872 given MVPN. More specifically it allows a PE in the receiver 873 sites set to discover the PEs in the sender sites set of the MVPN 874 and the PEs in the sender sites set of the MVPN to discover the 875 PEs in the receiver sites set of the MVPN. The PEs in the 876 receiver sites set would be configured to import the Route 877 Targets advertised in the BGP Auto-Discovery routes by PEs in the 878 sender sites set. The PEs in the sender sites set would be 879 configured to import the Route Targets advertised in the BGP 880 Auto-Discovery routes by PEs in the receiver sites set. 882 * PMSI tunnel attribute. This attribute is present if and only if 883 a default MI-PMSI is to be used for the MVPN. It contains the 884 following information: 886 whether the MI-PMSI is instantiated by 888 + A PIM-Bidir tree, 890 + a set of PIM-SSM trees, 892 + a set of PIM-SM trees 894 + a set of RSVP-TE point-to-multipoint LSPs 896 + a set of mLDP point-to-multipoint LSPs 898 + an mLDP multipoint-to-multipoint LSP 900 + a set of unicast tunnels 902 + a set of unicast tunnels to the root of a shared tree (in 903 this case the root must be identified) 905 * If the PE wishes to setup a default tunnel to instantiate the 906 I-PMSI, a unique identifier for the tunnel used to 907 instantiate the I-PMSI. 909 All the PEs attaching to a given MVPN (within a given AS) 910 must have been configured with the same PMSI tunnel attribute 911 for that MVPN. They are also expected to know the 912 encapsulation to use. 914 Note that a default tunnel can be identified at discovery 915 time only if the tunnel already exists (e.g., it was 916 constructed by means of configuration), or if it can be 917 constructed without each PE knowing the the identities of all 918 the others (e.g., it is constructed by a receiver-initiated 919 join technique such as PIM or mLDP). 921 In other cases, a default tunnel cannot be identified until 922 the PE has discovered one or more of the other PEs. This 923 will be the case, for example, if the tunnel is an RSVP-TE 924 P2MP LSP, which must be set up from the head end. In these 925 cases, a PE will first send an A-D route without a tunnel 926 identifier, and then will send another one with a tunnel 927 identifier after discovering one or more of the other PEs. 929 * Whether the tunnel used to instantiate the I-PMSI for this 930 MVPN is aggregating I-PMSIs from multiple MVPNs. This will 931 affect the encapsulation used. If aggregation is to be used, 932 a demultiplexor value to be carried by packets for this 933 particular MVPN must also be specified. The demultiplexing 934 mechanism and signaling procedures are described in section 935 6. 936 Further details of the use of this information are provided in 937 subsequent sections. 939 5. PE-PE Transmission of C-Multicast Routing 941 As a PE attached to a given MVPN receives C-Join/Prune messages from 942 its CEs in that MVPN, it must convey the information contained in 943 those messages to other PEs that are attached to the same MVPN. 945 There are several different methods for doing this. As these methods 946 are not interoperable, the method to be used for a particular MVPN 947 must either be configured, or discovered as part of the BGP-based 948 auto-discovery process. 950 5.1. RPF Information for Unicast VPN-IP Routes 952 When a PE receives a C-Join/Prune message from a CE, the message 953 identifies a particular multicast flow as belong either to a source 954 tree (S,G) or to a shared tree (*,G). We use the term C-source to 955 refer to S, in the case of a source tree, or to the Rendezvous Point 956 (RP) for G, in the case of (*,G). The PE needs to find the "upstream 957 multicast hop" for the (S,G) or (*,G) flow, and it does this by 958 looking up the C-source in the unicast VRF associated with the PE-CE 959 interfaces over which the C-Join/Prune was received. To facilitate 960 this, all unicast VPN-IP routes from an MVPN will carry RPF 961 information, which identifies the PE that originated the route, as 962 well as identifying the Autonomous System containing that PE. This 963 information is consulted when a PE does an "RPF lookup" of the C- 964 source as part of processing the C-Join/Prune messages. This RPF 965 information contains the following: 967 - Source AS Extended Community 969 To support MVPN a PE that originates a (unicast) route to VPN- 970 IPv4 addresses MUST include in the BGP Update message that 971 carries this route the Source AS extended community, except if it 972 is known a priori that none of these addresses will act as 973 multicast sources and/or RP, in which case the (unicast) route 974 need not carry the Source AS extended community. The Global 975 Administrator field of this community MUST be set to the 976 autonomous system number of the PE. The Local Administrator field 977 of this community SHOULD be set to 0. This community is described 978 further in [MVPN-BGP]. 980 - Route Import Extended Community 982 To support MVPN in addition to the import/export Route Target(s) 983 used by the unicast routing, each VRF on a PE MUST have an import 984 Route Target that is unique to this VRF, except if it is known a 985 priori that none of the (local) MVPN sites associated with the 986 VRF contain multicast source(s) and/or RP, in which case the VRF 987 need not have this import Route Target. This Route Target MUST be 988 IP address specific, and is constructed as follows: 990 + The Global Administrator field of the Route Target MUST be set to 991 an IP address of the PE. This address MUST be a routable IP 992 address. This address MAY be common for all the VRFs on the PE 993 (e.,g., this address may be PE's loopback address). 995 + The Local Administrator field of the Route Target associated with 996 a given VRF contains a 2 octets long number that uniquely 997 identifies that VRF within the PE that contains the VRF 998 (procedures for assigning such numbers are purely local to the 999 PE, and outside the scope of this document). 1001 A PE that originates a (unicast) route to VPN-IPv4 addresses MUST 1002 include in the BGP Updates message that carries this route the Route 1003 Import extended community that has the value of this Route Target, 1004 except if it is known a priori that none of these addresses will act 1005 as multicast sources and/or RP, in which case the (unicast) route 1006 need not carry the Route Import extended community. 1008 The Route Import Extended Community is described further in [MVPN- 1009 BGP]. 1011 5.2. PIM Peering 1013 5.2.1. Full Per-MVPN PIM Peering Across a MI-PMSI 1015 If the set of PEs attached to a given MVPN are connected via a MI- 1016 PMSI, the PEs can form "normal" PIM adjacencies with each other. 1017 Since the MI-PMSI functions as a broadcast network, the standard PIM 1018 procedures for forming and maintaining adjacencies over a LAN can be 1019 applied. 1021 As a result, the C-Join/Prune messages which a PE receives from a CE 1022 can be multicast to all the other PEs of the MVPN. PIM "join 1023 suppression" can be enabled and the PEs can send Asserts as needed. 1025 [This is the procedure specified in [rosen-08].] 1027 5.2.2. Lightweight PIM Peering Across a MI-PMSI 1029 The procedure of the previous section has the following 1030 disadvantages: 1032 - Periodic Hello messages must be sent by all PEs. 1034 Standard PIM procedures require that each PE in a particular MVPN 1035 periodically multicast a Hello to all the other PEs in that MVPN. 1036 If the number of MVPNs becomes very large, sending and receiving 1037 these Hellos can become a substantial overhead for the PE 1038 routers. 1040 - Periodic retransmission of C-Join/Prune messages. 1042 PIM is a "soft-state" protocol, in which reliability is assured 1043 through frequent retransmissions (refresh) of control messages. 1044 This too can begin to impose a large overhead on the PE routers 1045 as the number of MVPNs grows. 1047 The first of these disadvantages is easily remedied. The reason for 1048 the periodic PIM Hellos is to ensure that each PIM speaker on a LAN 1049 knows who all the other PIM speakers on the LAN are. However, in the 1050 context of MVPN, PEs in a given MVPN can learn the identities of all 1051 the other PEs in the MVPN by means of the BGP-based auto-discovery 1052 procedure of section 4. In that case, the periodic Hellos would 1053 serve no function, and could simply be eliminated. (Of course, this 1054 does imply a change to the standard PIM procedures.) 1056 When Hellos are suppressed, we may speak of "lightweight PIM 1057 peering". 1059 The periodic refresh of the C-Join/Prunes is not as simple to 1060 eliminate. The L3VPN WG has asked the PIM WG to specify "refresh 1061 reduction" procedures for PIM, so as to eliminate the need for the 1062 periodic refreshes. If and when such procedures have been specified, 1063 it will be very useful to incorporate them, so as to make the 1064 lightweight PIM peering procedures even more lightweight. 1066 5.2.3. Unicasting of PIM C-Join/Prune Messages 1068 PIM does not require that the C-Join/Prune messages which a PE 1069 receives from a CE to be multicast to all the other PEs; it allows 1070 them to be unicast to a single PE, the one which is upstream on the 1071 path to the root of the multicast tree mentioned in the Join/Prune 1072 message. Note that when the C-Join/Prune messages are unicast, there 1073 is no such thing as "join suppression". Therefore PIM Refresh 1074 Reduction may be considered to be a pre-requisite for the procedure 1075 of unicasting the C-Join/Prune messages. 1077 When the C-Join/Prunes are unicast, they are not transmitted on a 1078 PMSI at all. Note that the procedure of unicasting the C-Join/Prunes 1079 is different than the procedure of transmitting the C-Join/Prunes on 1080 an MI-PMSI which is instantiated as a mesh of unicast tunnels. 1082 If there are multiple PEs that can be used to reach a given C-source, 1083 procedures described in section 9 MUST be used to ensue that, at 1084 least within a single AS, all PEs choose the same PE to reach the C- 1085 source. 1087 5.2.4. Details of Per-MVPN PIM Peering over MI-PMSI 1089 In this section, we assume that inter-AS MVPNs will be supported by 1090 means of non-segmented inter-AS trees. Support for segmented inter- 1091 AS trees with PIM peering is for further study. 1093 When an MVPN uses an MI-PMSI, the C-instances of that MVPN can treat 1094 the MI-PMSI as a LAN interface, and form either full PIM adjacencies 1095 or lightweight PIM adjacencies with each other over that "LAN 1096 interface". 1098 To form a full PIM adjacency, the PEs execute the PIM LAN procedures, 1099 including the generation and processing of PIM Hello, Join/Prune, 1100 Assert, DF election and other PIM control packets. These are 1101 executed independently for each C-instance. PIM "join suppression" 1102 SHOULD be enabled. 1104 If it is known that all C-instances of a particular MVPN can support 1105 lightweight adjacencies, then lightweight adjacencies MUST be used. 1106 If it is not known that all such C-instances support lightweight 1107 instances, then full adjacencies MUST be used. Whether all the C- 1108 instances support lightweight adjacencies is known by virtue of the 1109 BGP-based auto-discovery procedures (combined with configuration). 1110 This knowledge might change over time, so the PEs must be able to 1111 switch in real time between the use of full adjacencies and 1112 lightweight adjacencies. 1114 The difference between a lightweight adjacency and a full adjacency 1115 is that no PIM Hellos are sent or received on a lightweight 1116 adjacency. The function which Hellos usually provide in PIM can be 1117 provided in MVPN by the BGP-based auto-discovery procedures, so the 1118 Hellos become superfluous. 1120 Whether or not Hellos are sent, if PIM Refresh Reduction procedures 1121 are available, and all the PEs supporting the MVPN are known to 1122 support these procedures, then the refresh reduction procedures MUST 1123 be used. 1125 5.2.4.1. PIM C-Instance Control Packets 1127 All PIM C-Instance control packets of a particular MVPN are addressed 1128 to the ALL-PIM-ROUTERS (224.0.0.13) IP destination address, and 1129 transmitted over the MI-PMSI of that MVPN. While in transit in the 1130 P-network, the packets are encapsulated as required for the 1131 particular kind of tunnel that is being used to instantiate the MI- 1132 PMSI. Thus the C-instance control packets are not processed by the P 1133 routers, and MVPN-specific PIM routes can be extended from site to 1134 site without appearing in the P routers. 1136 5.2.4.2. PIM C-instance RPF Determination 1138 Although the MI-PMSI is treated by PIM as a LAN interface, unicast 1139 routing is NOT run over it, and there are no unicast routing 1140 adjacencies over it. It is therefore necessary to specify special 1141 procedures for determining when the MI-PMSI is to be regarded as the 1142 "RPF Interface" for a particular C-address. 1144 When a PE needs to determine the RPF interface of a particular C- 1145 address, it looks up the C-address in the VRF. If the route matching 1146 it (call this the "RPF route") is not a VPN-IP route learned from 1147 MP-BGP as described in [RFC4364], or if that route's outgoing 1148 interface is one of the interfaces associated with the VRF, then 1149 ordinary PIM procedures for determining the RPF interface apply. 1151 However, if the RPF route is a VPN-IP route whose outgoing interface 1152 is not one of the interfaces associated with the VRF, then PIM will 1153 consider the outgoing interface to be the MI-PMSI associated with the 1154 VPN-specific PIM instance. 1156 Once PIM has determined that the RPF interface for a particular C- 1157 address is the MI-PMSI, it is necessary for PIM to determine the RPF 1158 neighbor for that C-address. This will be one of the other PEs that 1159 is a PIM adjacency over the MI-PMSI. 1161 When a PE distributes a given VPN-IP route via BGP, the PE must 1162 determine whether that route might possibly be regarded, by another 1163 PE, as an RPF route. (If a given VRF is part of an MVPN, it may be 1164 simplest to regard every route exported from that VRF to be a 1165 potential RPF route.) If the given VPN-IP route is a potential RPF 1166 route, then when the VPN-IP route is distributed by BGP, it SHOULD be 1167 accompanied by a VRF Route Import Extended Community (see [MVPN- 1168 BGP]). 1170 The VRF Route Import Extended Community contains an embedded IP 1171 address. If a PE advertises a route with a VRF Route Import Extended 1172 Community, then the PE MUST use that the IP address embedded therein 1173 as its Source IP address in any PIM control messages which it 1174 transmits to other PEs in the same MVPN. If a VRF Route Import 1175 Extended Community is not present, then the source IP address in any 1176 PIM control messages which it transmits to other PEs in the same MVPN 1177 MUST be be the same as the address carried in the BGP Next Hop of the 1178 route. 1180 When a PE has determined that the RPF interface for a particular C- 1181 address is the MI-PMSI, it must look up the RPF information that was 1182 distributed along with the VPN-IP address corresponding to that C- 1183 address. The IP address in this RPF information will be considered 1184 to be the IP address of the RPF adjacency for the C-address. 1186 If the RPF information is not present, but the "BGP Next Hop" for the 1187 C-address is one of the PEs that is a PIM adjacency over the MI-PMSI, 1188 then that PE should be treated as the RPF adjacency for that C- 1189 address. However, if the MVPN spans multiple Autonomous Systems, the 1190 BGP Next Hop might not be a PIM adjacency, and if that is the case 1191 the RPF check will not succeed unless the RPF information is used. 1193 5.3. Use of BGP for Carrying C-Multicast Routing 1195 It is possible to use BGP to carry C-multicast routing information 1196 from PE to PE, dispensing entirely with the transmission of C- 1197 Join/Prune messages from PE to PE. This section describes the 1198 procedures for carrying intra-AS multicast routing information. 1199 Inter-AS procedures are described in section 8. 1201 5.3.1. Sending BGP Updates 1203 The MCAST-VPN address family is used for this purpose. MCAST-VPN 1204 routes used for the purpose of carrying C-multicast routing 1205 information are distinguished from those used for the purpose of 1206 carrying auto-discovery information by means of a "route type" field 1207 which is encoded into the NLRI. The following information is 1208 required in BGP to advertise the MVPN routing information. The NLRI 1209 contains: 1211 - The type of C-multicast route. 1213 There are two types: 1215 * source tree join 1217 * shared tree join 1219 - The RD configured, for the MVPN, on the PE that is advertising 1220 the information. This is required to uniquely identify the as the addresses could overlap between different 1222 MVPNs. 1224 - The C-Source address. (Omitted if the route type is "shared tree 1225 join") 1227 - The C-Group address. 1229 - The RD from the VPN-IP route to the C-source. 1231 That is, the route to the C-source is looked up in the local 1232 unicast VRF associated with the CE-PE interface over which the 1233 C-multicast control packet arrived. The corresponding VPN-IP 1234 route is then examined, and the RD from that route is placed into 1235 the C-multicast route. 1237 Note that this RD is NOT necessarily one which is configured on 1238 the local PE. Rather it is one which is configured on the remote 1239 PE that is on the path to the C-source. 1241 The following attribute must also be included: 1243 - The upstream multicast hop. 1245 If a PE receives a C-Join (*, G) from a CE, the C-source is 1246 considered to be the C-RP for the particular C-G. When the C- 1247 multicast route represents a "shared tree join", it is presumed 1248 that the root of the tree (e.g., the RP) is determined by some 1249 means outside the scope of this specification. 1251 When the PE processes a C-PIM Join/Prune message, the route to 1252 the C-source is looked up in the local unicast VRF associated 1253 with the CE-PE interface over which the C-multicast control 1254 packet arrived. The corresponding VPN-IP route is then examined. 1255 If the AS specified therein is the local AS, or if no AS is 1256 specified therein, then the PE specified therein becomes the 1257 upstream multicast hop. If the AS specified therein is a remote 1258 AS, the BGP next hop on the route to the MVPN Auto-Discovery 1259 route advertised by the remote AS, becomes the upstream multicast 1260 hop. 1262 N.B.: It is possible that here is more than one unicast VPN-IP 1263 route to the C-source. In this case, the route that was 1264 installed in the VRF is not necessarily the route that must be 1265 chosen by the PE. In order to choose the proper route, the 1266 procedures followed in section 9 MUST be followed. 1268 The upstream multicast hop is identified in an Extended Communities 1269 attribute to facilitate the optional use of filters which can prevent 1270 the distribution of the update to BGP speakers other than the 1271 upstream multicast hop. 1273 When a PE distributes this information via BGP, it must include a 1274 Route Import Extended Communities attribute learned from the RPF 1275 information. 1277 Note that for these procedures to work the VPN-IP route MUST contain 1278 the RPF information. 1280 Note that there is no C-multicast route corresponding to the PIM 1281 function of pruning a source off the shared tree when a PE switches 1282 from a tree to a tree. Section 9 of this 1283 document specifies a mandatory procedure that ensures that if any PE 1284 joins a source tree, all other PEs that have joined or 1285 will join the shared tree will also join the 1286 source tree. This eliminates the need for a C-multicast route that 1287 prunes C-S off the shared tree when switching from to tree. 1290 5.3.2. Explicit Tracking 1292 Note that the upstream multicast hop is NOT part of the NLRI in the 1293 C-multicast BGP routes. This means that if several PEs join the same 1294 C-tree, the BGP routes they distribute to do so are regarded by BGP 1295 as comparable routes, and only one will be installed. If a route 1296 reflector is being used, this further means that the PE which is used 1297 to reach the C-source will know only that one or more of the other 1298 PEs have joined the tree, but it won't know which one. That is, this 1299 BGP update mechanism does not provide "explicit tracking". Explicit 1300 tracking is not provided by default because it increases the amount 1301 of state needed and thus decreases scalability. Also, as 1302 constructing the C-PIM messages to send "upstream" for a given tree 1303 does not depend on knowing all the PEs that are downstream on that 1304 tree, there is no reason for the C-multicast route type updates to 1305 provide explicit tracking. 1307 There are some cases in which explicit tracking is necessary in order 1308 for the PEs to set up certain kinds of P-trees. There are other 1309 cases in which explicit tracking is desirable in order to determine 1310 how to optimally aggregate multicast flows onto a given aggregate 1311 tree. As these functions have to do with the setting up of 1312 infrastructure in the P-network, rather than with the dissemination 1313 of C-multicast routing information, any explicit tracking that is 1314 necessary is handled by sending the "source active" A-D routes, that 1315 are described in sections 9 and 10. Detailed procedures for turning 1316 on explicit tracking can be found in [MVPN-BGP]. 1318 5.3.3. Withdrawing BGP Updates 1320 A PE removes itself from a C-multicast tree (shared or source) by 1321 withdrawing the corresponding BGP update. 1323 If a PE has pruned a C-source from a shared C-multicast tree, and it 1324 needs to "unprune" that source from that tree, it does so by 1325 withdrawing the route that pruned the source from the tree. 1327 6. I-PMSI Instantiation 1329 This section describes how tunnels in the SP network can be used to 1330 instantiate an I-PMSI for an MVPN on a PE. When C-multicast data is 1331 delivered on an I-PMSI, the data will go to all PEs that are on the 1332 path to receivers for that C-group, but may also go to PEs that are 1333 not on the path to receivers for that C-group. 1335 The tunnels which instantiate I-PMSIs can be either PE-PE unicast 1336 tunnels or P-multicast trees. When PE-PE unicast tunnels are used the 1337 PMSI is said to be instantiated using ingress replication. The 1338 instantiation of a tunnel for an I-PMSI is a matter of local policy 1339 decision and is not mandatory. Even for a site attached to multicast 1340 sources, transport of customer multicast traffic can be accommodated 1341 with S-PMSI-bound tunnels only 1343 [Editor's Note: MD trees described in [ROSEN-8, MVPN-BASE] are an 1344 example of P-multicast trees. Also Aggregate Trees described in 1345 [RAGGARWA-MCAST] are an example of P-multicast trees.] 1347 6.1. MVPN Membership and Egress PE Auto-Discovery 1349 As described in section 4 a PE discovers the MVPN membership 1350 information of other PEs using BGP auto-discovery mechanisms or using 1351 a mechanism that instantiates a MI-PMSI interface. When a PE supports 1352 only a UI-PMSI service for an MVPN, it MUST rely on the BGP auto- 1353 discovery mechanisms for discovering this information. This 1354 information also results in a PE in the sender sites set discovering 1355 the leaves of the P-multicast tree, which are the egress PEs that 1356 have sites in the receiver sites set in one or more MVPNs mapped onto 1357 the tree. 1359 6.1.1. Auto-Discovery for Ingress Replication 1361 In order for a PE to use Unicast Tunnels to send a C-multicast data 1362 packet for a particular MVPN to a set of remote PEs, the remote PEs 1363 must be able to correctly decapsulate such packets and to assign each 1364 one to the proper MVPN. This requires that the encapsulation used for 1365 sending packets through the tunnel have demultiplexing information 1366 which the receiver can associate with a particular MVPN. 1368 If ingress replication is being used for an MVPN, the PEs announce 1369 this as part of the BGP based MVPN membership auto-discovery process, 1370 described in section 4. The PMSI tunnel attribute specifies ingress 1371 replication. The demultiplexor value is a downstream-assigned MPLS 1372 label (i.e., assigned by the PE that originated the A-D route, to be 1373 used by other PEs when they send multicast packets on a unicast 1374 tunnel to that PE). 1376 Other demultiplexing procedures for unicast are under consideration. 1378 6.1.2. Auto-Discovery for P-Multicast Trees 1380 A PE announces the P-multicast technology it supports for a specified 1381 MVPN, as part of the BGP MVPN membership discovery. This allows other 1382 PEs to determine the P-multicast technology they can use for building 1383 P-multicast trees to instantiate an I-PMSI. If a PE has a default 1384 tree instantiation of an I-PMSI, it also announces the tree 1385 identifier as part of the auto-discovery, as well as announcing its 1386 aggregation capability. 1388 The announcement of a tree identifier at discovery time is only 1389 possible if the tree already exists (e.g., a preconfigured "traffic 1390 engineered" tunnel), or if the tree can be constructed dynamically 1391 without any PE having to know in advance all the other PEs on the 1392 tree (e.g., the tree is created by receiver-initiated joins). 1394 6.2. C-Multicast Routing Information Exchange 1396 When a PE doesn't support the use of a MI-PMSI for a given MVPN, it 1397 MUST either unicast MVPN routing information using PIM or else use 1398 BGP for exchanging the MVPN routing information. 1400 6.3. Aggregation 1402 A P-multicast tree can be used to instantiate a PMSI service for only 1403 one MVPN or for more than one MVPN. When a P-multicast tree is shared 1404 across multiple MVPNs it is termed an Aggregate Tree [RAGGARWA- 1405 MCAST]. The procedures described in this document allow a single SP 1406 multicast tree to be shared across multiple MVPNs. The procedures 1407 that are specific to aggregation are optional and are explicitly 1408 pointed out. Unless otherwise specified a P-multicast tree technology 1409 supports aggregation. 1411 Aggregate Trees allow a single P-multicast tree to be used across 1412 multiple MVPNs and hence state in the SP core grows per-set-of-MVPNs 1413 and not per MVPN. Depending on the congruence of the aggregated 1414 MVPNs, this may result in trading off optimality of multicast 1415 routing. 1417 An Aggregate Tree can be used by a PE to provide an UI-PMSI or MI- 1418 PMSI service for more than one MVPN. When this is the case the 1419 Aggregate Tree is said to have an inclusive mapping. 1421 6.3.1. Aggregate Tree Leaf Discovery 1423 BGP MVPN membership discovery allows a PE to determine the different 1424 Aggregate Trees that it should create and the MVPNs that should be 1425 mapped onto each such tree. The leaves of an Aggregate Tree are 1426 determined by the PEs, supporting aggregation, that belong to all the 1427 MVPNs that are mapped onto the tree. 1429 If an Aggregate Tree is used to instantiate one or more S-PMSIs, then 1430 it may be desirable for the PE at the root of the tree to know which 1431 PEs (in its MVPN) are receivers on that tree. This enables the PE to 1432 decide when to aggregate two S-PMSIs, based on congruence (as 1433 discussed in the next section). Thus explicit tracking may be 1434 required. Since the procedures for disseminating C-multicast routes 1435 do not provide explicit tracking, a type of A-D route known as a 1436 "Leaf A-D Route" is used. The PE which wants to assign a particular 1437 C-multicast flow to a particular Aggregate Tree can send an A-D route 1438 which elicits Leaf A-D routes from the PEs that need to receive that 1439 C-multicast flow. This provides the explicit tracking information 1440 needed to support the aggregation methodology discussed in the next 1441 section. 1443 6.3.2. Aggregation Methodology 1445 This document does not specify the mandatory implementation of any 1446 particular set of rules for determining whether or not the PMSIs of 1447 two particular MVPNs are to be instantiated by the same Aggregate 1448 Tree. This determination can be made by implementation-specific 1449 heuristics, by configuration, or even perhaps by the use of offline 1450 tools. 1452 It is the intention of this document that the control procedures will 1453 always result in all the PEs of an MVPN to agree on the PMSIs which 1454 are to be used and on the tunnels used to instantiate those PMSIs. 1456 This section discusses potential methodologies with respect to 1457 aggregation. 1459 The "congruence" of aggregation is defined by the amount of overlap 1460 in the leaves of the customer trees that are aggregated on a SP tree. 1461 For Aggregate Trees with an inclusive mapping the congruence depends 1462 on the overlap in the membership of the MVPNs that are aggregated on 1463 the tree. If there is complete overlap i.e. all MVPNs have exactly 1464 the same sites, aggregation is perfectly congruent. As the overlap 1465 between the MVPNs that are aggregated reduces, i.e. the number of 1466 sites that are common across all the MVPNs reduces, the congruence 1467 reduces. 1469 If aggregation is done such that it is not perfectly congruent a PE 1470 may receive traffic for MVPNs to which it doesn't belong. As the 1471 amount of multicast traffic in these unwanted MVPNs increases 1472 aggregation becomes less optimal with respect to delivered traffic. 1473 Hence there is a tradeoff between reducing state and delivering 1474 unwanted traffic. 1476 An implementation should provide knobs to control the congruence of 1477 aggregation. These knobs are implementation dependent. Configuring 1478 the percentage of sites that MVPNs must have in common to be 1479 aggregated, is an example of such a knob. This will allow a SP to 1480 deploy aggregation depending on the MVPN membership and traffic 1481 profiles in its network. If different PEs or servers are setting up 1482 Aggregate Trees this will also allow a service provider to engineer 1483 the maximum amount of unwanted MVPNs hat a particular PE may receive 1484 traffic for. 1486 6.3.3. Encapsulation of the Aggregate Tree 1488 An Aggregate Tree may use an IP/GRE encapsulation or an MPLS 1489 encapsulation. The protocol type in the IP/GRE header in the former 1490 case and the protocol type in the data link header in the latter need 1491 further explanation. This will be specified in a separate document. 1493 6.3.4. Demultiplexing C-multicast traffic 1495 When multiple MVPNs are aggregated onto one P-Multicast tree, 1496 determining the tree over which the packet is received is not 1497 sufficient to determine the MVPN to which the packet belongs. The 1498 packet must also carry some demultiplexing information to allow the 1499 egress PEs to determine the MVPN to which the packet belongs. Since 1500 the packet has been multicast through the P network, any given 1501 demultiplexing value must have the same meaning to all the egress 1502 PEs. The demultiplexing value is a MPLS label that corresponds to 1503 the multicast VRF to which the packet belongs. This label is placed 1504 by the ingress PE immediately beneath the P-Multicast tree header. 1505 Each of the egress PEs must be able to associate this MPLS label with 1506 the same MVPN. If downstream label assignment were used this would 1507 require all the egress PEs in the MVPN to agree on a common label for 1508 the MVPN. Instead the MPLS label is upstream assigned [MPLS- 1509 UPSTREAM-LABEL]. The label bindings are advertised via BGP updates 1510 originated the ingress PEs. 1512 This procedure requires each egress PE to support a separate label 1513 space for every other PE. The egress PEs create a forwarding entry 1514 for the upstream assigned MPLS label, allocated by the ingress PE, in 1515 this label space. Hence when the egress PE receives a packet over an 1516 Aggregate Tree, it first determines the tree that the packet was 1517 received over. The tree identifier determines the label space in 1518 which the upstream assigned MPLS label lookup has to be performed. 1519 The same label space may be used for all P-multicast trees rooted at 1520 the same ingress PE, or an implementation may decide to use a 1521 separate label space for every P-multicast tree. 1523 The encapsulation format is either MPLS or MPLS-in-something (e.g. 1524 MPLS-in-GRE [MPLS-IP]). When MPLS is used, this label will appear 1525 immediately below the label that identifies the P-multicast tree. 1526 When MPLS-in-GRE is used, this label will be the top MPLS label that 1527 appears when the GRE header is stripped off. 1529 When IP encapsulation is used for the P-multicast Tree, whatever 1530 information that particular encapsulation format uses for identifying 1531 a particular tunnel is used to determine the label space in which the 1532 MPLS label is looked up. 1534 If the P-multicast tree uses MPLS encapsulation, the P-multicast tree 1535 is itself identified by an MPLS label. The egress PE MUST NOT 1536 advertise IMPLICIT NULL or EXPLICIT NULL for that tree. Once the 1537 label representing the tree is popped off the MPLS label stack, the 1538 next label is the demultiplexing information that allows the proper 1539 MVPN to be determined. 1541 This specification requires that, to support this sort of 1542 aggregation, there be at least one upstream-assigned label per MVPN. 1543 It does not require that there be only one. For example, an ingress 1544 PE could assign a unique label to each C-(S,G). (This could be done 1545 using the same technique this is used to assign a particular C-(S,G) 1546 to an S-PMSI, see section 7.3.) 1548 6.4. Mapping Received Packets to MVPNs 1550 When an egress PE receives a C-multicast data packet over a P- 1551 multicast tree, it needs to forward the packet to the CEs that have 1552 receivers in the packet's C-multicast group. It also needs to 1553 determine the RPF interface for the C-multicast data packet. In order 1554 to do this the egress PE needs to determine the tunnel that the 1555 packet was received on. The PE can then determine the MVPN that the 1556 packet belongs to and if needed do any further lookups that are 1557 needed to forward the packet. 1559 6.4.1. Unicast Tunnels 1561 When ingress replication is used, the MVPN to which the received C- 1562 multicast data packet belongs can be determined by the MPLS label 1563 that was allocated by the egress. This label is distributed by the 1564 egress. This also determines the RPF interface for the C-multicast 1565 data packet. 1567 6.4.2. Non-Aggregated P-Multicast Trees 1569 If a P-multicast tree is associated with only one MVPN, determining 1570 the P-multicast tree on which a packet was received is sufficient to 1571 determine the packet's MVPN. All that the egress PE needs to know is 1572 the MVPN the P-multicast tree is associated with. 1574 There are different ways in which the egress PE can learn this 1575 association: 1577 a) Configuration. The P-multicast tree that a particular MVPN 1578 belongs to is configured on each PE. 1580 [Editor's Note: PIM-SM Default MD trees in [ROSEN-8] and 1581 [MVPN-BASE] are examples of configuring the P-multicast tree 1582 and MVPN association] 1584 b) BGP based advertisement of the P-multicast tree - MPVN mapping 1585 after the root of the tree discovers the leaves of the tree. 1586 The root of the tree sets up the tree after discovering each of 1587 the PEs that belong to the MVPN. It then advertises the P- 1588 multicast tree - MVPN mapping to each of the leaves. This 1589 mechanism can be used with both source initiated trees [e.g. 1590 RSVP-TE P2MP LSPs] and receiver initiated trees [e.g. PIM 1591 trees]. 1593 [Editor's Note: Aggregate tree advertisements in [RAGGARWA- 1594 MCAST] are examples of this.] 1596 c) BGP based advertisement of the P-multicast tree - MVPN mapping 1597 as part of the MVPN membership discovery. The root of the tree 1598 advertises, to each of the other PEs that belong to the MVPN, 1599 the P-multicast tree that the MVPN is associated with. This 1600 implies that the root doesn't need to know the leaves of the 1601 tree beforehand. This is possible only for receiver initiated 1602 trees e.g. PIM based trees. 1604 [Editor's Note: PIM-SSM discovery in [ROSEN-8] is an example of 1605 the above] 1607 Both of the above require the BGP based advertisement to contain the 1608 P-multicast tree identifier. This identifier is encoded as a BGP 1609 attribute and contains the following elements: 1611 - Tunnel Type. 1613 - Tunnel identifier. The semantics of the identifier is determined 1614 by the tunnel type. 1616 6.4.3. Aggregate P-Multicast Trees 1618 Once a PE sets up an Aggregate Tree it needs to announce the C- 1619 multicast groups being mapped to this tree to other PEs in the 1620 network. This procedure is referred to as Aggregate Tree discovery. 1621 For an Aggregate Tree with an inclusive mapping this discovery 1622 implies announcing: 1624 - The mapping of all MVPNs mapped to the Tree. 1626 - For each MVPN mapped onto the tree the inner label allocated for 1627 it by the ingress PE. The use of this label is explained in the 1628 demultiplexing procedures of section 6.3.4. 1630 - The P-multicast tree Identifier 1632 The egress PE creates a logical interface corresponding to the tree 1633 identifier. This interface is the RPF interface for all the entries mapped to that tree. 1636 When PIM is used to setup P-multicast trees, the egress PE also Joins 1637 the P-Group Address corresponding to the tree. This results in setup 1638 of the PIM P-multicast tree. 1640 6.5. I-PMSI Instantiation Using Ingress Replication 1642 As described in section 3 a PMSI can be instantiated using Unicast 1643 Tunnels between the PEs that are participating in the MVPN. In this 1644 mechanism the ingress PE replicates a C-multicast data packet 1645 belonging to a particular MVPN and sends a copy to all or a subset of 1646 the PEs that belong to the MVPN. A copy of the packet is tunneled to 1647 a remote PE over an Unicast Tunnel to the remote PE. IP/GRE Tunnels 1648 or MPLS LSPs are examples of unicast tunnels that may be used. Note 1649 that the same Unicast Tunnel can be used to transport packets 1650 belonging to different MVPNs. 1652 Ingress replication can be used to instantiate a UI-PMSI. The PE sets 1653 up unicast tunnels to each of the remote PEs that support ingress 1654 replication. For a given MVPN all C-multicast data packets are sent 1655 to each of the remote PEs in the MVPN that support ingress 1656 replication. Hence a remote PE may receive C-multicast data packets 1657 for a group even if it doesn't have any receivers in that group. 1659 Ingress replication can also be used to instantiate a MI-PMSI. In 1660 this case each PE has a mesh of unicast tunnels to every other PE in 1661 that MVPN. 1663 However when ingress replication is used it is recommended that only 1664 S-PMSIs be used. Instantiation of S-PMSIs with ingress replication is 1665 described in section 7.2. Note that this requires the use of 1666 explicit tracking, i.e., a PE must know which of the other PEs have 1667 receivers for each C-multicast tree. 1669 6.6. Establishing P-Multicast Trees 1671 It is believed that the architecture outlined in this document places 1672 no limitations on the protocols used to instantiate P-multicast 1673 trees. However, the only protocols being explicitly considered are 1674 PIM-SM, PIM-SSM, PIM-Bidir, RSVP-TE, and mLDP. 1676 A P-multicast tree can be either a source tree or a shared tree. A 1677 source tree is used to carry traffic only for the multicast VRFs that 1678 exist locally on the root of the tree i.e. for which the root has 1679 local CEs. The root is a PE router. Source P-multicast trees can be 1680 instantiated using PIM-SM, PIM-SSM, RSVP-TE P2MP LSPs, and mLDP P2MP 1681 LSPs. 1683 A shared tree on the other hand can be used to carry traffic 1684 belonging to VRFs that exist on other PEs as well. The root of a 1685 shared tree is not necessarily one of the PEs in the MVPN. All PEs 1686 that use the shared tree will send MVPN data packets to the root of 1687 the shared tree; if PIM is being used as the control protocol, PIM 1688 control packets also get sent to the root of the shared tree. This 1689 may require an unicast tunnel between each of these PEs and the root. 1690 The root will then send them on the shared tree and all the PEs that 1691 are leaves of the shared tree will receive the packets. For example a 1692 RP based PIM-SM tree would be a shared tree. Shared trees can be 1693 instantiated using PIM-SM, PIM-SSM, PIM-Bidir, RSVP-TE P2MP LSPs, 1694 mLDP P2MP LSPs, and mLDP MP2MP LSPs.. Aggregation support for 1695 bidirectional P-trees (i.e., PIM-Bidir trees or mLDP MP2MP trees) is 1696 for further study. Shared trees require all the PEs to discover the 1697 root of the shared tree for a MVPN. To achieve this the root of a 1698 shared tree advertises as part of the BGP based MVPN membership 1699 discovery: 1701 - The capability to setup a shared tree for a specified MVPN. 1703 - A downstream assigned label that is to be used by each PE to 1704 encapsulate a MVPN data packet, when they send this packet to the 1705 root of the shared tree. 1707 - A downstream assigned label that is to be used by each PE to 1708 encapsulate a MVPN control packet, when they send this packet to 1709 the root of the shared tree. 1711 Both a source tree and a shared tree can be used to instantiate an 1712 I-PMSI. If a source tree is used to instantiate an UI-PMSI for a 1713 MVPN, all the other PEs that belong to the MVPN, must be leaves of 1714 the source tree. If a shared tree is used to instantiate a UI-PMSI 1715 for a MVPN, all the PEs that are members of the MVPN must be leaves 1716 of the shared tree. 1718 6.7. RSVP-TE P2MP LSPs 1720 This section describes procedures that are specific to the usage of 1721 RSVP-TE P2MP LSPs for instantiating a UI-PMSI. The RSVP-TE P2MP LSP 1722 can be either a source tree or a shared tree. Procedures in [RSVP- 1723 P2MP] are used to signal the LSP. The LSP is signaled after the root 1724 of the LSP discovers the leaves. The egress PEs are discovered using 1725 the MVPN membership procedures described in section 4. RSVP-TE P2MP 1726 LSPs can optionally support aggregation. 1728 6.7.1. P2MP TE LSP Tunnel - MVPN Mapping 1730 P2MP TE LSP Tunnel to MVPN mapping can be learned at the egress PEs 1731 using either option (a) or option (b) described in section 6.4.2. 1732 Option (b) i.e. BGP based advertisements of the P2MP TE LSP Tunnel - 1733 MPVN mapping require that the root of the tree include the P2MP TE 1734 LSP Tunnel identifier as the tunnel identifier in the BGP 1735 advertisements. This identifier contains the following information 1736 elements: 1738 - The type of the tunnel is set to RSVP-TE P2MP Tunnel 1740 - RSVP-TE P2MP Tunnel's SESSION Object 1742 - Optionally RSVP-TE P2MP LSP's SENDER_TEMPLATE Object. This object 1743 is included when it is desired to identify a particular P2MP TE 1744 LSP. 1746 6.7.2. Demultiplexing C-Multicast Data Packets 1748 Demultiplexing the C-multicast data packets at the egress PE follow 1749 procedures described in section 6.3.4. The RSVP-TE P2MP LSP Tunnel 1750 must be signaled with penultimate-hop-popping (PHP) off. Signaling 1751 the P2MP TE LSP Tunnel with PHP off requires an extension to RSVP-TE 1752 which will be described later. 1754 7. Optimizing Multicast Distribution via S-PMSIs 1756 Whenever a particular multicast stream is being sent on an I-PMSI, it 1757 is likely that the data of that stream is being sent to PEs that do 1758 not require it. If a particular stream has a significant amount of 1759 traffic, it may be beneficial to move it to an S-PMSI which includes 1760 only those PEs that are transmitters and/or receivers (or at least 1761 includes fewer PEs that are neither). 1763 If explicit tracking is being done, S-PMSI creation can also be 1764 triggered on other criteria. For instance there could be a "pseudo 1765 wasted bandwidth" criteria: switching to an S-PMSI would be done if 1766 the bandwidth multiplied by the number of uninterested PEs (PE that 1767 are receiving the stream but have no receivers) is above a specified 1768 threshold. The motivation is that (a) the total bandwidth wasted by 1769 many sparsely subscribed low-bandwidth groups may be large, and (b) 1770 there's no point to moving a high-bandwidth group to an S-PMSI if all 1771 the PEs have receivers for it. 1773 Switching a (C-S, C-G) stream to an S-PMSI may require the root of 1774 the S-PMSI to determine the egress PEs that need to receive the (C-S, 1775 C-G) traffic. This is true in the following cases: 1777 - If the tunnel is a source initiated tree, such as a RSVP-TE P2MP 1778 Tunnel, the PE needs to know the leaves of the tree before it can 1779 instantiate the S-PMSI. 1781 - If a PE instantiates multiple S-PMSIs, belonging to different 1782 MVPNs, using one P-multicast tree, such a tree is termed an 1783 Aggregate Tree with a selective mapping. The setting up of such 1784 an Aggregate Tree requires the ingress PE to know all the other 1785 PEs that have receivers for multicast groups that are mapped onto 1786 the tree. 1788 The above two cases require that explicit tracking be done for the 1789 (C-S, C-G) stream. The root of the S-PMSI MAY decide to do explicit 1790 tracking of this stream only after it has determined to move the 1791 stream to an S-PMSI, or it MAY have been doing explicit tracking all 1792 along. 1794 If the S-PMSI is instantiated by a P-multicast tree, the PE at the 1795 root of the tree must signal the leaves of the tree that the (C-S, 1796 C-G) stream is now bound to the to the S-PMSI. Note that the PE could 1797 create the identity of the P-multicast tree prior to the actual 1798 instantiation of the tunnel. 1800 If the S-PMSI is instantiated by a source-initiated P-multicast tree 1801 (e.g., an RSVP-TE P2MP tunnel), the PE at the root of the tree must 1802 establish the source-initiated P-multicast tree to the leaves. This 1803 tree MAY have been established before the leaves receive the S-PMSI 1804 binding, or MAY be established after the leaves receives the binding. 1805 The leaves MUST not switch to the S-PMSI until they receive both the 1806 binding and the tree signaling message. 1808 7.1. S-PMSI Instantiation Using Ingress Replication 1810 As described in section 6.1.1, ingress replication can be used to 1811 instantiate a UI-PMSI. However this can result in a PE receiving 1812 packets for a multicast group for which it doesn't have any 1813 receivers. This can be avoided if the ingress PE tracks the remote 1814 PEs which have receivers in a particular C-multicast group. In order 1815 to do this it needs to receive C-Joins from each of the remote PEs. 1816 It then replicates the C-multicast data packet and sends it to only 1817 those egress PEs which are on the path to a receiver of that C-group. 1818 It is possible that each PE that is using ingress replication 1819 instantiates only S-PMSIs. It is also possible that some PEs 1820 instantiate UI-PMSIs while others instantiate only S-PMSIs. In both 1821 these cases the PE MUST either unicast MVPN routing information using 1822 PIM or use BGP for exchanging the MVPN routing information. This is 1823 because there may be no MI-PMSI available for it to exchange MVPN 1824 routing information. 1826 Note that the use of ingress replication doesn't require any extra 1827 procedures for signaling the binding of the S-PMSI from the ingress 1828 PE to the egress PEs. The procedures described for I-PMSIs are 1829 sufficient. 1831 7.2. Protocol for Switching to S-PMSIs 1833 We describe two protocols for switching to S-PMSIs. These protocols 1834 can be used when the tunnel that instantiates the S-PMSI is a P- 1835 multicast tree. 1837 7.2.1. A UDP-based Protocol for Switching to S-PMSIs 1839 This procedure can be used for any MVPN which has an MI-PMSI. 1840 Traffic from all multicast streams in a given MPVN is sent, by 1841 default, on the MI-PMSI. Consider a single multicast stream within a 1842 given MVPN, and consider a PE which is attached to a source of 1843 multicast traffic for that stream. The PE can be configured to move 1844 the stream from the MI-PMSI to an S-PMSI if certain configurable 1845 conditions are met. To do this, it needs to inform all the PEs which 1846 attach to receivers for stream. These PEs need to start listening 1847 for traffic on the S-PMSI, and the transmitting PE may start sending 1848 traffic on the S-PMSI when it is reasonably certain that all 1849 receiving PEs are listening on the S-PMSI. 1851 7.2.1.1. Binding a Stream to an S-PMSI 1853 When a PE which attaches to a transmitter for a particular multicast 1854 stream notices that the conditions for moving the stream to an S-PMSI 1855 are met, it begins to periodically send an "S-PMSI Join Message" on 1856 the MI-PMSI. The S-PMSI Join is a UDP-encapsulated message whose 1857 destination address is ALL-PIM-ROUTERS (224.0.0.13), and whose 1858 destination port is 3232. 1860 The S-PMSI Join Message contains the following information: 1862 - An identifier for the particular multicast stream which is to be 1863 bound to the S-PMSI. This can be represented as an (S,G) pair. 1865 - An identifier for the particular S-PMSI to which the stream is to 1866 be bound. This identifier is a structured field which includes 1867 the following information: 1869 * The type of tunnel used to instantiate the S-PMSI 1870 * An identifier for the tunnel. The form of the identifier 1871 will depend upon the tunnel type. The combination of tunnel 1872 identifier and tunnel type should contain enough information 1873 to enable all the PEs to "join" the tunnel and receive 1874 messages from it. 1876 * Any demultiplexing information needed by the tunnel 1877 encapsulation protocol to identify the particular S-PMSI. 1878 This allows a single tunnel to aggregate multiple S-PMSIs. 1879 If a particular tunnel is not aggregating multiple S-PMSIs, 1880 then no demultiplexing information is needed. 1882 A PE router which is not connected to a receiver will still receive 1883 the S-PMSI Joins, and MAY cache the information contained therein. 1884 Then if the PE later finds that it is attached to a receiver, it can 1885 immediately start listening to the S-PMSI. 1887 Upon receiving the S-PMSI Join, PE routers connected to receivers for 1888 the specified stream will take whatever action is necessary to start 1889 receiving multicast data packets on the S-PMSI. The precise action 1890 taken will depend upon the tunnel type. 1892 After a configurable delay, the PE router which is sending the S-PMSI 1893 Joins will start transmitting the stream's data packets on the S- 1894 PMSI. 1896 When the pre-configured conditions are no longer met for a particular 1897 stream, e.g. the traffic stops, the PE router connected to the source 1898 stops announcing S-PMSI Joins for that stream. Any PE that does not 1899 receive, over a configurable interval, an S-PMSI Join for a 1900 particular stream will stop listening to the S-PMSI. 1902 7.2.1.2. Packet Formats and Constants 1904 The S-PMSI Join message is encapsulated within UDP, and has the 1905 following type/length/value (TLV) encoding: 1907 0 1 2 3 1908 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1909 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1910 | Type | Length | Value | 1911 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1912 | . | 1913 | . | 1914 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1916 Type (8 bits) 1918 Length (16 bits): the total number of octets in the Type, Length, and 1919 Value fields combined 1921 Value (variable length) 1923 Currently only one type of S-PMSI Join is defined. A type 1 S-PMSI 1924 Join is used when the S-PMSI tunnel is a PIM tunnel which is used to 1925 carry a single multicast stream, where the packets of that stream 1926 have IPv4 source and destination IP addresses. 1928 0 1 2 3 1929 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1930 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1931 | Type | Length | Reserved | 1932 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1933 | C-source | 1934 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1935 | C-group | 1936 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1937 | P-group | 1938 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1940 Type (8 bits): 1 1942 Length (16 bits): 16 1944 Reserved (8 bits): This field SHOULD be zero when transmitted, and 1945 MUST be ignored when received. 1947 C-Source (32 bits): the IPv4 address of the traffic source in the 1948 VPN. 1950 C-Group (32 bits): the IPv4 address of the multicast traffic 1951 destination address in the VPN. 1953 P-Group (32 bits): the IPv4 group address that the PE router is going 1954 to use to encapsulate the flow (C-Source, C-Group). 1956 The P-group identifies the S-PMSI tunnel, and the (C-S, C-G) 1957 identifies the multicast flow that is carried in the tunnel. 1959 The protocol uses the following constants. 1961 [S-PMSI_DELAY]: 1963 the PE router which is to transmit onto the S-PMSI will delay 1964 this amount of time before it begins using the S-PMSI. The 1965 default value is 3 seconds. 1967 [S-PMSI_TIMEOUT]: 1969 if a PE (other than the transmitter) does not receive any packets 1970 over the S-PMSI tunnel for this amount of time, the PE will prune 1971 itself from the S-PMSI tunnel, and will expect (C-S, C-G) packets 1972 to arrive on an I-PMSI. The default value is 3 minutes. This 1973 value must be consistent among PE routers. 1975 [S-PMSI_HOLDOWN]: 1977 if the PE that transmits onto the S-PMSI does not see any (C-S, 1978 C-G) packets for this amount of time, it will resume sending (C- 1979 S, C-G) packets on an I-PMSI. 1981 This is used to avoid oscillation when traffic is bursty. The 1982 default value is 1 minute. 1984 [S-PMSI_INTERVAL] 1985 the interval the transmitting PE router uses to periodically send 1986 the S-PMSI Join message. The default value is 60 seconds. 1988 7.2.2. A BGP-based Protocol for Switching to S-PMSIs 1990 This procedure can be used for a MVPN that is using either a UI-PMSI 1991 or a MI-PMSI. Consider a single multicast stream for a C-(S, G) 1992 within a given MVPN, and consider a PE which is attached to a source 1993 of multicast traffic for that stream. The PE can be configured to 1994 move the stream from the MI-PMSI or UI-PMSI to an S-PMSI if certain 1995 configurable conditions are met. Once a PE decides to move the C-(S, 1996 G) for a given MVPN to a S-PMSI, it needs to instantiate the S-PMSI 1997 using a tunnel and announce to all the egress PEs, that are on the 1998 path to receivers of the C-(S, G), of the binding of the S-PMSI to 1999 the C-(S, G). The announcement is done using BGP. Depending on the 2000 tunneling technology used, this announcement may be done before or 2001 after setting up the tunnel. The source and egress PEs have to switch 2002 to using the S-PMSI for the C-(S, G). 2004 7.2.2.1. Advertising C-(S, G) Binding to a S-PMSI using BGP 2006 The ingress PE informs all the PEs that are on the path to receivers 2007 of the C-(S, G) of the binding of the S-PMSI to the C-(S, G). The BGP 2008 announcement is done by sending update for the MCAST-VPN address 2009 family. An A-D route is used, containing the following information: 2011 a) IP address of the originating PE 2013 b) The RD configured locally for the MVPN. This is required to 2014 uniquely identify the as the addresses 2015 could overlap between different MVPNs. This is the same RD 2016 value used in the auto-discovery process. 2018 c) The C-Source address. This address can be a prefix in order to 2019 allow a range of C-Source addresses to be mapped to an 2020 Aggregate Tree. 2022 d) The C-Group address. This address can be a range in order to 2023 allow a range of C-Group addresses to be mapped to an Aggregate 2024 Tree. 2026 e) A PE MAY aggregate two or more S-PMSIs originated by the PE 2027 onto the same P-Multicast tree. If the PE already advertises 2028 S-PMSI auto-discovery routes for these S-PMSIs, then 2029 aggregation requires the PE to re-advertise these routes. The 2030 re-advertised routes MUST be the same as the original ones, 2031 except for the PMSI tunnel attribute. If the PE has not 2032 previously advertised S-PMSI auto-discovery routes for these 2033 S-PMSIs, then the aggregation requires the PE to advertise 2034 (new) S-PMSI auto-discovery routes for these S-PMSIs. The PMSI 2035 Tunnel attribute in the newly advertised/re-advertised routes 2036 MUST carry the identity of the P- Multicast tree that 2037 aggregates the S-PMSIs. If at least some of the S-PMSIs 2038 aggregated onto the same P-Multicast tree belong to different 2039 MVPNs, then all these routes MUST carry an MPLS upstream 2040 assigned label [MPLS-UPSTREAM-LABEL, section 6.3.4]. If all 2041 these aggregated S-PMSIs belong to the same MVPN, then the 2042 routes MAY carry an MPLS upstream assigned label [MPLS- 2043 UPSTREAM-LABEL]. The labels MUST be distinct on a per MVPN 2044 basis, and MAY be distinct on a per route basis. 2046 When a PE distributes this information via BGP, it must include the 2047 following: 2049 1. An identifier for the particular S-PMSI to which the stream is 2050 to be bound. This identifier is a structured field which 2051 includes the following information: 2053 * The type of tunnel used to instantiate the S-PMSI 2055 * An identifier for the tunnel. The form of the identifier 2056 will depend upon the tunnel type. The combination of 2057 tunnel identifier and tunnel type should contain enough 2058 information to enable all the PEs to "join" the tunnel and 2059 receive messages from it. 2061 2. Route Target Extended Communities attribute. This is used as 2062 described in section 4. 2064 7.2.2.2. Explicit Tracking 2066 If the PE wants to enable explicit tracking for the specified flow, 2067 it also indicates this in the A-D route it uses to bind the flow to a 2068 particular S-PMSI. Then any PE which receives the A-D route will 2069 respond with a "Leaf A-D Route" in which it identifies itself as a 2070 receiver of the specified flow. The Leaf A-D route will be withdrawn 2071 when the PE is no longer a receiver for the flow. 2073 If the PE needs to enable explicit tracking for a flow before binding 2074 the flow to an S-PMSI, it can do so by sending an A-D route 2075 identifying the flow but not specifying an S-PMSI. This will elicit 2076 the Leaf A-D Routes. This is useful when the PE needs to know the 2077 receivers before selecting an S-PMSI. 2079 7.2.2.3. Switching to S-PMSI 2081 After the egress PEs receive the announcement they setup their 2082 forwarding path to receive traffic on the S-PMSI if they have one or 2083 more receivers interested in the bound to the S-PMSI. This 2084 involves changing the RPF interface for the relevant 2085 entries to the interface that is used to instantiate the S-PMSI. If 2086 an Aggregate Tree is used to instantiate a S-PMSI this also implies 2087 setting up the demultiplexing forwarding entries based on the inner 2088 label as described in section 6.3.4. The egress PEs may perform the 2089 switch to the S-PMSI once the advertisement from the ingress PE is 2090 received or wait for a preconfigured timer to do so. 2092 A source PE may use one of two approaches to decide when to start 2093 transmitting data on the S-PMSI. In the first approach once the 2094 source PE instantiates the S-PMSI, it starts sending multicast 2095 packets for entries mapped to the S-PMSI on both that as 2096 well as on the I-PMSI, which is currently used to send traffic for 2097 the . After some preconfigured timer the PE stops sending 2098 multicast packets for on the I-PMSI. In the second 2099 approach after a certain pre-configured delay after advertising the 2100 entry bound to a S-PMSI, the source PE begins to send 2101 traffic on the S-PMSI. At this point it stops to send traffic for the 2102 on the I-PMSI. This traffic is instead transmitted on the 2103 S-PMSI. 2105 7.3. Aggregation 2107 S-PMSIs can be aggregated on a P-multicast tree. The S-PMSI to C-(S, 2108 G) binding advertisement supports aggregation. Furthermore the 2109 aggregation procedures of section 6.3 apply. It is also possible to 2110 aggregate both S-PMSIs and I-PMSIs on the same P-multicast tree. 2112 7.4. Instantiating the S-PMSI with a PIM Tree 2114 The procedures of section 7.3 tell a PE when it must start listening 2115 and stop listening to a particular S-PMSI. Those procedures also 2116 specify the method for instantiating the S-PMSI. In this section, we 2117 provide the procedures to be used when the S-PMSI is instantiated as 2118 a PIM tree. The PIM tree is created by the PIM P-instance. 2120 If a single PIM tree is being used to aggregate multiple S-PMSIs, 2121 then the PIM tree to which a given stream is bound may have already 2122 been joined by a given receiving PE. If the tree does not already 2123 exist, then the appropriate PIM procedures to create it must be 2124 executed in the P-instance. 2126 If the S-PMSI for a particular multicast stream is instantiated as a 2127 PIM-SM or PIM-Bidir tree, the S-PMSI identifier will specify the RP 2128 and the group P-address, and the PE routers which have receivers for 2129 that stream must build a shared tree toward the RP. 2131 If the S-PMSI is instantiated as a PIM-SSM tree, the PE routers build 2132 a source tree toward the PE router that is advertising the S-PMSI 2133 Join. The IP address root of the tree is the same as the source IP 2134 address which appears in the S-PMSI Join. In this case, the tunnel 2135 identifier in the S-PMSI Join will only need to specify a group P- 2136 address. 2138 The above procedures assume that each PE router has a set of group 2139 P-addresses that it can use for setting up the PIM-trees. Each PE 2140 must be configured with this set of P-addresses. If PIM-SSM is used 2141 to set up the tunnels, then the PEs may be with overlapping sets of 2142 group P-addresses. If PIM-SSM is not used, then each PE must be 2143 configured with a unique set of group P-addresses (i.e., having no 2144 overlap with the set configured at any other PE router). The 2145 management of this set of addresses is thus greatly simplified when 2146 PIM-SSM is used, so the use of PIM-SSM is strongly recommended 2147 whenever PIM trees are used to instantiate S-PMSIs. 2149 If it is known that all the PEs which need to receive data traffic on 2150 a given S-PMSI can support aggregation of multiple S-PMSIs on a 2151 single PIM tree, then the transmitting PE, may, at its discretion, 2152 decide to bind the S-PMSI to a PIM tree which is already bound to 2153 one or more other S-PMSIs, from the same or from different MVPNs. In 2154 this case, appropriate demultiplexing information must be signaled. 2156 7.5. Instantiating S-PMSIs using RSVP-TE P2MP Tunnels 2158 RSVP-TE P2MP Tunnels can be used for instantiating S-PMSIs. 2159 Procedures described in the context of I-PMSIs in section 6.7 apply. 2161 8. Inter-AS Procedures 2163 If an MVPN has sites in more than one AS, it requires one or more 2164 PMSIs to be instantiated by inter-AS tunnels. This document 2165 describes two different types of inter-AS tunnel: 2167 1. "Segmented Inter-AS tunnels" 2169 A segmented inter-AS tunnel consists of a number of independent 2170 segments which are stitched together at the ASBRs. There are 2171 two types of segment, inter-AS segments and intra-AS segments. 2172 The segmented inter-AS tunnel consists of alternating intra-AS 2173 and inter-AS segments. 2175 Inter-AS segments connect adjacent ASBRs of different ASes; 2176 these "one-hop" segments are instantiated as unicast tunnels. 2178 Intra-AS segments connect ASBRs and PEs which are in the same 2179 AS. An intra-AS segment may be of whatever technology is 2180 desired by the SP that administers the that AS. Different 2181 intra-AS segments may be of different technologies. 2183 Note that an intra-AS segment of an inter-AS tunnel is distinct 2184 from any intra-AS tunnel in the AS. 2186 A segmented inter-AS tunnel can be thought of as a tree which 2187 is rooted at a particular AS, and which has as its leaves the 2188 other ASes which need to receive multicast data from the root 2189 AS. 2191 2. "Non-segmented Inter-AS tunnels" 2193 A non-segmented inter-AS tunnel is a single tunnel which spans 2194 AS boundaries. The tunnel technology cannot change from one 2195 point in the tunnel to the next, so all ASes through which the 2196 tunnel passes must support that technology. In essence, AS 2197 boundaries are of no significance to a non-segmented inter-AS 2198 tunnel. 2200 [Editor's Note: This is the model in [ROSEN-8] and [MVPN- 2201 BASE].] 2203 Section 10 of [RFC4364] describes three different options for 2204 supporting unicast Inter-AS BGP/MPLS IP VPNs, known as options A, B, 2205 and C. We describe below how both segmented and non-segmented 2206 inter-AS trees can be supported when option B or option C is used. 2207 (Option A does not pass any routing information through an ASBR at 2208 all, so no special inter-AS procedures are needed.) 2210 8.1. Non-Segmented Inter-AS Tunnels 2212 In this model, the previously described discovery and tunnel setup 2213 mechanisms are used, even though the PEs belonging to a given MVPN 2214 may be in different ASes. The ASBRs play no special role, but 2215 function merely as P routers. 2217 8.1.1. Inter-AS MVPN Auto-Discovery 2219 The previously described BGP-based auto-discovery mechanisms work "as 2220 is" when an MVPN contains PEs that are in different Autonomous 2221 Systems. 2223 8.1.2. Inter-AS MVPN Routing Information Exchange 2225 MVPN routing information exchange can be done by PIM peering (either 2226 lightweight or full) across an MI-PMSI, or by unicasting PIM 2227 messages. The method of using BGP to send MVPN routing information 2228 can also be used. 2230 If any form of PIM peering is used, a PE that sends C-PIM Join/Prune 2231 messages for a particular C-(S,G) must be able to identify the PE 2232 which is its PIM adjacency on the path to S. The identity of the PIM 2233 adjacency is determined from the RPF information associated with the 2234 VPN-IP route to S. 2236 If no RPF information is present, then the identity of the PIM 2237 adjacency is taken from the BGP Next Hop attribute of the VPN-IP 2238 route to S. Note that this will not give the correct result if 2239 option b of section 10 of [RFC4364] is used. To avoid this 2240 possibility of error, the RPF information SHOULD always be present if 2241 MVPN routing information is to be distributed by PIM. 2243 If BGP (rather than PIM) is used to distribute the MVPN routing 2244 information, and if option b of section 10 of [RFC4364] is in use, 2245 then the MVPN routes will be installed in the ASBRs along the path 2246 from each multicast source in the MVPN to each multicast receiver in 2247 the MVPN. If option b is not in use, the MVPN routes are not 2248 installed in the ASBRs. The handling of MVPN routes in either case 2249 is thus exactly analogous to the handling of unicast VPN-IP routes in 2250 the corresponding case. 2252 8.1.3. Inter-AS I-PMSI 2254 The procedures described earlier in this document can be used to 2255 instantiate an I-PMSI with inter-AS tunnels. Specific tunneling 2256 techniques require some explanation: 2258 1. If ingress replication is used, the inter-AS PE-PE tunnels will 2259 use the inter-AS tunneling procedures for the tunneling 2260 technology used. 2262 2. Inter-AS PIM-SM or PIM-SSM based trees rely on a PE joining a 2263 (P-S, P-G) tuple where P-S is the address of a PE in another 2264 AS. This (P-S, P-G) tuple is learned using the MVPN membership 2265 and BGP MVPN-tunnel binding procedures described earlier. 2266 However, if the source of the tree is in a different AS than a 2267 particular P router, it is possible that the P router will not 2268 have a route to the source. For example, the remote AS may be 2269 using BGP to distribute a route to the source, but a particular 2270 P router may be part of a "BGP-free core", in which the P 2271 routers are not aware of BGP-distributed routes. 2273 In such a case it is necessary for a PE to to tell PIM to 2274 construct the tree through a particular BGP speaker, the "BGP 2275 next hop" for the tree source. This can be accomplished with a 2276 PIM extension, in which the P-PIM Join/Prune messages carry a 2277 new "proxy" field which contains the address of that BGP next 2278 hop. As the P-multicast tree is constructed, it is built 2279 towards the proxy (the BGP next hop) rather than towards P-S, 2280 so the P routers will not need to have a route to P-S. 2282 Support for inter-AS trees using PIM-Bidir are for further 2283 study. 2285 When the BGP-based discovery procedures for MVPN are in place, 2286 one can distinguish two different inter-AS routes to a 2287 particular P-S: 2289 - BGP will install a unicast route to P-S along a particular 2290 path, using the IP AFI/SAFI ; 2292 - A PE's MVPN auto-discovery information is advertised by 2293 sending a BGP update whose NLRI is in a special address 2294 family (AFI/SAFI) used for this purpose. The NLRI of the 2295 address family contains the IP address of the PE, as well 2296 as an RD. If the NLRI contains the IP address of P-S, this 2297 in effect creates a second route to P-S. This route might 2298 follow a different path than the route in the unicast IP 2299 family. 2301 When building a PIM tree towards P-S, it may be desirable to 2302 build it along the route on which the MVPN auto-discovery 2303 AFI/SAFI is installed, rather than along the route on which the 2304 IP AFI/SAFI is installed. This enables the inter-AS portion of 2305 the tree to follow a path which is specifically chosen for 2306 multicast (i.e., it allows the inter-AS multicast topology to 2307 be "non-congruent" to the inter-AS unicast topology). 2309 In order for P routers to send P-Join/Prune messages along this 2310 path, they need to make use of the "proxy" field extension 2311 discussed above. The PIM message must also contain the full 2312 NLRI in the MVPN auto-discovery family, so that the BGP 2313 speakers can look up that NLRI to find the BGP next hop. 2315 3. Procedures in [RSVP-P2MP] are used for inter-AS RSVP-TE P2MP 2316 Tunnels. 2318 8.1.4. Inter-AS S-PMSI 2320 The leaves of the tunnel are discovered using the MVPN routing 2321 information. Procedures for setting up the tunnel are similar to the 2322 ones described in section 8.2.3 for an inter-AS I-PMSI. 2324 8.2. Segmented Inter-AS Tunnels 2326 8.2.1. Inter-AS MVPN Auto-Discovery Routes 2328 The BGP based MVPN membership discovery procedures of section 4 are 2329 used to auto-discover the intra-AS MVPN membership. This section 2330 describes the additional procedures for inter-AS MVPN membership 2331 discovery. It also describes the procedures for constructing 2332 segmented inter-AS tunnels. 2334 In this case, for a given MVPN in an AS, the objective is to form a 2335 spanning tree of MVPN membership, rooted at the AS. The nodes of this 2336 tree are ASes. The leaves of this tree are only those ASes that have 2337 at least one PE with a member in the MVPN. The inter-AS tunnel used 2338 to instantiate an inter-AS PMSI must traverse this spanning tree. A 2339 given AS needs to announce to another AS only the fact that it has 2340 membership in a given MVPN. It doesn't need to announce the 2341 membership of each PE in the AS to other ASes. 2343 This section defines an inter-AS auto-discovery route as a route that 2344 carries information about an AS that has one or more PEs (directly) 2345 connected to the site(s) of that MVPN. Further it defines an inter-AS 2346 leaf auto-discovery route (leaf auto-discovery route) as a route used 2347 to inform the root of an intra-AS segment, of an inter-AS tunnel, of 2348 a leaf of that intra-AS segment. 2350 8.2.1.1. Originating Inter-AS MVPN A-D Information 2352 A PE in a given AS advertises its MVPN membership to all its IBGP 2353 peers. This IBGP peer may be a route reflector which in turn 2354 advertises this information to only its IBGP peers. In this manner 2355 all the PEs and ASBRs in the AS learn this membership information. 2357 An Autonomous System Border Router (ASBR) may be configured to 2358 support a particular MVPN. If an ASBR is configured to support a 2359 particular MVPN, the ASBR MUST participate in the intra-AS MVPN 2360 auto-discovery/binding procedures for that MVPN within the AS that 2361 the ASBR belongs to, as defined in this document. 2363 Each ASBR then advertises the "AS MVPN membership" to its neighbor 2364 ASBRs using EBGP. This inter-AS auto-discovery route must not be 2365 advertised to the PEs/ASBRs in the same AS as this ASBR. The 2366 advertisement carries the following information elements: 2368 a. A Route Distinguisher for the MVPN. For a given MVPN each ASBR 2369 in the AS must use the same RD when advertising this 2370 information to other ASBRs. To accomplish this all the ASBRs 2371 within that AS, that are configured to support the MVPN, MUST 2372 be configured with the same RD for that MVPN. This RD MUST be 2373 of Type 0, MUST embed the autonomous system number of the AS. 2375 b. The announcing ASBR's local address as the next-hop for the 2376 above information elements. 2378 c. By default the BGP Update message MUST carry export Route 2379 Targets used by the unicast routing of that VPN. The default 2380 could be modified via configuration by having a set of Route 2381 Targets used for the inter-AS auto-discovery routes being 2382 distinct from the ones used by the unicast routing of that VPN. 2384 8.2.1.2. Propagating Inter-AS MVPN A-D Information 2386 As an inter-AS auto-discovery route originated by an ASBR within a 2387 given AS is propagated via BGP to other ASes, this results in 2388 creation of a data plane tunnel that spans multiple ASes. This tunnel 2389 is used to carry (multicast) traffic from the MVPN sites connected to 2390 the PEs of the AS to the MVPN sites connected to the PEs that are in 2391 the other ASes. Such tunnel consists of multiple intra-AS segments 2392 (one per AS) stitched at ASBRs' boundaries by single hop 2393 LSP segments. 2395 An ASBR originates creation of an intra-AS segment when the ASBR 2396 receives an inter-AS auto-discovery route from an EBGP neighbor. 2397 Creation of the segment is completed as a result of distributing via 2398 IBGP this route within the ASBR's own AS. 2400 For a given inter-AS tunnel each of its intra-AS segments could be 2401 constructed by its own independent mechanism. Moreover, by using 2402 upstream labels within a given AS multiple intra-AS segments of 2403 different inter-AS tunnels of either the same or different MVPNs may 2404 share the same P-Multicast Tree. 2406 Since (aggregated) inter-AS auto-discovery routes have granularity of 2407 , an MVPN that is present in N ASes would have total of N 2408 inter-AS tunnels. Thus for a given MVPN the number of inter-AS 2409 tunnels is independent of the number of PEs that have this MVPN. 2411 The following sections specify procedures for propagation of 2412 (aggregated) inter-AS auto-discovery routes across ASes. 2414 8.2.1.2.1. Inter-AS Auto-Discovery Route received via EBGP 2416 When an ASBR receives from one of its EBGP neighbors a BGP Update 2417 message that carries the inter-AS auto-discovery route if (a) at 2418 least one of the Route Targets carried in the message matches one of 2419 the import Route Targets configured on the ASBR, and (b) the ASBR 2420 determines that the received route is the best route to the 2421 destination carried in the NLRI of the route, the ASBR: 2423 a) Re-advertises this inter-AS auto-discovery route within its own 2424 AS. 2426 If the ASBR uses ingress replication to instantiate the intra- 2427 AS segment of the inter-AS tunnel, the re-advertised route 2428 SHOULD carry a Tunnel attribute with the Tunnel Identifier set 2429 to Ingress Replication, but no MPLS labels. 2431 If a P-Multicast Tree is used to instantiate the intra-AS 2432 segment of the inter-AS tunnel, and in order to advertise the 2433 P-Multicast tree identifier the ASBR doesn't need to know the 2434 leaves of the tree beforehand, then the advertising ASBR SHOULD 2435 advertise the P-Multicast tree identifier in the Tunnel 2436 Identifier of the Tunnel attribute. This, in effect, creates a 2437 binding between the inter-AS auto-discovery route and the P- 2438 Multicast Tree. 2440 If a P-Multicast Tree is used to instantiate the intra-AS 2441 segment of the inter-AS tunnel, and in order to advertise the 2442 P-Multicast tree identifier the advertising ASBR needs to know 2443 the leaves of the tree beforehand, the ASBR first discovers the 2444 leaves using the Auto-Discovery procedures, as specified 2445 further down. It then advertises the binding of the tree to the 2446 inter-AS auto-discovery route using the the original auto- 2447 discovery route with the addition of carrying in the route the 2448 Tunnel attribute that contains the type and the identity of the 2449 tree (encoded in the Tunnel Identifier of the attribute). 2451 b) Re-advertises the received inter-AS auto-discovery route to its 2452 EBGP peers, other than the EBGP neighbor from which the best 2453 inter-AS auto-discovery route was received. 2455 c) Advertises to its neighbor ASBR, from which it received the 2456 best inter-AS autodiscovery route to the destination carried in 2457 the NRLI of the route, a leaf auto-discovery route that carries 2458 an ASBR-ASBR tunnel binding with the tunnel identifier set to 2459 ingress replication. This binding as described in section 6 can 2460 be used by the neighbor ASBR to send traffic to this ASBR. 2462 8.2.1.2.2. Leaf Auto-Discovery Route received via EBGP 2464 When an ASBR receives via EBGP a leaf auto-discovery route, the ASBR 2465 finds an inter-AS auto-discovery route that has the same RD as the 2466 leaf auto-discovery route. The MPLS label carried in the leaf auto- 2467 discovery route is used to stitch a one hop ASBR-ASBR LSP to the tail 2468 of the intra-AS tunnel segment associated with the inter-AS auto- 2469 discovery route. 2471 8.2.1.2.3. Inter-AS Auto-Discovery Route received via IBGP 2473 If a given inter-AS auto-discovery route is advertised within an AS 2474 by multiple ASBRs of that AS, the BGP best route selection performed 2475 by other PE/ASBR routers within the AS does not require all these 2476 PE/ASBR routers to select the route advertised by the same ASBR - to 2477 the contrary different PE/ASBR routers may select routes advertised 2478 by different ASBRs. 2480 Further when a PE/ASBR receives from one of its IBGP neighbors a BGP 2481 Update message that carries a AS MVPN membership tree , if (a) the 2482 route was originated outside of the router's own AS, (b) at least one 2483 of the Route Targets carried in the message matches one of the import 2484 Route Targets configured on the PE/ASBR, and (c) the PE/ASBR 2485 determines that the received route is the best route to the 2486 destination carried in the NLRI of the route, if the router is an 2487 ASBR then the ASBR propagates the route to its EBGP neighbors. In 2488 addition the PE/ASBR performs the following. 2490 If the received inter-AS auto-discovery route carries the Tunnel 2491 attribute with the Tunnel Identifier set to LDP P2MP LSP, or PIM-SSM 2492 tree, or PIM-SM tree, the PE/ASBR SHOULD join the P-Multicast tree 2493 whose identity is carried in the Tunnel Identifier. 2495 If the received source auto-discovery route carries the Tunnel 2496 attribute with the Tunnel Identifier set to RSVP-TE P2MP LSP, then 2497 the ASBR that originated the route MUST signal the local PE/ASBR as 2498 one of leaf LSRs of the RSVP-TE P2MP LSP. This signaling MAY have 2499 been completed before the local PE/ASBR receives the BGP Update 2500 message. 2502 If the NLRI of the route does not carry a label, then this tree is an 2503 intra-AS LSP segment that is part of the inter-AS Tunnel for the MVPN 2504 advertised by the inter-AS auto-discovery route. If the NLRI carries 2505 a (upstream) label, then a combination of this tree and the label 2506 identifies the intra-AS segment. 2508 If this is an ASBR, this intra-AS segment may further be stitched to 2509 ASBR-ASBR inter-AS segment of the inter-AS tunnel. If the PE/ASBR has 2510 local receivers in the MVPN, packets received over the intra-AS 2511 segment must be forwarded to the local receivers using the local VRF. 2513 If the received inter-AS auto-discovery route either does not carry 2514 the Tunnel attribute, or carries the Tunnel attribute with the Tunnel 2515 Identifier set to ingress replication, then the PE/ASBR originates a 2516 new auto-discovery route to allow the ASBR from which the auto- 2517 discovery route was received, to learn of this ASBR as a leaf of the 2518 intra-AS tree. 2520 Thus the AS MVPN membership information propagates across multiple 2521 ASes along a spanning tree. BGP AS-Path based loop prevention 2522 mechanism prevents loops from forming as this information propagates. 2524 8.2.2. Inter-AS MVPN Routing Information Exchange 2526 All of the MVPN routing information exchange methods specified in 2527 section 5 can be supported across ASes. 2529 The objective in this case is to propagate the MVPN routing 2530 information to the remote PE that originates the unicast route to C- 2531 S/C-RP, in the reverse direction of the AS MVPN membership 2532 information announced by the remote PE's origin AS. This information 2533 is processed by each ASBR along this reverse path. 2535 To achieve this the PE that is generating the MVPN routing 2536 advertisement, first determines the source AS of the unicast route to 2537 C-S/C-RP. It then determines from the received AS MVPN membership 2538 information, for the source AS, the ASBR that is the next-hop for the 2539 best path of the source AS MVPN membership. The BGP MVPN routing 2540 update is sent to this ASBR and the ASBR then further propagates the 2541 BGP advertisement. BGP filtering mechanisms ensure that the BGP MVPN 2542 routing information updates flow only to the upstream router on the 2543 reverse path of the inter-AS MVPN membership tree. Details of this 2544 filtering mechanism and the relevant encoding will be specified in a 2545 separate document. 2547 8.2.3. Inter-AS I-PMSI 2549 All PEs in a given AS, use the same inter-AS heterogeneous tunnel, 2550 rooted at the AS, to instantiate an I-PMSI for an inter-AS MVPN 2551 service. As explained earlier the intra-AS tunnel segments that 2552 comprise this tunnel can be built using different tunneling 2553 technologies. To instantiate an MI-PMSI service for a MVPN there must 2554 be an inter-AS tunnel rooted at each AS that has at least one PE that 2555 is a member of the MVPN. 2557 A C-multicast data packet is sent using an intra-AS tunnel segment by 2558 the PE that first receives this packet from the MVPN customer site. 2559 An ASBR forwards this packet to any locally connected MVPN receivers 2560 for the multicast stream. If this ASBR has received a tunnel binding 2561 for the AS MVPN membership that it advertised to a neighboring ASBR, 2562 it also forwards this packet to the neighboring ASBR. In this case 2563 the packet is encapsulated in the downstream MPLS label received from 2564 the neighboring ASBR. The neighboring ASBR delivers this packet to 2565 any locally connected MVPN receivers for that multicast stream. It 2566 also transports this packet on an intra-AS tunnel segment, for the 2567 inter-AS MVPN tunnel, and the other PEs and ASBRs in the AS then 2568 receive this packet. The other ASBRs then repeat the procedure 2569 followed by the ASBR in the origin AS and the packet traverses the 2570 overlay inter-AS tunnel along a spanning tree. 2572 8.2.3.1. Support for Unicast VPN Inter-AS Methods 2574 The above procedures for setting up an inter-AS I-PMSI can be 2575 supported for each of the unicast VPN inter-AS models described in 2576 [RFC4364]. These procedures do not depend on the method used to 2577 exchange unicast VPN routes. For Option B and Option C they do 2578 require MPLS encapsulation between the ASBRs. 2580 8.2.4. Inter-AS S-PMSI 2582 An inter-AS tunnel for an S-PMSI is constructed similar to an inter- 2583 AS tunnel for an I-PMSI. Namely, such a tunnel is constructed as a 2584 concatenation of tunnel segments. There are two types of tunnel 2585 segments: an intra-AS tunnel segment (a segment that spans ASBRs 2586 within the same AS), and inter-AS tunnel segment (a segment that 2587 spans adjacent ASBRs in adjacent ASes). ASes that are spanned by a 2588 tunnel are not required to use the same tunneling mechanism to 2589 construct the tunnel - each AS may pick up a tunneling mechanism to 2590 construct the intra-AS tunnel segment of the tunnel on its 2592 The PE that decides to set up a S-PMSI, advertises the S-PMSI tunnel 2593 binding using procedures in section 7.3.2 to the routers in its own 2594 AS. The membership for which the S-PMSI is instantiated, 2595 is propagated along an inter-AS spanning tree. This spanning tree 2596 traverses the same ASBRs as the AS MVPN membership spanning tree. In 2597 addition to the information elements described in section 7.3.2 2598 (Origin AS, RD, next-hop) the C-S and C-G is also advertised. 2600 An ASBR that receives the AS information from its upstream 2601 ASBR using EBGP sends back a tunnel binding for AS 2602 information if a) at least one of the Route Targets carried in the 2603 message matches one of the import Route Targets configured on the 2604 ASBR, and (b) the ASBR determines that the received route is the best 2605 route to the destination carried in the NLRI of the route. If the 2606 ASBR instantiates a S-PMSI for the AS it sends back a 2607 downstream label that is used to forward the packet along its intra- 2608 AS S-PMSI for the . However the ASBR may decide to use an 2609 AS MVPN membership I-PMSI instead, in which case it sends back the 2610 same label that it advertised for the AS MVPN membership I-PMSI. If 2611 the downstream ASBR instantiates a S-PMSI, it further propagates the 2612 membership to its downstream ASes, else it does not. 2614 An AS can instantiate an intra-AS S-PMSI for the inter-AS S-PMSI 2615 tunnel only if the upstream AS instantiates a S-PMSI. The procedures 2616 allow each AS to determine whether it wishes to setup a S-PMSI or not 2617 and the AS is not forced to setup a S-PMSI just because the upstream 2618 AS decides to do so. 2620 The leaves of an intra-AS S-PMSI tunnel will be the PEs that have 2621 local receivers that are interested in and the ASBRs that 2622 have received MVPN routing information for . Note that an 2623 AS can determine these ASBRs as the MVPN routing information is 2624 propagated and processed by each ASBR on the AS MVPN membership 2625 spanning tree. 2627 The C-multicast data traffic is sent on the S-PMSI by the originating 2628 PE. When it reaches an ASBR that is on the spanning tree, it is 2629 delivered to local receivers, if any, and is also forwarded to the 2630 neighbor ASBR after being encapsulated in the label advertised by the 2631 neighbor. The neighbor ASBR either transports this packet on the S- 2632 PMSI for the multicast stream or an I-PMSI, delivering it to the 2633 ASBRs in its own AS. These ASBRs in turn repeat the procedures of the 2634 origin AS ASBRs and the multicast packet traverses the spanning tree. 2636 9. Duplicate Packet Detection and Single Forwarder PE 2638 An egress PE may receive duplicate multicast data packets, from more 2639 than one ingress PE, for a MVPN when a a site that contains C-S or 2640 C-RP is multihomed to more than one PE. An egress PE may also receive 2641 duplicate data packets for a MVPN, from two different ingress PEs, 2642 when the CE-PE routing protocol is PIM-SM and a router or a CE in a 2643 site switches from the C-RP tree to C-S tree. 2645 For a given a PE, say PE1, expects to receive C-data 2646 packets from the upstream PE, say PE2, which PE1 identified as the 2647 upstream multicast hop in the C-Multicast Routing Update that PE1 2648 sent in order to join . If PE1 can determine that a data 2649 packet for was received from the expected upstream PE, 2650 PE2, PE1 will accept the packet. Otherwise, PE1 will drop the 2651 packet. (But see section 10 for an exception case where PE1 will 2652 accept a packet even if it is from an unexpected upstream PE.) This 2653 determination can be performed only if the PMSI on which the packets 2654 are being received and the tunneling technology used to instantiate 2655 the PMSI allows the PE to determine the source PE that sent the 2656 packet. However this determination may not always be possible. 2658 Therefore, procedures are needed to ensure that packets are received 2659 at a PE only from a single upstream PE. This is called single 2660 forwarder PE selection. 2662 Single forwarder PE selection is achieved by the following set of 2663 procedures: 2665 a. If there is more than one PE within the same AS through which 2666 C-S or C-RP of a given MVPN could be reached, and in the case 2667 of C-S not every such PE advertises an S-PMSI for , 2668 all PEs that have this MVPN MUST send the MVPN routing 2669 information update for or to the same 2670 upstream PE. This is achieved using the following procedure: 2672 Using the procedure for "RPF determination" specified in 2673 section 5.1, find (a) the upstream multicast hop for the C-S or 2674 C-RP, and (b) the route used to reach the upstream multicast 2675 hop. Call this route the "installed RPF route" for C-S or C- 2676 RP. 2678 If the next-hop interface of the installed RPF route for C-S or 2679 C-RP is a VRF interface of the PE, then the PE uses that route 2680 to reach the C-S or C-RP. 2682 Otherwise, consider the set of all VPN-IP routes that are (a) 2683 eligible to be imported into the VRF (as determined by their 2684 Route Targets), (b) are eligible to be used for RPF 2685 determination (i.e., if RPF determination is done via a non- 2686 congruent multicast topology, this would include only the 2687 routes that are part of that topology), and (c) have exactly 2688 the same IP prefix as the installed RPF route. 2690 For each route in this set, determine the corresponding 2691 upstream PE. If a route has a VRF Route Import Extended 2692 Community, the route's upstream PE is determined from it. If a 2693 route does not have a VRF Route Import Extended Community, the 2694 route's upstream PE is determined from the route's BGP next hop 2695 attribute. 2697 This results in a set of pairs of . The PE 2698 will select the route whose corresponding upstream PE address 2699 is numerically highest, where a 32-bit IP address is treated as 2700 a 32-bit unsigned integer. Call this the "selected RPF route". 2701 The PE will use the selected RPF route to reach the C-S or C- 2702 RP. 2704 b. The above procedure ensures that if C-S or C-RP is multi-homed 2705 to PEs within a single AS, a PE will not receive duplicate 2706 traffic as long as all the PEs in that AS are on either the C-S 2707 or C-RP tree. 2709 However the PE may receive duplicate traffic if C-S or C-RP is 2710 multi-homed to different ASes. In this case the PE can detect 2711 duplicate traffic as such duplicate traffic will arrive on a 2712 different tunnel - if the PE was expecting the traffic on an 2713 inter-AS tunnel, duplicate traffic will arrive on an intra-AS 2714 tunnel [this is not an intra-AS tunnel segment, of an inter-AS 2715 tunnel] and vice-versa. 2717 To achieve the above the PE has to keep track of which (inter- 2718 AS) auto-discovery route the PE uses for sending MVPN multicast 2719 routing information towards C-S/C-RP. Then the PE should 2720 receive (multicast) traffic originated by C-S/C-RP only from 2721 the (inter-AS) tunnel that was carried in the best source 2722 auto-discovery route for the MVPN and was originated by the AS 2723 that contains C-S/C-RP (where "the best" is determined by the 2724 PE). All other multicast traffic originated by C-S/C-RP, but 2725 received on any other tunnel should be discarded as duplicated. 2727 The PE may also receive duplicate traffic during a 2728 to switch. The issue and the solution are described 2729 next. 2731 c. If the tunneling technology in use for a particular MVPN does 2732 not allow the egress PEs to identify the ingress PE, then 2733 having all the PEs select the same PE to be the upstream 2734 multicast hop is not sufficient to prevent packet duplication. 2735 The reason is that a single tunnel may be carrying traffic on 2736 both the (C-*, C-G) tree and the (C-S, C-G) tree. If some of 2737 the egress PEs have joined the source tree, but others expect 2738 to receive (S,G) packets from the shared tree, then two copies 2739 of data packet will travel on the tunnel, and the egress PEs 2740 will have no way to determine that only one copy should be 2741 accepted. 2743 To avoid this, it is necessary to ensure that once any PE joins 2744 the (C-S, C-G) tree, any other PE that has joined the (C-*, C- 2745 G) tree also switches to the (C-S, C-G) tree (selecting, of 2746 course, the same upstream multicast hop, as specified above). 2748 Whenever a PE creates an state as a result of 2749 receiving a C-multicast route for from some other 2750 PE, and the C-G group is a Sparse Mode group, the PE that 2751 creates the state MUST originate an auto-discovery route as 2752 specified below. The route is being advertised using the same 2753 procedures as the MVPN auto-discovery/binding (both intra-AS 2754 and inter-AS) specified in this document with the following 2755 modifications: 2757 1. The Multicast Source field MUST be set to C-S. The 2758 Multicast Source Length field is set appropriately to 2759 reflect this. 2761 2. The Multicast Group field MUST be set to C-G. The 2762 Multicast Group Length field is set appropriately to 2763 reflect this. 2765 The route goes to all the PEs of the MVPN. When a PE receives 2766 this route, it checks whether there are any receivers in the 2767 MVPN sites attached to the PE for the group carried in the 2768 route. If yes, then it generates a C-multicast route indicating 2769 Join for . This forces all the PEs (in all ASes) to 2770 switch to the C-S tree for from the C-RP tree. 2772 This is the same type of A-D route used to report active 2773 sources in the scenarios described in section 10. 2775 Note that when a PE thus joins the tree, it may need 2776 to send a PIM (S,G,RPT-bit) prune to one of its CE PIM 2777 neighbors, as determined by ordinary PIM procedures.. 2779 Whenever the PE deletes the state that was 2780 previously created as a result of receiving a C-multicast route 2781 for from some other PE, the PE that deletes the 2782 state also withdraws the auto-discovery route that was 2783 advertised when the state was created. 2785 N.B.: SINCE ALL PES WITH RECEIVERS FOR GROUP C-G WILL JOIN THE 2786 C-S SOURCE TREE IF ANY OF THEM DO, IT IS NEVER NECESSARY TO 2787 DISTRIBUTE A BGP C-MULTICAST ROUTE FOR THE PURPOSE OF PRUNING 2788 SOURCES FROM THE SHARED TREE. 2790 In summary when the CE-PE routing protocol for all PEs that belong to 2791 a MVPN is not PIM-SM, selection of a consistent upstream PE to reach 2792 C-S is sufficient to eliminate duplicates when C-S is multi-homed to 2793 a single AS. When C-S is multi-homed to multiple ASes, duplicate 2794 packet detection can be performed as the receiver PE can always 2795 determine whether packets arrived on the wrong tunnel. When the CE-PE 2796 routing protocol is PIM-SM, additional procedures as described above 2797 are required to force all PEs within all ASes to switch to the C-S 2798 tree from the C-RP tree when any PE switches to the C-S tree. 2800 10. Deployment Models 2802 This section describes some optional deployment models and specific 2803 procedures for those deployment models. 2805 10.1. Co-locating C-RPs on a PE 2807 [MVPN-REQ] describes C-RP engineering as an issue when PIM-SM (or 2808 bidir-PIM) is used in ASM mode on the VPN customer site. To quote 2809 from [MVPN-REQ]: 2811 "In some cases this engineering problem is not trivial: for instance, 2812 if sources and receivers are located in VPN sites that are different 2813 than that of the RP, then traffic may flow twice through the SP 2814 network and the CE-PE link of the RP (from source to RP, and then 2815 from RP to receivers) ; this is obviously not ideal. A multicast VPN 2816 solution SHOULD propose a way to help on solving this RP engineering 2817 issue." 2819 One of the C-RP deployment models is for the customer to outsource 2820 the RP to the provider. In this case the provider may co-locate the 2821 RP on the PE that is connected to the customer site [MVPN-REQ]. This 2822 model is introduced in [RP-MVPN]. This section describes how 2823 anycast-RP can be used for achieving this by advertising active 2824 sources. This is described below. 2826 10.1.1. Initial Configuration 2828 For a particular MVPN, at least one or more PEs that have sites in 2829 that MVPN, act as an RP for the sites of that MVPN connected to these 2830 PEs. Within each MVPN all these RPs use the same (anycast) address. 2831 All these RPs use the Anycast RP technique. 2833 10.1.2. Anycast RP Based on Propagating Active Sources 2835 This mechanism is based on propagating active sources between RPs. 2837 [Editor's Note: This is derived from the model in [RP-MVPN].] 2839 10.1.2.1. Receiver(s) Within a Site 2841 The PE which receives C-Join for (*,G) or (S,G) does not send the 2842 information that it has receiver(s) for G until it receives 2843 information about active sources for G from an upstream PE. 2845 On receiving this (described in the next section), the downstream PE 2846 will respond with Join for C-(S,G). Sending this information could be 2847 done using any of the procedures described in section 5. If BGP is 2848 used, the ingress address is set to the upstream PE's address which 2849 has triggered the source active information. Only the upstream PE 2850 will process this information. If unicast PIM is used then a unicast 2851 PIM message will have to be sent to the PE upstream PE that has 2852 triggered the source active information. If a MI-PMSI is used than 2853 further clarification is needed on the upstream neighbor address of 2854 the PIM message and will be provided in a future revision. 2856 10.1.2.2. Source Within a Site 2858 When a PE receives PIM-Register from a site that belongs to a given 2859 VPN, PE follows the normal PIM anycast RP procedures. It then 2860 advertises the source and group of the multicast data packet carried 2861 in PIM-Register message to other PEs in BGP using the following 2862 information elements: 2864 - Active source address 2866 - Active group address 2868 - Route target of the MVPN. 2870 This advertisement goes to all the PEs that belong to that MVPN. When 2871 a PE receives this advertisement, it checks whether there are any 2872 receivers in the sites attached to the PE for the group carried in 2873 the source active advertisement. If yes, then it generates an 2874 advertisement for C-(S,G) as specified in the previous section. 2876 Note that the mechanism described in section 7.3.2. can be leveraged 2877 to advertise a S-PMSI binding along with the source active messages. 2879 10.1.2.3. Receiver Switching from Shared to Source Tree 2881 No additional procedures are required when multicast receivers in 2882 customer's site shift from shared tree to source tree. 2884 10.2. Using MSDP between a PE and a Local C-RP 2886 Section 10.1 describes the case where each PE is a C-RP. This 2887 enables the PEs to know the active multicast sources for each MVPN, 2888 and they can then use BGP to distribute this information to each 2889 other. As a result, the PEs do not have to join any shared C-trees, 2890 and this results in a simplification of the PE operation. 2892 In another deployment scenario, the PEs are not themselves C-RPs, but 2893 use MSDP to talk to the C-RPs. In particular, a PE which attaches to 2894 a site that contains a C-RP becomes an MSDP peer of that C-RP. That 2895 PE then uses BGP to distribute the information about the active 2896 sources to the other PEs. When the PE determines, by MSDP, that a 2897 particular source is no longer active, then it withdraws the 2898 corresponding BGP update. Then the PEs do not have to join any 2899 shared C-trees, but they do not have to be C-RPs either. 2901 MSDP provides the capability for a Source Active message to carry an 2902 encapsulated data packet. This capability can be used to allow an 2903 MSDP speaker to receive the first (or first several) packet(s) of an 2904 (S,G) flow, even though the MSDP speaker hasn't yet joined the (S,G) 2905 tree. (Presumably it will join that tree as a result of receiving 2906 the SA message which carries the encapsulated data packet.) If this 2907 capability is not used, the first several data packets of an (S,G) 2908 stream may be lost. 2910 A PE which is talking MSDP to an RP may receive such an encapsulated 2911 data packet from the RP. The data packet should be decapsulated and 2912 transmitted to the other PEs in the MVPN. If the packet belongs to a 2913 particular (S,G) flow, and if the PE is a transmitter for some S-PMSI 2914 to which (S,G) has already been bound, the decapsulated data packet 2915 should be transmitted on that S-PMSI. Otherwise, if an I-PMSI exists 2916 for that MVPN, the decapsulated data packet should be transmitted on 2917 it. (If a default MI-PMSI exists, this would typically be used.) If 2918 neither of these conditions hold, the decapsulated data packet is not 2919 transmitted to the other PEs in the MVPN. The decision as to whether 2920 and how to transmit the decapsulated data packet does not effect the 2921 processing of the SA control message itself. 2923 Suppose that PE1 transmits a multicast data packet on a PMSI, where 2924 that data packet is part of an (S,G) flow, and PE2 receives that 2925 packet form that PMSI. According to section 9, PE1 is not the PE 2926 that PE2 expects to be transmitting (S,G) packets, then PE2 must 2927 discard the packet. If an MSDP-encapsulated data packet is 2928 transmitted on a PMSI as specified above, this rule from section 9 2929 would likely result in the packet's getting discarded. Therefore, if 2930 MSDP-encapsulated data packets being decapsulated and transmitted on 2931 a PMSI, we need to modify the rules of section 9 as follows: 2933 1. If the receiving PE, PE1, has already joined the (S,G) tree, 2934 and has chosen PE2 as the upstream PE for the (S,G) tree, but 2935 this packet does not come from PE2, PE1 must discard the 2936 packet. 2938 2. If the receiving PE, PE1, has not already joined the (S,G) 2939 tree, but is a PIM adjacency to a CE which is downstream on the 2940 (*,G) tree, the packet should be forwarded to the CE. 2942 11. Encapsulations 2944 The BGP-based auto-discovery procedures will ensure that the PEs in a 2945 single MVPN only use tunnels that they can all support, and for a 2946 given kind of tunnel, that they only use encapsulations that they can 2947 all support. 2949 11.1. Encapsulations for Single PMSI per Tunnel 2951 11.1.1. Encapsulation in GRE 2953 GRE encapsulation can be used for any PMSI that is instantiated by a 2954 mesh of unicast tunnels, as well as for any PMSI that is instantiated 2955 by one or more PIM tunnels of any sort. 2957 Packets received Packets in transit Packets forwarded 2958 at ingress PE in the service by egress PEs 2959 provider network 2961 +---------------+ 2962 | P-IP Header | 2963 +---------------+ 2964 | GRE | 2965 ++=============++ ++=============++ ++=============++ 2966 || C-IP Header || || C-IP Header || || C-IP Header || 2967 ++=============++ >>>>> ++=============++ >>>>> ++=============++ 2968 || C-Payload || || C-Payload || || C-Payload || 2969 ++=============++ ++=============++ ++=============++ 2971 The IP Protocol Number field in the P-IP Header must be set to 47. 2972 The Protocol Type field of the GRE Header must be set to 0x800. 2974 When an encapsulated packet is transmitted by a particular PE, the 2975 source IP address in the P-IP header must be the same address as is 2976 advertised by that PE in the RPF information. 2978 If the PMSI is instantiated by a PIM tree, the destination IP address 2979 in the P-IP header is the group P-address associated with that tree. 2980 The GRE key field value is omitted. 2982 If the PMSI is instantiated by unicast tunnels, the destination IP 2983 address is the address of the destination PE, and the optional GRE 2984 Key field is used to identify a particular MVPN. In this case, each 2985 PE would have to advertise a key field value for each MVPN; each PE 2986 would assign the key field value that it expects to receive. 2988 [RFC2784] specifies an optional GRE checksum, and [RFC2890] specifies 2989 an optional GRE sequence number fields. 2991 The GRE sequence number field is not needed because the transport 2992 layer services for the original application will be provided by the 2993 C-IP Header. 2995 The use of GRE checksum field must follow [RFC2784]. 2997 To facilitate high speed implementation, this document recommends 2998 that the ingress PE routers encapsulate VPN packets without setting 2999 the checksum, or sequence fields. 3001 11.1.2. Encapsulation in IP 3003 IP-in-IP [RFC1853] is also a viable option. When it is used, the 3004 IPv4 Protocol Number field is set to 4. The following diagram shows 3005 the progression of the packet as it enters and leaves the service 3006 provider network. 3008 Packets received Packets in transit Packets forwarded 3009 at ingress PE in the service by egress PEs 3010 provider network 3012 +---------------+ 3013 | P-IP Header | 3014 ++=============++ ++=============++ ++=============++ 3015 || C-IP Header || || C-IP Header || || C-IP Header || 3016 ++=============++ >>>>> ++=============++ >>>>> ++=============++ 3017 || C-Payload || || C-Payload || || C-Payload || 3018 ++=============++ ++=============++ ++=============++ 3020 11.1.3. Encapsulation in MPLS 3022 If the PMSI is instantiated as a P2MP MPLS LSP, MPLS encapsulation is 3023 used. Penultimate-hop-popping must be disabled for the P2MP MPLS LSP. 3024 If the PMSI is instantiated as an RSVP-TE P2MP LSP, additional MPLS 3025 encapsulation procedures are used, as specified in [RSVP-P2MP]. 3027 If other methods of assigning MPLS labels to multicast distribution 3028 trees are in use, these multicast distribution trees may be used as 3029 appropriate to instantiate PMSIs, and any additional MPLS 3030 encapsulation procedures may be used. 3032 Packets received Packets in transit Packets forwarded 3033 at ingress PE in the service by egress PEs 3034 provider network 3036 +---------------+ 3037 | P-MPLS Header | 3038 ++=============++ ++=============++ ++=============++ 3039 || C-IP Header || || C-IP Header || || C-IP Header || 3040 ++=============++ >>>>> ++=============++ >>>>> ++=============++ 3041 || C-Payload || || C-Payload || || C-Payload || 3042 ++=============++ ++=============++ ++=============++ 3044 11.2. Encapsulations for Multiple PMSIs per Tunnel 3046 The encapsulations for transmitting multicast data messages when 3047 there are multiple PMSIs per tunnel are based on the encapsulation 3048 for a single PMSI per tunnel, but with an MPLS label used for 3049 demultiplexing. 3051 The label is upstream-assigned and distributed via BGP as specified 3052 in section 4. The label must enable the receiver to select the 3053 proper VRF, and may enable the receiver to select a particular 3054 multicast routing entry within that VRF. 3056 11.2.1. Encapsulation in GRE 3058 Rather than the IP-in-GRE encapsulation discussed in section 11.1.1, 3059 we use the MPLS-in-GRE encapsulation. This is specified in [MPLS- 3060 IP]. The GRE protocol type MUST be set to 0x8847. [The reason for 3061 using the unicast rather than the multicast value is specified in 3062 [MPLS-MCAST-ENCAPS]. 3064 11.2.2. Encapsulation in IP 3066 Rather than the IP-in-IP encapsulation discussed in section 12.1.2, 3067 we use the MPLS-in-IP encapsulation. This is specified in [MPLS-IP]. 3068 The IP protocol number MUST be set to the value identifying the 3069 payload as an MPLS unicast packet. [There is no "MPLS multicast 3070 packet" protocol number.] 3072 11.3. Encapsulations for Unicasting PIM Control Messages 3074 When PIM control messages are unicast, rather than being sent on an 3075 MI-PMSI, the the receiving PE needs to determine the particular MVPN 3076 whose multicast routing information is being carried in the PIM 3077 message. One method is to use a downstream-assigned MPLS label which 3078 the receiving PE has allocated for this specific purpose. The label 3079 would be distributed via BGP. This can be used with an MPLS, MPLS- 3080 in-GRE, or MPLS-in-IP encapsulation. 3082 A possible alternative to modify the PIM messages themselves so that 3083 they carry information which can be used to identify a particular 3084 MVPN, such as an RT. 3086 This area is still under consideration. 3088 11.4. General Considerations for IP and GRE Encaps 3090 These apply also to the MPLS-in-IP and MPLS-in-GRE encapsulations. 3092 11.4.1. MTU 3094 It is the responsibility of the originator of a C-packet to ensure 3095 that the packet small enough to reach all of its destinations, even 3096 when it is encapsulated within IP or GRE. 3098 When a packet is encapsulated in IP or GRE, the router that does the 3099 encapsulation MUST set the DF bit in the outer header. This ensures 3100 that the decapsulating router will not need to reassemble the 3101 encapsulating packets before performing decapsulation. 3103 In some cases the encapsulating router may know that a particular C- 3104 packet is too large to reach its destinations. Procedures by which 3105 it may know this are outside the scope of the current document. 3106 However, if this is known, then: 3108 - If the DF bit is set in the IP header of a C-packet which is 3109 known to be too large, the router will discard the C-packet as 3110 being "too large", and follow normal IP procedures (which may 3111 require the return of an ICMP message to the source). 3113 - If the DF bit is not set in the IP header of a C-packet which is 3114 known to be too large, the router MAY fragment the packet before 3115 encapsulating it, and then encapsulate each fragment separately. 3116 Alternatively, the router MAY discard the packet. 3118 If the router discards a packet as too large, it should maintain OAM 3119 information related to this behavior, allowing the operator to 3120 properly troubleshoot the issue. 3122 Note that if the entire path of the tunnel does not support an MTU 3123 which is large enough to carry the a particular encapsulated C- 3124 packet, and if the encapsulating router does not do fragmentation, 3125 then the customer will not receive the expected connectivity. 3127 11.4.2. TTL 3129 The ingress PE should not copy the TTL field from the payload IP 3130 header received from a CE router to the delivery IP or MPLS header. 3131 The setting of the TTL of the delivery header is determined by the 3132 local policy of the ingress PE router. 3134 11.4.3. Differentiated Services 3136 The setting of the DS field in the delivery IP header should follow 3137 the guidelines outlined in [RFC2983]. Setting the EXP field in the 3138 delivery MPLS header should follow the guidelines in [RFC3270]. An SP 3139 may also choose to deploy any of the additional mechanisms the PE 3140 routers support. 3142 11.4.4. Avoiding Conflict with Internet Multicast 3144 If the SP is providing Internet multicast, distinct from its VPN 3145 multicast services, and using PIM based P-multicast trees, it must 3146 ensure that the group P-addresses which it used in support of MPVN 3147 services are distinct from any of the group addresses of the Internet 3148 multicasts it supports. This is best done by using administratively 3149 scoped addresses [ADMIN-ADDR]. 3151 The group C-addresses need not be distinct from either the group P- 3152 addresses or the Internet multicast addresses. 3154 12. Security Considerations 3156 To be supplied. 3158 13. IANA Considerations 3160 To be supplied. 3162 14. Other Authors 3164 Sarveshwar Bandi, Yiqun Cai, Thomas Morin, Yakov Rekhter, IJsbrands 3165 Wijnands, Seisho Yasukawa 3167 15. Other Contributors 3169 Significant contributions were made Arjen Boers, Toerless Eckert, 3170 Adrian Farrel, Luyuan Fang, Dino Farinacci, Lenny Guiliano, Shankar 3171 Karuna, Anil Lohiya, Tom Pusateri, Ted Qian, Robert Raszuk, Tony 3172 Speakman, Dan Tappan. 3174 16. Authors' Addresses 3176 Rahul Aggarwal (Editor) 3177 Juniper Networks 3178 1194 North Mathilda Ave. 3179 Sunnyvale, CA 94089 3180 Email: rahul@juniper.net 3182 Sarveshwar Bandi 3183 Motorola 3184 Vanenburg IT park, Madhapur, 3185 Hyderabad, India 3186 Email: sarvesh@motorola.com 3188 Yiqun Cai 3189 Cisco Systems, Inc. 3190 170 Tasman Drive 3191 San Jose, CA, 95134 3192 E-mail: ycai@cisco.com 3194 Thomas Morin 3195 France Telecom R & D 3196 2, avenue Pierre-Marzin 3197 22307 Lannion Cedex 3198 France 3199 Email: thomas.morin@francetelecom.com 3201 Yakov Rekhter 3202 Juniper Networks 3203 1194 North Mathilda Ave. 3204 Sunnyvale, CA 94089 3205 Email: yakov@juniper.net 3206 Eric C. Rosen (Editor) 3207 Cisco Systems, Inc. 3208 1414 Massachusetts Avenue 3209 Boxborough, MA, 01719 3210 E-mail: erosen@cisco.com 3212 IJsbrand Wijnands 3213 Cisco Systems, Inc. 3214 170 Tasman Drive 3215 San Jose, CA, 95134 3216 E-mail: ice@cisco.com 3218 Seisho Yasukawa 3219 NTT Corporation 3220 9-11, Midori-Cho 3-Chome 3221 Musashino-Shi, Tokyo 180-8585, 3222 Japan 3223 Phone: +81 422 59 4769 3224 Email: yasukawa.seisho@lab.ntt.co.jp 3226 17. Normative References 3228 [MVPN-BGP], R. Aggarwal, E. Rosen, T. Morin, Y. Rekhter, C. 3229 Kodeboniya, "BGP Encodings for Multicast in MPLS/BGP IP VPNs", 3230 draft-ietf-l3vpn-2547bis-mcast-bgp-02.txt, March 2007 3232 [MPLS-IP] T. Worster, Y. Rekhter, E. Rosen, "Encapsulating MPLS in IP 3233 or Generic Routing Encapsulation (GRE)", RFC 4023, March 2005 3235 [MPLS-MCAST-ENCAPS] T. Eckert, E. Rosen, R. Aggarwal, Y. Rekhter, 3236 "MPLS Multicast Encapsulations", draft-ietf-mpls-multicast-encaps- 3237 04.txt, April 2007 3239 [MPLS-UPSTREAM-LABEL] R. Aggarwal, Y. Rekhter, E. Rosen, "MPLS 3240 Upstream Label Assignment and Context Specific Label Space", draft- 3241 ietf-mpls-upstream-label-02.txt, March 2007 3243 [PIM-SM] "Protocol Independent Multicast - Sparse Mode (PIM-SM)", 3244 Fenner, Handley, Holbrook, Kouvelas, August 2006, RFC 4601 3246 [RFC2119] "Key words for use in RFCs to Indicate Requirement 3247 Levels.", Bradner, March 1997 3249 [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006 3251 [RSVP-P2MP] R. Aggarwal, et. al., "Extensions to RSVP-TE for Point to 3252 Multipoint TE LSPs", draft-ietf-mpls-rsvp-te-p2mp-07.txt, January 3253 2007 3255 18. Informative References 3257 [ADMIN-ADDR] D. Meyer, "Administratively Scoped IP Multicast", RFC 3258 2365, July 1998 3260 [MVPN-REQ] T. Morin, Ed., "Requirements for Multicast in L3 3261 Provider-Provisioned VPNs", RFC 4834, April 2007 3263 [MVPN-BASE] R. Aggarwal, A. Lohiya, T. Pusateri, Y. Rekhter, "Base 3264 Specification for Multicast in MPLS/BGP VPNs", draft-raggarwa-l3vpn- 3265 2547-mvpn-00.txt 3267 [RAGGARWA-MCAST] R. Aggarwal, et. al., "Multicast in BGP MPLS VPNs 3268 and VPLS", draft-raggarwa-l3vpn-mvpn-vpls-mcast-01.txt". 3270 [ROSEN-8] E. Rosen, Y. Cai, I. Wijnands, "Multicast in MPLS/BGP IP 3271 VPNs", draft-rosen-vpn-mcast-08.txt 3273 [RP-MVPN] S. Yasukawa, et. al., "BGP/MPLS IP Multicast VPNs", draft- 3274 yasukawa-l3vpn-p2mp-mcast-01.txt 3276 [RFC1853] W. Simpson, "IP in IP Tunneling", October 1995 3278 [RFC2784] D. Farinacci, et. al., "Generic Routing Encapsulation", 3279 March 2000 3281 [RFC2890] G. Dommety, "Key and Sequence Number Extensions to GRE", 3282 September 2000 3284 [RFC2983] D. Black, "Differentiated Services and Tunnels", October 3285 2000 3287 [RFC3270] F. Le Faucheur, et. al., "MPLS Support of Differentiated 3288 Services", May 2002 3290 19. Full Copyright Statement 3292 Copyright (C) The IETF Trust (2007). 3294 This document is subject to the rights, licenses and restrictions 3295 contained in BCP 78, and except as set forth therein, the authors 3296 retain all their rights. 3298 This document and the information contained herein are provided on an 3299 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 3300 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 3301 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 3302 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 3303 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 3304 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 3306 20. Intellectual Property 3308 The IETF takes no position regarding the validity or scope of any 3309 Intellectual Property Rights or other rights that might be claimed to 3310 pertain to the implementation or use of the technology described in 3311 this document or the extent to which any license under such rights 3312 might or might not be available; nor does it represent that it has 3313 made any independent effort to identify any such rights. Information 3314 on the procedures with respect to rights in RFC documents can be 3315 found in BCP 78 and BCP 79. 3317 Copies of IPR disclosures made to the IETF Secretariat and any 3318 assurances of licenses to be made available, or the result of an 3319 attempt made to obtain a general license or permission for the use of 3320 such proprietary rights by implementers or users of this 3321 specification can be obtained from the IETF on-line IPR repository at 3322 http://www.ietf.org/ipr. 3324 The IETF invites any interested party to bring to its attention any 3325 copyrights, patents or patent applications, or other proprietary 3326 rights that may cover technology that may be required to implement 3327 this standard. Please address the information to the IETF at ietf- 3328 ipr@ietf.org.