idnits 2.17.1 

draft-ietf-l3vpn-2547bis-mcast-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 19.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 3304.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 3315.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 3322.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 3328.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     If the S-PMSI is instantiated by a source-initiated P-multicast
     tree (e.g., an RSVP-TE P2MP tunnel), the PE at the root of the tree must
     establish the source-initiated P-multicast tree to the leaves.  This tree
     MAY have been established before the leaves receive the S-PMSI binding,
     or MAY be established after the leaves receives the binding. The leaves
     MUST not switch to the S-PMSI until they receive both the binding and the
     tree signaling message.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (April 2007) is 6221 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-08) exists of
     draft-ietf-l3vpn-2547bis-mcast-bgp-02

  == Outdated reference: A later version (-10) exists of
     draft-ietf-mpls-multicast-encaps-04

  == Outdated reference: A later version (-07) exists of
     draft-ietf-mpls-upstream-label-02

  ** Obsolete normative reference: RFC 4601 (ref. 'PIM-SM') (Obsoleted by RFC
     7761)

  == Outdated reference: A later version (-15) exists of
     draft-rosen-vpn-mcast-08


     Summary: 2 errors (**), 0 flaws (~~), 7 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                             Eric C. Rosen (Editor)
3	Internet Draft                                       Cisco Systems, Inc.
4	Expiration Date: October 2007
5	                                                 Rahul Aggarwal (Editor)
6	                                                        Juniper Networks

8	                                                              April 2007

10	                     Multicast in MPLS/BGP IP VPNs

12	                 draft-ietf-l3vpn-2547bis-mcast-04.txt

14	Status of this Memo

16	   By submitting this Internet-Draft, each author represents that any
17	   applicable patent or other IPR claims of which he or she is aware
18	   have been or will be disclosed, and any of which he or she becomes
19	   aware will be disclosed, in accordance with Section 6 of BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   The list of current Internet-Drafts can be accessed at
32	   http://www.ietf.org/ietf/1id-abstracts.txt.

34	   The list of Internet-Draft Shadow Directories can be accessed at
35	   http://www.ietf.org/shadow.html.

37	Abstract

39	   In order for IP multicast traffic within a BGP/MPLS IP VPN (Virtual
40	   Private Network) to travel from one VPN site to another, special
41	   protocols and procedures must be implemented by the VPN Service
42	   Provider.  These protocols and procedures are specified in this
43	   document.

45	Table of Contents

47	    1          Specification of requirements  ......................   4
48	    2          Introduction  .......................................   4
49	    2.1        Optimality vs Scalability  ..........................   5
50	    2.1.1      Multicast Distribution Trees  .......................   7
51	    2.1.2      Ingress Replication through Unicast Tunnels  ........   8
52	    2.2        Overview  ...........................................   8
53	    2.2.1      Multicast Routing Adjacencies  ......................   8
54	    2.2.2      MVPN Definition  ....................................   8
55	    2.2.3      Auto-Discovery  .....................................   9
56	    2.2.4      PE-PE Multicast Routing Information  ................  10
57	    2.2.5      PE-PE Multicast Data Transmission  ..................  11
58	    2.2.6      Inter-AS MVPNs  .....................................  11
59	    2.2.7      Optional Deployment Models  .........................  12
60	    3          Concepts and Framework  .............................  12
61	    3.1        PE-CE Multicast Routing  ............................  12
62	    3.2        P-Multicast Service Interfaces (PMSIs)  .............  13
63	    3.2.1      Inclusive and Selective PMSIs  ......................  14
64	    3.2.2      Tunnels Instantiating PMSIs  ........................  15
65	    3.3        Use of PMSIs for Carrying Multicast Data  ...........  17
66	    3.3.1      MVPNs with Default MI-PMSIs  ........................  18
67	    3.3.2      When MI-PMSIs are Required  .........................  18
68	    3.3.3      MVPNs That Do Not Use MI-PMSIs  .....................  18
69	    4          BGP-Based Autodiscovery of MVPN Membership  .........  19
70	    5          PE-PE Transmission of C-Multicast Routing  ..........  21
71	    5.1        RPF Information for Unicast VPN-IP Routes  ..........  21
72	    5.2        PIM Peering  ........................................  23
73	    5.2.1      Full Per-MVPN PIM Peering Across a MI-PMSI  .........  23
74	    5.2.2      Lightweight PIM Peering Across a MI-PMSI  ...........  23
75	    5.2.3      Unicasting of PIM C-Join/Prune Messages  ............  24
76	    5.2.4      Details of Per-MVPN PIM Peering over MI-PMSI  .......  24
77	    5.2.4.1    PIM C-Instance Control Packets  .....................  25
78	    5.2.4.2    PIM C-instance RPF Determination  ...................  25
79	    5.3        Use of BGP for Carrying C-Multicast Routing  ........  27
80	    5.3.1      Sending BGP Updates  ................................  27
81	    5.3.2      Explicit Tracking  ..................................  29
82	    5.3.3      Withdrawing BGP Updates  ............................  29
83	    6          I-PMSI Instantiation  ...............................  30
84	    6.1        MVPN Membership and Egress PE Auto-Discovery  .......  30
85	    6.1.1      Auto-Discovery for Ingress Replication  .............  30
86	    6.1.2      Auto-Discovery for P-Multicast Trees  ...............  31
87	    6.2        C-Multicast Routing Information Exchange  ...........  31
88	    6.3        Aggregation  ........................................  31
89	    6.3.1      Aggregate Tree Leaf Discovery  ......................  32
90	    6.3.2      Aggregation Methodology  ............................  32
91	    6.3.3      Encapsulation of the Aggregate Tree  ................  33
92	    6.3.4      Demultiplexing C-multicast traffic  .................  33
93	    6.4        Mapping Received Packets to MVPNs  ..................  34
94	    6.4.1      Unicast Tunnels  ....................................  35
95	    6.4.2      Non-Aggregated P-Multicast Trees  ...................  35
96	    6.4.3      Aggregate P-Multicast Trees  ........................  36
97	    6.5        I-PMSI Instantiation Using Ingress Replication  .....  36
98	    6.6        Establishing P-Multicast Trees  .....................  37
99	    6.7        RSVP-TE P2MP LSPs  ..................................  38
100	    6.7.1      P2MP TE LSP Tunnel - MVPN Mapping  ..................  38
101	    6.7.2      Demultiplexing C-Multicast Data Packets  ............  39
102	    7          Optimizing Multicast Distribution via S-PMSIs  ......  39
103	    7.1        S-PMSI Instantiation Using Ingress Replication  .....  40
104	    7.2        Protocol for Switching to S-PMSIs  ..................  41
105	    7.2.1      A UDP-based Protocol for Switching to S-PMSIs  ......  41
106	    7.2.1.1    Binding a Stream to an S-PMSI  ......................  41
107	    7.2.1.2    Packet Formats and Constants  .......................  42
108	    7.2.2      A BGP-based Protocol for Switching to S-PMSIs  ......  44
109	    7.2.2.1    Advertising C-(S, G) Binding to a S-PMSI using BGP  .  44
110	    7.2.2.2    Explicit Tracking  ..................................  46
111	    7.2.2.3    Switching to S-PMSI  ................................  46
112	    7.3        Aggregation  ........................................  47
113	    7.4        Instantiating the S-PMSI with a PIM Tree  ...........  47
114	    7.5        Instantiating S-PMSIs using RSVP-TE P2MP Tunnels  ...  48
115	    8          Inter-AS Procedures  ................................  48
116	    8.1        Non-Segmented Inter-AS Tunnels  .....................  49
117	    8.1.1      Inter-AS MVPN Auto-Discovery  .......................  49
118	    8.1.2      Inter-AS MVPN Routing Information Exchange  .........  49
119	    8.1.3      Inter-AS I-PMSI  ....................................  50
120	    8.1.4      Inter-AS S-PMSI  ....................................  51
121	    8.2        Segmented Inter-AS Tunnels  .........................  51
122	    8.2.1      Inter-AS MVPN Auto-Discovery Routes  ................  51
123	    8.2.1.1    Originating Inter-AS MVPN A-D Information  ..........  52
124	    8.2.1.2    Propagating Inter-AS MVPN A-D Information  ..........  53
125	    8.2.1.2.1  Inter-AS Auto-Discovery Route received via EBGP  ....  53
126	    8.2.1.2.2  Leaf Auto-Discovery Route received via EBGP  ........  54
127	    8.2.1.2.3  Inter-AS Auto-Discovery Route received via IBGP  ....  55
128	    8.2.2      Inter-AS MVPN Routing Information Exchange  .........  56
129	    8.2.3      Inter-AS I-PMSI  ....................................  56
130	    8.2.3.1    Support for Unicast VPN Inter-AS Methods  ...........  57
131	    8.2.4      Inter-AS S-PMSI  ....................................  57
132	    9          Duplicate Packet Detection and Single Forwarder PE  .  58
133	   10          Deployment Models  ..................................  62
134	   10.1        Co-locating C-RPs on a PE  ..........................  62
135	   10.1.1      Initial Configuration  ..............................  62
136	   10.1.2      Anycast RP Based on Propagating Active Sources  .....  62
137	   10.1.2.1    Receiver(s) Within a Site  ..........................  63
138	   10.1.2.2    Source Within a Site  ...............................  63
139	   10.1.2.3    Receiver Switching from Shared to Source Tree  ......  63
140	   10.2        Using MSDP between a PE and a Local C-RP  ...........  64
141	   11          Encapsulations  .....................................  65
142	   11.1        Encapsulations for Single PMSI per Tunnel  ..........  65
143	   11.1.1      Encapsulation in GRE  ...............................  65
144	   11.1.2      Encapsulation in IP  ................................  66
145	   11.1.3      Encapsulation in MPLS  ..............................  67
146	   11.2        Encapsulations for Multiple PMSIs per Tunnel  .......  68
147	   11.2.1      Encapsulation in GRE  ...............................  68
148	   11.2.2      Encapsulation in IP  ................................  68
149	   11.3        Encapsulations for Unicasting PIM Control Messages  .  68
150	   11.4        General Considerations for IP and GRE Encaps  .......  69
151	   11.4.1      MTU  ................................................  69
152	   11.4.2      TTL  ................................................  69
153	   11.4.3      Differentiated Services  ............................  70
154	   11.4.4      Avoiding Conflict with Internet Multicast  ..........  70
155	   12          Security Considerations  ............................  70
156	   13          IANA Considerations  ................................  70
157	   14          Other Authors  ......................................  70
158	   15          Other Contributors  .................................  70
159	   16          Authors' Addresses  .................................  71
160	   17          Normative References  ...............................  72
161	   18          Informative References  .............................  73
162	   19          Full Copyright Statement  ...........................  74
163	   20          Intellectual Property  ..............................  74

165	1. Specification of requirements

167	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
168	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
169	   document are to be interpreted as described in [RFC2119].

171	2. Introduction

173	   [RFC4364] specifies the set of procedures which a Service Provider
174	   (SP) must implement in order to provide a particular kind of VPN
175	   service ("BGP/MPLS IP VPN") for its customers.  The service described
176	   therein allows IP unicast packets to travel from one customer site to
177	   another, but it does not provide a way for IP multicast traffic to
178	   travel from one customer site to another.

180	   This document extends the service defined in  [RFC4364] so that it
181	   also includes the capability of handling IP multicast traffic.  This
182	   requires a number of different protocols to work together.  The
183	   document provides a framework describing how the various protocols
184	   fit together, and also provides detailed specification of some of the
185	   protocols.   The detailed specification of some of the other
186	   protocols is found in pre-existing documents or in companion
187	   documents.

189	2.1. Optimality vs Scalability

191	   In a "BGP/MPLS IP VPN" [RFC4364], unicast routing of VPN packets is
192	   achieved without the need to keep any per-VPN state in the core of
193	   the SP's network (the "P routers").  Routing information from a
194	   particular VPN is maintained only by the Provider Edge routers (the
195	   "PE routers", or "PEs") that attach directly to sites of that VPN.
196	   Customer data travels through the P routers in tunnels from one PE to
197	   another (usually MPLS Label Switched Paths, LSPs), so to support the
198	   VPN service the P routers only need to have routes to the PE routers.
199	   The PE-to-PE routing is optimal, but the amount of associated state
200	   in the P routers depends only on the number of PEs, not on the number
201	   of VPNs.

203	   However, in order to provide optimal multicast routing for a
204	   particular multicast flow, the P routers through which that flow
205	   travels have to hold state which is specific to that flow.
206	   Scalability would be poor if the amount of state in the P routers
207	   were proportional to the number of multicast flows in the VPNs.
208	   Therefore, when supporting multicast service for a BGP/MPLS IP VPN,
209	   the optimality of the multicast routing must be traded off against
210	   the scalability of the P routers.   We explain this below in more
211	   detail.

213	   If a particular VPN is transmitting "native" multicast traffic over
214	   the backbone,  we refer to it as an "MVPN".  By "native" multicast
215	   traffic, we mean packets that a CE sends to a PE, such that the IP
216	   destination address of the packets is a multicast group address, or
217	   the packets are multicast control packets addressed to the PE router
218	   itself, or the packets are IP multicast data packets encapsulated in
219	   MPLS.

221	   We say that the backbone multicast routing for a particular multicast
222	   group in a particular VPN is "optimal" if and only if all of the
223	   following conditions hold:

225	     - When a PE router receives a multicast data packet of that group
226	       from a CE router, it transmits the packet in such a way that the
227	       packet is received by every other PE router which is on the path
228	       to a receiver of that group;

230	     - The packet is not received by any other PEs;

232	     - While in the backbone, no more than one copy of the packet ever
233	       traverses any link.

235	     - While in the backbone, if bandwidth usage is to be optimized, the
236	       packet traverses minimum cost trees rather than shortest path
237	       trees.

239	   Optimal routing for a particular multicast group requires that the
240	   backbone maintain one or more source-trees which are specific to that
241	   flow.  Each such tree requires that state be maintained in all the P
242	   routers that are in the tree.

244	   This would potentially require an unbounded amount of state in the P
245	   routers, since the SP has no control of the number of multicast
246	   groups in the VPNs that it supports. Nor does the SP have any control
247	   over the number of transmitters in each group, nor of the
248	   distribution of the receivers.

250	   The procedures defined in this document allow an SP to provide
251	   multicast VPN service without requiring the amount of state
252	   maintained by the P routers to be proportional to the number of
253	   multicast data flows in the VPNs.  The amount of state is traded off
254	   against the optimality of the multicast routing.  Enough flexibility
255	   is provided so that a given SP can make his own tradeoffs between
256	   scalability and optimality.  An SP can even allow some multicast
257	   groups in some VPNs to receive optimal routing, while others do not.
258	   Of course, the cost of this flexibility is an increase in the number
259	   of options provided by the protocols.

261	   The basic technique for providing scalability is to aggregate a
262	   number of customer multicast flows onto a single multicast
263	   distribution tree through the P routers.  A number of aggregation
264	   methods are supported.

266	   The procedures defined in this document also accommodate the SP that
267	   does not want to build multicast distribution trees in his backbone
268	   at all; the ingress PE can replicate each multicast data packet and
269	   then unicast each replica through a tunnel to each egress PE that
270	   needs to receive the data.

272	2.1.1. Multicast Distribution Trees

274	   This document supports the use of a single multicast distribution
275	   tree in the backbone to carry all the multicast traffic from a
276	   specified set of one or more MVPNs.  Such a tree is referred to as an
277	   "Inclusive Tree". An Inclusive Tree which carries the traffic of more
278	   than one MVPN is an "Aggregate Inclusive Tree".  An Inclusive Tree
279	   contains, as its members, all the PEs that attach to any of the MVPNs
280	   using the tree.

282	   With this option, even if each tree supports only one MVPN, the upper
283	   bound on the amount of state maintained by the P routers is
284	   proportional to the number of VPNs supported, rather than to the
285	   number of multicast flows in those VPNs.  If the trees are
286	   unidirectional, it would be more accurate to say that the state is
287	   proportional to the product of the number of VPNs and the average
288	   number of PEs per VPN.  The amount of state maintained by the P
289	   routers can be further reduced by aggregating more MVPNs onto a
290	   single tree.  If each such tree supports a set of MVPNs, (call it an
291	   "MVPN aggregation set"), the state maintained by the P routers is
292	   proportional to the product of the number of MVPN aggregation sets
293	   and the average number of PEs per MVPN. Thus the state does not grow
294	   linearly with the number of MVPNs.

296	   However, as data from many multicast groups is aggregated together
297	   onto a single "Inclusive Tree", it is likely that some PEs will
298	   receive multicast data for which they have no need, i.e., some degree
299	   of optimality has been sacrificed.

301	   This document also provides procedures which enable a single
302	   multicast distribution tree in the backbone to be used to carry
303	   traffic belonging only to a specified set of one or more multicast
304	   groups, from one or more MVPNs. Such a tree is referred to as a
305	   "Selective Tree" and more specifically as an "Aggregate Selective
306	   Tree" when the multicast groups belong to different MVPNs.  By
307	   default, traffic from most multicast groups could be carried by an
308	   Inclusive Tree, while traffic from, e.g., high bandwidth groups could
309	   be carried in one of the "Selective Trees".  When setting up the
310	   Selective Trees, one should include only those PEs which need to
311	   receive multicast data from one or more of the groups assigned to the
312	   tree.  This provides more optimal routing than can be obtained by
313	   using only Inclusive Trees, though it requires additional state in
314	   the P routers.

316	2.1.2. Ingress Replication through Unicast Tunnels

318	   This document also provides procedures for carry MVPN data traffic
319	   through unicast tunnels from the ingress PE to each of the egress
320	   PEs. The ingress PE replicates the multicast data packet received
321	   from a CE and sends it to each of the egress PEs using the unicast
322	   tunnels.  This requires no multicast routing state in the P routers
323	   at all, but it puts the entire replication load on the ingress PE
324	   router, and makes no attempt to optimize the multicast routing.

326	2.2. Overview

328	2.2.1. Multicast Routing Adjacencies

330	   In BGP MPLS IP VPNs [RFC4364], each CE ("Customer Edge") router is a
331	   unicast routing adjacency of a PE router, but CE routers at different
332	   sites do not become unicast routing adjacencies of each other. This
333	   important characteristic is retained for multicast routing -- a CE
334	   router becomes a multicast routing adjacency of a PE router, but CE
335	   routers at different sites do not become multicast routing
336	   adjacencies of each other.

338	   The multicast routing protocol on the PE-CE link is presumed to be
339	   PIM.  The Sparse Mode, Dense Mode, Single Source Mode, and
340	   Bidirectional Modes are supported. A CE router exchanges "ordinary"
341	   PIM control messages with the PE router to which it is attached.

343	   The PEs attaching to a particular MVPN then have to exchange the
344	   multicast routing information with each other.  Two basic methods for
345	   doing this are defined: (1) PE-PE PIM, and (2) BGP.  In the former
346	   case, the PEs need to be multicast routing adjacencies of each other.
347	   In the latter case, they do not.  For example, each PE may be a BGP
348	   adjacency of a Route Reflector (RR), and not of any other PEs.

350	   To support the "Carrier's Carrier" model of [RFC4364], mLDP or BGP
351	   can be used on the PE-CE interface. This will be described in
352	   subsequent versions of this document.

354	2.2.2. MVPN Definition

356	   An MVPN is defined by two sets of sites, Sender Sites set and
357	   Receiver Sites set, with the following properties:

359	     -  Hosts within the Sender Sites set could originate multicast
360	       traffic for receivers in the Receiver Sites set.

362	     -  Receivers not in the Receiver Sites set should not be able to
363	       receive this traffic.

365	     -  Hosts within the Receiver Sites set could receive multicast
366	       traffic originated by any host in the Sender Sites set.

368	     -  Hosts within the Receiver Sites set should not be able to
369	       receive multicast traffic originated by any host that is not in
370	       the Sender Sites set.

372	   A site could be both in the Sender Sites set and Receiver Sites set,
373	   which implies that hosts within such a site could both originate and
374	   receive multicast traffic. An extreme case is when the Sender Sites
375	   set is the same as the Receiver Sites set, in which case all sites
376	   could originate and receive multicast traffic from each other.

378	   Sites within a given MVPN may be either within the same, or in
379	   different organizations, which implies that an MVPN can be either an
380	   Intranet or an Extranet.

382	   A given site may be in more than one MVPN, which implies that MVPNs
383	   may overlap.

385	   Not all sites of a given MVPN have to be connected to the same
386	   service provider, which implies that an MVPN can span multiple
387	   service providers.

389	   Another way to look at MVPN is to say that an MVPN is defined by a
390	   set of administrative policies. Such policies determine both Sender
391	   Sites set and Receiver Site set. Such policies are established by
392	   MVPN customers, but implemented/realized by MVPN Service Providers
393	   using the existing BGP/MPLS VPN mechanisms, such as Route Targets,
394	   with extensions, as necessary.

396	2.2.3. Auto-Discovery

398	   In order for the PE routers attaching to a given MVPN to exchange
399	   MVPN control information with each other, each one needs to discover
400	   all the other PEs that attach to the same MVPN.  (Strictly speaking,
401	   a PE in the receiver sites set need only discover the other PEs in
402	   the sender sites set and a PE in the sender sites set need only
403	   discover the other PEs in the receiver sites set.) This is referred
404	   to as "MVPN Auto-Discovery".

406	   This document discusses two ways of providing MVPN autodiscovery:

408	     - BGP can be used for discovering and maintaining MVPN membership.
409	       The PE routers advertise their MVPN membership to other PE
410	       routers using BGP. A PE is considered to be a "member" of a
411	       particular MVPN if it contains a VRF (Virtual Routing and
412	       Forwarding table, see [RFC4364]) which is configured to contain
413	       the multicast routing information of that MVPN.  This auto-
414	       discovery option does not make any assumptions about the methods
415	       used for transmitting MVPN multicast data packets through the
416	       backbone.

418	     - If it is known that the multicast data packets of a particular
419	       MVPN are to be transmitted (at least, by default) through a non-
420	       aggregated Inclusive Tree which is to be set up by PIM-SM or
421	       PIM-Bidir, and if the PEs attaching to that MVPN are configured
422	       with the group address corresponding to that tree, then the PEs
423	       can auto-discover each other simply by joining the tree and then
424	       multicasting PIM Hellos over the tree.

426	2.2.4. PE-PE Multicast Routing Information

428	   The BGP/MPLS IP VPN [RFC4364] specification requires a PE to maintain
429	   at most one BGP peering with every other PE in the network. This
430	   peering is used to exchange VPN routing information. The use of Route
431	   Reflectors further reduces the number of BGP adjacencies maintained
432	   by a PE to exchange VPN routing information with other PEs. This
433	   document describes various options for exchanging MVPN control
434	   information between PE routers based on the use of PIM or BGP. These
435	   options have different overheads with respect to the number of
436	   routing adjacencies that a PE router needs to maintain to exchange
437	   MVPN control information with other PE routers. Some of these options
438	   allow the retention of the unicast BGP/MPLS VPN model letting a PE
439	   maintain at most one routing adjacency with other PE routers to
440	   exchange MVPN control information.

442	   The solution in [RFC4364] uses BGP to exchange VPN routing
443	   information between PE routers. This document describes various
444	   solutions for exchanging MVPN control information. One option is the
445	   use of BGP, providing reliable transport. Another option is the use
446	   of the currently existing, "soft state" PIM standard [PIM-SM].

448	2.2.5. PE-PE Multicast Data Transmission

450	   Like [RFC4364], this document decouples the procedures for exchanging
451	   routing information from the procedures for transmitting data
452	   traffic. Hence a variety of transport technologies may be used in the
453	   backbone. For inclusive trees, these transport technologies include
454	   unicast PE-PE tunnels (using MPLS or IP/GRE encapsulation), multicast
455	   distribution trees created by PIM-SSM, PIM-SM, or PIM-Bidir (using
456	   IP/GRE encapsulation), point-to-multipoint LSPs created by RSVP-TE or
457	   mLDP, and multipoint-to-multipoint LSPs created by mLDP.  (However,
458	   techniques for aggregating the traffic of multiple MVPNs onto a
459	   single multipoint-to-multipoint LSP or onto a single bidirectional
460	   multicast distribution tree are for further study.) For selective
461	   trees, only unicast PE-PE tunnels (using MPLS or IP/GRE
462	   encapsulation) and unidirectional single-source trees are supported,
463	   and the supported tree creation protocols are PIM-SSM (using IP/GRE
464	   encapsulation), RSVP-TE, and mLDP.

466	   In order to aggregate traffic from multiple MVPNs onto a single
467	   multicast distribution tree, it is necessary to have a mechanism to
468	   enable the egresses of the tree to demultiplex the multicast traffic
469	   received over the tree and to associate each received packet with a
470	   particular MVPN.  This document specifies a mechanism whereby
471	   upstream label assignment [MPLS-UPSTREAM-LABEL] is used by the root
472	   of the tree to assign a label to each flow.  This label is used by
473	   the receivers to perform the demultiplexing. This document also
474	   describes procedures based on BGP that are used by the root of an
475	   Aggregate Tree to advertise the Inclusive and/or Selective binding
476	   and the demultiplexing information to the leaves of the tree.

478	   This document also describes the data plane encapsulations for
479	   supporting the various SP multicast transport options.

481	   This document assumes that when SP multicast trees are used, traffic
482	   for a particular multicast group is transmitted by a particular PE on
483	   only one SP multicast tree. The use of multiple SP multicast trees
484	   for transmitting traffic belonging to a particular multicast group is
485	   for further study.

487	2.2.6. Inter-AS MVPNs

489	   [RFC4364] describes different options for supporting Inter-AS
490	   BGP/MPLS unicast VPNs. This document describes how Inter-AS MVPNs can
491	   be supported for each of the unicast BGP/MPLS VPN Inter-AS options.
492	   This document also specifies a model where Inter-AS MVPN service can
493	   be offered without requiring a single SP multicast tree to span
494	   multiple ASes. In this model, an inter-AS multicast tree consists of
495	   a number of "segments", one per AS, which are stitched together at AS
496	   boundary points. These are known as "segmented inter-AS trees".  Each
497	   segment of a segmented inter-AS tree may use a different multicast
498	   transport technology.

500	   It is also possible to support Inter-AS MVPNs with non-segmented
501	   source trees that extend across AS boundaries.

503	2.2.7. Optional Deployment Models

505	   The document also discusses an optional MVPN deployment model in
506	   which PEs take on all or part of the role of a PIM RP (Rendezvous
507	   Point).  The necessary protocol extensions to support this are
508	   defined.

510	3. Concepts and Framework

512	3.1. PE-CE Multicast Routing

514	   Support of multicast in BGP/MPLS IP VPNs is modeled closely after
515	   support of unicast in BGP/MPLS IP VPNs. That is, a multicast routing
516	   protocol will be run on the PE-CE interfaces, such that PE and CE are
517	   multicast routing adjacencies on that interface.  CEs at different
518	   sites do not become multicast routing adjacencies of each other.

520	   If a PE attaches to n VPNs for which multicast support is provided
521	   (i.e., to n "MVPNs"), the PE will run n independent instances of a
522	   multicast routing protocol.  We will refer to these multicast routing
523	   instances as "VPN-specific multicast routing instances", or more
524	   briefly as "multicast C-instances". The notion of a "VRF" ("Virtual
525	   Routing and Forwarding Table"), defined in [RFC4364], is extended to
526	   include multicast routing entries as well as unicast routing entries.
527	   Each multicast routing entry is thus associated with a particular
528	   VRF.

530	   Whether a particular VRF belongs to an MVPN  or not is determined by
531	   configuration.

533	   In this document, we will not attempt to provide support for every
534	   possible multicast routing protocol that could possibly run on the
535	   PE-CE link.  Rather, we consider multicast C-instances only for the
536	   following multicast routing protocols:

538	     - PIM Sparse Mode (PIM-SM)

540	     - PIM Single Source Mode (PIM-SSM)

542	     - PIM Bidirectional Mode (PIM-Bidir)

544	     - PIM Dense Mode (PIM-DM)

546	   In order to support the "Carrier's Carrier" model of [RFC4364], mLDP
547	   or BGP will also be supported on the PE-CE interface; however, this
548	   is not described in this revision.

550	   As the document only supports PIM-based C-instances, we will
551	   generally use the term "PIM C-instances" to refer to the multicast
552	   C-instances.

554	   A PE router may also be running a "provider-wide" instance of PIM, (a
555	   "PIM P-instance"), in which it has a PIM adjacency with, e.g., each
556	   of its IGP neighbors (i.e., with P routers), but NOT with any CE
557	   routers, and not with other PE routers (unless another PE router
558	   happens to be an IGP adjacency).  In this case, P routers would also
559	   run the P-instance of PIM, but NOT a C-instance.  If there is a PIM
560	   P-instance, it may or may not have a role to play in support of VPN
561	   multicast; this is discussed in later sections.  However, in no case
562	   will the PIM P-instance contain VPN-specific multicast routing
563	   information.

565	   In order to help clarify when we are speaking of the PIM P-instance
566	   and when we are speaking of a PIM C-instance, we will also apply the
567	   prefixes "P-" and "C-" respectively to control messages, addresses,
568	   etc.  Thus a P-Join would be a PIM Join which is processed by the PIM
569	   P-instance, and a C-Join would be a PIM Join which is processed by a
570	   C-instance.  A P-group address would be a group address in the SP's
571	   address space, and a C-group address would be a group address in a
572	   VPN's address space.

574	3.2. P-Multicast Service Interfaces (PMSIs)

576	   Multicast data packets received by a PE over a PE-CE interface must
577	   be forwarded to one or more of the other PEs in the same MVPN for
578	   delivery to one or more other CEs.

580	   We define the notion of a "P-Multicast Service Interface" (PMSI).  If
581	   a particular MVPN is supported by a particular set of PE routers,
582	   then there will be a PMSI connecting those PE routers.  A PMSI is a
583	   conceptual "overlay" on the P network with the following property: a
584	   PE in a given MVPN can give a packet to the PMSI, and the packet will
585	   be delivered to some or all of the other PEs in the MVPN, such that
586	   any PE receiving such a packet will be able to tell which MVPN the
587	   packet belongs to.

589	   As we discuss below, a PMSI may be instantiated by a number of
590	   different transport mechanisms, depending on the particular
591	   requirements of the MVPN and of the SP.  We will refer to these
592	   transport mechanisms as "tunnels".

594	   For each MVPN, there are one or more PMSIs that are used for
595	   transmitting the MVPN's multicast data from one PE to others.  We
596	   will use the term "PMSI" such that a single PMSI belongs to a single
597	   MVPN.  However, the transport mechanism which is used to instantiate
598	   a PMSI may allow a single "tunnel" to carry the data of multiple
599	   PMSIs.

601	   In this document we make a clear distinction between the multicast
602	   service (the PMSI) and its instantiation.  This allows us to separate
603	   the discussion of different services from the discussion of different
604	   instantiations of each service.  The term "tunnel" is used to refer
605	   only to the transport mechanism that instantiates a service.

607	   [This is a significant change from previous drafts on the topic of
608	   MVPN, which have used the term "Multicast Tunnel" to refer both to
609	   the multicast service (what we call here the PMSI) and to its
610	   instantiation.]

612	3.2.1. Inclusive and Selective PMSIs

614	   We will distinguish between three different kinds of PMSI:

616	     - "Multidirectional Inclusive" PMSI (MI-PMSI)

618	       A Multidirectional Inclusive PMSI is one which enables ANY PE
619	       attaching to a particular MVPN to transmit a message such that it
620	       will be received by EVERY other PE attaching to that MVPN.

622	       There is at most one MI-PMSI per MVPN.  (Though the tunnel which
623	       instantiates an MI-PMSI may actually carry the data of more than
624	       one PMSI.)

626	       An MI-PMSI can be thought of as an overlay broadcast network
627	       connecting the set of PEs supporting a particular MVPN.

629	       [The "Default MDTs" of rosen-08 provide the transport service of
630	       MI-PMSIs, in this terminology.]

632	     - "Unidirectional Inclusive" PMSI (UI-PMSI)

634	       A Unidirectional Inclusive PMSI is one which enables a particular
635	       PE, attached to a particular MVPN, to transmit a message such
636	       that it will be received by all the other PEs attaching to that
637	       MVPN.  There is at most one UI-PMSI per PE per MVPN, though the
638	       "tunnel" which instantiates a UI-PMSI may in fact carry the data
639	       of more than one PMSI.

641	     - "Selective" PMSI (S-PMSI).

643	       A Selective PMSI is one which provides a mechanism wherein a
644	       particular PE in an MVPN can multicast messages so that they will
645	       be received by a subset of the other PEs of that MVPN.  There may
646	       be an arbitrary number of S-PMSIs per PE per MVPN.  Again, the
647	       "tunnel" which instantiates a given S-PMSI may carry data from
648	       multiple S-PMSIs.

650	       [The "Data MDTs" of earlier drafts provide the transport service
651	       of "Selective PMSIs" in the terminology of this draft.]

653	   We will see in later sections the role played by these different
654	   kinds of PMSI.  We will use the term "I-PMSI" when we are not
655	   distinguishing between "MI-PMSIs" and "UI-PMSIs".

657	3.2.2. Tunnels Instantiating PMSIs

659	   A number of different tunnel setup techniques can be used to create
660	   the tunnels that instantiate the PMSIs.  Among these are:

662	     - PIM

664	       A PMSI can be instantiated as (a set of) Multicast Distribution
665	       Trees created by the PIM P-instance ("P-trees").

667	       PIM-SSM, PIM-Bidir, or PIM-SM can be used to create P-trees.
668	       (PIM-DM  is not supported for this purpose.)

670	       A single MI-PMSI can be instantiated by a single shared P-tree,
671	       or by a number of source P-trees (one for each PE of the MI-
672	       PMSI).  P-trees may be shared by multiple MVPNs (i.e., a given
673	       P-tree may be the instantiation of multiple PMSIs), as long as
674	       the encapsulation provides some means of demultiplexing the data
675	       traffic by MVPN.

677	       Selective PMSIs are most instantiated by source P-trees, and are
678	       most naturally created by PIM-SSM, since by definition only one
679	       PE is the source of the multicast data on a Selective PMSI.

681	       [The "Default MDTs" of [rosen-08] are MI-PMSIs instantiated as
682	       PIM trees.  The "data MDTs" of [rosen-08] are S-PMSIs
683	       instantiated as PIM trees.]

685	     - MLDP

687	       A PMSI may be instantiated as one or more mLDP Point-to-
688	       Multipoint (P2MP) LSPs, or as an mLDP Multipoint-to-Point(MP2MP)
689	       LSP.  A Selective PMSI or a Unidirectional Inclusive PMSI would
690	       be instantiated as a single mLDP P2MP LSP, whereas a
691	       Multidirectional Inclusive PMSI could be instantiated either as a
692	       set of such LSPs (one for each PE in the MVPN) or as a single
693	       M2PMP LSP.

695	       MLDP P2MP LSPs can be shared across multiple MVPNs.

697	     - RSVP-TE

699	       A PMSI may be instantiated as one or more RSVP-TE Point-to-
700	       Multipoint (P2MP) LSPs.  A Selective PMSI or a Unidirectional
701	       Inclusive PMSI would be instantiated as a single RSVP-TE P2MP
702	       LSP, whereas a Multidirectional Inclusive PMSI would be
703	       instantiated as a set of such LSPs, one for each PE in the MVPN.
704	       RSVP-TE P2MP LSPs can be shared across multiple MVPNs.

706	     - A Mesh of Unicast Tunnels.

708	       If a PMSI is implemented as a mesh of unicast tunnels, a PE
709	       wishing to transmit a packet through the PMSI would replicate the
710	       packet, and send a copy to each of the other PEs.

712	       An MI-PMSI for a given MVPN can be instantiated as a full mesh of
713	       unicast tunnels among that MVPN's PEs.  A UI-PMSI or an S-PMSI
714	       can be instantiated as a partial mesh.

716	     - Unicast Tunnels to the Root of a P-Tree.

718	       Any type of PMSI can be instantiated through a method in which
719	       there is a single P-tree (created, for example, via PIM-SSM or
720	       via RSVP-TE), and a PE transmits a packet to the PMSI by sending
721	       it in a unicast tunnel to the root of that P-tree.  All PEs in
722	       the given MVPN would need to be leaves of the tree.

724	       When this instantiation method is used, the transmitter of the
725	       multicast data may receive its own data back.  Methods for
726	       avoiding this are for further study.

728	   It can be seen that each method of implementing PMSIs has its own
729	   area of applicability.  This specification therefore allows for the
730	   use of any of these methods.  At first glance, this may seem like an
731	   overabundance of options.  However, the history of multicast
732	   development and deployment should make it clear that there is no one
733	   option which is always acceptable.  The use of segmented inter-AS
734	   trees does allow each SP to select the option which it finds most
735	   applicable in its own environment, without causing any other SP to
736	   choose that same option.

738	   Specifying the conditions under which a particular tree building
739	   method is applicable is outside the scope of this document.

741	   The choice of the tunnel technique belongs to the sender router and
742	   is a local policy decision of the router. The procedures defined
743	   throughout this document do not mandate that the same tunnel
744	   technique be used for all PMSI tunnels going through a same provider
745	   backbone.  It is however expected that any tunnel technique that can
746	   be subject to being used by a PE for a particular MVPN is also
747	   supported by other PE having VRFs for the MVPN.  Moreover, the use of
748	   ingress replication by any PE for an MVPN, implies that all other PEs
749	   MUST use ingress replication for this MVPN.

751	3.3. Use of PMSIs for Carrying Multicast Data

753	   Each PE supporting a particular MVPN must have a way of discovering:

755	     - The set of other PEs in its AS that are attached to sites of that
756	       MVPN, and the set of other ASes that have PEs attached to sites
757	       of that MVPN.  However, if segmented inter-AS trees are not used
758	       (see section 8.2), then each PE needs to know the entire set of
759	       PEs attached to sites of that MVPN.

761	     - If segmented inter-AS trees are to be used, the set of border
762	       routers in its AS that support inter-AS connectivity for that
763	       MVPN

765	     - If the MVPN is configured to use a default MI-PMSI, the
766	       information needed to set up and to use the tunnels instantiating
767	       the default MI-PMSI,

769	     - For each other PE, whether the PE supports Aggregate Trees for
770	       the MVPN, and if so, the demultiplexing information which must be
771	       provided so that the other PE can determine whether a packet
772	       which it received on an aggregate tree belongs to this MVPN.

774	   In some cases this information is provided by means of the BGP-based
775	   auto-discovery procedures detailed in section 4.  In other cases,
776	   this information is provided after discovery is complete, by means of
777	   procedures defined in section 6.1.2.  In either case, the information
778	   which is provided must be sufficient to enable the PMSI to be bound
779	   to the identified tunnel, to enable the tunnel to be created if it
780	   does not already exist, and to enable the different PMSIs which may
781	   travel on the same tunnel to be properly demultiplexed.

783	3.3.1. MVPNs with Default MI-PMSIs

785	   If an MVPN uses an MI-PMSI, then the MI-PMSI for that MVPN will be
786	   created as soon as the necessary information has been obtained.
787	   Creating a PMSI means creating the tunnel which carries it (unless
788	   that tunnel already exists), as well as binding the PMSI to the
789	   tunnel. The MI-PMSI for that MVPN is then used as the default method
790	   of transmitting multicast data packets for that MVPN.  In effect, all
791	   the multicast streams for the MVPN are, by default, aggregated onto
792	   the MI-MVPN.

794	   If a particular multicast stream from a particular source PE has
795	   certain characteristics, it can be desirable to migrate it from the
796	   MI-PMSI to an S-PMSI.  Procedures for migrating a stream from an MI-
797	   PMSI to an S-PMSI are discussed in section 7.

799	3.3.2. When MI-PMSIs are Required

801	   MI-PMSIs are required under the following conditions:

803	     - The MVPN is using PIM-DM, or some other protocol (such as BSR)
804	       which relies upon flooding.  Only with an MI-PMSI can the C-data
805	       (or C-control-packets) received from any CE be flooded to all
806	       PEs.

808	     - If the procedure for carrying C-multicast routes from PE to PE
809	       involves the multicasting of P-PIM control messages among the PEs
810	       (see sections 5.2.1, 5.2.2, and 5.2.4).

812	3.3.3. MVPNs That Do Not Use MI-PMSIs

814	   If a particular MVPN does not use a default MI-PMSI, then its
815	   multicast data may be sent by default on a UI-PMSI.

817	   It is also possible to send all the multicast data on an S-PMSI,
818	   omitting any usage of I-PMSIs.  This prevents PEs from receiving data
819	   which they don't need, at the cost of requiring additional tunnels.
820	   However, cost-effective instantiation of S-PMSIs is likely to require
821	   Aggregate P-trees, which in turn makes it necessary for the
822	   transmitting PE to know which PEs need to receive which multicast
823	   streams. This is known as "explicit tracking", and the procedures to
824	   enable explicit tracking may themselves impose a cost.  This is
825	   further discussed in section 7.2.2.2.

827	4. BGP-Based Autodiscovery of MVPN Membership

829	   BGP-based autodiscovery is done by means of a new address family, the
830	   MCAST-VPN address family. (This address family also has other uses,
831	   as will be seen later.)  Any PE which attaches to an MVPN must issue
832	   a BGP update message containing an NLRI in this address family, along
833	   with a specific set of attributes.  In this document, we specify the
834	   information which must be contained in these BGP updates in order to
835	   provide auto-discovery.  The encoding details, along with the
836	   complete set of detailed procedures, are specified in a separate
837	   document [MVPN-BGP].

839	   This section specifies the intra-AS BGP-based autodiscovery
840	   procedures.  When segmented inter-AS trees are used, additional
841	   procedures are needed, as specified in section 8.  Further detail may
842	   be found in [MVPN-BGP].  (When segmented inter-AS trees are not used,
843	   the inter-AS procedures are almost identical to the intra-AS
844	   procedures.)

846	   BGP-based autodiscovery uses a particular kind of MCAST-VPN route
847	   known as an "auto-discovery routes", or "A-D route".

849	   An "intra-AS A-D route" is a particular kind of A-D route that is
850	   never distributed outside its AS of origin.  Intra-AS A-D routes are
851	   originated by the PEs that are (directly) connected to the site(s) of
852	   that MVPN.

854	   For the purpose of auto-discovery, each PE attached to a site in a
855	   given MVPN must originate an intra-AS auto-discovery route.  The NLRI
856	   of that route must the following information:

858	     - The route type (i.e., intra-AS A-D route)

860	     - IP address of the originating PE

862	     - An RD configured locally for the MVPN.  This is an RD which can
863	       be prepended to that IP address to form a globally unique VPN-IP
864	       address of the PE.

866	   The A-D route must also carry the following attributes:

868	     - One or more Route Target attributes.  If any other PE has one of
869	       these Route Targets configured for import into a VRF, it treats
870	       the advertising PE as a member in the MVPN to which the VRF
871	       belongs. This allows each PE to discover the PEs that belong to a
872	       given MVPN.  More specifically it allows a PE in the receiver
873	       sites set to discover the PEs in the sender sites set of the MVPN
874	       and the PEs in the sender sites set of the MVPN to discover the
875	       PEs in the receiver sites set of the MVPN. The PEs in the
876	       receiver sites set would be configured to import the Route
877	       Targets advertised in the BGP Auto-Discovery routes by PEs in the
878	       sender sites set. The PEs in the sender sites set would be
879	       configured to import the Route Targets advertised in the BGP
880	       Auto-Discovery routes by PEs in the receiver sites set.

882	     * PMSI tunnel attribute.  This attribute is present if and only if
883	       a default MI-PMSI is to be used for the MVPN.  It contains the
884	       following information:

886	           whether the MI-PMSI is instantiated by

888	             + A PIM-Bidir tree,

890	             + a set of PIM-SSM trees,

892	             + a set of PIM-SM trees

894	             + a set of RSVP-TE point-to-multipoint LSPs

896	             + a set of mLDP point-to-multipoint LSPs

898	             + an mLDP multipoint-to-multipoint LSP

900	             + a set of unicast tunnels

902	             + a set of unicast tunnels to the root of a shared tree (in
903	               this case the root must be identified)

905	         * If the PE wishes to setup a default tunnel to instantiate the
906	           I-PMSI, a unique identifier for the tunnel used to
907	           instantiate the I-PMSI.

909	           All the PEs attaching to a given MVPN (within a given AS)
910	           must have been configured with the same PMSI tunnel attribute
911	           for that MVPN.  They are also expected to know the
912	           encapsulation to use.

914	           Note that a default tunnel can be identified at discovery
915	           time only if the tunnel already exists (e.g., it was
916	           constructed by means of configuration), or if it can be
917	           constructed without each PE knowing the the identities of all
918	           the others (e.g., it is constructed by a receiver-initiated
919	           join technique such as PIM or mLDP).

921	           In other cases, a default tunnel cannot be identified until
922	           the PE has discovered one or more of the other PEs.   This
923	           will be the case, for example, if the tunnel is an RSVP-TE
924	           P2MP LSP, which must be set up from the head end.  In these
925	           cases, a PE will first send an A-D route without a tunnel
926	           identifier, and then will send another one with a tunnel
927	           identifier after discovering one or more of the other PEs.

929	         * Whether the tunnel used to instantiate the I-PMSI for this
930	           MVPN is aggregating I-PMSIs from multiple MVPNs.  This will
931	           affect the encapsulation used.  If aggregation is to be used,
932	           a demultiplexor value to be carried by packets for this
933	           particular MVPN must also be specified. The demultiplexing
934	           mechanism and signaling procedures are described in section
935	           6.
936	       Further details of the use of this information are provided in
937	       subsequent sections.

939	5. PE-PE Transmission of C-Multicast Routing

941	   As a PE attached to a given MVPN receives C-Join/Prune messages from
942	   its CEs in that MVPN, it must convey the information contained in
943	   those messages to other PEs that are attached to the same MVPN.

945	   There are several different methods for doing this. As these methods
946	   are not interoperable, the method to be used for a particular MVPN
947	   must either be configured, or discovered as part of the BGP-based
948	   auto-discovery process.

950	5.1. RPF Information for Unicast VPN-IP Routes

952	   When a PE receives a C-Join/Prune message from a CE, the message
953	   identifies a particular multicast flow as belong either to a source
954	   tree (S,G) or to a shared tree (*,G).  We use the term C-source to
955	   refer to S, in the case of a source tree, or to the Rendezvous Point
956	   (RP) for G, in the case of (*,G).  The PE needs to find the "upstream
957	   multicast hop" for the (S,G) or (*,G) flow, and it does this by
958	   looking up the C-source in the unicast VRF associated with the PE-CE
959	   interfaces over which the C-Join/Prune was received.  To facilitate
960	   this, all unicast VPN-IP routes from an MVPN will carry RPF
961	   information, which identifies the PE that originated the route, as
962	   well as identifying the Autonomous System containing that PE.  This
963	   information is consulted when a PE does an "RPF lookup" of the C-
964	   source as part of processing the C-Join/Prune messages.  This RPF
965	   information contains the following:

967	     - Source AS Extended Community

969	       To support MVPN a PE that originates a (unicast) route to VPN-
970	       IPv4 addresses MUST include in the BGP Update message that
971	       carries this route the Source AS extended community, except if it
972	       is known a priori that none of these addresses will act as
973	       multicast sources and/or RP, in which case the (unicast) route
974	       need not carry the Source AS extended community.  The Global
975	       Administrator field of this community MUST be set to the
976	       autonomous system number of the PE. The Local Administrator field
977	       of this community SHOULD be set to 0. This community is described
978	       further in [MVPN-BGP].

980	     - Route Import Extended Community

982	       To support MVPN in addition to the import/export Route Target(s)
983	       used by the unicast routing, each VRF on a PE MUST have an import
984	       Route Target that is unique to this VRF, except if it is known a
985	       priori that none of the (local) MVPN sites associated with the
986	       VRF contain multicast source(s) and/or RP, in which case the VRF
987	       need not have this import Route Target. This Route Target MUST be
988	       IP address specific, and is constructed as follows:

990	     + The Global Administrator field of the Route Target MUST be set to
991	       an IP address of the PE. This address MUST be a routable IP
992	       address.  This address MAY be common for all the VRFs on the PE
993	       (e.,g., this address may be PE's loopback address).

995	     + The Local Administrator field of the Route Target associated with
996	       a given VRF contains a 2 octets long number that uniquely
997	       identifies that VRF within the PE that contains the VRF
998	       (procedures for assigning such numbers are purely local to the
999	       PE, and outside the scope of this document).

1001	   A PE that originates a (unicast) route to VPN-IPv4 addresses MUST
1002	   include in the BGP Updates message that carries this route the Route
1003	   Import extended community that has the value of this Route Target,
1004	   except if it is known a priori that none of these addresses will act
1005	   as multicast sources and/or RP, in which case the (unicast) route
1006	   need not carry the Route Import extended community.

1008	   The Route Import Extended Community is described further in [MVPN-
1009	   BGP].

1011	5.2. PIM Peering

1013	5.2.1. Full Per-MVPN PIM Peering Across a MI-PMSI

1015	   If the set of PEs attached to a given MVPN are connected via a MI-
1016	   PMSI, the PEs can form "normal" PIM adjacencies with each other.
1017	   Since the MI-PMSI functions as a broadcast network, the standard PIM
1018	   procedures for forming and maintaining adjacencies over a LAN can be
1019	   applied.

1021	   As a result, the C-Join/Prune messages which a PE receives from a CE
1022	   can be multicast to all the other PEs of the MVPN.  PIM "join
1023	   suppression" can be enabled and the PEs can send Asserts as needed.

1025	   [This is the procedure specified in [rosen-08].]

1027	5.2.2. Lightweight PIM Peering Across a MI-PMSI

1029	   The procedure of the previous section has the following
1030	   disadvantages:

1032	     - Periodic Hello messages must be sent by all PEs.

1034	       Standard PIM procedures require that each PE in a particular MVPN
1035	       periodically multicast a Hello to all the other PEs in that MVPN.
1036	       If the number of MVPNs becomes very large, sending and receiving
1037	       these Hellos can become a substantial overhead for the PE
1038	       routers.

1040	     - Periodic retransmission of C-Join/Prune messages.

1042	       PIM is a "soft-state" protocol, in which reliability is assured
1043	       through frequent retransmissions (refresh) of control messages.
1044	       This too can begin to impose a large overhead on the PE routers
1045	       as the number of MVPNs grows.

1047	   The first of these disadvantages is easily remedied.  The reason for
1048	   the periodic PIM Hellos is to ensure that each PIM speaker on a LAN
1049	   knows who all the other PIM speakers on the LAN are.  However, in the
1050	   context of MVPN, PEs in a given MVPN can learn the identities of all
1051	   the other PEs in the MVPN by means of the BGP-based auto-discovery
1052	   procedure of section 4.  In that case, the periodic Hellos would
1053	   serve no function, and could simply be eliminated.  (Of course, this
1054	   does imply a change to the standard PIM procedures.)

1056	   When Hellos are suppressed, we may speak of "lightweight PIM
1057	   peering".

1059	   The periodic refresh of the C-Join/Prunes is not as simple to
1060	   eliminate.  The L3VPN WG has asked the PIM WG to specify "refresh
1061	   reduction" procedures for PIM, so as to eliminate the need for the
1062	   periodic refreshes.  If and when such procedures have been specified,
1063	   it will be very useful to incorporate them, so as to make the
1064	   lightweight PIM peering procedures even more lightweight.

1066	5.2.3. Unicasting of PIM C-Join/Prune Messages

1068	   PIM does not require that the C-Join/Prune messages which a PE
1069	   receives from a CE to be multicast to all the other PEs; it allows
1070	   them to be unicast to a single PE, the one which is upstream on the
1071	   path to the root of the multicast tree mentioned in the Join/Prune
1072	   message. Note that when the C-Join/Prune messages are unicast, there
1073	   is no such thing as "join suppression".  Therefore PIM Refresh
1074	   Reduction may be considered to be a pre-requisite for the procedure
1075	   of unicasting the C-Join/Prune messages.

1077	   When the C-Join/Prunes are unicast, they are not transmitted on a
1078	   PMSI at all.  Note that the procedure of unicasting the C-Join/Prunes
1079	   is different than the procedure of transmitting the C-Join/Prunes on
1080	   an MI-PMSI which is instantiated as a mesh of unicast tunnels.

1082	   If there are multiple PEs that can be used to reach a given C-source,
1083	   procedures described in section 9 MUST be used to ensue that, at
1084	   least within a single AS, all PEs choose the same PE to reach the C-
1085	   source.

1087	5.2.4. Details of Per-MVPN PIM Peering over MI-PMSI

1089	   In this section, we assume that inter-AS MVPNs will be supported by
1090	   means of non-segmented inter-AS trees.  Support for segmented inter-
1091	   AS trees with PIM peering is for further study.

1093	   When an MVPN uses an MI-PMSI, the C-instances of that MVPN can treat
1094	   the MI-PMSI as a LAN interface, and form either full PIM adjacencies
1095	   or lightweight PIM adjacencies with each other over that "LAN
1096	   interface".

1098	   To form a full PIM adjacency, the PEs execute the PIM LAN procedures,
1099	   including the generation and processing of PIM Hello, Join/Prune,
1100	   Assert, DF election and other PIM control packets.  These are
1101	   executed independently for each C-instance.  PIM "join suppression"
1102	   SHOULD be enabled.

1104	   If it is known that all C-instances of a particular MVPN can support
1105	   lightweight adjacencies, then lightweight adjacencies MUST be used.
1106	   If it is not known that all such C-instances support lightweight
1107	   instances, then full adjacencies MUST be used.  Whether all the C-
1108	   instances support lightweight adjacencies is known by virtue of the
1109	   BGP-based auto-discovery procedures (combined with configuration).
1110	   This knowledge might change over time, so the PEs must be able to
1111	   switch in real time between the use of full adjacencies and
1112	   lightweight adjacencies.

1114	   The difference between a lightweight adjacency and a full adjacency
1115	   is that no PIM Hellos are sent or received on a lightweight
1116	   adjacency.  The function which Hellos usually provide in PIM can be
1117	   provided in MVPN by the BGP-based auto-discovery procedures, so the
1118	   Hellos become superfluous.

1120	   Whether or not Hellos are sent, if PIM Refresh Reduction procedures
1121	   are available, and all the PEs supporting the  MVPN are known to
1122	   support these procedures, then the refresh reduction procedures MUST
1123	   be used.

1125	5.2.4.1. PIM C-Instance Control Packets

1127	   All PIM C-Instance control packets of a particular MVPN are addressed
1128	   to the ALL-PIM-ROUTERS (224.0.0.13) IP destination address, and
1129	   transmitted over the MI-PMSI of that MVPN.  While in transit in the
1130	   P-network, the packets are encapsulated as required for the
1131	   particular kind of tunnel that is being used to instantiate the MI-
1132	   PMSI.  Thus the C-instance control packets are not processed by the P
1133	   routers, and MVPN-specific PIM routes can be extended from site to
1134	   site without appearing in the P routers.

1136	5.2.4.2. PIM C-instance RPF Determination

1138	   Although the MI-PMSI is treated by PIM as a LAN interface, unicast
1139	   routing is NOT run over it, and there are no unicast routing
1140	   adjacencies over it.  It is therefore necessary to specify special
1141	   procedures for determining when the MI-PMSI is to be regarded as the
1142	   "RPF Interface" for a particular C-address.

1144	   When a PE needs to determine the RPF interface of a particular C-
1145	   address, it looks up the C-address in the VRF. If the route matching
1146	   it (call this the "RPF route") is not a VPN-IP route learned from
1147	   MP-BGP as described in [RFC4364], or if that route's outgoing
1148	   interface is one of the interfaces associated with the VRF, then
1149	   ordinary PIM procedures for determining the RPF interface apply.

1151	   However, if the RPF route is a VPN-IP route whose outgoing interface
1152	   is not one of the interfaces associated with the VRF, then PIM will
1153	   consider the outgoing interface to be the MI-PMSI associated with the
1154	   VPN-specific PIM instance.

1156	   Once PIM has determined that the RPF interface for a particular C-
1157	   address is the MI-PMSI, it is necessary for PIM to determine the RPF
1158	   neighbor for that C-address.  This will be one of the other PEs that
1159	   is a PIM adjacency over the MI-PMSI.

1161	   When a PE distributes a given VPN-IP route via BGP, the PE must
1162	   determine whether that route might possibly be regarded, by another
1163	   PE, as an RPF route. (If a given VRF is part of an MVPN, it may be
1164	   simplest to regard every route exported from that VRF to be a
1165	   potential RPF route.)  If the given VPN-IP route is a potential RPF
1166	   route, then when the VPN-IP route is distributed by BGP, it SHOULD be
1167	   accompanied by a VRF Route Import Extended Community (see [MVPN-
1168	   BGP]).

1170	   The VRF Route Import Extended Community contains an embedded IP
1171	   address.  If a PE advertises a route with a VRF Route Import Extended
1172	   Community, then the PE MUST use that the IP address embedded therein
1173	   as its Source IP address in any PIM control messages which it
1174	   transmits to other PEs in the same MVPN.  If a VRF Route Import
1175	   Extended Community is not present, then the source IP address in any
1176	   PIM control messages which it transmits to other PEs in the same MVPN
1177	   MUST be be the same as the address carried in the BGP Next Hop of the
1178	   route.

1180	   When a PE has determined that the RPF interface for a particular C-
1181	   address is the MI-PMSI, it must look up the RPF information that was
1182	   distributed along with the VPN-IP address corresponding to that C-
1183	   address.  The IP address in this RPF information will be considered
1184	   to be the IP address of the RPF adjacency for the C-address.

1186	   If the RPF information is not present, but the "BGP Next Hop" for the
1187	   C-address is one of the PEs that is a PIM adjacency over the MI-PMSI,
1188	   then that PE should be treated as the RPF adjacency for that C-
1189	   address.  However, if the MVPN spans multiple Autonomous Systems, the
1190	   BGP Next Hop might not be a PIM adjacency, and if that is the case
1191	   the RPF check will not succeed unless the RPF information is used.

1193	5.3. Use of BGP for Carrying C-Multicast Routing

1195	   It is possible to use BGP to carry C-multicast routing information
1196	   from PE to PE, dispensing entirely with the transmission of C-
1197	   Join/Prune messages from PE to PE. This section describes the
1198	   procedures for carrying intra-AS multicast routing information.
1199	   Inter-AS procedures are described in section 8.

1201	5.3.1. Sending BGP Updates

1203	   The MCAST-VPN address family is used for this purpose.  MCAST-VPN
1204	   routes used for the purpose of carrying C-multicast routing
1205	   information are distinguished from those used for the purpose of
1206	   carrying auto-discovery information by means of a "route type" field
1207	   which is encoded into the NLRI.  The following information is
1208	   required in BGP to advertise the MVPN routing information.  The NLRI
1209	   contains:

1211	     - The type of C-multicast route.

1213	       There are two types:

1215	         * source tree join

1217	         * shared tree join

1219	     - The RD configured, for the MVPN, on the PE that is advertising
1220	       the information.  This is required to uniquely identify the <C-
1221	       Source, C-Group> as the addresses could overlap between different
1222	       MVPNs.

1224	     - The C-Source address. (Omitted if the route type is "shared tree
1225	       join")

1227	     - The C-Group address.

1229	     - The RD from the VPN-IP route to the C-source.

1231	       That is, the route to the C-source is looked up in the local
1232	       unicast VRF associated with the CE-PE interface over which the
1233	       C-multicast control packet arrived.   The corresponding VPN-IP
1234	       route is then examined, and the RD from that route is placed into
1235	       the  C-multicast route.

1237	       Note that this RD is NOT necessarily one which is configured on
1238	       the local PE.  Rather it is one which is configured on the remote
1239	       PE that is on the path to the C-source.

1241	   The following attribute must also be included:

1243	     - The upstream multicast hop.

1245	       If a PE receives a C-Join (*, G) from a CE, the C-source is
1246	       considered to be the C-RP for the particular C-G.  When the C-
1247	       multicast route represents a "shared tree join", it is presumed
1248	       that the root of the tree (e.g., the RP) is determined by some
1249	       means outside the scope of this specification.

1251	       When the PE processes a C-PIM Join/Prune message, the route to
1252	       the C-source is looked up in the local unicast VRF associated
1253	       with the CE-PE interface over which the C-multicast control
1254	       packet arrived.  The corresponding VPN-IP route is then examined.
1255	       If the AS specified therein is the local AS, or if no AS is
1256	       specified therein, then the PE specified therein becomes the
1257	       upstream multicast hop.  If the AS specified therein is a remote
1258	       AS, the BGP next hop on the route to the  MVPN Auto-Discovery
1259	       route advertised by the remote AS, becomes the upstream multicast
1260	       hop.

1262	       N.B.: It is possible that here is more than one unicast VPN-IP
1263	       route to the C-source.  In this case, the route that was
1264	       installed in the VRF is not necessarily the route that must be
1265	       chosen by the PE.  In order to choose the proper route, the
1266	       procedures followed in section 9 MUST be followed.

1268	   The upstream multicast hop is identified in an Extended Communities
1269	   attribute to facilitate the optional use of filters which can prevent
1270	   the distribution of the update to BGP speakers other than the
1271	   upstream multicast hop.

1273	   When a PE distributes this information via BGP, it must include a
1274	   Route Import Extended Communities attribute learned from the RPF
1275	   information.

1277	   Note that for these procedures to work the VPN-IP route MUST contain
1278	   the RPF information.

1280	   Note that there is no C-multicast route corresponding to the PIM
1281	   function of pruning a source off the shared tree when a PE switches
1282	   from a <C-*, C-G> tree to a <C-S, C-G> tree.  Section 9 of this
1283	   document specifies a mandatory procedure that ensures that if any PE
1284	   joins a <C-S, C-G> source tree, all other PEs that have joined or
1285	   will join the <C-*, C-G> shared tree will also join the <C-S, C-G>
1286	   source tree.  This eliminates the need for a C-multicast route that
1287	   prunes C-S off the <C-*, C-G> shared tree when switching from <C-*,
1288	   C-G> to <C-S, C-G> tree.

1290	5.3.2. Explicit Tracking

1292	   Note that the upstream multicast hop is NOT part of the NLRI in the
1293	   C-multicast BGP routes.  This means that if several PEs join the same
1294	   C-tree, the BGP routes they distribute to do so are regarded by BGP
1295	   as comparable routes, and only one will be installed.  If a route
1296	   reflector is being used, this further means that the PE which is used
1297	   to reach the C-source will know only that one or more of the other
1298	   PEs have joined the tree, but it won't know which one.  That is, this
1299	   BGP update mechanism does not provide "explicit tracking".  Explicit
1300	   tracking is not provided by default because it increases the amount
1301	   of state needed and thus decreases scalability.  Also, as
1302	   constructing the C-PIM messages to send "upstream" for a given tree
1303	   does not depend on knowing all the PEs that are downstream on that
1304	   tree, there is no reason for the C-multicast route type updates to
1305	   provide explicit tracking.

1307	   There are some cases in which explicit tracking is necessary in order
1308	   for the PEs to set up certain kinds of P-trees.  There are other
1309	   cases in which explicit tracking is desirable in order to determine
1310	   how to optimally aggregate multicast flows onto a given aggregate
1311	   tree.  As these functions have to do with the setting up of
1312	   infrastructure in the P-network, rather than with the dissemination
1313	   of C-multicast routing information, any explicit tracking that is
1314	   necessary is handled by sending the "source active" A-D routes, that
1315	   are described in sections 9 and 10.  Detailed procedures for turning
1316	   on explicit tracking can be found in [MVPN-BGP].

1318	5.3.3. Withdrawing BGP Updates

1320	   A PE removes itself from a C-multicast tree (shared or source) by
1321	   withdrawing the corresponding BGP update.

1323	   If a PE has pruned a C-source from a shared C-multicast tree, and it
1324	   needs to "unprune" that source from that tree, it does so by
1325	   withdrawing the route that pruned the source from the tree.

1327	6. I-PMSI Instantiation

1329	   This section describes how tunnels in the SP network can be used to
1330	   instantiate an I-PMSI for an MVPN on a PE.   When C-multicast data is
1331	   delivered on an I-PMSI, the data will go to all PEs that are on the
1332	   path to receivers for that C-group, but may also go to PEs that are
1333	   not on the path to receivers for that C-group.

1335	   The tunnels which instantiate I-PMSIs can be either PE-PE unicast
1336	   tunnels or P-multicast trees. When PE-PE unicast tunnels are used the
1337	   PMSI is said to be instantiated using ingress replication.  The
1338	   instantiation of a tunnel for an I-PMSI is a matter of local policy
1339	   decision and is not mandatory. Even for a site attached to multicast
1340	   sources, transport of customer multicast traffic can be accommodated
1341	   with S-PMSI-bound tunnels only

1343	   [Editor's Note: MD trees described in [ROSEN-8, MVPN-BASE] are an
1344	   example of P-multicast trees. Also Aggregate Trees described in
1345	   [RAGGARWA-MCAST] are an example of P-multicast trees.]

1347	6.1. MVPN Membership and Egress PE Auto-Discovery

1349	   As described in section 4 a PE discovers the MVPN membership
1350	   information of other PEs using BGP auto-discovery mechanisms or using
1351	   a mechanism that instantiates a MI-PMSI interface. When a PE supports
1352	   only a UI-PMSI service for an MVPN, it MUST rely on the BGP auto-
1353	   discovery mechanisms for discovering this information. This
1354	   information also results in a PE in the sender sites set discovering
1355	   the leaves of the P-multicast tree, which are the egress PEs that
1356	   have sites in the receiver sites set in one or more MVPNs mapped onto
1357	   the tree.

1359	6.1.1. Auto-Discovery for Ingress Replication

1361	   In order for a PE to use Unicast Tunnels to send a C-multicast data
1362	   packet for a particular MVPN to a set of remote PEs, the remote PEs
1363	   must be able to correctly decapsulate such packets and to assign each
1364	   one to the proper MVPN. This requires that the encapsulation used for
1365	   sending packets through the tunnel have demultiplexing information
1366	   which the receiver can associate with a particular MVPN.

1368	   If ingress replication is being used for an MVPN, the PEs announce
1369	   this as part of the BGP based MVPN membership auto-discovery process,
1370	   described in section 4.  The PMSI tunnel attribute specifies ingress
1371	   replication.  The demultiplexor value is a downstream-assigned MPLS
1372	   label (i.e., assigned by the PE that originated the A-D route, to be
1373	   used by other PEs when they send multicast packets on a unicast
1374	   tunnel to that PE).

1376	   Other demultiplexing procedures for unicast are under consideration.

1378	6.1.2. Auto-Discovery for P-Multicast Trees

1380	   A PE announces the P-multicast technology it supports for a specified
1381	   MVPN, as part of the BGP MVPN membership discovery. This allows other
1382	   PEs to determine the P-multicast technology they can use for building
1383	   P-multicast trees to instantiate an I-PMSI. If a PE has a default
1384	   tree instantiation of an I-PMSI, it also announces the tree
1385	   identifier as part of the auto-discovery, as well as announcing its
1386	   aggregation capability.

1388	   The announcement of a tree identifier at discovery time is only
1389	   possible if the tree already exists (e.g., a preconfigured "traffic
1390	   engineered" tunnel), or if the tree can be constructed dynamically
1391	   without any PE having to know in advance all the other PEs on the
1392	   tree (e.g., the tree is created by receiver-initiated joins).

1394	6.2. C-Multicast Routing Information Exchange

1396	   When a PE doesn't support the use of a MI-PMSI for a given MVPN, it
1397	   MUST either unicast MVPN routing information using PIM or else use
1398	   BGP for exchanging the MVPN routing information.

1400	6.3. Aggregation

1402	   A P-multicast tree can be used to instantiate a PMSI service for only
1403	   one MVPN or for more than one MVPN. When a P-multicast tree is shared
1404	   across multiple MVPNs it is termed an Aggregate Tree [RAGGARWA-
1405	   MCAST]. The procedures described in this document allow a single SP
1406	   multicast tree to be shared across multiple MVPNs. The procedures
1407	   that are specific to aggregation are optional and are explicitly
1408	   pointed out. Unless otherwise specified a P-multicast tree technology
1409	   supports aggregation.

1411	   Aggregate Trees allow a single P-multicast tree to be used across
1412	   multiple MVPNs and hence state in the SP core grows per-set-of-MVPNs
1413	   and not per MVPN.  Depending on the congruence of the aggregated
1414	   MVPNs, this may result in trading off optimality of multicast
1415	   routing.

1417	   An Aggregate Tree can be used by a PE to provide an UI-PMSI or MI-
1418	   PMSI service for more than one MVPN. When this is the case the
1419	   Aggregate Tree is said to have an inclusive mapping.

1421	6.3.1. Aggregate Tree Leaf Discovery

1423	   BGP MVPN membership discovery allows a PE to determine the different
1424	   Aggregate Trees that it should create and the MVPNs that should be
1425	   mapped onto each such tree. The leaves of an Aggregate Tree are
1426	   determined by the PEs, supporting aggregation, that belong to all the
1427	   MVPNs that are mapped onto the tree.

1429	   If an Aggregate Tree is used to instantiate one or more S-PMSIs, then
1430	   it may be desirable for the PE at the root of the tree to know which
1431	   PEs (in its MVPN) are receivers on that tree.  This enables the PE to
1432	   decide when to aggregate two S-PMSIs, based on congruence (as
1433	   discussed in the next section).  Thus explicit tracking may be
1434	   required.  Since the procedures for disseminating C-multicast routes
1435	   do not provide explicit tracking, a type of A-D route known as a
1436	   "Leaf A-D Route" is used.  The PE which wants to assign a particular
1437	   C-multicast flow to a particular Aggregate Tree can send an A-D route
1438	   which elicits Leaf A-D routes from the PEs that need to receive that
1439	   C-multicast flow.  This provides the explicit tracking information
1440	   needed to support the aggregation methodology discussed in the next
1441	   section.

1443	6.3.2. Aggregation Methodology

1445	   This document does not specify the mandatory implementation of any
1446	   particular set of rules for determining whether or not the PMSIs of
1447	   two particular MVPNs are to be instantiated by the same Aggregate
1448	   Tree.  This determination can be made by implementation-specific
1449	   heuristics, by configuration, or even perhaps by the use of offline
1450	   tools.

1452	   It is the intention of this document that the control procedures will
1453	   always result in all the PEs of an MVPN to agree on the PMSIs which
1454	   are to be used and on the tunnels used to instantiate those PMSIs.

1456	   This section discusses potential methodologies with respect to
1457	   aggregation.

1459	   The "congruence" of aggregation is defined by the amount of overlap
1460	   in the leaves of the customer trees that are aggregated on a SP tree.
1461	   For Aggregate Trees with an inclusive mapping the congruence depends
1462	   on the overlap in the membership of the MVPNs that are aggregated on
1463	   the tree. If there is complete overlap i.e. all MVPNs have exactly
1464	   the same sites, aggregation is perfectly congruent. As the overlap
1465	   between the MVPNs that are aggregated reduces, i.e. the number of
1466	   sites that are common across all the MVPNs reduces, the congruence
1467	   reduces.

1469	   If aggregation is done such that it is not perfectly congruent a PE
1470	   may receive traffic for MVPNs to which it doesn't belong. As the
1471	   amount of multicast traffic in these unwanted MVPNs increases
1472	   aggregation becomes less optimal with respect to delivered traffic.
1473	   Hence there is a tradeoff between reducing state and delivering
1474	   unwanted traffic.

1476	   An implementation should provide knobs to control the congruence of
1477	   aggregation. These knobs are implementation dependent. Configuring
1478	   the percentage of sites that MVPNs must have in common to be
1479	   aggregated, is an example of such a knob. This will allow a SP to
1480	   deploy aggregation depending on the MVPN membership and traffic
1481	   profiles in its network.  If different PEs or servers are setting up
1482	   Aggregate Trees this will also allow a service provider to engineer
1483	   the maximum amount of unwanted MVPNs hat a particular PE may receive
1484	   traffic for.

1486	6.3.3. Encapsulation of the Aggregate Tree

1488	   An Aggregate Tree may use an IP/GRE encapsulation or an MPLS
1489	   encapsulation.  The protocol type in the IP/GRE header in the former
1490	   case and the protocol type in the data link header in the latter need
1491	   further explanation. This will be specified in a separate document.

1493	6.3.4. Demultiplexing C-multicast traffic

1495	   When multiple MVPNs are aggregated onto one P-Multicast tree,
1496	   determining the tree over which the packet is received is not
1497	   sufficient to determine the MVPN to which the packet belongs.  The
1498	   packet must also carry some demultiplexing information to allow the
1499	   egress PEs to determine the MVPN to which the packet belongs.  Since
1500	   the packet has been multicast through the P network, any given
1501	   demultiplexing value must have the same meaning to all the egress
1502	   PEs.  The demultiplexing value is a MPLS label that corresponds to
1503	   the multicast VRF to which the packet belongs. This label is placed
1504	   by the ingress PE immediately beneath the P-Multicast tree header.
1505	   Each of the egress PEs must be able to associate this MPLS label with
1506	   the same MVPN.  If downstream label assignment were used this would
1507	   require all the egress PEs in the MVPN to agree on a common label for
1508	   the MVPN. Instead the MPLS label is upstream assigned [MPLS-
1509	   UPSTREAM-LABEL]. The label bindings are advertised via BGP updates
1510	   originated the ingress PEs.

1512	   This procedure requires each egress PE to support a separate label
1513	   space for every other PE. The egress PEs create a forwarding entry
1514	   for the upstream assigned MPLS label, allocated by the ingress PE, in
1515	   this label space. Hence when the egress PE receives a packet over an
1516	   Aggregate Tree, it first determines the tree that the packet was
1517	   received over. The tree identifier determines the label space in
1518	   which the upstream assigned MPLS label lookup has to be performed.
1519	   The same label space may be used for all P-multicast trees rooted at
1520	   the same ingress PE, or an implementation may decide to use a
1521	   separate label space for every P-multicast tree.

1523	   The encapsulation format is either MPLS or MPLS-in-something (e.g.
1524	   MPLS-in-GRE [MPLS-IP]). When MPLS is used, this label will appear
1525	   immediately below the label that identifies the P-multicast tree.
1526	   When MPLS-in-GRE is used, this label will be the top MPLS label that
1527	   appears when the GRE header is stripped off.

1529	   When IP encapsulation is used for the P-multicast Tree, whatever
1530	   information that particular encapsulation format uses for identifying
1531	   a particular tunnel is used to determine the label space in which the
1532	   MPLS label is looked up.

1534	   If the P-multicast tree uses MPLS encapsulation, the P-multicast tree
1535	   is itself identified by an MPLS label.  The egress PE MUST NOT
1536	   advertise IMPLICIT NULL or EXPLICIT NULL for that tree.  Once the
1537	   label representing the tree is popped off the MPLS label stack, the
1538	   next label is the demultiplexing information that allows the proper
1539	   MVPN to be determined.

1541	   This specification requires that, to support this sort of
1542	   aggregation, there be at least one upstream-assigned label per MVPN.
1543	   It does not require that there be only one.  For example, an ingress
1544	   PE could assign a unique label to each C-(S,G).  (This could be done
1545	   using the same technique this is used to assign a particular C-(S,G)
1546	   to an S-PMSI, see section 7.3.)

1548	6.4. Mapping Received Packets to MVPNs

1550	   When an egress PE receives a C-multicast data packet over a P-
1551	   multicast tree, it needs to forward the packet to the CEs that have
1552	   receivers in the packet's C-multicast group. It also needs to
1553	   determine the RPF interface for the C-multicast data packet. In order
1554	   to do this the egress PE needs to determine the tunnel that the
1555	   packet was received on. The PE can then determine the MVPN that the
1556	   packet belongs to and if needed do any further lookups that are
1557	   needed to forward the packet.

1559	6.4.1. Unicast Tunnels

1561	   When ingress replication is used, the MVPN to which the received C-
1562	   multicast data packet belongs can be determined by the MPLS label
1563	   that was allocated by the egress. This label is distributed by the
1564	   egress.  This also determines the RPF interface for the C-multicast
1565	   data packet.

1567	6.4.2. Non-Aggregated P-Multicast Trees

1569	   If a P-multicast tree is associated with only one MVPN, determining
1570	   the P-multicast tree on which a packet was received is sufficient to
1571	   determine the packet's MVPN. All that the egress PE needs to know is
1572	   the MVPN the P-multicast tree is associated with.

1574	   There are different ways in which the egress PE can learn this
1575	   association:

1577	      a) Configuration. The P-multicast tree that a particular MVPN
1578	         belongs to is configured on each PE.

1580	         [Editor's Note: PIM-SM Default MD trees in [ROSEN-8] and
1581	         [MVPN-BASE] are examples of configuring the P-multicast tree
1582	         and MVPN association]

1584	      b) BGP based advertisement of the P-multicast tree - MPVN mapping
1585	         after the root of the tree discovers the leaves of the tree.
1586	         The root of the tree sets up the tree after discovering each of
1587	         the PEs that belong to the MVPN.  It then advertises the P-
1588	         multicast tree - MVPN mapping to each of the leaves.  This
1589	         mechanism can be used with both source initiated trees [e.g.
1590	         RSVP-TE P2MP LSPs] and receiver initiated trees [e.g. PIM
1591	         trees].

1593	         [Editor's Note: Aggregate tree advertisements in [RAGGARWA-
1594	         MCAST] are examples of this.]

1596	      c) BGP based advertisement of the P-multicast tree - MVPN mapping
1597	         as part of the MVPN membership discovery. The root of the tree
1598	         advertises, to each of the other PEs that belong to the MVPN,
1599	         the P-multicast tree that the MVPN is associated with. This
1600	         implies that the root doesn't need to know the leaves of the
1601	         tree beforehand. This is possible only for receiver initiated
1602	         trees e.g. PIM based trees.

1604	         [Editor's Note: PIM-SSM discovery in [ROSEN-8] is an example of
1605	         the above]

1607	   Both of the above require the BGP based advertisement to contain the
1608	   P-multicast tree identifier. This identifier is encoded as a BGP
1609	   attribute and contains the following elements:

1611	     - Tunnel Type.

1613	     - Tunnel identifier. The semantics of the identifier is determined
1614	       by the tunnel type.

1616	6.4.3. Aggregate P-Multicast Trees

1618	   Once a PE sets up an Aggregate Tree it needs to announce the C-
1619	   multicast groups being mapped to this tree to other PEs in the
1620	   network. This procedure is referred to as Aggregate Tree discovery.
1621	   For an Aggregate Tree with an inclusive mapping this discovery
1622	   implies announcing:

1624	     - The mapping of all MVPNs mapped to the Tree.

1626	     - For each MVPN mapped onto the tree the inner label allocated for
1627	       it by the ingress PE. The use of this label is explained in the
1628	       demultiplexing procedures of section 6.3.4.

1630	     - The P-multicast tree Identifier

1632	   The egress PE creates a logical interface corresponding to the tree
1633	   identifier. This interface is the RPF interface for all the <C-
1634	   Source, C-Group> entries mapped to that tree.

1636	   When PIM is used to setup P-multicast trees, the egress PE also Joins
1637	   the P-Group Address corresponding to the tree. This results in setup
1638	   of the PIM P-multicast tree.

1640	6.5. I-PMSI Instantiation Using Ingress Replication

1642	   As described in section 3 a PMSI can be instantiated using Unicast
1643	   Tunnels between the PEs that are participating in the MVPN. In this
1644	   mechanism the ingress PE replicates a C-multicast data packet
1645	   belonging to a particular MVPN and sends a copy to all or a subset of
1646	   the PEs that belong to the MVPN. A copy of the packet is tunneled to
1647	   a remote PE over an Unicast Tunnel to the remote PE. IP/GRE Tunnels
1648	   or MPLS LSPs are examples of unicast tunnels that may be used. Note
1649	   that the same Unicast Tunnel can be used to transport packets
1650	   belonging to different MVPNs.

1652	   Ingress replication can be used to instantiate a UI-PMSI. The PE sets
1653	   up unicast tunnels to each of the remote PEs that support ingress
1654	   replication. For a given MVPN all C-multicast data packets are sent
1655	   to each of the remote PEs in the MVPN that support ingress
1656	   replication. Hence a remote PE may receive C-multicast data packets
1657	   for a group even if it doesn't have any receivers in that group.

1659	   Ingress replication can also be used to instantiate a MI-PMSI. In
1660	   this case each PE has a mesh of unicast tunnels to every other PE in
1661	   that MVPN.

1663	   However when ingress replication is used it is recommended that only
1664	   S-PMSIs be used. Instantiation of S-PMSIs with ingress replication is
1665	   described in section 7.2.  Note that this requires the use of
1666	   explicit tracking, i.e., a PE must know which of the other PEs have
1667	   receivers for each C-multicast tree.

1669	6.6. Establishing P-Multicast Trees

1671	   It is believed that the architecture outlined in this document places
1672	   no limitations on the protocols used to instantiate P-multicast
1673	   trees. However, the only protocols being explicitly considered are
1674	   PIM-SM, PIM-SSM, PIM-Bidir, RSVP-TE, and mLDP.

1676	   A P-multicast tree can be either a source tree or a shared tree. A
1677	   source tree is used to carry traffic only for the multicast VRFs that
1678	   exist locally on the root of the tree i.e. for which the root has
1679	   local CEs. The root is a PE router. Source P-multicast trees can be
1680	   instantiated using PIM-SM, PIM-SSM, RSVP-TE P2MP LSPs, and mLDP P2MP
1681	   LSPs.

1683	   A shared tree on the other hand can be used to carry traffic
1684	   belonging to VRFs that exist on other PEs as well. The root of a
1685	   shared tree is not necessarily one of the PEs in the MVPN. All PEs
1686	   that use the shared tree will send MVPN data packets to the root of
1687	   the shared tree; if PIM is being used as the control protocol, PIM
1688	   control packets also get sent to the root of the shared tree.  This
1689	   may require an unicast tunnel between each of these PEs and the root.
1690	   The root will then send them on the shared tree and all the PEs that
1691	   are leaves of the shared tree will receive the packets. For example a
1692	   RP based PIM-SM tree would be a shared tree. Shared trees can be
1693	   instantiated using PIM-SM, PIM-SSM, PIM-Bidir, RSVP-TE P2MP LSPs,
1694	   mLDP P2MP LSPs, and mLDP MP2MP LSPs.. Aggregation support for
1695	   bidirectional P-trees (i.e., PIM-Bidir trees or mLDP MP2MP trees) is
1696	   for further study. Shared trees require all the PEs to discover the
1697	   root of the shared tree for a MVPN. To achieve this the root of a
1698	   shared tree advertises as part of the BGP based MVPN membership
1699	   discovery:

1701	     - The capability to setup a shared tree for a specified MVPN.

1703	     - A downstream assigned label that is to be used by each PE to
1704	       encapsulate a MVPN data packet, when they send this packet to the
1705	       root of the shared tree.

1707	     - A downstream assigned label that is to be used by each PE to
1708	       encapsulate a MVPN control packet, when they send this packet to
1709	       the root of the shared tree.

1711	   Both a source tree and a shared tree can be used to instantiate an
1712	   I-PMSI.  If a source tree is used to instantiate an UI-PMSI for a
1713	   MVPN, all the other PEs that belong to the MVPN, must be leaves of
1714	   the source tree. If a shared tree is used to instantiate a UI-PMSI
1715	   for a MVPN, all the PEs that are members of the MVPN must be leaves
1716	   of the shared tree.

1718	6.7. RSVP-TE P2MP LSPs

1720	   This section describes procedures that are specific to the usage of
1721	   RSVP-TE P2MP LSPs for instantiating a UI-PMSI. The RSVP-TE P2MP LSP
1722	   can be either a source tree or a shared tree. Procedures in [RSVP-
1723	   P2MP] are used to signal the LSP. The LSP is signaled after the root
1724	   of the LSP discovers the leaves. The egress PEs are discovered using
1725	   the MVPN membership procedures described in section 4. RSVP-TE P2MP
1726	   LSPs can optionally support aggregation.

1728	6.7.1. P2MP TE LSP Tunnel - MVPN Mapping

1730	   P2MP TE LSP Tunnel to MVPN mapping can be learned at the egress PEs
1731	   using either option (a) or option (b) described in section 6.4.2.
1732	   Option (b) i.e. BGP based advertisements of the P2MP TE LSP Tunnel -
1733	   MPVN mapping require that the root of the tree include the P2MP TE
1734	   LSP Tunnel identifier as the tunnel identifier in the BGP
1735	   advertisements. This identifier contains the following information
1736	   elements:

1738	     - The type of the tunnel is set to RSVP-TE P2MP Tunnel

1740	     - RSVP-TE P2MP Tunnel's SESSION Object

1742	     - Optionally RSVP-TE P2MP LSP's SENDER_TEMPLATE Object. This object
1743	       is included when it is desired to identify a particular P2MP TE
1744	       LSP.

1746	6.7.2. Demultiplexing C-Multicast Data Packets

1748	   Demultiplexing the C-multicast data packets at the egress PE follow
1749	   procedures described in section 6.3.4. The RSVP-TE P2MP LSP Tunnel
1750	   must be signaled with penultimate-hop-popping (PHP) off. Signaling
1751	   the P2MP TE LSP Tunnel with PHP off requires an extension to RSVP-TE
1752	   which will be described later.

1754	7. Optimizing Multicast Distribution via S-PMSIs

1756	   Whenever a particular multicast stream is being sent on an I-PMSI, it
1757	   is likely that the data of that stream is being sent to PEs that do
1758	   not require it.  If a particular stream has a significant amount of
1759	   traffic,  it may be beneficial to move it to an S-PMSI which includes
1760	   only those PEs that are transmitters and/or receivers (or at least
1761	   includes fewer PEs that are neither).

1763	   If explicit tracking is being done, S-PMSI creation can also be
1764	   triggered on other criteria.  For instance there could be a "pseudo
1765	   wasted bandwidth" criteria: switching to an S-PMSI would be done if
1766	   the bandwidth multiplied by the number of uninterested PEs (PE that
1767	   are receiving the stream but have no receivers) is above a specified
1768	   threshold. The motivation is that (a) the total bandwidth wasted by
1769	   many sparsely subscribed low-bandwidth groups may be large, and (b)
1770	   there's no point to moving a high-bandwidth group to an S-PMSI if all
1771	   the PEs have receivers for it.

1773	   Switching a (C-S, C-G) stream to an S-PMSI may require the root of
1774	   the S-PMSI to determine the egress PEs that need to receive the (C-S,
1775	   C-G) traffic.  This is true in the following cases:

1777	     - If the tunnel is a source initiated tree, such as a RSVP-TE P2MP
1778	       Tunnel, the PE needs to know the leaves of the tree before it can
1779	       instantiate the S-PMSI.

1781	     - If a PE instantiates multiple S-PMSIs, belonging to different
1782	       MVPNs, using one P-multicast tree, such a tree is termed an
1783	       Aggregate Tree with a selective mapping. The setting up of such
1784	       an Aggregate Tree requires the ingress PE to know all the other
1785	       PEs that have receivers for multicast groups that are mapped onto
1786	       the tree.

1788	   The above two cases require that explicit tracking be done for the
1789	   (C-S, C-G) stream.  The root of the S-PMSI MAY decide to do explicit
1790	   tracking of this stream only after it has determined to move the
1791	   stream to an S-PMSI, or it MAY have been doing explicit tracking all
1792	   along.

1794	   If the S-PMSI is instantiated by a P-multicast tree, the PE at the
1795	   root of the tree must signal the leaves of the tree that the (C-S,
1796	   C-G) stream is now bound to the to the S-PMSI. Note that the PE could
1797	   create the identity of the P-multicast tree prior to the actual
1798	   instantiation of the tunnel.

1800	   If the S-PMSI is instantiated by a source-initiated P-multicast tree
1801	   (e.g., an RSVP-TE P2MP tunnel), the PE at the root of the tree must
1802	   establish the source-initiated P-multicast tree to the leaves.  This
1803	   tree MAY have been established before the leaves receive the S-PMSI
1804	   binding, or MAY be established after the leaves receives the binding.
1805	   The leaves MUST not switch to the S-PMSI until they receive both the
1806	   binding and the tree signaling message.

1808	7.1. S-PMSI Instantiation Using Ingress Replication

1810	   As described in section 6.1.1, ingress replication can be used to
1811	   instantiate a UI-PMSI. However this can result in a PE receiving
1812	   packets for a multicast group for which it doesn't have any
1813	   receivers. This can be avoided if the ingress PE tracks the remote
1814	   PEs which have receivers in a particular C-multicast group.  In order
1815	   to do this it needs to receive C-Joins from each of the remote PEs.
1816	   It then replicates the C-multicast data packet and sends it to only
1817	   those egress PEs which are on the path to a receiver of that C-group.
1818	   It is possible that each PE that is using ingress replication
1819	   instantiates only S-PMSIs. It is also possible that some PEs
1820	   instantiate UI-PMSIs while others instantiate only S-PMSIs. In both
1821	   these cases the PE MUST either unicast MVPN routing information using
1822	   PIM or use BGP for exchanging the MVPN routing information. This is
1823	   because there may be no MI-PMSI available for it to exchange MVPN
1824	   routing information.

1826	   Note that the use of ingress replication doesn't require any extra
1827	   procedures for signaling the binding of the S-PMSI from the ingress
1828	   PE to the egress PEs.  The procedures described for I-PMSIs are
1829	   sufficient.

1831	7.2. Protocol for Switching to S-PMSIs

1833	   We describe two protocols for switching to S-PMSIs.  These protocols
1834	   can be used when the tunnel that instantiates the S-PMSI is a P-
1835	   multicast tree.

1837	7.2.1. A UDP-based Protocol for Switching to S-PMSIs

1839	   This procedure can be used for any MVPN which has an MI-PMSI.
1840	   Traffic from all multicast streams in a given MPVN is sent, by
1841	   default, on the MI-PMSI.  Consider a single multicast stream within a
1842	   given MVPN, and consider a PE which is attached to a source of
1843	   multicast traffic for that stream.  The PE can be configured to move
1844	   the stream from the MI-PMSI to an S-PMSI if certain configurable
1845	   conditions are met.  To do this, it needs to inform all the PEs which
1846	   attach to receivers for stream.  These PEs need to start listening
1847	   for traffic on the S-PMSI, and the transmitting PE may start sending
1848	   traffic on the S-PMSI when it is reasonably certain that all
1849	   receiving PEs are listening on the S-PMSI.

1851	7.2.1.1. Binding a Stream to an S-PMSI

1853	   When a PE which attaches to a transmitter for a particular multicast
1854	   stream notices that the conditions for moving the stream to an S-PMSI
1855	   are met, it begins to periodically send an "S-PMSI Join Message" on
1856	   the MI-PMSI.  The S-PMSI Join is a UDP-encapsulated message whose
1857	   destination address is ALL-PIM-ROUTERS (224.0.0.13), and whose
1858	   destination port is 3232.

1860	   The S-PMSI Join Message contains the following information:

1862	     - An identifier for the particular multicast stream which is to be
1863	       bound to the S-PMSI.   This can be represented as an (S,G) pair.

1865	     - An identifier for the particular S-PMSI to which the stream is to
1866	       be bound.  This identifier is a structured field which includes
1867	       the following information:

1869	         * The type of tunnel used to instantiate the S-PMSI
1870	         * An identifier for the tunnel.  The form of the identifier
1871	           will depend upon the tunnel type.  The combination of tunnel
1872	           identifier and tunnel type should contain enough information
1873	           to enable all the PEs to "join" the tunnel and receive
1874	           messages from it.

1876	         * Any demultiplexing information needed by the tunnel
1877	           encapsulation protocol to identify the particular S-PMSI.
1878	           This allows a single tunnel to aggregate multiple S-PMSIs.
1879	           If a particular tunnel is not aggregating multiple S-PMSIs,
1880	           then no demultiplexing information is needed.

1882	   A PE router which is not connected to a receiver will still receive
1883	   the S-PMSI Joins, and MAY cache the information contained therein.
1884	   Then if the PE later finds that it is attached to a receiver, it can
1885	   immediately start listening to the S-PMSI.

1887	   Upon receiving the S-PMSI Join, PE routers connected to receivers for
1888	   the specified stream will take whatever action is necessary to start
1889	   receiving multicast data packets on the S-PMSI.  The precise action
1890	   taken will depend upon the tunnel type.

1892	   After a configurable delay, the PE router which is sending the S-PMSI
1893	   Joins will start transmitting the stream's data packets on the S-
1894	   PMSI.

1896	   When the pre-configured conditions are no longer met for a particular
1897	   stream, e.g. the traffic stops, the PE router connected to the source
1898	   stops announcing S-PMSI Joins for that stream.  Any PE that does not
1899	   receive, over a configurable interval, an S-PMSI Join for a
1900	   particular stream will stop listening to the S-PMSI.

1902	7.2.1.2. Packet Formats and Constants

1904	   The S-PMSI Join message is encapsulated within UDP, and has the
1905	   following type/length/value (TLV) encoding:

1907	        0                   1                   2                   3
1908	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1909	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1910	       |     Type      |            Length           |     Value       |
1911	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1912	       |                               .                               |
1913	       |                               .                               |
1914	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1916	   Type (8 bits)

1918	   Length (16 bits): the total number of octets in the Type, Length, and
1919	   Value fields combined

1921	   Value (variable length)

1923	   Currently only one type of S-PMSI Join is defined.  A type 1 S-PMSI
1924	   Join is used when the S-PMSI tunnel is a PIM tunnel which is used to
1925	   carry a single multicast stream, where the packets of that stream
1926	   have IPv4 source and destination IP addresses.

1928	        0                   1                   2                   3
1929	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1930	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1931	       |     Type      |           Length            |    Reserved     |
1932	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1933	       |                           C-source                            |
1934	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1935	       |                           C-group                             |
1936	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1937	       |                           P-group                             |
1938	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1940	   Type (8 bits): 1

1942	   Length (16 bits): 16

1944	   Reserved (8 bits):  This field SHOULD be zero when transmitted, and
1945	   MUST be ignored when received.

1947	   C-Source (32 bits): the IPv4 address of the traffic source in the
1948	   VPN.

1950	   C-Group (32 bits): the IPv4 address of the multicast traffic
1951	   destination address in the VPN.

1953	   P-Group (32 bits): the IPv4 group address that the PE router is going
1954	   to use to encapsulate the flow (C-Source, C-Group).

1956	   The P-group identifies the S-PMSI tunnel, and the (C-S, C-G)
1957	   identifies the multicast flow that is carried in the tunnel.

1959	   The protocol uses the following constants.

1961	   [S-PMSI_DELAY]:

1963	       the PE router which is to transmit onto the S-PMSI will delay
1964	       this amount of time before it begins using the S-PMSI.  The
1965	       default value is 3 seconds.

1967	   [S-PMSI_TIMEOUT]:

1969	       if a PE (other than the transmitter) does not receive any packets
1970	       over the S-PMSI tunnel for this amount of time, the PE will prune
1971	       itself from the S-PMSI tunnel, and will expect (C-S, C-G) packets
1972	       to arrive on an I-PMSI.  The default value is 3 minutes.  This
1973	       value must be consistent among PE routers.

1975	   [S-PMSI_HOLDOWN]:

1977	       if the PE that transmits onto the S-PMSI does not see any (C-S,
1978	       C-G) packets for this amount of time, it will resume sending (C-
1979	       S, C-G) packets on an I-PMSI.

1981	       This is used to avoid oscillation when traffic is bursty.  The
1982	       default value is 1 minute.

1984	   [S-PMSI_INTERVAL]
1985	       the interval the transmitting PE router uses to periodically send
1986	       the S-PMSI Join message.  The default value is 60 seconds.

1988	7.2.2. A BGP-based Protocol for Switching to S-PMSIs

1990	   This procedure can be used for a MVPN that is using either a UI-PMSI
1991	   or a MI-PMSI. Consider a single multicast stream for a C-(S, G)
1992	   within a given MVPN, and consider a PE which is attached to a source
1993	   of multicast traffic for that stream. The PE can be configured to
1994	   move the stream from the MI-PMSI or UI-PMSI to an S-PMSI if certain
1995	   configurable conditions are met. Once a PE decides to move the C-(S,
1996	   G) for a given MVPN to a S-PMSI, it needs to instantiate the S-PMSI
1997	   using a tunnel and announce to all the egress PEs, that are on the
1998	   path to receivers of the C-(S, G), of the binding of the S-PMSI to
1999	   the C-(S, G). The announcement is done using BGP.  Depending on the
2000	   tunneling technology used, this announcement may be done before or
2001	   after setting up the tunnel. The source and egress PEs have to switch
2002	   to using the S-PMSI for the C-(S, G).

2004	7.2.2.1. Advertising C-(S, G) Binding to a S-PMSI using BGP

2006	   The ingress PE informs all the PEs that are on the path to receivers
2007	   of the C-(S, G) of the binding of the S-PMSI to the C-(S, G). The BGP
2008	   announcement is done by sending update for the MCAST-VPN address
2009	   family.  An A-D route is used, containing the following information:

2011	      a) IP address of the originating PE

2013	      b) The RD configured locally for the MVPN. This is required to
2014	         uniquely identify the <C-Source, C-Group> as the addresses
2015	         could overlap between different MVPNs.  This is the same RD
2016	         value used in the auto-discovery process.

2018	      c) The C-Source address. This address can be a prefix in order to
2019	         allow a range of C-Source addresses to be mapped to an
2020	         Aggregate Tree.

2022	      d) The C-Group address. This address can be a range in order to
2023	         allow a range of C-Group addresses to be mapped to an Aggregate
2024	         Tree.

2026	      e) A PE MAY aggregate two or more S-PMSIs originated by the PE
2027	         onto the same P-Multicast tree. If the PE already advertises
2028	         S-PMSI auto-discovery routes for these S-PMSIs, then
2029	         aggregation requires the PE to re-advertise these routes. The
2030	         re-advertised routes MUST be the same as the original ones,
2031	         except for the PMSI tunnel attribute. If the PE has not
2032	         previously advertised S-PMSI auto-discovery routes for these
2033	         S-PMSIs, then the aggregation requires the PE to advertise
2034	         (new) S-PMSI auto-discovery routes for these S-PMSIs.  The PMSI
2035	         Tunnel attribute in the newly advertised/re-advertised routes
2036	         MUST carry the identity of the P- Multicast tree that
2037	         aggregates the S-PMSIs. If at least some of the S-PMSIs
2038	         aggregated onto the same P-Multicast tree belong to different
2039	         MVPNs, then all these routes MUST carry an MPLS upstream
2040	         assigned label [MPLS-UPSTREAM-LABEL, section 6.3.4].  If all
2041	         these aggregated S-PMSIs belong to the same MVPN, then the
2042	         routes MAY carry an MPLS upstream assigned label [MPLS-
2043	         UPSTREAM-LABEL].  The labels MUST be distinct on a per MVPN
2044	         basis, and MAY be distinct on a per route basis.

2046	   When a PE distributes this information via BGP, it must include the
2047	   following:

2049	      1. An identifier for the particular S-PMSI to which the stream is
2050	         to be bound.  This identifier is a structured field which
2051	         includes the following information:

2053	           * The type of tunnel used to instantiate the S-PMSI

2055	           * An identifier for the tunnel.  The form of the identifier
2056	             will depend upon the tunnel type.  The combination of
2057	             tunnel identifier and tunnel type should contain enough
2058	             information to enable all the PEs to "join" the tunnel and
2059	             receive messages from it.

2061	      2. Route Target Extended Communities attribute. This is used as
2062	         described in section 4.

2064	7.2.2.2. Explicit Tracking

2066	   If the PE wants to enable explicit tracking for the specified flow,
2067	   it also indicates this in the A-D route it uses to bind the flow to a
2068	   particular S-PMSI.  Then any PE which receives the A-D route will
2069	   respond with a "Leaf A-D Route" in which it identifies itself as a
2070	   receiver of the specified flow.  The Leaf A-D route will be withdrawn
2071	   when the PE is no longer a receiver for the flow.

2073	   If the PE needs to enable explicit tracking for a flow before binding
2074	   the flow to an S-PMSI, it can do so by sending an A-D route
2075	   identifying the flow but not specifying an S-PMSI.  This will elicit
2076	   the Leaf A-D Routes.  This is useful when the PE needs to know the
2077	   receivers before selecting an S-PMSI.

2079	7.2.2.3. Switching to S-PMSI

2081	   After the egress PEs receive the announcement they setup their
2082	   forwarding path to receive traffic on the S-PMSI if they have one or
2083	   more receivers interested in the <C-S, C-G> bound to the S-PMSI. This
2084	   involves changing the RPF interface for the relevant <C-S, C-G>
2085	   entries to the interface that is used to instantiate the S-PMSI. If
2086	   an Aggregate Tree is used to instantiate a S-PMSI this also implies
2087	   setting up the demultiplexing forwarding entries based on the inner
2088	   label as described in section 6.3.4.  The egress PEs may perform the
2089	   switch to the S-PMSI once the advertisement from the ingress PE is
2090	   received or wait for a preconfigured timer to do so.

2092	   A source PE may use one of two approaches to decide when to start
2093	   transmitting data on the S-PMSI. In the first approach once the
2094	   source PE instantiates the S-PMSI, it starts sending multicast
2095	   packets for <C-S, C-G> entries mapped to the S-PMSI on both that as
2096	   well as on the I-PMSI, which is currently used to send traffic for
2097	   the <C-S, C-G>. After some preconfigured timer the PE stops sending
2098	   multicast packets for <C-S, C-G> on the I-PMSI. In the second
2099	   approach after a certain pre-configured delay after advertising the
2100	   <C-S, C-G> entry bound to a S-PMSI,  the source PE begins to send
2101	   traffic on the S-PMSI. At this point it stops to send traffic for the
2102	   <C-S, C-G> on the I-PMSI. This traffic is instead transmitted on the
2103	   S-PMSI.

2105	7.3. Aggregation

2107	   S-PMSIs can be aggregated on a P-multicast tree. The S-PMSI to C-(S,
2108	   G) binding advertisement supports aggregation. Furthermore the
2109	   aggregation procedures of section 6.3 apply. It is also possible to
2110	   aggregate both S-PMSIs and I-PMSIs on the same P-multicast tree.

2112	7.4. Instantiating the S-PMSI with a PIM Tree

2114	   The procedures of section 7.3 tell a PE when it must start listening
2115	   and stop listening to a particular S-PMSI.  Those procedures also
2116	   specify the method for instantiating the S-PMSI.  In this section, we
2117	   provide the procedures to be used when the S-PMSI is instantiated as
2118	   a PIM tree.  The PIM tree is created by the PIM P-instance.

2120	   If a single PIM tree is being used to aggregate multiple S-PMSIs,
2121	   then the PIM tree to which a given stream is bound may have already
2122	   been joined by a given receiving PE.  If the tree does not already
2123	   exist, then the appropriate PIM procedures to create it must be
2124	   executed in the P-instance.

2126	   If the S-PMSI for a particular multicast stream is instantiated as a
2127	   PIM-SM or PIM-Bidir tree, the S-PMSI identifier will specify the RP
2128	   and the group P-address, and the PE routers which have receivers for
2129	   that stream must build a shared tree toward the RP.

2131	   If the S-PMSI is instantiated as a PIM-SSM tree, the PE routers build
2132	   a source tree toward the PE router that is advertising the S-PMSI
2133	   Join.  The IP address root of the tree is the same as the source IP
2134	   address which appears in the S-PMSI Join.  In this case, the tunnel
2135	   identifier in the S-PMSI Join will only need to specify a group P-
2136	   address.

2138	   The above procedures assume that each PE router has a set of group
2139	   P-addresses that it can use for setting up the PIM-trees.  Each PE
2140	   must be configured with this set of P-addresses.  If PIM-SSM is used
2141	   to set up the tunnels, then the PEs may be with overlapping sets of
2142	   group P-addresses.  If PIM-SSM is not used, then each PE must be
2143	   configured with a unique set of group P-addresses (i.e., having no
2144	   overlap with the set configured at any other PE router).  The
2145	   management of this set of addresses is thus greatly simplified when
2146	   PIM-SSM is used, so the use of PIM-SSM is strongly recommended
2147	   whenever PIM trees are used to instantiate S-PMSIs.

2149	   If it is known that all the PEs which need to receive data traffic on
2150	   a given S-PMSI can support aggregation of multiple  S-PMSIs on a
2151	   single PIM tree, then the transmitting PE, may, at its discretion,
2152	   decide to bind the S-PMSI to a PIM  tree which is already bound to
2153	   one or more other S-PMSIs, from the same or from different MVPNs.  In
2154	   this case, appropriate demultiplexing information must be signaled.

2156	7.5. Instantiating S-PMSIs using RSVP-TE P2MP Tunnels

2158	   RSVP-TE P2MP Tunnels can be used for instantiating S-PMSIs.
2159	   Procedures described in the context of I-PMSIs in section 6.7 apply.

2161	8. Inter-AS Procedures

2163	   If an MVPN has sites in more than one AS, it requires one or more
2164	   PMSIs to be instantiated by inter-AS tunnels.  This document
2165	   describes two different types of inter-AS tunnel:

2167	      1. "Segmented Inter-AS tunnels"

2169	         A segmented inter-AS tunnel consists of a number of independent
2170	         segments which are stitched together at the ASBRs.  There are
2171	         two types of segment, inter-AS segments and intra-AS segments.
2172	         The segmented inter-AS tunnel consists of alternating intra-AS
2173	         and inter-AS segments.

2175	         Inter-AS segments connect adjacent ASBRs of different ASes;
2176	         these "one-hop" segments are instantiated as unicast tunnels.

2178	         Intra-AS segments connect ASBRs and PEs which are in the same
2179	         AS.  An intra-AS segment may be of whatever technology is
2180	         desired by the SP that administers the that AS.  Different
2181	         intra-AS segments may be of different technologies.

2183	         Note that an intra-AS segment of an inter-AS tunnel is distinct
2184	         from any intra-AS tunnel in the AS.

2186	         A segmented inter-AS tunnel can be thought of as a tree which
2187	         is rooted at a particular AS, and which has as its leaves the
2188	         other ASes which need to receive multicast data from the root
2189	         AS.

2191	      2. "Non-segmented Inter-AS tunnels"

2193	         A non-segmented inter-AS tunnel is a single tunnel which spans
2194	         AS boundaries.  The tunnel technology cannot change from one
2195	         point in the tunnel to the next, so all ASes through which the
2196	         tunnel passes must support that technology.  In essence, AS
2197	         boundaries are of no significance to a non-segmented inter-AS
2198	         tunnel.

2200	         [Editor's Note: This is the model in [ROSEN-8] and [MVPN-
2201	         BASE].]

2203	   Section 10 of [RFC4364] describes three different options for
2204	   supporting unicast Inter-AS BGP/MPLS IP VPNs, known as options A, B,
2205	   and C.  We describe below how both segmented and non-segmented
2206	   inter-AS trees can be supported when option B or option C is used.
2207	   (Option A does not pass any routing information through an ASBR at
2208	   all, so no special inter-AS procedures are needed.)

2210	8.1. Non-Segmented Inter-AS Tunnels

2212	   In this model, the previously described discovery and tunnel setup
2213	   mechanisms are used, even though the PEs belonging to a given MVPN
2214	   may be in different ASes.  The ASBRs play no special role, but
2215	   function merely as P routers.

2217	8.1.1. Inter-AS MVPN Auto-Discovery

2219	   The previously described BGP-based auto-discovery mechanisms work "as
2220	   is" when an MVPN contains PEs that are in different Autonomous
2221	   Systems.

2223	8.1.2. Inter-AS MVPN Routing Information Exchange

2225	   MVPN routing information exchange can be done by PIM peering (either
2226	   lightweight or full) across an MI-PMSI, or by unicasting PIM
2227	   messages.  The method of using BGP to send MVPN routing information
2228	   can also be used.

2230	   If any form of PIM peering is used, a PE that sends C-PIM Join/Prune
2231	   messages for a particular C-(S,G) must be able to identify the PE
2232	   which is its PIM adjacency on the path to S.  The identity of the PIM
2233	   adjacency is determined from the RPF information associated with the
2234	   VPN-IP route to S.

2236	   If no RPF information is present, then the identity of the PIM
2237	   adjacency is taken from the BGP Next Hop attribute of the VPN-IP
2238	   route to S.  Note that this will not give the correct result if
2239	   option b of section 10 of [RFC4364] is used.  To avoid this
2240	   possibility of error, the RPF information SHOULD always be present if
2241	   MVPN routing information is to be distributed by PIM.

2243	   If BGP (rather than PIM) is used to distribute the MVPN routing
2244	   information, and if option b of section 10 of [RFC4364] is in use,
2245	   then the MVPN routes will be installed in the ASBRs along the path
2246	   from each multicast source in the MVPN to each multicast receiver in
2247	   the MVPN.  If option b is not in use, the MVPN routes are not
2248	   installed in the ASBRs.  The handling of MVPN routes in either case
2249	   is thus exactly analogous to the handling of unicast VPN-IP routes in
2250	   the corresponding case.

2252	8.1.3. Inter-AS I-PMSI

2254	   The procedures described earlier in this document can be used to
2255	   instantiate an I-PMSI with inter-AS tunnels. Specific tunneling
2256	   techniques require some explanation:

2258	      1. If ingress replication is used, the inter-AS PE-PE tunnels will
2259	         use the inter-AS tunneling procedures for the tunneling
2260	         technology used.

2262	      2. Inter-AS PIM-SM or PIM-SSM based trees rely on a PE joining a
2263	         (P-S, P-G) tuple where P-S is the address of a PE in another
2264	         AS. This (P-S, P-G) tuple is learned using the MVPN membership
2265	         and BGP MVPN-tunnel binding procedures described earlier.
2266	         However, if the source of the tree is in a different AS than a
2267	         particular P router, it is possible that the P router will not
2268	         have a route to the source.  For example, the remote AS may be
2269	         using BGP to distribute a route to the source, but a particular
2270	         P router may be part of a "BGP-free core", in which the P
2271	         routers are not aware of BGP-distributed routes.

2273	         In such a case it is necessary for a PE to to tell PIM to
2274	         construct the tree through a particular BGP speaker, the "BGP
2275	         next hop" for the tree source.  This can be accomplished with a
2276	         PIM extension, in which the P-PIM Join/Prune messages carry a
2277	         new "proxy" field which contains the address of that BGP next
2278	         hop.  As the P-multicast tree is constructed, it is built
2279	         towards the proxy (the BGP next hop) rather than towards P-S,
2280	         so the P routers will not need to have a route to P-S.

2282	         Support for inter-AS trees using PIM-Bidir are for further
2283	         study.

2285	         When the BGP-based discovery procedures for MVPN are in place,
2286	         one can distinguish two different inter-AS routes to a
2287	         particular P-S:

2289	           - BGP will install a unicast route to P-S along a particular
2290	             path, using the IP AFI/SAFI ;

2292	           - A PE's MVPN auto-discovery information is advertised by
2293	             sending a BGP update whose  NLRI  is in a special address
2294	             family (AFI/SAFI) used for this purpose.  The  NLRI of the
2295	             address family contains the  IP address of the PE, as well
2296	             as an RD.  If the NLRI contains the IP address of P-S, this
2297	             in effect creates a second route to P-S.  This route might
2298	             follow a different path than the route in the unicast IP
2299	             family.

2301	         When building a PIM tree towards P-S, it may be desirable to
2302	         build it along the route on which the MVPN auto-discovery
2303	         AFI/SAFI is installed, rather than along the route on which the
2304	         IP AFI/SAFI is installed.  This enables the inter-AS portion of
2305	         the tree to follow a path which is specifically chosen for
2306	         multicast (i.e., it allows the inter-AS multicast topology to
2307	         be "non-congruent" to the inter-AS unicast topology).

2309	         In order for P routers to send P-Join/Prune messages along this
2310	         path, they need to make use of the "proxy" field extension
2311	         discussed above.  The PIM message must also contain the full
2312	         NLRI in the MVPN auto-discovery family, so that the BGP
2313	         speakers can look up that NLRI to find the BGP next hop.

2315	      3. Procedures in [RSVP-P2MP] are used for inter-AS RSVP-TE P2MP
2316	         Tunnels.

2318	8.1.4. Inter-AS S-PMSI

2320	   The leaves of the tunnel are discovered using the MVPN routing
2321	   information.  Procedures for setting up the tunnel are similar to the
2322	   ones described in section 8.2.3 for an inter-AS I-PMSI.

2324	8.2. Segmented Inter-AS Tunnels

2326	8.2.1. Inter-AS MVPN Auto-Discovery Routes

2328	   The BGP based MVPN membership discovery procedures of section 4 are
2329	   used to auto-discover the intra-AS MVPN membership. This section
2330	   describes the additional procedures for inter-AS MVPN membership
2331	   discovery. It also describes the procedures for constructing
2332	   segmented inter-AS tunnels.

2334	   In this case, for a given MVPN in an AS, the objective is to form a
2335	   spanning tree of MVPN membership, rooted at the AS. The nodes of this
2336	   tree are ASes.  The leaves of this tree are only those ASes that have
2337	   at least one PE with a member in the MVPN. The inter-AS tunnel used
2338	   to instantiate an inter-AS PMSI must traverse this spanning tree. A
2339	   given AS needs to announce to another AS only the fact that it has
2340	   membership in a given MVPN. It doesn't need to announce the
2341	   membership of each PE in the AS to other ASes.

2343	   This section defines an inter-AS auto-discovery route as a route that
2344	   carries information about an AS that has one or more PEs (directly)
2345	   connected to the site(s) of that MVPN. Further it defines an inter-AS
2346	   leaf auto-discovery route (leaf auto-discovery route) as a route used
2347	   to inform the root of an intra-AS segment, of an inter-AS tunnel, of
2348	   a leaf of that intra-AS segment.

2350	8.2.1.1. Originating Inter-AS MVPN A-D Information

2352	   A PE in a given AS advertises its MVPN membership to all its IBGP
2353	   peers.  This IBGP peer may be a route reflector which in turn
2354	   advertises this information to only its IBGP peers. In this manner
2355	   all the PEs and ASBRs in the AS learn this membership information.

2357	   An Autonomous System Border Router (ASBR) may be configured to
2358	   support a particular MVPN. If an ASBR is configured to support a
2359	   particular MVPN, the ASBR MUST participate in the intra-AS MVPN
2360	   auto-discovery/binding procedures for that MVPN within the AS that
2361	   the ASBR belongs to, as defined in this document.

2363	   Each ASBR then advertises the "AS MVPN membership" to its neighbor
2364	   ASBRs using EBGP. This inter-AS auto-discovery route must not be
2365	   advertised to the PEs/ASBRs in the same AS as this ASBR. The
2366	   advertisement carries the following information elements:

2368	      a. A Route Distinguisher for the MVPN. For a given MVPN each ASBR
2369	         in the AS must use the same RD when advertising this
2370	         information to other ASBRs. To accomplish this all the ASBRs
2371	         within that AS, that are configured to support the MVPN, MUST
2372	         be configured with the same RD for that MVPN. This RD MUST be
2373	         of Type 0, MUST embed the autonomous system number of the AS.

2375	      b. The announcing ASBR's local address as the next-hop for the
2376	         above information elements.

2378	      c. By default the BGP Update message MUST carry export Route
2379	         Targets used by the unicast routing of that VPN. The default
2380	         could be modified via configuration by having a set of Route
2381	         Targets used for the inter-AS auto-discovery routes being
2382	         distinct from the ones used by the unicast routing of that VPN.

2384	8.2.1.2. Propagating Inter-AS MVPN A-D Information

2386	   As an inter-AS auto-discovery route originated by an ASBR within a
2387	   given AS is propagated via BGP to other ASes, this results in
2388	   creation of a data plane tunnel that spans multiple ASes. This tunnel
2389	   is used to carry (multicast) traffic from the MVPN sites connected to
2390	   the PEs of the AS to the MVPN sites connected to the PEs that are in
2391	   the other ASes. Such tunnel consists of multiple intra-AS segments
2392	   (one per AS) stitched at ASBRs' boundaries by single hop <ASBR-ASBR>
2393	   LSP segments.

2395	   An ASBR originates creation of an intra-AS segment when the ASBR
2396	   receives an inter-AS auto-discovery route from an EBGP neighbor.
2397	   Creation of the segment is completed as a result of distributing via
2398	   IBGP this route within the ASBR's own AS.

2400	   For a given inter-AS tunnel each of its intra-AS segments could be
2401	   constructed by its own independent mechanism. Moreover, by using
2402	   upstream labels within a given AS multiple intra-AS segments of
2403	   different inter-AS tunnels of either the same or different MVPNs may
2404	   share the same P-Multicast Tree.

2406	   Since (aggregated) inter-AS auto-discovery routes have granularity of
2407	   <AS, MVPN>, an MVPN that is present in N ASes would have total of N
2408	   inter-AS tunnels. Thus for a given MVPN the number of inter-AS
2409	   tunnels is independent of the number of PEs that have this MVPN.

2411	   The following sections specify procedures for propagation of
2412	   (aggregated) inter-AS auto-discovery routes across ASes.

2414	8.2.1.2.1. Inter-AS Auto-Discovery Route received via EBGP

2416	   When an ASBR receives from one of its EBGP neighbors a BGP Update
2417	   message that carries the inter-AS auto-discovery route if (a) at
2418	   least one of the Route Targets carried in the message matches one of
2419	   the import Route Targets configured on the ASBR, and (b) the ASBR
2420	   determines that the received route is the best route to the
2421	   destination carried in the NLRI of the route, the ASBR:

2423	      a) Re-advertises this inter-AS auto-discovery route within its own
2424	         AS.

2426	         If the ASBR uses ingress replication to instantiate the intra-
2427	         AS segment of the inter-AS tunnel, the re-advertised route
2428	         SHOULD carry a Tunnel attribute with the Tunnel Identifier set
2429	         to Ingress Replication, but no MPLS labels.

2431	         If a P-Multicast Tree is used to instantiate the intra-AS
2432	         segment of the inter-AS tunnel, and in order to advertise the
2433	         P-Multicast tree identifier the ASBR doesn't need to know the
2434	         leaves of the tree beforehand, then the advertising ASBR SHOULD
2435	         advertise the P-Multicast tree identifier in the Tunnel
2436	         Identifier of the Tunnel attribute. This, in effect, creates a
2437	         binding between the inter-AS auto-discovery route and the P-
2438	         Multicast Tree.

2440	         If a P-Multicast Tree is used to instantiate the intra-AS
2441	         segment of the inter-AS tunnel, and in order to advertise the
2442	         P-Multicast tree identifier the advertising ASBR needs to know
2443	         the leaves of the tree beforehand, the ASBR first discovers the
2444	         leaves using the Auto-Discovery procedures, as specified
2445	         further down. It then advertises the binding of the tree to the
2446	         inter-AS auto-discovery route using the the original auto-
2447	         discovery route with the addition of carrying in the route the
2448	         Tunnel attribute that contains the type and the identity of the
2449	         tree (encoded in the Tunnel Identifier of the attribute).

2451	      b) Re-advertises the received inter-AS auto-discovery route to its
2452	         EBGP peers, other than the EBGP neighbor from which the best
2453	         inter-AS auto-discovery route was received.

2455	      c) Advertises to its neighbor ASBR, from which it received the
2456	         best inter-AS autodiscovery route to the destination carried in
2457	         the NRLI of the route, a leaf auto-discovery route that carries
2458	         an ASBR-ASBR tunnel binding with the tunnel identifier set to
2459	         ingress replication. This binding as described in section 6 can
2460	         be used by the neighbor ASBR to send traffic to this ASBR.

2462	8.2.1.2.2. Leaf Auto-Discovery Route received via EBGP

2464	   When an ASBR receives via EBGP a leaf auto-discovery route, the ASBR
2465	   finds an inter-AS auto-discovery route that has the same RD as the
2466	   leaf auto-discovery route. The MPLS label carried in the leaf auto-
2467	   discovery route is used to stitch a one hop ASBR-ASBR LSP to the tail
2468	   of the intra-AS tunnel segment associated with the inter-AS auto-
2469	   discovery route.

2471	8.2.1.2.3. Inter-AS Auto-Discovery Route received via IBGP

2473	   If a given inter-AS auto-discovery route is advertised within an AS
2474	   by multiple ASBRs of that AS, the BGP best route selection performed
2475	   by other PE/ASBR routers within the AS does not require all these
2476	   PE/ASBR routers to select the route advertised by the same ASBR - to
2477	   the contrary different PE/ASBR routers may select routes advertised
2478	   by different ASBRs.

2480	   Further when a PE/ASBR receives from one of its IBGP neighbors a BGP
2481	   Update message that carries a AS MVPN membership tree , if (a) the
2482	   route was originated outside of the router's own AS, (b) at least one
2483	   of the Route Targets carried in the message matches one of the import
2484	   Route Targets configured on the PE/ASBR, and (c) the PE/ASBR
2485	   determines that the received route is the best route to the
2486	   destination carried in the NLRI of the route, if the router is an
2487	   ASBR then the ASBR propagates the route to its EBGP neighbors. In
2488	   addition the PE/ASBR performs the following.

2490	   If the received inter-AS auto-discovery route carries the Tunnel
2491	   attribute with the Tunnel Identifier set to LDP P2MP LSP, or PIM-SSM
2492	   tree, or PIM-SM tree, the PE/ASBR SHOULD join the P-Multicast tree
2493	   whose identity is carried in the Tunnel Identifier.

2495	   If the received source auto-discovery route carries the Tunnel
2496	   attribute with the Tunnel Identifier set to RSVP-TE P2MP LSP, then
2497	   the ASBR that originated the route MUST signal the local PE/ASBR as
2498	   one of leaf LSRs of the RSVP-TE P2MP LSP. This signaling MAY have
2499	   been completed before the local PE/ASBR receives the BGP Update
2500	   message.

2502	   If the NLRI of the route does not carry a label, then this tree is an
2503	   intra-AS LSP segment that is part of the inter-AS Tunnel for the MVPN
2504	   advertised by the inter-AS auto-discovery route. If the NLRI carries
2505	   a (upstream) label, then a combination of this tree and the label
2506	   identifies the intra-AS segment.

2508	   If this is an ASBR, this intra-AS segment may further be stitched to
2509	   ASBR-ASBR inter-AS segment of the inter-AS tunnel. If the PE/ASBR has
2510	   local receivers in the MVPN, packets received over the intra-AS
2511	   segment must be forwarded to the local receivers using the local VRF.

2513	   If the received inter-AS auto-discovery route either does not carry
2514	   the Tunnel attribute, or carries the Tunnel attribute with the Tunnel
2515	   Identifier set to ingress replication, then the PE/ASBR originates a
2516	   new auto-discovery route to allow the ASBR from which the auto-
2517	   discovery route was received, to learn of this ASBR as a leaf of the
2518	   intra-AS tree.

2520	   Thus the AS MVPN membership information propagates across multiple
2521	   ASes along a spanning tree. BGP AS-Path based loop prevention
2522	   mechanism prevents loops from forming as this information propagates.

2524	8.2.2. Inter-AS MVPN Routing Information Exchange

2526	   All of the MVPN routing information exchange methods specified in
2527	   section 5 can be supported across ASes.

2529	   The objective in this case is to propagate the MVPN routing
2530	   information to the remote PE that originates the unicast route to C-
2531	   S/C-RP, in the reverse direction of the AS MVPN membership
2532	   information announced by the remote PE's origin AS. This information
2533	   is processed by each ASBR along this reverse path.

2535	   To achieve this the PE that is generating the MVPN routing
2536	   advertisement, first determines the source AS of the unicast route to
2537	   C-S/C-RP. It then determines from the received AS MVPN membership
2538	   information, for the source AS, the ASBR that is the next-hop for the
2539	   best path of the source AS MVPN membership. The BGP MVPN routing
2540	   update is sent to this ASBR and the ASBR then further propagates the
2541	   BGP advertisement. BGP filtering mechanisms ensure that the BGP MVPN
2542	   routing information updates flow only to the upstream router on the
2543	   reverse path of the inter-AS MVPN membership tree. Details of this
2544	   filtering mechanism and the relevant encoding will be specified in a
2545	   separate document.

2547	8.2.3. Inter-AS I-PMSI

2549	   All PEs in a given AS, use the same inter-AS heterogeneous tunnel,
2550	   rooted at the AS, to instantiate an I-PMSI for an inter-AS MVPN
2551	   service. As explained earlier the intra-AS tunnel segments that
2552	   comprise this tunnel can be built using different tunneling
2553	   technologies. To instantiate an MI-PMSI service for a MVPN there must
2554	   be an inter-AS tunnel rooted at each AS that has at least one PE that
2555	   is a member of the MVPN.

2557	   A C-multicast data packet is sent using an intra-AS tunnel segment by
2558	   the PE that first receives this packet from the MVPN customer site.
2559	   An ASBR forwards this packet to any locally connected MVPN receivers
2560	   for the multicast stream. If this ASBR has received a tunnel binding
2561	   for the AS MVPN membership that it advertised to a neighboring ASBR,
2562	   it also forwards this packet to the neighboring ASBR. In this case
2563	   the packet is encapsulated in the downstream MPLS label received from
2564	   the neighboring ASBR. The neighboring ASBR delivers this packet to
2565	   any locally connected MVPN receivers for that multicast stream. It
2566	   also transports this packet on an intra-AS tunnel segment, for the
2567	   inter-AS MVPN tunnel, and the other PEs and ASBRs in the AS then
2568	   receive this packet.  The other ASBRs then repeat the procedure
2569	   followed by the ASBR in the origin AS and the packet traverses the
2570	   overlay inter-AS tunnel along a spanning tree.

2572	8.2.3.1. Support for Unicast VPN Inter-AS Methods

2574	   The above procedures for setting up an inter-AS I-PMSI can be
2575	   supported for each of the unicast VPN inter-AS models described in
2576	   [RFC4364]. These procedures do not depend on the method used to
2577	   exchange unicast VPN routes. For Option B and Option C they do
2578	   require MPLS encapsulation between the ASBRs.

2580	8.2.4. Inter-AS S-PMSI

2582	   An inter-AS tunnel for an S-PMSI is constructed similar to an inter-
2583	   AS tunnel for an I-PMSI. Namely, such a tunnel is constructed as a
2584	   concatenation of tunnel segments. There are two types of tunnel
2585	   segments: an intra-AS tunnel segment (a segment that spans ASBRs
2586	   within the same AS), and inter-AS tunnel segment (a segment that
2587	   spans adjacent ASBRs in adjacent ASes). ASes that are spanned by a
2588	   tunnel are not required to use the same tunneling mechanism to
2589	   construct the tunnel - each AS may pick up a tunneling mechanism to
2590	   construct the intra-AS tunnel segment of the tunnel on its

2592	   The PE that decides to set up a S-PMSI, advertises the S-PMSI tunnel
2593	   binding using procedures in section 7.3.2 to the routers in its own
2594	   AS. The <C-S, C-G> membership for which the S-PMSI is instantiated,
2595	   is propagated along an inter-AS spanning tree. This spanning tree
2596	   traverses the same ASBRs as the AS MVPN membership spanning tree. In
2597	   addition to the information elements described in section 7.3.2
2598	   (Origin AS, RD, next-hop) the C-S and C-G is also advertised.

2600	   An ASBR that receives the AS <C-S, C-G> information from its upstream
2601	   ASBR using EBGP sends back a tunnel binding for AS <C-S, C-G>
2602	   information if a) at least one of the Route Targets carried in the
2603	   message matches one of the import Route Targets configured on the
2604	   ASBR, and (b) the ASBR determines that the received route is the best
2605	   route to the destination carried in the NLRI of the route. If the
2606	   ASBR instantiates a S-PMSI for the AS <C-S, C-G> it sends back a
2607	   downstream label that is used to forward the packet along its intra-
2608	   AS S-PMSI for the <C-S, C-G>. However the ASBR may decide to use an
2609	   AS MVPN membership I-PMSI instead, in which case it sends back the
2610	   same label that it advertised for the AS MVPN membership I-PMSI. If
2611	   the downstream ASBR instantiates a S-PMSI, it further propagates the
2612	   <C-S, C-G> membership to its downstream ASes, else it does not.

2614	   An AS can instantiate an intra-AS S-PMSI for the inter-AS S-PMSI
2615	   tunnel only if the upstream AS instantiates a S-PMSI. The procedures
2616	   allow each AS to determine whether it wishes to setup a S-PMSI or not
2617	   and the AS is not forced to setup a S-PMSI just because the upstream
2618	   AS decides to do so.

2620	   The leaves of an intra-AS S-PMSI tunnel will be the PEs that have
2621	   local receivers that are interested in <C-S, C-G> and the ASBRs that
2622	   have received MVPN routing information for <C-S, C-G>. Note that an
2623	   AS can determine these ASBRs as the MVPN routing information is
2624	   propagated and processed by each ASBR on the AS MVPN membership
2625	   spanning tree.

2627	   The C-multicast data traffic is sent on the S-PMSI by the originating
2628	   PE.  When it reaches an ASBR that is on the spanning tree, it is
2629	   delivered to local receivers, if any, and is also forwarded to the
2630	   neighbor ASBR after being encapsulated in the label advertised by the
2631	   neighbor. The neighbor ASBR either transports this packet on the S-
2632	   PMSI for the multicast stream or an I-PMSI, delivering it to the
2633	   ASBRs in its own AS. These ASBRs in turn repeat the procedures of the
2634	   origin AS ASBRs and the multicast packet traverses the spanning tree.

2636	9. Duplicate Packet Detection and Single Forwarder PE

2638	   An egress PE may receive duplicate multicast data packets, from more
2639	   than one ingress PE, for a MVPN when a a site that contains C-S or
2640	   C-RP is multihomed to more than one PE. An egress PE may also receive
2641	   duplicate data packets for a MVPN, from two different ingress PEs,
2642	   when the CE-PE routing protocol is PIM-SM and a router or a CE in a
2643	   site switches from the C-RP tree to C-S tree.

2645	   For a given <C-S, C-G> a PE, say PE1, expects to receive C-data
2646	   packets from the upstream PE, say PE2, which PE1 identified as the
2647	   upstream multicast hop in the C-Multicast Routing Update that PE1
2648	   sent in order to join <C-S, C-G>. If PE1 can determine that a data
2649	   packet for <C-S, C-G> was received from the expected upstream PE,
2650	   PE2, PE1 will accept the packet.  Otherwise, PE1 will drop the
2651	   packet.  (But see section 10 for an exception case where PE1 will
2652	   accept a packet even if it is from an unexpected upstream PE.) This
2653	   determination can be performed only if the PMSI on which the packets
2654	   are being received and the tunneling technology used to instantiate
2655	   the PMSI allows the PE to determine the source PE that sent the
2656	   packet. However this determination may not always be possible.

2658	   Therefore, procedures are needed to ensure that packets are received
2659	   at a PE only from a single upstream PE.  This is called single
2660	   forwarder PE selection.

2662	   Single forwarder PE selection is achieved by the following set of
2663	   procedures:

2665	      a. If there is more than one PE within the same AS through which
2666	         C-S or C-RP of a given MVPN could be reached, and in the case
2667	         of C-S not every such PE advertises an S-PMSI for <C-S, C-G>,
2668	         all PEs that have this MVPN MUST send the MVPN routing
2669	         information update for <C-S, C-G> or <C-*, C-G> to the same
2670	         upstream PE.  This is achieved using the following procedure:

2672	         Using the procedure for "RPF determination" specified in
2673	         section 5.1, find (a) the upstream multicast hop for the C-S or
2674	         C-RP, and (b) the route used to reach the upstream multicast
2675	         hop.  Call this route the "installed RPF route" for C-S or C-
2676	         RP.

2678	         If the next-hop interface of the installed RPF route for C-S or
2679	         C-RP is a VRF interface of the PE, then the PE uses that route
2680	         to reach the C-S or C-RP.

2682	         Otherwise, consider the set of all VPN-IP routes that are (a)
2683	         eligible to be imported into the VRF (as determined by their
2684	         Route Targets), (b) are eligible to be used for RPF
2685	         determination (i.e., if RPF determination is done via a non-
2686	         congruent multicast topology, this would include only the
2687	         routes that are part of that topology), and (c) have exactly
2688	         the same IP prefix as the installed RPF route.

2690	         For each route in this set, determine the corresponding
2691	         upstream PE.  If a route has a VRF Route Import Extended
2692	         Community, the route's upstream PE is determined from it. If a
2693	         route does not have a VRF Route Import Extended Community, the
2694	         route's upstream PE is determined from the route's BGP next hop
2695	         attribute.

2697	         This results in a set of pairs of <route, upstream PE>.  The PE
2698	         will select the route whose corresponding upstream PE address
2699	         is numerically highest, where a 32-bit IP address is treated as
2700	         a 32-bit unsigned integer.  Call this the "selected RPF route".
2701	         The PE will use the selected RPF route to reach the C-S or C-
2702	         RP.

2704	      b. The above procedure ensures that if C-S or C-RP is multi-homed
2705	         to PEs within a single AS, a PE will not receive duplicate
2706	         traffic as long as all the PEs in that AS are on either the C-S
2707	         or C-RP tree.

2709	         However the PE may receive duplicate traffic if C-S or C-RP is
2710	         multi-homed to different ASes. In this case the PE can detect
2711	         duplicate traffic as such duplicate traffic will arrive on a
2712	         different tunnel - if the PE was expecting the traffic on an
2713	         inter-AS tunnel, duplicate traffic will arrive on an intra-AS
2714	         tunnel [this is not an intra-AS tunnel segment, of an inter-AS
2715	         tunnel] and vice-versa.

2717	         To achieve the above the PE has to keep track of which (inter-
2718	         AS) auto-discovery route the PE uses for sending MVPN multicast
2719	         routing information towards C-S/C-RP. Then the PE should
2720	         receive (multicast) traffic originated by C-S/C-RP only from
2721	         the (inter-AS) tunnel that was carried in the best source
2722	         auto-discovery route for the MVPN and was originated by the AS
2723	         that contains C-S/C-RP (where "the best" is determined by the
2724	         PE). All other multicast traffic originated by C-S/C-RP, but
2725	         received on any other tunnel should be discarded as duplicated.

2727	         The PE may also receive duplicate traffic during a <C-*, C-G>
2728	         to <C-S, C-G> switch. The issue and the solution are described
2729	         next.

2731	      c. If the tunneling technology in use for a particular MVPN does
2732	         not allow the egress PEs to identify the ingress PE, then
2733	         having all the PEs select the same PE to be the upstream
2734	         multicast hop is not sufficient to prevent packet duplication.
2735	         The reason is that a single tunnel may be carrying traffic on
2736	         both the (C-*, C-G) tree and the (C-S, C-G) tree.  If some of
2737	         the egress PEs have joined the source tree, but others expect
2738	         to receive (S,G) packets from the shared tree, then two copies
2739	         of data packet will travel on the tunnel, and the egress PEs
2740	         will have no way to determine that only one copy should be
2741	         accepted.

2743	         To avoid this, it is necessary to ensure that once any PE joins
2744	         the (C-S, C-G) tree, any other PE that has joined the (C-*, C-
2745	         G) tree also switches to the (C-S, C-G) tree  (selecting, of
2746	         course, the same upstream multicast hop, as specified above).

2748	         Whenever a PE creates an <C-S,C-G> state as a result of
2749	         receiving a C-multicast route for <C-S, C-G> from some other
2750	         PE, and the C-G group is a Sparse Mode group, the PE that
2751	         creates the state MUST originate an auto-discovery route as
2752	         specified below. The route is being advertised using the same
2753	         procedures as the MVPN auto-discovery/binding (both intra-AS
2754	         and inter-AS) specified in this document with the following
2755	         modifications:

2757	            1. The Multicast Source field MUST be set to C-S.  The
2758	               Multicast Source Length field is set appropriately to
2759	               reflect this.

2761	            2. The Multicast Group field MUST be set to C-G.  The
2762	               Multicast Group Length field is set appropriately to
2763	               reflect this.

2765	         The route goes to all the PEs of the MVPN. When a PE receives
2766	         this route, it checks whether there are any receivers in the
2767	         MVPN sites attached to the PE for the group carried in the
2768	         route. If yes, then it generates a C-multicast route indicating
2769	         Join for <C-S, C-G>.  This forces all the PEs (in all ASes) to
2770	         switch to the C-S tree for <C-S, C-G> from the C-RP tree.

2772	         This is the same type of A-D route used to report active
2773	         sources in the scenarios described in section 10.

2775	         Note that when a PE thus joins the <C-S, C-G> tree, it may need
2776	         to send a PIM (S,G,RPT-bit) prune to one of its CE PIM
2777	         neighbors, as determined by ordinary PIM procedures..

2779	         Whenever the PE deletes the <C-S, C-G> state that was
2780	         previously created as a result of receiving a C-multicast route
2781	         for <C-S, C-G> from some other PE, the PE that deletes the
2782	         state also withdraws the auto-discovery route that was
2783	         advertised when the state was created.

2785	         N.B.: SINCE ALL PES WITH RECEIVERS FOR GROUP C-G WILL JOIN THE
2786	         C-S SOURCE TREE IF ANY OF THEM DO, IT IS NEVER NECESSARY TO
2787	         DISTRIBUTE A BGP C-MULTICAST ROUTE FOR THE PURPOSE OF PRUNING
2788	         SOURCES FROM THE SHARED TREE.

2790	   In summary when the CE-PE routing protocol for all PEs that belong to
2791	   a MVPN is not PIM-SM, selection of a consistent upstream PE to reach
2792	   C-S is sufficient to eliminate duplicates when C-S is multi-homed to
2793	   a single AS. When C-S is multi-homed to multiple ASes, duplicate
2794	   packet detection can be performed as the receiver PE can always
2795	   determine whether packets arrived on the wrong tunnel. When the CE-PE
2796	   routing protocol is PIM-SM, additional procedures as described above
2797	   are required to force all PEs within all ASes to switch to the C-S
2798	   tree from the C-RP tree when any PE switches to the C-S tree.

2800	10. Deployment Models

2802	   This section describes some optional deployment models and specific
2803	   procedures for those deployment models.

2805	10.1. Co-locating C-RPs on a PE

2807	   [MVPN-REQ] describes C-RP engineering as an issue when PIM-SM (or
2808	   bidir-PIM) is used in ASM mode on the VPN customer site. To quote
2809	   from [MVPN-REQ]:

2811	   "In some cases this engineering problem is not trivial: for instance,
2812	   if sources and receivers are located in VPN sites that are different
2813	   than that of the RP, then traffic may flow twice through the SP
2814	   network and the CE-PE link of the RP (from source to RP, and then
2815	   from RP to receivers) ; this is obviously not ideal.  A multicast VPN
2816	   solution SHOULD propose a way to help on solving this RP engineering
2817	   issue."

2819	   One of the C-RP deployment models is for the customer to outsource
2820	   the RP to the provider. In this case the provider may co-locate the
2821	   RP on the PE that is connected to the customer site [MVPN-REQ]. This
2822	   model is introduced in [RP-MVPN]. This section describes how
2823	   anycast-RP can be used for achieving this by advertising active
2824	   sources. This is described below.

2826	10.1.1. Initial Configuration

2828	   For a particular MVPN, at least one or more PEs that have sites in
2829	   that MVPN, act as an RP for the sites of that MVPN connected to these
2830	   PEs.  Within each MVPN all these RPs use the same (anycast) address.
2831	   All these RPs use the Anycast RP technique.

2833	10.1.2. Anycast RP Based on Propagating Active Sources

2835	   This mechanism is based on propagating active sources between RPs.

2837	   [Editor's Note: This is derived from the model in [RP-MVPN].]

2839	10.1.2.1. Receiver(s) Within a Site

2841	   The PE which receives C-Join for (*,G) or (S,G) does not send the
2842	   information that it has receiver(s) for G until it receives
2843	   information about active sources for G from an upstream PE.

2845	   On receiving this (described in the next section), the downstream PE
2846	   will respond with Join for C-(S,G). Sending this information could be
2847	   done using any of the procedures described in section 5. If BGP is
2848	   used, the ingress address is set to the upstream PE's address which
2849	   has triggered the source active information. Only the upstream PE
2850	   will process this information. If unicast PIM is used then a unicast
2851	   PIM message will have to be sent to the PE upstream PE that has
2852	   triggered the source active information. If a MI-PMSI is used than
2853	   further clarification is needed on the upstream neighbor address of
2854	   the PIM message and will be provided in a future revision.

2856	10.1.2.2. Source Within a Site

2858	   When a PE receives PIM-Register from a site that belongs to a given
2859	   VPN, PE follows the normal PIM anycast RP procedures. It then
2860	   advertises the source and group of the multicast data packet carried
2861	   in PIM-Register message to other PEs in BGP using the following
2862	   information elements:

2864	     - Active source address

2866	     - Active group address

2868	     - Route target of the MVPN.

2870	   This advertisement goes to all the PEs that belong to that MVPN. When
2871	   a PE receives this advertisement, it checks whether there are any
2872	   receivers in the sites attached to the PE for the group carried in
2873	   the source active advertisement. If yes, then it generates an
2874	   advertisement for C-(S,G) as specified in the previous section.

2876	   Note that the mechanism described in section 7.3.2. can be leveraged
2877	   to advertise a S-PMSI binding along with the source active messages.

2879	10.1.2.3. Receiver Switching from Shared to Source Tree

2881	   No additional procedures are required when multicast receivers in
2882	   customer's site shift from shared tree to source tree.

2884	10.2. Using MSDP between a PE and a Local C-RP

2886	   Section 10.1 describes the case where each PE is a C-RP.  This
2887	   enables the PEs to know the active multicast sources for each MVPN,
2888	   and they can then use BGP to distribute this information to each
2889	   other.  As a result, the PEs do not have to join any shared C-trees,
2890	   and this results in a simplification of the PE operation.

2892	   In another deployment scenario, the PEs are not themselves C-RPs, but
2893	   use MSDP to talk to the C-RPs.  In particular, a PE which attaches to
2894	   a site that contains a C-RP becomes an MSDP peer of that C-RP.  That
2895	   PE then uses BGP to distribute the information about the active
2896	   sources to the other PEs.  When the PE determines, by MSDP, that a
2897	   particular source is no longer active, then it withdraws the
2898	   corresponding BGP update.  Then the PEs do not have to join any
2899	   shared C-trees, but they do not have to be C-RPs either.

2901	   MSDP provides the capability for a Source Active message to carry an
2902	   encapsulated data packet.  This capability can be used to allow an
2903	   MSDP speaker to receive the first (or first several) packet(s) of an
2904	   (S,G) flow, even though the MSDP speaker hasn't yet joined the (S,G)
2905	   tree.  (Presumably it will join that tree as a result of receiving
2906	   the SA message which carries the encapsulated data packet.)  If this
2907	   capability is not used, the first several data packets of an (S,G)
2908	   stream may be lost.

2910	   A PE which is talking MSDP to an RP may receive such an encapsulated
2911	   data packet from the RP.  The data packet should be decapsulated and
2912	   transmitted to the other PEs in the MVPN.  If the packet belongs to a
2913	   particular (S,G) flow, and if the PE is a transmitter for some S-PMSI
2914	   to which (S,G) has already been bound, the decapsulated data packet
2915	   should be transmitted on that S-PMSI.  Otherwise, if an I-PMSI exists
2916	   for that MVPN, the decapsulated data packet should be transmitted on
2917	   it.  (If a default MI-PMSI exists, this would typically be used.)  If
2918	   neither of these conditions hold, the decapsulated data packet is not
2919	   transmitted to the other PEs in the MVPN.  The decision as to whether
2920	   and how to transmit the decapsulated data packet does not effect the
2921	   processing of the SA control message itself.

2923	   Suppose that PE1 transmits a multicast data packet on a PMSI, where
2924	   that data packet is part of an (S,G) flow, and PE2 receives that
2925	   packet form that PMSI.  According to section 9, PE1 is not the PE
2926	   that PE2 expects to be transmitting (S,G) packets, then PE2 must
2927	   discard the packet.  If an MSDP-encapsulated data packet is
2928	   transmitted on a PMSI as specified above, this rule from section 9
2929	   would likely result in the packet's getting discarded.  Therefore, if
2930	   MSDP-encapsulated data packets being decapsulated and transmitted on
2931	   a PMSI, we need to modify the rules of section 9 as follows:

2933	      1. If the receiving PE, PE1, has already joined the (S,G) tree,
2934	         and has chosen PE2 as the upstream PE for the (S,G) tree, but
2935	         this packet does not come from PE2, PE1 must discard the
2936	         packet.

2938	      2. If the receiving PE, PE1, has not already joined the (S,G)
2939	         tree, but is a PIM adjacency to a CE which is downstream on the
2940	         (*,G) tree, the packet should be forwarded to the CE.

2942	11. Encapsulations

2944	   The BGP-based auto-discovery procedures will ensure that the PEs in a
2945	   single MVPN only use tunnels that they can all support, and for a
2946	   given kind of tunnel, that they only use encapsulations that they can
2947	   all support.

2949	11.1. Encapsulations for Single PMSI per Tunnel

2951	11.1.1. Encapsulation in GRE

2953	   GRE encapsulation can be used for any PMSI that is instantiated by a
2954	   mesh of unicast tunnels, as well as for any PMSI that is instantiated
2955	   by one or more PIM tunnels of any sort.

2957	   Packets received        Packets in transit      Packets forwarded
2958	   at ingress PE           in the service          by egress PEs
2959	                           provider network

2961	                           +---------------+
2962	                           |  P-IP Header  |
2963	                           +---------------+
2964	                           |      GRE      |
2965	   ++=============++       ++=============++       ++=============++
2966	   || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
2967	   ++=============++ >>>>> ++=============++ >>>>> ++=============++
2968	   || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
2969	   ++=============++       ++=============++       ++=============++

2971	   The IP Protocol Number field in the P-IP Header must be set to 47.
2972	   The Protocol Type field of the GRE Header must be set to 0x800.

2974	   When an encapsulated packet is transmitted by a particular PE, the
2975	   source IP address in the P-IP header must be the same address as is
2976	   advertised by that PE in the RPF information.

2978	   If the PMSI is instantiated by a PIM tree, the destination IP address
2979	   in the P-IP header is the group P-address associated with that tree.
2980	   The GRE key field value is omitted.

2982	   If the PMSI is instantiated by unicast tunnels, the destination IP
2983	   address is the address of the destination PE, and the optional GRE
2984	   Key field is used to identify a particular MVPN.  In this case, each
2985	   PE would have to advertise a key field value for each MVPN; each PE
2986	   would assign the key field value that it expects to receive.

2988	   [RFC2784] specifies an optional GRE checksum, and [RFC2890] specifies
2989	   an optional GRE sequence number fields.

2991	   The GRE sequence number field is not needed because the transport
2992	   layer services for the original application will be provided by the
2993	   C-IP Header.

2995	   The use of GRE checksum field must follow [RFC2784].

2997	   To facilitate high speed implementation, this document recommends
2998	   that the ingress PE routers encapsulate VPN packets without setting
2999	   the checksum, or sequence fields.

3001	11.1.2. Encapsulation in IP

3003	   IP-in-IP [RFC1853] is also a viable option.  When it is used, the
3004	   IPv4 Protocol Number field is set to 4. The following diagram shows
3005	   the progression of the packet as it enters and leaves the service
3006	   provider network.

3008	   Packets received        Packets in transit      Packets forwarded
3009	   at ingress PE           in the service          by egress PEs
3010	                           provider network

3012	                           +---------------+
3013	                           |  P-IP Header  |
3014	   ++=============++       ++=============++       ++=============++
3015	   || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
3016	   ++=============++ >>>>> ++=============++ >>>>> ++=============++
3017	   || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
3018	   ++=============++       ++=============++       ++=============++

3020	11.1.3. Encapsulation in MPLS

3022	   If the PMSI is instantiated as a P2MP MPLS LSP, MPLS encapsulation is
3023	   used. Penultimate-hop-popping must be disabled for the P2MP MPLS LSP.
3024	   If the PMSI is instantiated as an RSVP-TE P2MP LSP, additional MPLS
3025	   encapsulation procedures are used, as specified in [RSVP-P2MP].

3027	   If other methods of assigning MPLS labels to multicast distribution
3028	   trees are in use, these multicast distribution trees may be used as
3029	   appropriate to instantiate PMSIs, and any additional MPLS
3030	   encapsulation procedures may be used.

3032	   Packets received        Packets in transit      Packets forwarded
3033	   at ingress PE           in the service          by egress PEs
3034	                           provider network

3036	                           +---------------+
3037	                           | P-MPLS Header |
3038	   ++=============++       ++=============++       ++=============++
3039	   || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
3040	   ++=============++ >>>>> ++=============++ >>>>> ++=============++
3041	   || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
3042	   ++=============++       ++=============++       ++=============++

3044	11.2. Encapsulations for Multiple PMSIs per Tunnel

3046	   The encapsulations for transmitting multicast data messages when
3047	   there are multiple PMSIs per tunnel are based on the encapsulation
3048	   for a single PMSI per tunnel, but with an MPLS label used for
3049	   demultiplexing.

3051	   The label is upstream-assigned and distributed via BGP as specified
3052	   in section 4.  The label must enable the receiver to select the
3053	   proper VRF, and may enable the receiver to select a particular
3054	   multicast routing entry within that VRF.

3056	11.2.1. Encapsulation in GRE

3058	   Rather than the IP-in-GRE encapsulation discussed in section 11.1.1,
3059	   we use the MPLS-in-GRE encapsulation.  This is specified in [MPLS-
3060	   IP].  The GRE protocol type MUST be set to 0x8847. [The reason for
3061	   using the unicast rather than the multicast value is specified in
3062	   [MPLS-MCAST-ENCAPS].

3064	11.2.2. Encapsulation in IP

3066	   Rather than the IP-in-IP encapsulation discussed in section 12.1.2,
3067	   we use the MPLS-in-IP encapsulation.  This is specified in [MPLS-IP].
3068	   The IP protocol number MUST be set to the value identifying the
3069	   payload as an MPLS unicast packet. [There is no "MPLS multicast
3070	   packet" protocol number.]

3072	11.3. Encapsulations for Unicasting PIM Control Messages

3074	   When PIM control messages are unicast, rather than being sent on an
3075	   MI-PMSI, the the receiving PE needs to determine the particular MVPN
3076	   whose multicast routing information is being carried in the PIM
3077	   message.  One method is to use a downstream-assigned MPLS label which
3078	   the receiving PE has allocated for this specific purpose.  The label
3079	   would be distributed via BGP.  This can be used with an MPLS, MPLS-
3080	   in-GRE, or MPLS-in-IP encapsulation.

3082	   A possible alternative to modify the PIM messages themselves so that
3083	   they carry information which can be used to identify a particular
3084	   MVPN, such as an RT.

3086	   This area is still under consideration.

3088	11.4. General Considerations for IP and GRE Encaps

3090	   These apply also to the MPLS-in-IP and MPLS-in-GRE encapsulations.

3092	11.4.1. MTU

3094	   It is the responsibility of the originator of a C-packet to ensure
3095	   that the packet small enough to reach all of its destinations, even
3096	   when it is encapsulated within IP or GRE.

3098	   When a packet is encapsulated in IP or GRE, the router that does the
3099	   encapsulation MUST set the DF bit in the outer header.  This ensures
3100	   that the decapsulating router will not need to reassemble the
3101	   encapsulating packets before performing decapsulation.

3103	   In some cases the encapsulating router may know that a particular C-
3104	   packet is too large to reach its destinations.  Procedures by which
3105	   it may know this are outside the scope of the current document.
3106	   However, if this is known, then:

3108	     - If the DF bit is set in the IP header of a C-packet which is
3109	       known to be too large, the router will discard the C-packet as
3110	       being "too large", and follow normal IP procedures (which may
3111	       require the return of an ICMP message to the source).

3113	     - If the DF bit is not set in the IP header of a C-packet which is
3114	       known to be too large, the router MAY fragment the packet before
3115	       encapsulating it, and then encapsulate each fragment separately.
3116	       Alternatively, the router MAY discard the packet.

3118	   If the router discards a packet as too large, it should maintain OAM
3119	   information related to this behavior, allowing the operator to
3120	   properly troubleshoot the issue.

3122	   Note that if the entire path of the tunnel does not support an MTU
3123	   which is large enough to carry the a particular encapsulated C-
3124	   packet, and if the encapsulating router does not do fragmentation,
3125	   then the customer will not receive the expected connectivity.

3127	11.4.2. TTL

3129	   The ingress PE should not copy the TTL field from the payload IP
3130	   header received from a CE router to the delivery IP or MPLS header.
3131	   The setting of the TTL of the delivery header is determined by the
3132	   local policy of the ingress PE router.

3134	11.4.3. Differentiated Services

3136	   The setting of the DS field in the delivery IP header should follow
3137	   the guidelines outlined in [RFC2983].  Setting the EXP field in the
3138	   delivery MPLS header should follow the guidelines in [RFC3270]. An SP
3139	   may also choose to deploy any of the additional mechanisms the PE
3140	   routers support.

3142	11.4.4. Avoiding Conflict with Internet Multicast

3144	   If the SP is providing Internet multicast, distinct from its VPN
3145	   multicast services, and using PIM based P-multicast trees, it must
3146	   ensure that the group P-addresses which it used in support of MPVN
3147	   services are distinct from any of the group addresses of the Internet
3148	   multicasts it supports.  This is best done by using administratively
3149	   scoped addresses [ADMIN-ADDR].

3151	   The group C-addresses need not be distinct from either the group P-
3152	   addresses or the Internet multicast addresses.

3154	12. Security Considerations

3156	   To be supplied.

3158	13. IANA Considerations

3160	   To be supplied.

3162	14. Other Authors

3164	   Sarveshwar Bandi, Yiqun Cai, Thomas Morin, Yakov Rekhter, IJsbrands
3165	   Wijnands, Seisho Yasukawa

3167	15. Other Contributors

3169	   Significant contributions were made Arjen Boers, Toerless Eckert,
3170	   Adrian Farrel, Luyuan Fang, Dino Farinacci, Lenny Guiliano, Shankar
3171	   Karuna, Anil Lohiya, Tom Pusateri, Ted Qian, Robert Raszuk, Tony
3172	   Speakman, Dan Tappan.

3174	16. Authors' Addresses

3176	      Rahul Aggarwal (Editor)
3177	      Juniper Networks
3178	      1194 North Mathilda Ave.
3179	      Sunnyvale, CA 94089
3180	      Email: rahul@juniper.net

3182	      Sarveshwar Bandi
3183	      Motorola
3184	      Vanenburg IT park, Madhapur,
3185	      Hyderabad, India
3186	      Email: sarvesh@motorola.com

3188	      Yiqun Cai
3189	      Cisco Systems, Inc.
3190	      170 Tasman Drive
3191	      San Jose, CA, 95134
3192	      E-mail: ycai@cisco.com

3194	      Thomas Morin
3195	      France Telecom R & D
3196	      2, avenue Pierre-Marzin
3197	      22307 Lannion Cedex
3198	      France
3199	      Email: thomas.morin@francetelecom.com

3201	      Yakov Rekhter
3202	      Juniper Networks
3203	      1194 North Mathilda Ave.
3204	      Sunnyvale, CA 94089
3205	      Email: yakov@juniper.net
3206	      Eric C. Rosen (Editor)
3207	      Cisco Systems, Inc.
3208	      1414 Massachusetts Avenue
3209	      Boxborough, MA, 01719
3210	      E-mail: erosen@cisco.com

3212	      IJsbrand Wijnands
3213	      Cisco Systems, Inc.
3214	      170 Tasman Drive
3215	      San Jose, CA, 95134
3216	      E-mail: ice@cisco.com

3218	      Seisho Yasukawa
3219	      NTT Corporation
3220	      9-11, Midori-Cho 3-Chome
3221	      Musashino-Shi, Tokyo 180-8585,
3222	      Japan
3223	      Phone: +81 422 59 4769
3224	      Email: yasukawa.seisho@lab.ntt.co.jp

3226	17. Normative References

3228	   [MVPN-BGP], R. Aggarwal, E. Rosen,  T. Morin, Y. Rekhter,  C.
3229	   Kodeboniya, "BGP Encodings for Multicast in MPLS/BGP IP VPNs",
3230	   draft-ietf-l3vpn-2547bis-mcast-bgp-02.txt, March 2007

3232	   [MPLS-IP] T. Worster, Y. Rekhter, E. Rosen, "Encapsulating MPLS in IP
3233	   or Generic Routing Encapsulation (GRE)", RFC 4023, March 2005

3235	   [MPLS-MCAST-ENCAPS] T. Eckert, E. Rosen, R. Aggarwal, Y. Rekhter,
3236	   "MPLS Multicast Encapsulations", draft-ietf-mpls-multicast-encaps-
3237	   04.txt, April 2007

3239	   [MPLS-UPSTREAM-LABEL] R. Aggarwal, Y. Rekhter, E. Rosen, "MPLS
3240	   Upstream Label Assignment and Context Specific Label Space", draft-
3241	   ietf-mpls-upstream-label-02.txt, March 2007

3243	   [PIM-SM]  "Protocol Independent Multicast - Sparse Mode (PIM-SM)",
3244	   Fenner, Handley, Holbrook, Kouvelas, August 2006, RFC 4601

3246	   [RFC2119] "Key words for use in RFCs to Indicate Requirement
3247	   Levels.", Bradner, March 1997

3249	   [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006

3251	   [RSVP-P2MP] R. Aggarwal, et. al., "Extensions to RSVP-TE for Point to
3252	   Multipoint TE LSPs", draft-ietf-mpls-rsvp-te-p2mp-07.txt, January
3253	   2007

3255	18. Informative References

3257	   [ADMIN-ADDR] D. Meyer, "Administratively Scoped IP Multicast", RFC
3258	   2365, July 1998

3260	   [MVPN-REQ] T. Morin, Ed., "Requirements for Multicast in L3
3261	   Provider-Provisioned VPNs", RFC 4834, April 2007

3263	   [MVPN-BASE] R. Aggarwal, A. Lohiya, T. Pusateri, Y. Rekhter, "Base
3264	   Specification for Multicast in MPLS/BGP VPNs", draft-raggarwa-l3vpn-
3265	   2547-mvpn-00.txt

3267	   [RAGGARWA-MCAST] R. Aggarwal, et. al., "Multicast in BGP MPLS VPNs
3268	   and VPLS", draft-raggarwa-l3vpn-mvpn-vpls-mcast-01.txt".

3270	   [ROSEN-8] E. Rosen, Y. Cai, I. Wijnands, "Multicast in MPLS/BGP IP
3271	   VPNs", draft-rosen-vpn-mcast-08.txt

3273	   [RP-MVPN] S. Yasukawa, et. al., "BGP/MPLS IP Multicast VPNs", draft-
3274	   yasukawa-l3vpn-p2mp-mcast-01.txt

3276	   [RFC1853] W. Simpson, "IP in IP Tunneling", October 1995

3278	   [RFC2784] D. Farinacci, et. al., "Generic Routing Encapsulation",
3279	   March 2000

3281	   [RFC2890] G. Dommety, "Key and Sequence Number Extensions to GRE",
3282	   September 2000

3284	   [RFC2983] D. Black, "Differentiated Services and Tunnels", October
3285	   2000

3287	   [RFC3270] F. Le Faucheur, et. al., "MPLS Support of Differentiated
3288	   Services", May 2002

3290	19. Full Copyright Statement

3292	   Copyright (C) The IETF Trust (2007).

3294	   This document is subject to the rights, licenses and restrictions
3295	   contained in BCP 78, and except as set forth therein, the authors
3296	   retain all their rights.

3298	   This document and the information contained herein are provided on an
3299	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
3300	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
3301	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
3302	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
3303	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
3304	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

3306	20. Intellectual Property

3308	   The IETF takes no position regarding the validity or scope of any
3309	   Intellectual Property Rights or other rights that might be claimed to
3310	   pertain to the implementation or use of the technology described in
3311	   this document or the extent to which any license under such rights
3312	   might or might not be available; nor does it represent that it has
3313	   made any independent effort to identify any such rights.  Information
3314	   on the procedures with respect to rights in RFC documents can be
3315	   found in BCP 78 and BCP 79.

3317	   Copies of IPR disclosures made to the IETF Secretariat and any
3318	   assurances of licenses to be made available, or the result of an
3319	   attempt made to obtain a general license or permission for the use of
3320	   such proprietary rights by implementers or users of this
3321	   specification can be obtained from the IETF on-line IPR repository at
3322	   http://www.ietf.org/ipr.

3324	   The IETF invites any interested party to bring to its attention any
3325	   copyrights, patents or patent applications, or other proprietary
3326	   rights that may cover technology that may be required to implement
3327	   this standard.  Please address the information to the IETF at ietf-
3328	   ipr@ietf.org.