idnits 2.17.1 

draft-ietf-l3vpn-2547bis-mcast-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 19.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 3757.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 3768.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 3775.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 3781.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     If the S-PMSI is instantiated by a source-initiated P-multicast
     tree (e.g., an RSVP-TE P2MP tunnel), the PE at the root of the tree must
     establish the source-initiated P-multicast tree to the leaves.  This tree
     MAY have been established before the leaves receive the S-PMSI binding,
     or MAY be established after the leaves receives the binding. The leaves
     MUST not switch to the S-PMSI until they receive both the binding and the
     tree signaling message.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (July 2007) is 6129 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-08) exists of
     draft-ietf-l3vpn-2547bis-mcast-bgp-03

  == Outdated reference: A later version (-10) exists of
     draft-ietf-mpls-multicast-encaps-06

  == Outdated reference: A later version (-07) exists of
     draft-ietf-mpls-upstream-label-02

  == Outdated reference: A later version (-06) exists of
     draft-ietf-pim-join-attributes-03

  ** Obsolete normative reference: RFC 4601 (ref. 'PIM-SM') (Obsoleted by RFC
     7761)

  == Outdated reference: A later version (-15) exists of
     draft-rosen-vpn-mcast-08


     Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                             Eric C. Rosen (Editor)
3	Internet Draft                                       Cisco Systems, Inc.
4	Expiration Date: January 2008
5	                                                 Rahul Aggarwal (Editor)
6	                                                        Juniper Networks

8	                                                               July 2007

10	                     Multicast in MPLS/BGP IP VPNs

12	                 draft-ietf-l3vpn-2547bis-mcast-05.txt

14	Status of this Memo

16	   By submitting this Internet-Draft, each author represents that any
17	   applicable patent or other IPR claims of which he or she is aware
18	   have been or will be disclosed, and any of which he or she becomes
19	   aware will be disclosed, in accordance with Section 6 of BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   The list of current Internet-Drafts can be accessed at
32	   http://www.ietf.org/ietf/1id-abstracts.txt.

34	   The list of Internet-Draft Shadow Directories can be accessed at
35	   http://www.ietf.org/shadow.html.

37	Abstract

39	   In order for IP multicast traffic within a BGP/MPLS IP VPN (Virtual
40	   Private Network) to travel from one VPN site to another, special
41	   protocols and procedures must be implemented by the VPN Service
42	   Provider.  These protocols and procedures are specified in this
43	   document.

45	Table of Contents

47	 1          Specification of requirements  .........................   5
48	 2          Introduction  ..........................................   5
49	 2.1        Optimality vs Scalability  .............................   5
50	 2.1.1      Multicast Distribution Trees  ..........................   7
51	 2.1.2      Ingress Replication through Unicast Tunnels  ...........   8
52	 2.2        Overview  ..............................................   8
53	 2.2.1      Multicast Routing Adjacencies  .........................   8
54	 2.2.2      MVPN Definition  .......................................   9
55	 2.2.3      Auto-Discovery  ........................................  10
56	 2.2.4      PE-PE Multicast Routing Information  ...................  10
57	 2.2.5      PE-PE Multicast Data Transmission  .....................  11
58	 2.2.6      Inter-AS MVPNs  ........................................  12
59	 2.2.7      Optionally Eliminating Shared Tree State  ..............  12
60	 3          Concepts and Framework  ................................  12
61	 3.1        PE-CE Multicast Routing  ...............................  12
62	 3.2        P-Multicast Service Interfaces (PMSIs)  ................  14
63	 3.2.1      Inclusive and Selective PMSIs  .........................  15
64	 3.2.2      Tunnels Instantiating PMSIs  ...........................  16
65	 3.3        Use of PMSIs for Carrying Multicast Data  ..............  18
66	 3.3.1      MVPNs with Default MI-PMSIs  ...........................  18
67	 3.3.2      When MI-PMSIs are Required  ............................  19
68	 3.3.3      MVPNs That Do Not Use MI-PMSIs  ........................  19
69	 3.4        PE-PE Transmission of C-Multicast Routing  .............  19
70	 3.4.1      PIM Peering  ...........................................  19
71	 3.4.1.1    Full Per-MVPN PIM Peering Across a MI-PMSI  ............  19
72	 3.4.1.2    Lightweight PIM Peering Across a MI-PMSI  ..............  20
73	 3.4.1.3    Unicasting of PIM C-Join/Prune Messages  ...............  21
74	 3.4.2      Using BGP to Carry C-Multicast Routing  ................  21
75	 4          BGP-Based Autodiscovery of MVPN Membership  ............  21
76	 5          PE-PE Transmission of C-Multicast Routing  .............  24
77	 5.1        Selecting the Upstream Multicast Hop (UMH)  ............  25
78	 5.1.1      Eligible Routes for UMH Selection  .....................  25
79	 5.1.2      Information Carried by Eligible UMH Routes  ............  26
80	 5.1.3      Selecting the Upstream PE  .............................  26
81	 5.1.4      Selecting the Upstream Multicast Hop  ..................  28
82	 5.2        Details of Per-MVPN Full PIM Peering over MI-PMSI  .....  28
83	 5.2.1      PIM C-Instance Control Packets  ........................  29
84	 5.2.2      PIM C-instance RPF Determination  ......................  29
85	 5.2.3      Backwards Compatibility  ...............................  30
86	 5.3        Use of BGP for Carrying C-Multicast Routing  ...........  30
87	 5.3.1      Sending BGP Updates  ...................................  30
88	 5.3.2      Explicit Tracking  .....................................  32
89	 5.3.3      Withdrawing BGP Updates  ...............................  32
90	 6          I-PMSI Instantiation  ..................................  32
91	 6.1        MVPN Membership and Egress PE Auto-Discovery  ..........  33
92	 6.1.1      Auto-Discovery for Ingress Replication  ................  33
93	 6.1.2      Auto-Discovery for P-Multicast Trees  ..................  34
94	 6.2        C-Multicast Routing Information Exchange  ..............  34
95	 6.3        Aggregation  ...........................................  34
96	 6.3.1      Aggregate Tree Leaf Discovery  .........................  35
97	 6.3.2      Aggregation Methodology  ...............................  35
98	 6.3.3      Encapsulation of the Aggregate Tree  ...................  36
99	 6.3.4      Demultiplexing C-multicast traffic  ....................  36
100	 6.4        Mapping Received Packets to MVPNs  .....................  37
101	 6.4.1      Unicast Tunnels  .......................................  38
102	 6.4.2      Non-Aggregated P-Multicast Trees  ......................  38
103	 6.4.3      Aggregate P-Multicast Trees  ...........................  39
104	 6.5        I-PMSI Instantiation Using Ingress Replication  ........  39
105	 6.6        Establishing P-Multicast Trees  ........................  40
106	 6.7        RSVP-TE P2MP LSPs  .....................................  41
107	 6.7.1      P2MP TE LSP Tunnel - MVPN Mapping  .....................  41
108	 6.7.2      Demultiplexing C-Multicast Data Packets  ...............  42
109	 7          Optimizing Multicast Distribution via S-PMSIs  .........  42
110	 7.1        S-PMSI Instantiation Using Ingress Replication  ........  43
111	 7.2        Protocol for Switching to S-PMSIs  .....................  44
112	 7.2.1      A UDP-based Protocol for Switching to S-PMSIs  .........  44
113	 7.2.1.1    Binding a Stream to an S-PMSI  .........................  44
114	 7.2.1.2    Packet Formats and Constants  ..........................  45
115	 7.2.2      A BGP-based Protocol for Switching to S-PMSIs  .........  47
116	 7.2.2.1    Advertising C-(S, G) Binding to a S-PMSI using BGP  ....  47
117	 7.2.2.2    Explicit Tracking  .....................................  49
118	 7.2.2.3    Switching to S-PMSI  ...................................  49
119	 7.3        Aggregation  ...........................................  50
120	 7.4        Instantiating the S-PMSI with a PIM Tree  ..............  50
121	 7.5        Instantiating S-PMSIs using RSVP-TE P2MP Tunnels  ......  51
122	 8          Inter-AS Procedures  ...................................  51
123	 8.1        Non-Segmented Inter-AS Tunnels  ........................  52
124	 8.1.1      Inter-AS MVPN Auto-Discovery  ..........................  52
125	 8.1.2      Inter-AS MVPN Routing Information Exchange  ............  52
126	 8.1.3      Inter-AS P-Tunnels  ....................................  53
127	 8.1.4      PIM-Based Inter-AS P-Multicast Trees  ..................  53
128	 8.2        Segmented Inter-AS Tunnels  ............................  54
129	 8.2.1      Inter-AS MVPN Auto-Discovery Routes  ...................  54
130	 8.2.1.1    Originating Inter-AS MVPN A-D Information  .............  55
131	 8.2.1.2    Propagating Inter-AS MVPN A-D Information  .............  56
132	 8.2.1.2.1  Inter-AS Auto-Discovery Route received via EBGP  .......  56
133	 8.2.1.2.2  Leaf Auto-Discovery Route received via EBGP  ...........  57
134	 8.2.1.2.3  Inter-AS Auto-Discovery Route received via IBGP  .......  57
135	 8.2.2      Inter-AS MVPN Routing Information Exchange  ............  59
136	 8.2.3      Inter-AS I-PMSI  .......................................  59
137	 8.2.3.1    Support for Unicast VPN Inter-AS Methods  ..............  60
138	 8.2.4      Inter-AS S-PMSI  .......................................  60
139	 9          Duplicate Packet Detection and Single Forwarder PE  ....  61
140	 9.1        Multihomed C-S or C-RP  ................................  62
141	 9.1.1      Single forwarder PE selection  .........................  63
142	 9.2        Switching from the C-RP tree to C-S tree  ..............  64
143	10          Eliminating PE-PE Distribution of (C-*,C-G) State  .....  65
144	10.1        Co-locating C-RPs on a PE  .............................  66
145	10.1.1      Initial Configuration  .................................  66
146	10.1.2      Anycast RP Based on Propagating Active Sources  ........  67
147	10.1.2.1    Receiver(s) Within a Site  .............................  67
148	10.1.2.2    Source Within a Site  ..................................  67
149	10.1.2.3    Receiver Switching from Shared to Source Tree  .........  68
150	10.2        Using MSDP between a PE and a Local C-RP  ..............  68
151	11          Encapsulations  ........................................  69
152	11.1        Encapsulations for Single PMSI per Tunnel  .............  69
153	11.1.1      Encapsulation in GRE  ..................................  69
154	11.1.2      Encapsulation in IP  ...................................  71
155	11.1.3      Encapsulation in MPLS  .................................  71
156	11.2        Encapsulations for Multiple PMSIs per Tunnel  ..........  72
157	11.2.1      Encapsulation in GRE  ..................................  72
158	11.2.2      Encapsulation in IP  ...................................  72
159	11.3        Encapsulations Identifying a Distinguished PE  .........  73
160	11.3.1      For MP2MP LSP P-tunnels  ...............................  73
161	11.3.2      For Support of PIM-BIDIR C-Groups  .....................  73
162	11.4        Encapsulations for Unicasting PIM Control Messages  ....  74
163	11.5        General Considerations for IP and GRE Encaps  ..........  74
164	11.5.1      MTU  ...................................................  74
165	11.5.2      TTL  ...................................................  75
166	11.5.3      Differentiated Services  ...............................  75
167	11.5.4      Avoiding Conflict with Internet Multicast  .............  75
168	12          Support for PIM-BIDIR C-Groups  ........................  76
169	12.1        The VPN Backbone Becomes the RPL  ......................  77
170	12.1.1      Control Plane  .........................................  77
171	12.1.2      Data Plane  ............................................  78
172	12.2        Partitioned Sets of PEs  ...............................  78
173	12.2.1      Partitions  ............................................  78
174	12.2.2      Using PE Labels  .......................................  79
175	12.2.3      Mesh of MP2MP P-Tunnels  ...............................  80
176	13          Security Considerations  ...............................  80
177	14          IANA Considerations  ...................................  80
178	15          Other Authors  .........................................  80
179	16          Other Contributors  ....................................  80
180	17          Authors' Addresses  ....................................  80
181	18          Normative References  ..................................  82
182	19          Informative References  ................................  83
183	20          Full Copyright Statement  ..............................  83
184	21          Intellectual Property  .................................  84

186	1. Specification of requirements

188	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
189	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
190	   document are to be interpreted as described in [RFC2119].

192	2. Introduction

194	   [RFC4364] specifies the set of procedures which a Service Provider
195	   (SP) must implement in order to provide a particular kind of VPN
196	   service ("BGP/MPLS IP VPN") for its customers.  The service described
197	   therein allows IP unicast packets to travel from one customer site to
198	   another, but it does not provide a way for IP multicast traffic to
199	   travel from one customer site to another.

201	   This document extends the service defined in [RFC4364] so that it
202	   also includes the capability of handling IP multicast traffic.  This
203	   requires a number of different protocols to work together.  The
204	   document provides a framework describing how the various protocols
205	   fit together, and also provides detailed specification of some of the
206	   protocols.  The detailed specification of some of the other protocols
207	   is found in pre-existing documents or in companion documents.

209	2.1. Optimality vs Scalability

211	   In a "BGP/MPLS IP VPN" [RFC4364], unicast routing of VPN packets is
212	   achieved without the need to keep any per-VPN state in the core of
213	   the SP's network (the "P routers").  Routing information from a
214	   particular VPN is maintained only by the Provider Edge routers (the
215	   "PE routers", or "PEs") that attach directly to sites of that VPN.
216	   Customer data travels through the P routers in tunnels from one PE to
217	   another (usually MPLS Label Switched Paths, LSPs), so to support the
218	   VPN service the P routers only need to have routes to the PE routers.

220	   The PE-to-PE routing is optimal, but the amount of associated state
221	   in the P routers depends only on the number of PEs, not on the number
222	   of VPNs.

224	   However, in order to provide optimal multicast routing for a
225	   particular multicast flow, the P routers through which that flow
226	   travels have to hold state which is specific to that flow.
227	   Scalability would be poor if the amount of state in the P routers
228	   were proportional to the number of multicast flows in the VPNs.
229	   Therefore, when supporting multicast service for a BGP/MPLS IP VPN,
230	   the optimality of the multicast routing must be traded off against
231	   the scalability of the P routers.   We explain this below in more
232	   detail.

234	   If a particular VPN is transmitting "native" multicast traffic over
235	   the backbone, we refer to it as an "MVPN".  By "native" multicast
236	   traffic, we mean packets that a CE sends to a PE, such that the IP
237	   destination address of the packets is a multicast group address, or
238	   the packets are multicast control packets addressed to the PE router
239	   itself, or the packets are IP multicast data packets encapsulated in
240	   MPLS.

242	   We say that the backbone multicast routing for a particular multicast
243	   group in a particular VPN is "optimal" if and only if all of the
244	   following conditions hold:

246	     - When a PE router receives a multicast data packet of that group
247	       from a CE router, it transmits the packet in such a way that the
248	       packet is received by every other PE router which is on the path
249	       to a receiver of that group;

251	     - The packet is not received by any other PEs;

253	     - While in the backbone, no more than one copy of the packet ever
254	       traverses any link.

256	     - While in the backbone, if bandwidth usage is to be optimized, the
257	       packet traverses minimum cost trees rather than shortest path
258	       trees.

260	   Optimal routing for a particular multicast group requires that the
261	   backbone maintain one or more source-trees which are specific to that
262	   flow.  Each such tree requires that state be maintained in all the P
263	   routers that are in the tree.

265	   This would potentially require an unbounded amount of state in the P
266	   routers, since the SP has no control of the number of multicast
267	   groups in the VPNs that it supports. Nor does the SP have any control
268	   over the number of transmitters in each group, nor of the
269	   distribution of the receivers.

271	   The procedures defined in this document allow an SP to provide
272	   multicast VPN service without requiring the amount of state
273	   maintained by the P routers to be proportional to the number of
274	   multicast data flows in the VPNs.  The amount of state is traded off
275	   against the optimality of the multicast routing.  Enough flexibility
276	   is provided so that a given SP can make his own tradeoffs between
277	   scalability and optimality.  An SP can even allow some multicast
278	   groups in some VPNs to receive optimal routing, while others do not.
279	   Of course, the cost of this flexibility is an increase in the number
280	   of options provided by the protocols.

282	   The basic technique for providing scalability is to aggregate a
283	   number of customer multicast flows onto a single multicast
284	   distribution tree through the P routers.  A number of aggregation
285	   methods are supported.

287	   The procedures defined in this document also accommodate the SP that
288	   does not want to build multicast distribution trees in his backbone
289	   at all; the ingress PE can replicate each multicast data packet and
290	   then unicast each replica through a tunnel to each egress PE that
291	   needs to receive the data.

293	2.1.1. Multicast Distribution Trees

295	   This document supports the use of a single multicast distribution
296	   tree in the backbone to carry all the multicast traffic from a
297	   specified set of one or more MVPNs.  Such a tree is referred to as an
298	   "Inclusive Tree". An Inclusive Tree which carries the traffic of more
299	   than one MVPN is an "Aggregate Inclusive Tree".  An Inclusive Tree
300	   contains, as its members, all the PEs that attach to any of the MVPNs
301	   using the tree.

303	   With this option, even if each tree supports only one MVPN, the upper
304	   bound on the amount of state maintained by the P routers is
305	   proportional to the number of VPNs supported, rather than to the
306	   number of multicast flows in those VPNs.  If the trees are
307	   unidirectional, it would be more accurate to say that the state is
308	   proportional to the product of the number of VPNs and the average
309	   number of PEs per VPN.  The amount of state maintained by the P
310	   routers can be further reduced by aggregating more MVPNs onto a
311	   single tree.  If each such tree supports a set of MVPNs, (call it an
312	   "MVPN aggregation set"), the state maintained by the P routers is
313	   proportional to the product of the number of MVPN aggregation sets
314	   and the average number of PEs per MVPN. Thus the state does not grow
315	   linearly with the number of MVPNs.

317	   However, as data from many multicast groups is aggregated together
318	   onto a single "Inclusive Tree", it is likely that some PEs will
319	   receive multicast data for which they have no need, i.e., some degree
320	   of optimality has been sacrificed.

322	   This document also provides procedures which enable a single
323	   multicast distribution tree in the backbone to be used to carry
324	   traffic belonging only to a specified set of one or more multicast
325	   groups, from one or more MVPNs. Such a tree is referred to as a
326	   "Selective Tree" and more specifically as an "Aggregate Selective
327	   Tree" when the multicast groups belong to different MVPNs.  By
328	   default, traffic from most multicast groups could be carried by an
329	   Inclusive Tree, while traffic from, e.g., high bandwidth groups could
330	   be carried in one of the "Selective Trees".  When setting up the
331	   Selective Trees, one should include only those PEs which need to
332	   receive multicast data from one or more of the groups assigned to the
333	   tree.  This provides more optimal routing than can be obtained by
334	   using only Inclusive Trees, though it requires additional state in
335	   the P routers.

337	2.1.2. Ingress Replication through Unicast Tunnels

339	   This document also provides procedures for carry MVPN data traffic
340	   through unicast tunnels from the ingress PE to each of the egress
341	   PEs. The ingress PE replicates the multicast data packet received
342	   from a CE and sends it to each of the egress PEs using the unicast
343	   tunnels.  This requires no multicast routing state in the P routers
344	   at all, but it puts the entire replication load on the ingress PE
345	   router, and makes no attempt to optimize the multicast routing.

347	2.2. Overview

349	2.2.1. Multicast Routing Adjacencies

351	   In BGP MPLS IP VPNs [RFC4364], each CE ("Customer Edge") router is a
352	   unicast routing adjacency of a PE router, but CE routers at different
353	   sites do not become unicast routing adjacencies of each other. This
354	   important characteristic is retained for multicast routing -- a CE
355	   router becomes a multicast routing adjacency of a PE router, but CE
356	   routers at different sites do not become multicast routing
357	   adjacencies of each other.

359	   The multicast routing protocol on the PE-CE link is presumed to be
360	   PIM.  The Sparse Mode, Dense Mode, Single Source Mode, and
361	   Bidirectional Modes are supported. A CE router exchanges "ordinary"
362	   PIM control messages with the PE router to which it is attached.

364	   The PEs attaching to a particular MVPN then have to exchange the
365	   multicast routing information with each other.  Two basic methods for
366	   doing this are defined: (1) PE-PE PIM, and (2) BGP.  In the former
367	   case, the PEs need to be multicast routing adjacencies of each other.
368	   In the latter case, they do not.  For example, each PE may be a BGP
369	   adjacency of a Route Reflector (RR), and not of any other PEs.

371	   To support the "Carrier's Carrier" model of [RFC4364], mLDP or BGP
372	   can be used on the PE-CE interface. This will be described in
373	   subsequent versions of this document.

375	2.2.2. MVPN Definition

377	   An MVPN is defined by two sets of sites, Sender Sites set and
378	   Receiver Sites set, with the following properties:

380	     - Hosts within the Sender Sites set could originate multicast
381	       traffic for receivers in the Receiver Sites set.

383	     - Receivers not in the Receiver Sites set should not be able to
384	       receive this traffic.

386	     - Hosts within the Receiver Sites set could receive multicast
387	       traffic originated by any host in the Sender Sites set.

389	     - Hosts within the Receiver Sites set should not be able to receive
390	       multicast traffic originated by any host that is not in the
391	       Sender Sites set.

393	   A site could be both in the Sender Sites set and Receiver Sites set,
394	   which implies that hosts within such a site could both originate and
395	   receive multicast traffic. An extreme case is when the Sender Sites
396	   set is the same as the Receiver Sites set, in which case all sites
397	   could originate and receive multicast traffic from each other.

399	   Sites within a given MVPN may be either within the same, or in
400	   different organizations, which implies that an MVPN can be either an
401	   Intranet or an Extranet.

403	   A given site may be in more than one MVPN, which implies that MVPNs
404	   may overlap.

406	   Not all sites of a given MVPN have to be connected to the same
407	   service provider, which implies that an MVPN can span multiple
408	   service providers.

410	   Another way to look at MVPN is to say that an MVPN is defined by a
411	   set of administrative policies. Such policies determine both Sender
412	   Sites set and Receiver Site set. Such policies are established by
413	   MVPN customers, but implemented/realized by MVPN Service Providers
414	   using the existing BGP/MPLS VPN mechanisms, such as Route Targets,
415	   with extensions, as necessary.

417	2.2.3. Auto-Discovery

419	   In order for the PE routers attaching to a given MVPN to exchange
420	   MVPN control information with each other, each one needs to discover
421	   all the other PEs that attach to the same MVPN.  (Strictly speaking,
422	   a PE in the receiver sites set need only discover the other PEs in
423	   the sender sites set and a PE in the sender sites set need only
424	   discover the other PEs in the receiver sites set.) This is referred
425	   to as "MVPN Auto-Discovery".

427	   This document discusses two ways of providing MVPN autodiscovery:

429	     - BGP can be used for discovering and maintaining MVPN membership.
430	       The PE routers advertise their MVPN membership to other PE
431	       routers using BGP. A PE is considered to be a "member" of a
432	       particular MVPN if it contains a VRF (Virtual Routing and
433	       Forwarding table, see [RFC4364]) which is configured to contain
434	       the multicast routing information of that MVPN.  This auto-
435	       discovery option does not make any assumptions about the methods
436	       used for transmitting MVPN multicast data packets through the
437	       backbone.

439	     - If it is known that the multicast data packets of a particular
440	       MVPN are to be transmitted (at least, by default) through a non-
441	       aggregated Inclusive Tree which is to be set up by PIM-SM or
442	       BIDIR-PIM, and if the PEs attaching to that MVPN are configured
443	       with the group address corresponding to that tree, then the PEs
444	       can auto-discover each other simply by joining the tree and then
445	       multicasting PIM Hellos over the tree.

447	2.2.4. PE-PE Multicast Routing Information

449	   The BGP/MPLS IP VPN [RFC4364] specification requires a PE to maintain
450	   at most one BGP peering with every other PE in the network. This
451	   peering is used to exchange VPN routing information. The use of Route
452	   Reflectors further reduces the number of BGP adjacencies maintained
453	   by a PE to exchange VPN routing information with other PEs. This
454	   document describes various options for exchanging MVPN control
455	   information between PE routers based on the use of PIM or BGP. These
456	   options have different overheads with respect to the number of
457	   routing adjacencies that a PE router needs to maintain to exchange
458	   MVPN control information with other PE routers. Some of these options
459	   allow the retention of the unicast BGP/MPLS VPN model letting a PE
460	   maintain at most one routing adjacency with other PE routers to
461	   exchange MVPN control information.

463	   The solution in [RFC4364] uses BGP to exchange VPN routing
464	   information between PE routers. This document describes various
465	   solutions for exchanging MVPN control information. One option is the
466	   use of BGP, providing reliable transport. Another option is the use
467	   of the currently existing, "soft state" PIM standard [PIM-SM].

469	2.2.5. PE-PE Multicast Data Transmission

471	   Like [RFC4364], this document decouples the procedures for exchanging
472	   routing information from the procedures for transmitting data
473	   traffic. Hence a variety of transport technologies may be used in the
474	   backbone. For inclusive trees, these transport technologies include
475	   unicast PE-PE tunnels (using MPLS or IP/GRE encapsulation), multicast
476	   distribution trees created by PIM-SSM, PIM-SM, or BIDIR-PIM (using
477	   IP/GRE encapsulation), point-to-multipoint LSPs created by RSVP-TE or
478	   mLDP, and multipoint-to-multipoint LSPs created by mLDP.  (However,
479	   techniques for aggregating the traffic of multiple MVPNs onto a
480	   single multipoint-to-multipoint LSP or onto a single bidirectional
481	   multicast distribution tree are for further study.) For selective
482	   trees, only unicast PE-PE tunnels (using MPLS or IP/GRE
483	   encapsulation) and unidirectional single-source trees are supported,
484	   and the supported tree creation protocols are PIM-SSM (using IP/GRE
485	   encapsulation), RSVP-TE, and mLDP.

487	   In order to aggregate traffic from multiple MVPNs onto a single
488	   multicast distribution tree, it is necessary to have a mechanism to
489	   enable the egresses of the tree to demultiplex the multicast traffic
490	   received over the tree and to associate each received packet with a
491	   particular MVPN.  This document specifies a mechanism whereby
492	   upstream label assignment [MPLS-UPSTREAM-LABEL] is used by the root
493	   of the tree to assign a label to each flow.  This label is used by
494	   the receivers to perform the demultiplexing. This document also
495	   describes procedures based on BGP that are used by the root of an
496	   Aggregate Tree to advertise the Inclusive and/or Selective binding
497	   and the demultiplexing information to the leaves of the tree.

499	   This document also describes the data plane encapsulations for
500	   supporting the various SP multicast transport options.

502	   This document assumes that when SP multicast trees are used, traffic
503	   for a particular multicast group is transmitted by a particular PE on
504	   only one SP multicast tree. The use of multiple SP multicast trees
505	   for transmitting traffic belonging to a particular multicast group is
506	   for further study.

508	2.2.6. Inter-AS MVPNs

510	   [RFC4364] describes different options for supporting BGP/MPLS IP
511	   unicast VPNs whose provider backbones contain more than one
512	   Autonomous System (AS).  These are know as Inter-AS VPNs. In an
513	   Inter-AS VPN, the ASes may belong to the same provider or to
514	   different providers.  This document describes how Inter-AS MVPNs can
515	   be supported for each of the unicast BGP/MPLS VPN Inter-AS options.
516	   This document also specifies a model where Inter-AS MVPN service can
517	   be offered without requiring a single SP multicast tree to span
518	   multiple ASes. In this model, an inter-AS multicast tree consists of
519	   a number of "segments", one per AS, which are stitched together at AS
520	   boundary points. These are known as "segmented inter-AS trees".  Each
521	   segment of a segmented inter-AS tree may use a different multicast
522	   transport technology.

524	   It is also possible to support Inter-AS MVPNs with non-segmented
525	   source trees that extend across AS boundaries.

527	2.2.7. Optionally Eliminating Shared Tree State

529	   The document also discusses some options and protocol extensions
530	   which can be used to eliminate the need for the PE routers to
531	   distribute to each other the (*, G) and (*, G, RPT-bit) states when
532	   there are PIM Sparse Mode multicast groups in the VPNs.

534	3. Concepts and Framework

536	3.1. PE-CE Multicast Routing

538	   Support of multicast in BGP/MPLS IP VPNs is modeled closely after
539	   support of unicast in BGP/MPLS IP VPNs. That is, a multicast routing
540	   protocol will be run on the PE-CE interfaces, such that PE and CE are
541	   multicast routing adjacencies on that interface.  CEs at different
542	   sites do not become multicast routing adjacencies of each other.

544	   If a PE attaches to n VPNs for which multicast support is provided
545	   (i.e., to n "MVPNs"), the PE will run n independent instances of a
546	   multicast routing protocol.  We will refer to these multicast routing
547	   instances as "VPN-specific multicast routing instances", or more
548	   briefly as "multicast C-instances". The notion of a "VRF" ("Virtual
549	   Routing and Forwarding Table"), defined in [RFC4364], is extended to
550	   include multicast routing entries as well as unicast routing entries.
551	   Each multicast routing entry is thus associated with a particular
552	   VRF.

554	   Whether a particular VRF belongs to an MVPN or not is determined by
555	   configuration.

557	   In this document, we will not attempt to provide support for every
558	   possible multicast routing protocol that could possibly run on the
559	   PE-CE link.  Rather, we consider multicast C-instances only for the
560	   following multicast routing protocols:

562	     - PIM Sparse Mode (PIM-SM)

564	     - PIM Single Source Mode (PIM-SSM)

566	     - PIM Bidirectional Mode (BIDIR-PIM)

568	     - PIM Dense Mode (PIM-DM)

570	   In order to support the "Carrier's Carrier" model of [RFC4364], mLDP
571	   or BGP will also be supported on the PE-CE interface; however, this
572	   is not described in this revision.

574	   As the document only supports PIM-based C-instances, we will
575	   generally use the term "PIM C-instances" to refer to the multicast C-
576	   instances.

578	   A PE router may also be running a "provider-wide" instance of PIM, (a
579	   "PIM P-instance"), in which it has a PIM adjacency with, e.g., each
580	   of its IGP neighbors (i.e., with P routers), but NOT with any CE
581	   routers, and not with other PE routers (unless another PE router
582	   happens to be an IGP adjacency).  In this case, P routers would also
583	   run the P-instance of PIM, but NOT a C-instance.  If there is a PIM
584	   P-instance, it may or may not have a role to play in support of VPN
585	   multicast; this is discussed in later sections.  However, in no case
586	   will the PIM P-instance contain VPN-specific multicast routing
587	   information.

589	   In order to help clarify when we are speaking of the PIM P-instance
590	   and when we are speaking of a PIM C-instance, we will also apply the
591	   prefixes "P-" and "C-" respectively to control messages, addresses,
592	   etc.  Thus a P-Join would be a PIM Join which is processed by the PIM
593	   P-instance, and a C-Join would be a PIM Join which is processed by a
594	   C-instance.  A P-group address would be a group address in the SP's
595	   address space, and a C-group address would be a group address in a
596	   VPN's address space.

598	3.2. P-Multicast Service Interfaces (PMSIs)

600	   Multicast data packets received by a PE over a PE-CE interface must
601	   be forwarded to one or more of the other PEs in the same MVPN for
602	   delivery to one or more other CEs.

604	   We define the notion of a "P-Multicast Service Interface" (PMSI).  If
605	   a particular MVPN is supported by a particular set of PE routers,
606	   then there will be a PMSI connecting those PE routers.  A PMSI is a
607	   conceptual "overlay" on the P network with the following property: a
608	   PE in a given MVPN can give a packet to the PMSI, and the packet will
609	   be delivered to some or all of the other PEs in the MVPN, such that
610	   any PE receiving such a packet will be able to tell which MVPN the
611	   packet belongs to.

613	   As we discuss below, a PMSI may be instantiated by a number of
614	   different transport mechanisms, depending on the particular
615	   requirements of the MVPN and of the SP.  We will refer to these
616	   transport mechanisms as "tunnels".

618	   For each MVPN, there are one or more PMSIs that are used for
619	   transmitting the MVPN's multicast data from one PE to others.  We
620	   will use the term "PMSI" such that a single PMSI belongs to a single
621	   MVPN.  However, the transport mechanism which is used to instantiate
622	   a PMSI may allow a single "tunnel" to carry the data of multiple
623	   PMSIs.

625	   In this document we make a clear distinction between the multicast
626	   service (the PMSI) and its instantiation.  This allows us to separate
627	   the discussion of different services from the discussion of different
628	   instantiations of each service.  The term "tunnel" is used to refer
629	   only to the transport mechanism that instantiates a service.

631	   [This is a significant change from previous drafts on the topic of
632	   MVPN, which have used the term "Multicast Tunnel" to refer both to
633	   the multicast service (what we call here the PMSI) and to its
634	   instantiation.]

636	3.2.1. Inclusive and Selective PMSIs

638	   We will distinguish between three different kinds of PMSI:

640	     - "Multidirectional Inclusive" PMSI (MI-PMSI)

642	       A Multidirectional Inclusive PMSI is one which enables ANY PE
643	       attaching to a particular MVPN to transmit a message such that it
644	       will be received by EVERY other PE attaching to that MVPN.

646	       There is at most one MI-PMSI per MVPN.  (Though the tunnel which
647	       instantiates an MI-PMSI may actually carry the data of more than
648	       one PMSI.)

650	       An MI-PMSI can be thought of as an overlay broadcast network
651	       connecting the set of PEs supporting a particular MVPN.

653	       [The "Default MDTs" of rosen-08 provide the transport service of
654	       MI-PMSIs, in this terminology.]

656	     - "Unidirectional Inclusive" PMSI (UI-PMSI)

658	       A Unidirectional Inclusive PMSI is one which enables a particular
659	       PE, attached to a particular MVPN, to transmit a message such
660	       that it will be received by all the other PEs attaching to that
661	       MVPN.  There is at most one UI-PMSI per PE per MVPN, though the
662	       "tunnel" which instantiates a UI-PMSI may in fact carry the data
663	       of more than one PMSI.

665	     - "Selective" PMSI (S-PMSI).

667	       A Selective PMSI is one which provides a mechanism wherein a
668	       particular PE in an MVPN can multicast messages so that they will
669	       be received by a subset of the other PEs of that MVPN.  There may
670	       be an arbitrary number of S-PMSIs per PE per MVPN.  Again, the
671	       "tunnel" which instantiates a given S-PMSI may carry data from
672	       multiple S-PMSIs.

674	       [The "Data MDTs" of earlier drafts provide the transport service
675	       of "Selective PMSIs" in the terminology of this draft.]

677	   We will see in later sections the role played by these different
678	   kinds of PMSI.  We will use the term "I-PMSI" when we are not
679	   distinguishing between "MI-PMSIs" and "UI-PMSIs".

681	3.2.2. Tunnels Instantiating PMSIs

683	   The tunnels which are used to instantiate PMSIs will be referred to
684	   as "P-tunnels".  A number of different tunnel setup techniques can be
685	   used to create the P-tunnels that instantiate the PMSIs.  Among these
686	   are:

688	     - PIM

690	       A PMSI can be instantiated as (a set of) Multicast Distribution
691	       Trees created by the PIM P-instance ("P-trees").

693	       PIM-SSM, BIDIR-PIM, or PIM-SM can be used to create P-trees.
694	       (PIM-DM is not supported for this purpose.)

696	       A single MI-PMSI can be instantiated by a single shared P-tree,
697	       or by a number of source P-trees (one for each PE of the MI-
698	       PMSI).  P-trees may be shared by multiple MVPNs (i.e., a given P-
699	       tree may be the instantiation of multiple PMSIs), as long as the
700	       encapsulation provides some means of demultiplexing the data
701	       traffic by MVPN.

703	       Selective PMSIs are most instantiated by source P-trees, and are
704	       most naturally created by PIM-SSM, since by definition only one
705	       PE is the source of the multicast data on a Selective PMSI.

707	       [The "Default MDTs" of [rosen-08] are MI-PMSIs instantiated as
708	       PIM trees.  The "data MDTs" of [rosen-08] are S-PMSIs
709	       instantiated as PIM trees.]

711	     - MLDP

713	       A PMSI may be instantiated as one or more mLDP Point-to-
714	       Multipoint (P2MP) LSPs, or as an mLDP Multipoint-to-Point(MP2MP)
715	       LSP.  A Selective PMSI or a Unidirectional Inclusive PMSI would
716	       be instantiated as a single mLDP P2MP LSP, whereas a
717	       Multidirectional Inclusive PMSI could be instantiated either as a
718	       set of such LSPs (one for each PE in the MVPN) or as a single
719	       M2PMP LSP.

721	       MLDP P2MP LSPs can be shared across multiple MVPNs.

723	     - RSVP-TE

725	       A PMSI may be instantiated as one or more RSVP-TE Point-to-
726	       Multipoint (P2MP) LSPs.  A Selective PMSI or a Unidirectional
727	       Inclusive PMSI would be instantiated as a single RSVP-TE P2MP
728	       LSP, whereas a Multidirectional Inclusive PMSI would be
729	       instantiated as a set of such LSPs, one for each PE in the MVPN.
730	       RSVP-TE P2MP LSPs can be shared across multiple MVPNs.

732	     - A Mesh of Unicast Tunnels.

734	       If a PMSI is implemented as a mesh of unicast tunnels, a PE
735	       wishing to transmit a packet through the PMSI would replicate the
736	       packet, and send a copy to each of the other PEs.

738	       An MI-PMSI for a given MVPN can be instantiated as a full mesh of
739	       unicast tunnels among that MVPN's PEs.  A UI-PMSI or an S-PMSI
740	       can be instantiated as a partial mesh.

742	     - Unicast Tunnels to the Root of a P-Tree.

744	       Any type of PMSI can be instantiated through a method in which
745	       there is a single P-tree (created, for example, via PIM-SSM or
746	       via RSVP-TE), and a PE transmits a packet to the PMSI by sending
747	       it in a unicast tunnel to the root of that P-tree.  All PEs in
748	       the given MVPN would need to be leaves of the tree.

750	       When this instantiation method is used, the transmitter of the
751	       multicast data may receive its own data back.  Methods for
752	       avoiding this are for further study.

754	   It can be seen that each method of implementing PMSIs has its own
755	   area of applicability.  This specification therefore allows for the
756	   use of any of these methods.  At first glance, this may seem like an
757	   overabundance of options.  However, the history of multicast
758	   development and deployment should make it clear that there is no one
759	   option which is always acceptable.  The use of segmented inter-AS
760	   trees does allow each SP to select the option which it finds most
761	   applicable in its own environment, without causing any other SP to
762	   choose that same option.

764	   Specifying the conditions under which a particular tree building
765	   method is applicable is outside the scope of this document.

767	   The choice of the tunnel technique belongs to the sender router and
768	   is a local policy decision of the router. The procedures defined
769	   throughout this document do not mandate that the same tunnel
770	   technique be used for all PMSI tunnels going through a same provider
771	   backbone.  It is however expected that any tunnel technique that can
772	   be subject to being used by a PE for a particular MVPN is also
773	   supported by other PE having VRFs for the MVPN.  Moreover, the use of
774	   ingress replication by any PE for an MVPN, implies that all other PEs
775	   MUST use ingress replication for this MVPN.

777	3.3. Use of PMSIs for Carrying Multicast Data

779	   Each PE supporting a particular MVPN must have a way of discovering:

781	     - The set of other PEs in its AS that are attached to sites of that
782	       MVPN, and the set of other ASes that have PEs attached to sites
783	       of that MVPN.  However, if segmented inter-AS trees are not used
784	       (see section 8.2), then each PE needs to know the entire set of
785	       PEs attached to sites of that MVPN.

787	     - If segmented inter-AS trees are to be used, the set of border
788	       routers in its AS that support inter-AS connectivity for that
789	       MVPN

791	     - If the MVPN is configured to use a default MI-PMSI, the
792	       information needed to set up and to use the tunnels instantiating
793	       the default MI-PMSI,

795	     - For each other PE, whether the PE supports Aggregate Trees for
796	       the MVPN, and if so, the demultiplexing information which must be
797	       provided so that the other PE can determine whether a packet
798	       which it received on an aggregate tree belongs to this MVPN.

800	   In some cases this information is provided by means of the BGP-based
801	   auto-discovery procedures detailed in section 4.  In other cases,
802	   this information is provided after discovery is complete, by means of
803	   procedures defined in section 6.1.2.  In either case, the information
804	   which is provided must be sufficient to enable the PMSI to be bound
805	   to the identified tunnel, to enable the tunnel to be created if it
806	   does not already exist, and to enable the different PMSIs which may
807	   travel on the same tunnel to be properly demultiplexed.

809	3.3.1. MVPNs with Default MI-PMSIs

811	   If an MVPN uses an MI-PMSI, then the MI-PMSI for that MVPN will be
812	   created as soon as the necessary information has been obtained.
813	   Creating a PMSI means creating the tunnel which carries it (unless
814	   that tunnel already exists), as well as binding the PMSI to the
815	   tunnel. The MI-PMSI for that MVPN is then used as the default method
816	   of transmitting multicast data packets for that MVPN.  In effect, all
817	   the multicast streams for the MVPN are, by default, aggregated onto
818	   the MI-MVPN.

820	   If a particular multicast stream from a particular source PE has
821	   certain characteristics, it can be desirable to migrate it from the
822	   MI-PMSI to an S-PMSI.  Procedures for migrating a stream from an MI-
823	   PMSI to an S-PMSI are discussed in section 7.

825	3.3.2. When MI-PMSIs are Required

827	   MI-PMSIs are required under the following conditions:

829	     - The MVPN is using PIM-DM, or some other protocol (such as BSR)
830	       which relies upon flooding.  Only with an MI-PMSI can the C-data
831	       (or C-control-packets) received from any CE be flooded to all
832	       PEs.

834	     - If the procedure for carrying C-multicast routes from PE to PE
835	       involves the multicasting of P-PIM control messages among the PEs
836	       (see sections 3.4.1.1, 3.4.1.2, and 5.2).

838	3.3.3. MVPNs That Do Not Use MI-PMSIs

840	   If a particular MVPN does not use a default MI-PMSI, then its
841	   multicast data may be sent by default on a UI-PMSI.

843	   It is also possible to send all the multicast data on an S-PMSI,
844	   omitting any usage of I-PMSIs.  This prevents PEs from receiving data
845	   which they don't need, at the cost of requiring additional tunnels.
846	   However, cost-effective instantiation of S-PMSIs is likely to require
847	   Aggregate P-trees, which in turn makes it necessary for the
848	   transmitting PE to know which PEs need to receive which multicast
849	   streams. This is known as "explicit tracking", and the procedures to
850	   enable explicit tracking may themselves impose a cost.  This is
851	   further discussed in section 7.2.2.2.

853	3.4. PE-PE Transmission of C-Multicast Routing

855	   As a PE attached to a given MVPN receives C-Join/Prune messages from
856	   its CEs in that MVPN, it must convey the information contained in
857	   those messages to other PEs that are attached to the same MVPN.

859	   There are several different methods for doing this. As these methods
860	   are not interoperable, the method to be used for a particular MVPN
861	   must either be configured, or discovered as part of the auto-
862	   discovery process.

864	3.4.1. PIM Peering

866	3.4.1.1. Full Per-MVPN PIM Peering Across a MI-PMSI

868	   If the set of PEs attached to a given MVPN are connected via a MI-
869	   PMSI, the PEs can form "normal" PIM adjacencies with each other.

871	   Since the MI-PMSI functions as a broadcast network, the standard PIM
872	   procedures for forming and maintaining adjacencies over a LAN can be
873	   applied.

875	   As a result, the C-Join/Prune messages which a PE receives from a CE
876	   can be multicast to all the other PEs of the MVPN.  PIM "join
877	   suppression" can be enabled and the PEs can send Asserts as needed.

879	   This procedure is fully specified in section 5.2.

881	   [This is the procedure specified in [rosen-08].]

883	3.4.1.2. Lightweight PIM Peering Across a MI-PMSI

885	   The procedure of the previous section has the following
886	   disadvantages:

888	     - Periodic Hello messages must be sent by all PEs.

890	       Standard PIM procedures require that each PE in a particular MVPN
891	       periodically multicast a Hello to all the other PEs in that MVPN.
892	       If the number of MVPNs becomes very large, sending and receiving
893	       these Hellos can become a substantial overhead for the PE
894	       routers.

896	     - Periodic retransmission of C-Join/Prune messages.

898	       PIM is a "soft-state" protocol, in which reliability is assured
899	       through frequent retransmissions (refresh) of control messages.
900	       This too can begin to impose a large overhead on the PE routers
901	       as the number of MVPNs grows.

903	   The first of these disadvantages is easily remedied.  The reason for
904	   the periodic PIM Hellos is to ensure that each PIM speaker on a LAN
905	   knows who all the other PIM speakers on the LAN are.  However, in the
906	   context of MVPN, PEs in a given MVPN can learn the identities of all
907	   the other PEs in the MVPN by means of the BGP-based auto-discovery
908	   procedure of section 4.  In that case, the periodic Hellos would
909	   serve no function, and could simply be eliminated.  (Of course, this
910	   does imply a change to the standard PIM procedures.)

912	   When Hellos are suppressed, we may speak of "lightweight PIM
913	   peering".

915	   The periodic refresh of the C-Join/Prunes is not as simple to
916	   eliminate.  If and when "refresh reduction" procedures are specified
917	   for PIM, it may be useful to incorporate them, so as to make the
918	   lightweight PIM peering procedures even more lightweight.

920	   Lightweight PIM peering is not specified in this document.

922	3.4.1.3. Unicasting of PIM C-Join/Prune Messages

924	   PIM does not require that the C-Join/Prune messages which a PE
925	   receives from a CE to be multicast to all the other PEs; it allows
926	   them to be unicast to a single PE, the one which is upstream on the
927	   path to the root of the multicast tree mentioned in the Join/Prune
928	   message. Note that when the C-Join/Prune messages are unicast, there
929	   is no such thing as "join suppression".  Therefore PIM Refresh
930	   Reduction may be considered to be a pre-requisite for the procedure
931	   of unicasting the C-Join/Prune messages.

933	   When the C-Join/Prunes are unicast, they are not transmitted on a
934	   PMSI at all.  Note that the procedure of unicasting the C-Join/Prunes
935	   is different than the procedure of transmitting the C-Join/Prunes on
936	   an MI-PMSI which is instantiated as a mesh of unicast tunnels.

938	   If there are multiple PEs that can be used to reach a given C-source,
939	   procedures described in section 9 MUST be used to ensue that, at
940	   least within a single AS, all PEs choose the same PE to reach the C-
941	   source.

943	   Procedures for unicasting the PIM control messages are not further
944	   specified in this document.

946	3.4.2. Using BGP to Carry C-Multicast Routing

948	   It is possible to use BGP to carry C-multicast routing information
949	   from PE to PE, dispensing entirely with the transmission of C-
950	   Join/Prune messages from PE to PE. This is specified in section 5.3.
951	   Inter-AS procedures are described in section 8.

953	4. BGP-Based Autodiscovery of MVPN Membership

955	   BGP-based autodiscovery is done by means of a new address family, the
956	   MCAST-VPN address family. (This address family also has other uses,
957	   as will be seen later.)  Any PE which attaches to an MVPN must issue
958	   a BGP update message containing an NLRI in this address family, along
959	   with a specific set of attributes.  In this document, we specify the
960	   information which must be contained in these BGP updates in order to
961	   provide auto-discovery.  The encoding details, along with the
962	   complete set of detailed procedures, are specified in a separate
963	   document [MVPN-BGP].

965	   This section specifies the intra-AS BGP-based autodiscovery
966	   procedures.  When segmented inter-AS trees are used, additional
967	   procedures are needed, as specified in section 8.  Further detail may
968	   be found in [MVPN-BGP].  (When segmented inter-AS trees are not used,
969	   the inter-AS procedures are almost identical to the intra-AS
970	   procedures.)

972	   BGP-based autodiscovery uses a particular kind of MCAST-VPN route
973	   known as an "auto-discovery routes", or "A-D route".  In particular,
974	   it uses two kinds of "A-D routes", the "Intra-AS A-D Route" and the
975	   "Inter-AS A-D Route".  (There are also additional kinds of A-D
976	   routes, such as the Source Active A-D routes which are used for
977	   purposes that go beyond auto-discovery.  These are discussed in
978	   subsequent sections.)

980	   The Inter-AS A-D Route is used only when segmented inter-AS tunnels
981	   are used, as specified in section 8.

983	   The "Intra-AS A-D route" is originated by the PEs that are (directly)
984	   connected to the site(s) of an MVPN.  It is distributed to other PEs
985	   that attach to sites of the MVPN.  If segmented Inter-AS Tunnels are
986	   used, then the Intra-AS A-D routes are not distributed outside the AS
987	   where they originate; if segmented Inter-AS Tunnels are not used,
988	   then the Intra-AS A-D routes are, despite their name, distributed to
989	   all PEs attached to the VPN, no matter what AS the PEs are in.

991	   The NLRI of an Intra-AS A-D route must contain the following
992	   information:

994	     - The route type (i.e., Intra-AS A-D route)

996	     - The IP address of the originating PE

998	     - An RD configured locally for the MVPN.  This is an RD which can
999	       be prepended to that IP address to form a globally unique VPN-IP
1000	       address of the PE.

1002	   The A-D route must also carry the following attributes:

1004	     - One or more Route Target attributes.  If any other PE has one of
1005	       these Route Targets configured for import into a VRF, it treats
1006	       the advertising PE as a member in the MVPN to which the VRF
1007	       belongs. This allows each PE to discover the PEs that belong to a
1008	       given MVPN.  More specifically it allows a PE in the receiver
1009	       sites set to discover the PEs in the sender sites set of the MVPN
1010	       and the PEs in the sender sites set of the MVPN to discover the
1011	       PEs in the receiver sites set of the MVPN. The PEs in the
1012	       receiver sites set would be configured to import the Route
1013	       Targets advertised in the BGP Auto-Discovery routes by PEs in the
1014	       sender sites set. The PEs in the sender sites set would be
1015	       configured to import the Route Targets advertised in the BGP
1016	       Auto-Discovery routes by PEs in the receiver sites set.

1018	     - PMSI tunnel attribute.  This attribute is present if and only if
1019	       a default MI-PMSI is to be used for the MVPN.  It contains the
1020	       following information:

1022	         * whether the MI-PMSI is instantiated by

1024	             + A BIDIR-PIM tree,

1026	             + a set of PIM-SSM trees,

1028	             + a set of PIM-SM trees

1030	             + a set of RSVP-TE point-to-multipoint LSPs

1032	             + a set of mLDP point-to-multipoint LSPs

1034	             + an mLDP multipoint-to-multipoint LSP

1036	             + a set of unicast tunnels

1038	             + a set of unicast tunnels to the root of a shared tree (in
1039	               this case the root must be identified)

1041	         * If the PE wishes to setup a default tunnel to instantiate the
1042	           I-PMSI, a unique identifier for the tunnel used to
1043	           instantiate the I-PMSI.

1045	           All the PEs attaching to a given MVPN (within a given AS)
1046	           must have been configured with the same PMSI tunnel attribute
1047	           for that MVPN.  They are also expected to know the
1048	           encapsulation to use.

1050	           Note that a default tunnel can be identified at discovery
1051	           time only if the tunnel already exists (e.g., it was
1052	           constructed by means of configuration), or if it can be
1053	           constructed without each PE knowing the the identities of all
1054	           the others (e.g., it is constructed by a receiver-initiated
1055	           join technique such as PIM or mLDP).

1057	           In other cases, a default tunnel cannot be identified until
1058	           the PE has discovered one or more of the other PEs.  This
1059	           will be the case, for example, if the tunnel is an RSVP-TE
1060	           P2MP LSP, which must be set up from the head end.  In these
1061	           cases, a PE will first send an A-D route without a tunnel
1062	           identifier, and then will send another one with a tunnel
1063	           identifier after discovering one or more of the other PEs.

1065	           All the PEs attaching to a given MVPN must be configured with
1066	           information specifying the encapsulation to use.

1068	         * Whether the tunnel used to instantiate the I-PMSI for this
1069	           MVPN is aggregating I-PMSIs from multiple MVPNs.  This will
1070	           affect the encapsulation used.  If aggregation is to be used,
1071	           a demultiplexor value to be carried by packets for this
1072	           particular MVPN must also be specified.  The demultiplexing
1073	           mechanism and signaling procedures are described in section
1074	           6.

1076	       Further details of the use of this information are provided in
1077	       subsequent sections.

1079	       Sometimes it is necessary for one PE to advertise an upstream-
1080	       assigned MPLS label that identifies another PE.  Under certain
1081	       circumstances to be discussed later, a PE which is the root of a
1082	       multicast P-tunnel will bind an MPLS label value to one or more
1083	       of the PEs that belong to the P-tunnel, and will distribute these
1084	       label bindings using A-D routes. The precise details of this
1085	       label distribution will be included in the next revision of this
1086	       document.  We will refer to these as "PE Labels".  A packet
1087	       traveling on the P-tunnel may carry one of these labels as an
1088	       indication that the PE corresponding to that label is special.
1089	       See section 11.3 for more details.

1091	5. PE-PE Transmission of C-Multicast Routing

1093	   As a PE attached to a given MVPN receives C-Join/Prune messages from
1094	   its CEs in that MVPN, it must convey the information contained in
1095	   those messages to other PEs that are attached to the same MVPN.  This
1096	   is known as the "PE-PE transmission of C-multicast routing
1097	   information".

1099	   This section specifies the procedures used for PE-PE transmission of
1100	   C-multicast routing information.  Not every procedure mentioned in
1101	   section 3.4 is specified here.  Rather, this section focuses on two
1102	   particular procedures:

1104	     - Full PIM Peering.

1106	       This procedure is fully specified herein.

1108	     - Use of BGP to distribute C-multicast routing

1110	       This procedure is described herein, but the full specification
1111	       appears in [MVPN-BGP].

1113	   Those aspect of the procedures which apply to both of the above are
1114	   also specified fully herein.

1116	   Specification of other procedures is for future study.

1118	5.1. Selecting the Upstream Multicast Hop (UMH)

1120	   When a PE receives a C-Join/Prune message from a CE, the message
1121	   identifies a particular multicast flow as belonging either to a
1122	   source tree (S,G) or to a shared tree (*,G).  We use the term C-
1123	   source to refer to S, in the case of a source tree, or to the
1124	   Rendezvous Point (RP) for G, in the case of (*,G).  If the route to
1125	   the C-source is across the VPN backbone, then the PE needs to find
1126	   the "upstream multicast hop" (UMH) for the (S,G) or (*,G) flow. The
1127	   "upstream multicast hop" is either the PE at which (S,G) or (*,G)
1128	   data packets enter the VPN backbone, or else is the Autonomous System
1129	   Border Router (ASBR) at which those data packets enter the local AS
1130	   when traveling through the VPN backbone.  The process of finding the
1131	   upstream multicast hop for a given C-source is known as "upstream
1132	   multicast hop selection".

1134	5.1.1. Eligible Routes for UMH Selection

1136	   In the simplest case, the PE does the upstream hop selection by
1137	   looking up the C-source in the unicast VRF associated with the PE-CE
1138	   interface over which the C-Join/Prune was received.  The route that
1139	   matches the C-source will contain the information needed to select
1140	   the upstream multicast hop.

1142	   However, in some cases, the CEs may be distributing to the PEs a
1143	   special set of routes that are to be used exclusively for the purpose
1144	   of upstream multicast hop selection, and not used for unicast routing
1145	   at all.  For example, when BGP is the CE-PE unicast routing protocol,
1146	   the CEs may be using SAFI 2 to distribute a special set of routes
1147	   that are to be used for, and only for, upstream multicast hop
1148	   selection.  When OSPF is the CE-PE routing protocol, the CE may use
1149	   an MT-ID of 1 to distribute a special set of routes that are to be
1150	   used for, and only for, upstream multicast hop selection .  When a CE
1151	   uses one of these mechanisms to distribute to a PE a special set of
1152	   routes to be used exclusively for upstream multicast hop selection,
1153	   these routes are distributed among the PEs using SAFI 129, as
1154	   described in [MVPN-BGP].

1156	   Whether the routes used for upstream multicast hop selection are (a)
1157	   the "ordinary" unicast routes or (b) a special set of routes that are
1158	   used exclusively for upstream multicast hop selection, is a matter of
1159	   policy.  How that policy is chosen, deployed, or implemented is
1160	   outside the scope of this document.  In the following, we will simply
1161	   refer to the set of routes that are used for upstream multicast hop
1162	   selection, the "Eligible UMH routes", with no presumptions about the
1163	   policy by which this set of routes was chosen.

1165	5.1.2. Information Carried by Eligible UMH Routes

1167	   Every route which is eligible for UMH selection MUST carry a VRF
1168	   Route Import Extended Community [MVPN-BGP].  This attribute
1169	   identifies the PE that originated the route.

1171	   If BGP is used for carrying C-multicast routes, OR if "Segmented
1172	   Inter-AS Tunnels" (see section 8.2) are used, then every UMH route
1173	   MUST also carry a Source AS Extended Community [MVPN-BGP].

1175	   These two attributes are used in the upstream multicast hop selection
1176	   procedures described below.

1178	5.1.3. Selecting the Upstream PE

1180	   The first step in selecting the upstream multicast hop for a given C-
1181	   source is to select the upstream PE router for that C-source.

1183	   The PE that received the C-Join message from a CE looks in the VRF
1184	   corresponding to the interfaces over which the C-Join was received.
1185	   It finds the Eligible UMH route which is the best match for the C-
1186	   source specified in that C-Join.  Call this the "Installed UMH
1187	   Route".

1189	   Note that the outgoing interface of the Installed UMH Route may be
1190	   one of the interfaces associated with the VRF, in which case the
1191	   upstream multicast hop is a CE and the route to the C-source is not
1192	   across the VPN backbone.

1194	   Consider the set of all VPN-IP routes that are: (a) eligible to be
1195	   imported into the VRF (as determined by their Route Targets), (b) are
1196	   eligible to be used for upstream multicast hop selection, and (c)
1197	   have exactly the same IP prefix (not necessarily the same RD) as the
1198	   installed UMH route.

1200	   For each route in this set, determine the corresponding upstream PE
1201	   and upstream RD.  If a route has a VRF Route Import Extended
1202	   Community, the route's upstream PE is determined from it. If a route
1203	   does not have a VRF Route Import Extended Community, the route's
1204	   upstream PE is determined from the route's BGP next hop attribute.
1205	   In either case, the upstream RD is taken from the route's NLRI.

1207	   This results in a set of pairs of <route, upstream PE, upstream RD>.
1208	   If the Installed UMH Route is not already in this set, it is added to
1209	   the set.  Call this the "UMH Route Candidate Set."  Then the PE MUST
1210	   select a single route from the set to be the "Selected UMH Route".
1211	   The corresponding upstream PE is known as the "Selected Upstream PE",
1212	   and the corresponding upstream RD is known as the "Selected Upstream
1213	   RD".

1215	   There are several possible procedures that can be used by a PE to
1216	   select a single route from the candidate set.

1218	   The default procedure, which MUST be implemented, is to select the
1219	   route whose corresponding upstream PE address is numerically highest,
1220	   where a 32-bit IP address is treated as a 32 bit unsigned integer.
1221	   Call this the "default upstream PE selection".  For a given C-source,
1222	   provided that the routing information used to create the candidate
1223	   set is stable, all PEs will have the same default upstream PE
1224	   selection.  (Though different default upstream PE selections may be
1225	   chosen during a routing transient.)

1227	   An alternative procedure which MUST be implemented, but which is
1228	   disabled by default, is the following.  This procedure ensures that,
1229	   except during a routing transient, each PE chooses the same upstream
1230	   PE for a given combination of C-source and C-G.

1232	      1. The PEs in the candidate set are numbered from lower to higher
1233	         IP address, starting from 0.

1235	      2. The following hash is performed:

1237	           - A bytewise exclusive-or of all the bytes in the C-source
1238	             address and the C-G address is performed.

1240	           - The result is taken modulo n, where n is the number of PEs
1241	             in the candidate set.  Call this result N.

1243	   The selected upstream PE is then the one that appears in position N
1244	   in the list of step 1.

1246	   Other hashing algorithms are allowed as well, but not required.

1248	   The alternative procedure allows a form of "equal cost load
1249	   balancing".  Suppose, for example, that from egress PEs PE3 and PE4,
1250	   source C-S can be reached, at equal cost, via ingress PE PE1 or
1251	   ingress PE PE2.  The load balancing procedure makes it possible for
1252	   PE1 to be the ingress PE for (C-S, C-G1) data traffic while PE2 is
1253	   the ingress PE for (C-S, C-G2) data traffic.

1255	5.1.4. Selecting the Upstream Multicast Hop

1257	   In certain cases, the selected upstream multicast hop is the same as
1258	   the selected upstream PE.  In other cases, the selected upstream
1259	   multicast hop is the ASBR which is the "BGP next hop" of the Selected
1260	   UMH Route.

1262	   If the selected upstream PE is in the local AS, then the selected
1263	   upstream PE is also the selected upstream multicast hop.  This is the
1264	   case if any of the following conditions holds:

1266	     - The selected UMH route has a Source AS Extended Community, and
1267	       the Source AS is the same as the local AS,

1269	     - The selected UMH route does not have a Source AS Extended
1270	       Community, but the route's BGP next hop is the same as the
1271	       upstream PE.

1273	   Otherwise, the selected upstream multicast hop is an ASBR.  The
1274	   method of determining just which ASBR it is depends on the particular
1275	   inter-AS signaling method being used (PIM or BGP), and on whether
1276	   segmented or non-segmented inter-AS tunnels are used.  These details
1277	   are presented in later sections.

1279	5.2. Details of Per-MVPN Full PIM Peering over MI-PMSI

1281	   In this section, we assume that inter-AS MVPNs will be supported by
1282	   means of non-segmented inter-AS trees.  Support for segmented inter-
1283	   AS trees with PIM peering is for further study.

1285	   When an MVPN uses an MI-PMSI, the C-instances of that MVPN can treat
1286	   the MI-PMSI as a LAN interface, and form either full PIM adjacencies
1287	   with each other over that "LAN interface".

1289	   To form a full PIM adjacency, the PEs execute the PIM LAN procedures,
1290	   including the generation and processing of PIM Hello, Join/Prune,
1291	   Assert, DF election and other PIM control packets.  These are
1292	   executed independently for each C-instance.  PIM "join suppression"
1293	   SHOULD be enabled.

1295	5.2.1. PIM C-Instance Control Packets

1297	   All PIM C-Instance control packets of a particular MVPN are addressed
1298	   to the ALL-PIM-ROUTERS (224.0.0.13) IP destination address, and
1299	   transmitted over the MI-PMSI of that MVPN.  While in transit in the
1300	   P-network, the packets are encapsulated as required for the
1301	   particular kind of tunnel that is being used to instantiate the MI-
1302	   PMSI.  Thus the C-instance control packets are not processed by the P
1303	   routers, and MVPN-specific PIM routes can be extended from site to
1304	   site without appearing in the P routers.

1306	   As specified in section 5.1.2, when a PE distributes VPN-IP routes
1307	   which are eligible for use as UMH routes, the PE MUST include a VRF
1308	   Route Import Extended Community with each route.  For a given MVPN, a
1309	   single such IP address MUST be used, and that same IP address MUST be
1310	   used as the source address in all PIM control packets for that MVPN.

1312	5.2.2. PIM C-instance RPF Determination

1314	   Although the MI-PMSI is treated by PIM as a LAN interface, unicast
1315	   routing is NOT run over it, and there are no unicast routing
1316	   adjacencies over it.  It is therefore necessary to specify special
1317	   procedures for determining when the MI-PMSI is to be regarded as the
1318	   "RPF Interface" for a particular C-address.

1320	   The PE follows the procedures of section 5.1 to determine the
1321	   selected UMH route.  If that route is NOT a VPN-IP route learned from
1322	   BGP as described in [RFC4364], or if that route's outgoing interface
1323	   is one of the interfaces associated with the VRF, then ordinary PIM
1324	   procedures for determining the RPF interface apply.

1326	   However, if the selected UMH route is a VPN-IP route whose outgoing
1327	   interface is not one of the interfaces associated with the VRF, then
1328	   PIM will consider the RPF interface to be the MI-PMSI associated with
1329	   the VPN-specific PIM instance.

1331	   Once PIM has determined that the RPF interface for a particular C-
1332	   source is the MI-PMSI, it is necessary for PIM to determine the "RPF
1333	   neighbor" for that C-source.  This will be one of the other PEs that
1334	   is a PIM adjacency over the MI-PMSI.  In particular, it will be the
1335	   "selected upstream PE" as defined in section 5.1.

1337	5.2.3. Backwards Compatibility

1339	   There are older implementations which do not use the VRF Route Import
1340	   Extended Community or any explicit mechanism for carrying information
1341	   to identify the originating PE of a selected UMH route.

1343	   For backwards compatibility, when the selected UMH route does not
1344	   have any such mechanism, the IP address from the "BGP Next Hop" field
1345	   of the selected UMH route will be used as the selected UMH address,
1346	   and will be treated as the address of the upstream PE.  There is no
1347	   selected upstream RD in this case.  However, use of this backwards
1348	   compatibility technique presupposes that:

1350	     - The PE which originated the selected UMH route placed the same IP
1351	       address in the BGP Next Hop field that it is using as the source
1352	       address of the PE-PE PIM control packets for this MVPN.

1354	     - The MVPN is not an Inter-AS MVPN that uses option b from section
1355	       10 of [RFC4364].

1357	   Should either of these conditions fail, interoperability with the
1358	   older implementations will not be achieved.

1360	5.3. Use of BGP for Carrying C-Multicast Routing

1362	   It is possible to use BGP to carry C-multicast routing information
1363	   from PE to PE, dispensing entirely with the transmission of C-
1364	   Join/Prune messages from PE to PE. This section describes the
1365	   procedures for carrying intra-AS multicast routing information.
1366	   Inter-AS procedures are described in section 8.  The complete
1367	   specification of both sets of procedures and of the encodings can be
1368	   found in [MVPN-BGP].

1370	5.3.1. Sending BGP Updates

1372	   The MCAST-VPN address family is used for this purpose.  MCAST-VPN
1373	   routes used for the purpose of carrying C-multicast routing
1374	   information are distinguished from those used for the purpose of
1375	   carrying auto-discovery information by means of a "route type" field
1376	   which is encoded into the NLRI.  The following information is
1377	   required in BGP to advertise the MVPN routing information.  The NLRI
1378	   contains:

1380	     - The type of C-multicast route.

1382	       There are two types:

1384	         * source tree join

1386	         * shared tree join

1388	     - The RD configured, for the MVPN, on the PE that is advertising
1389	       the information.  The RD is required in order to uniquely
1390	       identify the <C-Source, C-Group> when different MVPNs have
1391	       overlapping address spaces.

1393	     - The C-Group address.

1395	     - The C-Source address.

1397	       This field is omitted if the route type is "shared tree join".
1398	       In the case of a shared tree join, the C-source is a C-RP.  The
1399	       address of the C-RP corresponding to the C-group address is
1400	       presumed to be already known (or automatically determinable) be
1401	       the other PEs, though means that are outside the scope of this
1402	       specification.

1404	     - The Selected Upstream RD corresponding to the C-source address
1405	       (determined by the procedures of section 5.1).

1407	   Whenever a C-multicast route is sent, it must also carry, in a Route
1408	   Target Extended Community, the Selected Upstream Multicast Hop
1409	   corresponding to the C-source address (determined by the procedures
1410	   of section 5.1). The selected upstream multicast hop is identified in
1411	   an Extended Community attribute to facilitate the optional use of
1412	   filters which can prevent the distribution of the update to BGP
1413	   speakers other than the upstream multicast hop.  See section 10.1.3
1414	   of [MVPN-BGP] for the details.

1416	   There is no C-multicast route corresponding to the PIM function of
1417	   pruning a source off the shared tree when a PE switches from a <C-*,
1418	   C-G> tree to a <C-S, C-G> tree.  Section 9 of this document specifies
1419	   a mandatory procedure that ensures that if any PE joins a <C-S, C-G>
1420	   source tree, all other PEs that have joined or will join the <C-*, C-
1421	   G> shared tree will also join the <C-S, C-G> source tree.  This
1422	   eliminates the need for a C-multicast route that prunes C-S off the
1423	   <C-*, C-G> shared tree when switching from <C-*, C-G> to <C-S, C-G>
1424	   tree.

1426	5.3.2. Explicit Tracking

1428	   Note that the upstream multicast hop is NOT part of the NLRI in the
1429	   C-multicast BGP routes.  This means that if several PEs join the same
1430	   C-tree, the BGP routes they distribute to do so are regarded by BGP
1431	   as comparable routes, and only one will be installed.  If a route
1432	   reflector is being used, this further means that the PE which is used
1433	   to reach the C-source will know only that one or more of the other
1434	   PEs have joined the tree, but it won't know which one.  That is, this
1435	   BGP update mechanism does not provide "explicit tracking".  Explicit
1436	   tracking is not provided by default because it increases the amount
1437	   of state needed and thus decreases scalability.  Also, as
1438	   constructing the C-PIM messages to send "upstream" for a given tree
1439	   does not depend on knowing all the PEs that are downstream on that
1440	   tree, there is no reason for the C-multicast route type updates to
1441	   provide explicit tracking.

1443	   There are some cases in which explicit tracking is necessary in order
1444	   for the PEs to set up certain kinds of P-trees.  There are other
1445	   cases in which explicit tracking is desirable in order to determine
1446	   how to optimally aggregate multicast flows onto a given aggregate
1447	   tree.  As these functions have to do with the setting up of
1448	   infrastructure in the P-network, rather than with the dissemination
1449	   of C-multicast routing information, any explicit tracking that is
1450	   necessary is handled by sending the "source active" A-D routes, that
1451	   are described in sections 9 and 10.  Detailed procedures for turning
1452	   on explicit tracking can be found in [MVPN-BGP].

1454	5.3.3. Withdrawing BGP Updates

1456	   A PE removes itself from a C-multicast tree (shared or source) by
1457	   withdrawing the corresponding BGP update.

1459	   If a PE has pruned a C-source from a shared C-multicast tree, and it
1460	   needs to "unprune" that source from that tree, it does so by
1461	   withdrawing the route that pruned the source from the tree.

1463	6. I-PMSI Instantiation

1465	   This section describes how tunnels in the SP network can be used to
1466	   instantiate an I-PMSI for an MVPN on a PE.  When C-multicast data is
1467	   delivered on an I-PMSI, the data will go to all PEs that are on the
1468	   path to receivers for that C-group, but may also go to PEs that are
1469	   not on the path to receivers for that C-group.

1471	   The tunnels which instantiate I-PMSIs can be either PE-PE unicast
1472	   tunnels or P-multicast trees. When PE-PE unicast tunnels are used the
1473	   PMSI is said to be instantiated using ingress replication.  The
1474	   instantiation of a tunnel for an I-PMSI is a matter of local policy
1475	   decision and is not mandatory.  Even for a site attached to multicast
1476	   sources, transport of customer multicast traffic can be accommodated
1477	   with S-PMSI-bound tunnels only

1479	   [Editor's Note: MD trees described in [ROSEN-8, MVPN-BASE] are an
1480	   example of P-multicast trees. Also Aggregate Trees described in
1481	   [RAGGARWA-MCAST] are an example of P-multicast trees.]

1483	6.1. MVPN Membership and Egress PE Auto-Discovery

1485	   As described in section 4 a PE discovers the MVPN membership
1486	   information of other PEs using BGP auto-discovery mechanisms or using
1487	   a mechanism that instantiates a MI-PMSI interface. When a PE supports
1488	   only a UI-PMSI service for an MVPN, it MUST rely on the BGP auto-
1489	   discovery mechanisms for discovering this information. This
1490	   information also results in a PE in the sender sites set discovering
1491	   the leaves of the P-multicast tree, which are the egress PEs that
1492	   have sites in the receiver sites set in one or more MVPNs mapped onto
1493	   the tree.

1495	6.1.1. Auto-Discovery for Ingress Replication

1497	   In order for a PE to use Unicast Tunnels to send a C-multicast data
1498	   packet for a particular MVPN to a set of remote PEs, the remote PEs
1499	   must be able to correctly decapsulate such packets and to assign each
1500	   one to the proper MVPN. This requires that the encapsulation used for
1501	   sending packets through the tunnel have demultiplexing information
1502	   which the receiver can associate with a particular MVPN.

1504	   If ingress replication is being used for an MVPN, the PEs announce
1505	   this as part of the BGP based MVPN membership auto-discovery process,
1506	   described in section 4.  The PMSI tunnel attribute specifies ingress
1507	   replication.  The demultiplexor value is a downstream-assigned MPLS
1508	   label (i.e., assigned by the PE that originated the A-D route, to be
1509	   used by other PEs when they send multicast packets on a unicast
1510	   tunnel to that PE).

1512	   Other demultiplexing procedures for unicast are under consideration.

1514	6.1.2. Auto-Discovery for P-Multicast Trees

1516	   A PE announces the P-multicast technology it supports for a specified
1517	   MVPN, as part of the BGP MVPN membership discovery. This allows other
1518	   PEs to determine the P-multicast technology they can use for building
1519	   P-multicast trees to instantiate an I-PMSI. If a PE has a default
1520	   tree instantiation of an I-PMSI, it also announces the tree
1521	   identifier as part of the auto-discovery, as well as announcing its
1522	   aggregation capability.

1524	   The announcement of a tree identifier at discovery time is only
1525	   possible if the tree already exists (e.g., a preconfigured "traffic
1526	   engineered" tunnel), or if the tree can be constructed dynamically
1527	   without any PE having to know in advance all the other PEs on the
1528	   tree (e.g., the tree is created by receiver-initiated joins).

1530	6.2. C-Multicast Routing Information Exchange

1532	   When a PE doesn't support the use of a MI-PMSI for a given MVPN, it
1533	   MUST either unicast MVPN routing information using PIM or else use
1534	   BGP for exchanging the MVPN routing information.

1536	6.3. Aggregation

1538	   A P-multicast tree can be used to instantiate a PMSI service for only
1539	   one MVPN or for more than one MVPN. When a P-multicast tree is shared
1540	   across multiple MVPNs it is termed an Aggregate Tree [RAGGARWA-
1541	   MCAST]. The procedures described in this document allow a single SP
1542	   multicast tree to be shared across multiple MVPNs. The procedures
1543	   that are specific to aggregation are optional and are explicitly
1544	   pointed out. Unless otherwise specified a P-multicast tree technology
1545	   supports aggregation.

1547	   Aggregate Trees allow a single P-multicast tree to be used across
1548	   multiple MVPNs and hence state in the SP core grows per-set-of-MVPNs
1549	   and not per MVPN.  Depending on the congruence of the aggregated
1550	   MVPNs, this may result in trading off optimality of multicast
1551	   routing.

1553	   An Aggregate Tree can be used by a PE to provide an UI-PMSI or MI-
1554	   PMSI service for more than one MVPN. When this is the case the
1555	   Aggregate Tree is said to have an inclusive mapping.

1557	6.3.1. Aggregate Tree Leaf Discovery

1559	   BGP MVPN membership discovery allows a PE to determine the different
1560	   Aggregate Trees that it should create and the MVPNs that should be
1561	   mapped onto each such tree. The leaves of an Aggregate Tree are
1562	   determined by the PEs, supporting aggregation, that belong to all the
1563	   MVPNs that are mapped onto the tree.

1565	   If an Aggregate Tree is used to instantiate one or more S-PMSIs, then
1566	   it may be desirable for the PE at the root of the tree to know which
1567	   PEs (in its MVPN) are receivers on that tree.  This enables the PE to
1568	   decide when to aggregate two S-PMSIs, based on congruence (as
1569	   discussed in the next section).  Thus explicit tracking may be
1570	   required.  Since the procedures for disseminating C-multicast routes
1571	   do not provide explicit tracking, a type of A-D route known as a
1572	   "Leaf A-D Route" is used.  The PE which wants to assign a particular
1573	   C-multicast flow to a particular Aggregate Tree can send an A-D route
1574	   which elicits Leaf A-D routes from the PEs that need to receive that
1575	   C-multicast flow.  This provides the explicit tracking information
1576	   needed to support the aggregation methodology discussed in the next
1577	   section.

1579	6.3.2. Aggregation Methodology

1581	   This document does not specify the mandatory implementation of any
1582	   particular set of rules for determining whether or not the PMSIs of
1583	   two particular MVPNs are to be instantiated by the same Aggregate
1584	   Tree.  This determination can be made by implementation-specific
1585	   heuristics, by configuration, or even perhaps by the use of offline
1586	   tools.

1588	   It is the intention of this document that the control procedures will
1589	   always result in all the PEs of an MVPN to agree on the PMSIs which
1590	   are to be used and on the tunnels used to instantiate those PMSIs.

1592	   This section discusses potential methodologies with respect to
1593	   aggregation.

1595	   The "congruence" of aggregation is defined by the amount of overlap
1596	   in the leaves of the customer trees that are aggregated on a SP tree.
1597	   For Aggregate Trees with an inclusive mapping the congruence depends
1598	   on the overlap in the membership of the MVPNs that are aggregated on
1599	   the tree. If there is complete overlap i.e. all MVPNs have exactly
1600	   the same sites, aggregation is perfectly congruent. As the overlap
1601	   between the MVPNs that are aggregated reduces, i.e. the number of
1602	   sites that are common across all the MVPNs reduces, the congruence
1603	   reduces.

1605	   If aggregation is done such that it is not perfectly congruent a PE
1606	   may receive traffic for MVPNs to which it doesn't belong. As the
1607	   amount of multicast traffic in these unwanted MVPNs increases
1608	   aggregation becomes less optimal with respect to delivered traffic.
1609	   Hence there is a tradeoff between reducing state and delivering
1610	   unwanted traffic.

1612	   An implementation should provide knobs to control the congruence of
1613	   aggregation. These knobs are implementation dependent. Configuring
1614	   the percentage of sites that MVPNs must have in common to be
1615	   aggregated, is an example of such a knob. This will allow a SP to
1616	   deploy aggregation depending on the MVPN membership and traffic
1617	   profiles in its network.  If different PEs or servers are setting up
1618	   Aggregate Trees this will also allow a service provider to engineer
1619	   the maximum amount of unwanted MVPNs hat a particular PE may receive
1620	   traffic for.

1622	6.3.3. Encapsulation of the Aggregate Tree

1624	   An Aggregate Tree may use an IP/GRE encapsulation or an MPLS
1625	   encapsulation.  The protocol type in the IP/GRE header in the former
1626	   case and the protocol type in the data link header in the latter need
1627	   further explanation. This will be specified in a separate document.

1629	6.3.4. Demultiplexing C-multicast traffic

1631	   When multiple MVPNs are aggregated onto one P-Multicast tree,
1632	   determining the tree over which the packet is received is not
1633	   sufficient to determine the MVPN to which the packet belongs.  The
1634	   packet must also carry some demultiplexing information to allow the
1635	   egress PEs to determine the MVPN to which the packet belongs.  Since
1636	   the packet has been multicast through the P network, any given
1637	   demultiplexing value must have the same meaning to all the egress
1638	   PEs.  The demultiplexing value is a MPLS label that corresponds to
1639	   the multicast VRF to which the packet belongs. This label is placed
1640	   by the ingress PE immediately beneath the P-Multicast tree header.
1641	   Each of the egress PEs must be able to associate this MPLS label with
1642	   the same MVPN.  If downstream label assignment were used this would
1643	   require all the egress PEs in the MVPN to agree on a common label for
1644	   the MVPN. Instead the MPLS label is upstream assigned [MPLS-UPSTREAM-
1645	   LABEL]. The label bindings are advertised via BGP updates originated
1646	   the ingress PEs.

1648	   This procedure requires each egress PE to support a separate label
1649	   space for every other PE. The egress PEs create a forwarding entry
1650	   for the upstream assigned MPLS label, allocated by the ingress PE, in
1651	   this label space. Hence when the egress PE receives a packet over an
1652	   Aggregate Tree, it first determines the tree that the packet was
1653	   received over. The tree identifier determines the label space in
1654	   which the upstream assigned MPLS label lookup has to be performed.
1655	   The same label space may be used for all P-multicast trees rooted at
1656	   the same ingress PE, or an implementation may decide to use a
1657	   separate label space for every P-multicast tree.

1659	   The encapsulation format is either MPLS or MPLS-in-something (e.g.
1660	   MPLS-in-GRE [MPLS-IP]). When MPLS is used, this label will appear
1661	   immediately below the label that identifies the P-multicast tree.
1662	   When MPLS-in-GRE is used, this label will be the top MPLS label that
1663	   appears when the GRE header is stripped off.

1665	   When IP encapsulation is used for the P-multicast Tree, whatever
1666	   information that particular encapsulation format uses for identifying
1667	   a particular tunnel is used to determine the label space in which the
1668	   MPLS label is looked up.

1670	   If the P-multicast tree uses MPLS encapsulation, the P-multicast tree
1671	   is itself identified by an MPLS label.  The egress PE MUST NOT
1672	   advertise IMPLICIT NULL or EXPLICIT NULL for that tree.  Once the
1673	   label representing the tree is popped off the MPLS label stack, the
1674	   next label is the demultiplexing information that allows the proper
1675	   MVPN to be determined.

1677	   This specification requires that, to support this sort of
1678	   aggregation, there be at least one upstream-assigned label per MVPN.
1679	   It does not require that there be only one.  For example, an ingress
1680	   PE could assign a unique label to each C-(S,G).  (This could be done
1681	   using the same technique this is used to assign a particular C-(S,G)
1682	   to an S-PMSI, see section 7.3.)

1684	6.4. Mapping Received Packets to MVPNs

1686	   When an egress PE receives a C-multicast data packet over a P-
1687	   multicast tree, it needs to forward the packet to the CEs that have
1688	   receivers in the packet's C-multicast group.  In order to do this the
1689	   egress PE needs to determine the tunnel that the packet was received
1690	   on. The PE can then determine the MVPN that the packet belongs to and
1691	   if needed do any further lookups that are needed to forward the
1692	   packet.

1694	6.4.1. Unicast Tunnels

1696	   When ingress replication is used, the MVPN to which the received C-
1697	   multicast data packet belongs can be determined by the MPLS label
1698	   that was allocated by the egress. This label is distributed by the
1699	   egress.

1701	6.4.2. Non-Aggregated P-Multicast Trees

1703	   If a P-multicast tree is associated with only one MVPN, determining
1704	   the P-multicast tree on which a packet was received is sufficient to
1705	   determine the packet's MVPN. All that the egress PE needs to know is
1706	   the MVPN the P-multicast tree is associated with.

1708	   There are different ways in which the egress PE can learn this
1709	   association:

1711	      a) Configuration. The P-multicast tree that a particular MVPN
1712	         belongs to is configured on each PE.

1714	         [Editor's Note: PIM-SM Default MD trees in [ROSEN-8] and [MVPN-
1715	         BASE] are examples of configuring the P-multicast tree and MVPN
1716	         association]

1718	      b) BGP based advertisement of the P-multicast tree - MPVN mapping
1719	         after the root of the tree discovers the leaves of the tree.
1720	         The root of the tree sets up the tree after discovering each of
1721	         the PEs that belong to the MVPN.  It then advertises the P-
1722	         multicast tree - MVPN mapping to each of the leaves.  This
1723	         mechanism can be used with both source initiated trees [e.g.
1724	         RSVP-TE P2MP LSPs] and receiver initiated trees [e.g. PIM
1725	         trees].

1727	         [Editor's Note: Aggregate tree advertisements in [RAGGARWA-
1728	         MCAST] are examples of this.]

1730	      c) BGP based advertisement of the P-multicast tree - MVPN mapping
1731	         as part of the MVPN membership discovery. The root of the tree
1732	         advertises, to each of the other PEs that belong to the MVPN,
1733	         the P-multicast tree that the MVPN is associated with. This
1734	         implies that the root doesn't need to know the leaves of the
1735	         tree beforehand. This is possible only for receiver initiated
1736	         trees e.g. PIM based trees.

1738	         [Editor's Note: PIM-SSM discovery in [ROSEN-8] is an example of
1739	         the above]

1741	   Both of the above require the BGP based advertisement to contain the
1742	   P-multicast tree identifier. This identifier is encoded as a BGP
1743	   attribute and contains the following elements:

1745	     - Tunnel Type.

1747	     - Tunnel identifier. The semantics of the identifier is determined
1748	       by the tunnel type.

1750	6.4.3. Aggregate P-Multicast Trees

1752	   Once a PE sets up an Aggregate Tree it needs to announce the C-
1753	   multicast groups being mapped to this tree to other PEs in the
1754	   network. This procedure is referred to as Aggregate Tree discovery.
1755	   For an Aggregate Tree with an inclusive mapping this discovery
1756	   implies announcing:

1758	     - The mapping of all MVPNs mapped to the Tree.

1760	     - For each MVPN mapped onto the tree the inner label allocated for
1761	       it by the ingress PE. The use of this label is explained in the
1762	       demultiplexing procedures of section 6.3.4.

1764	     - The P-multicast tree Identifier

1766	   The egress PE creates a logical interface corresponding to the tree
1767	   identifier. This interface is the RPF interface for all the <C-
1768	   Source, C-Group> entries mapped to that tree.

1770	   When PIM is used to setup P-multicast trees, the egress PE also Joins
1771	   the P-Group Address corresponding to the tree. This results in setup
1772	   of the PIM P-multicast tree.

1774	6.5. I-PMSI Instantiation Using Ingress Replication

1776	   As described in section 3 a PMSI can be instantiated using Unicast
1777	   Tunnels between the PEs that are participating in the MVPN. In this
1778	   mechanism the ingress PE replicates a C-multicast data packet
1779	   belonging to a particular MVPN and sends a copy to all or a subset of
1780	   the PEs that belong to the MVPN. A copy of the packet is tunneled to
1781	   a remote PE over an Unicast Tunnel to the remote PE. IP/GRE Tunnels
1782	   or MPLS LSPs are examples of unicast tunnels that may be used. Note
1783	   that the same Unicast Tunnel can be used to transport packets
1784	   belonging to different MVPNs.

1786	   Ingress replication can be used to instantiate a UI-PMSI. The PE sets
1787	   up unicast tunnels to each of the remote PEs that support ingress
1788	   replication. For a given MVPN all C-multicast data packets are sent
1789	   to each of the remote PEs in the MVPN that support ingress
1790	   replication. Hence a remote PE may receive C-multicast data packets
1791	   for a group even if it doesn't have any receivers in that group.

1793	   Ingress replication can also be used to instantiate a MI-PMSI. In
1794	   this case each PE has a mesh of unicast tunnels to every other PE in
1795	   that MVPN.

1797	   However when ingress replication is used it is recommended that only
1798	   S-PMSIs be used. Instantiation of S-PMSIs with ingress replication is
1799	   described in section 7.1.  Note that this requires the use of
1800	   explicit tracking, i.e., a PE must know which of the other PEs have
1801	   receivers for each C-multicast tree.

1803	6.6. Establishing P-Multicast Trees

1805	   It is believed that the architecture outlined in this document places
1806	   no limitations on the protocols used to instantiate P-multicast
1807	   trees. However, the only protocols being explicitly considered are
1808	   PIM-SM, PIM-SSM, BIDIR-PIM, RSVP-TE, and mLDP.

1810	   A P-multicast tree can be either a source tree or a shared tree. A
1811	   source tree is used to carry traffic only for the multicast VRFs that
1812	   exist locally on the root of the tree i.e. for which the root has
1813	   local CEs. The root is a PE router. Source P-multicast trees can be
1814	   instantiated using PIM-SM, PIM-SSM, RSVP-TE P2MP LSPs, and mLDP P2MP
1815	   LSPs.

1817	   A shared tree on the other hand can be used to carry traffic
1818	   belonging to VRFs that exist on other PEs as well. The root of a
1819	   shared tree is not necessarily one of the PEs in the MVPN. All PEs
1820	   that use the shared tree will send MVPN data packets to the root of
1821	   the shared tree; if PIM is being used as the control protocol, PIM
1822	   control packets also get sent to the root of the shared tree.  This
1823	   may require an unicast tunnel between each of these PEs and the root.
1824	   The root will then send them on the shared tree and all the PEs that
1825	   are leaves of the shared tree will receive the packets. For example a
1826	   RP based PIM-SM tree would be a shared tree. Shared trees can be
1827	   instantiated using PIM-SM, PIM-SSM, BIDIR-PIM, RSVP-TE P2MP LSPs,
1828	   mLDP P2MP LSPs, and mLDP MP2MP LSPs.. Aggregation support for
1829	   bidirectional P-trees (i.e., BIDIR-PIM trees or mLDP MP2MP trees) is
1830	   for further study. Shared trees require all the PEs to discover the
1831	   root of the shared tree for a MVPN. To achieve this the root of a
1832	   shared tree advertises as part of the BGP based MVPN membership
1833	   discovery:

1835	     - The capability to setup a shared tree for a specified MVPN.

1837	     - A downstream assigned label that is to be used by each PE to
1838	       encapsulate a MVPN data packet, when they send this packet to the
1839	       root of the shared tree.

1841	     - A downstream assigned label that is to be used by each PE to
1842	       encapsulate a MVPN control packet, when they send this packet to
1843	       the root of the shared tree.

1845	   Both a source tree and a shared tree can be used to instantiate an I-
1846	   PMSI.  If a source tree is used to instantiate an UI-PMSI for a MVPN,
1847	   all the other PEs that belong to the MVPN, must be leaves of the
1848	   source tree. If a shared tree is used to instantiate a UI-PMSI for a
1849	   MVPN, all the PEs that are members of the MVPN must be leaves of the
1850	   shared tree.

1852	6.7. RSVP-TE P2MP LSPs

1854	   This section describes procedures that are specific to the usage of
1855	   RSVP-TE P2MP LSPs for instantiating a UI-PMSI. The RSVP-TE P2MP LSP
1856	   can be either a source tree or a shared tree. Procedures in [RSVP-
1857	   P2MP] are used to signal the LSP. The LSP is signaled after the root
1858	   of the LSP discovers the leaves. The egress PEs are discovered using
1859	   the MVPN membership procedures described in section 4. RSVP-TE P2MP
1860	   LSPs can optionally support aggregation.

1862	6.7.1. P2MP TE LSP Tunnel - MVPN Mapping

1864	   P2MP TE LSP Tunnel to MVPN mapping can be learned at the egress PEs
1865	   using either option (a) or option (b) described in section 6.4.2.
1866	   Option (b) i.e. BGP based advertisements of the P2MP TE LSP Tunnel -
1867	   MPVN mapping require that the root of the tree include the P2MP TE
1868	   LSP Tunnel identifier as the tunnel identifier in the BGP
1869	   advertisements. This identifier contains the following information
1870	   elements:

1872	     - The type of the tunnel is set to RSVP-TE P2MP Tunnel

1874	     - RSVP-TE P2MP Tunnel's SESSION Object
1875	     - Optionally RSVP-TE P2MP LSP's SENDER_TEMPLATE Object. This object
1876	       is included when it is desired to identify a particular P2MP TE
1877	       LSP.

1879	6.7.2. Demultiplexing C-Multicast Data Packets

1881	   Demultiplexing the C-multicast data packets at the egress PE follow
1882	   procedures described in section 6.3.4. The RSVP-TE P2MP LSP Tunnel
1883	   must be signaled with penultimate-hop-popping (PHP) off. Signaling
1884	   the P2MP TE LSP Tunnel with PHP off requires an extension to RSVP-TE
1885	   which will be described later.

1887	7. Optimizing Multicast Distribution via S-PMSIs

1889	   Whenever a particular multicast stream is being sent on an I-PMSI, it
1890	   is likely that the data of that stream is being sent to PEs that do
1891	   not require it.  If a particular stream has a significant amount of
1892	   traffic, it may be beneficial to move it to an S-PMSI which includes
1893	   only those PEs that are transmitters and/or receivers (or at least
1894	   includes fewer PEs that are neither).

1896	   If explicit tracking is being done, S-PMSI creation can also be
1897	   triggered on other criteria.  For instance there could be a "pseudo
1898	   wasted bandwidth" criteria: switching to an S-PMSI would be done if
1899	   the bandwidth multiplied by the number of uninterested PEs (PE that
1900	   are receiving the stream but have no receivers) is above a specified
1901	   threshold. The motivation is that (a) the total bandwidth wasted by
1902	   many sparsely subscribed low-bandwidth groups may be large, and (b)
1903	   there's no point to moving a high-bandwidth group to an S-PMSI if all
1904	   the PEs have receivers for it.

1906	   Switching a (C-S, C-G) stream to an S-PMSI may require the root of
1907	   the S-PMSI to determine the egress PEs that need to receive the (C-S,
1908	   C-G) traffic.  This is true in the following cases:

1910	     - If the tunnel is a source initiated tree, such as a RSVP-TE P2MP
1911	       Tunnel, the PE needs to know the leaves of the tree before it can
1912	       instantiate the S-PMSI.

1914	     - If a PE instantiates multiple S-PMSIs, belonging to different
1915	       MVPNs, using one P-multicast tree, such a tree is termed an
1916	       Aggregate Tree with a selective mapping. The setting up of such
1917	       an Aggregate Tree requires the ingress PE to know all the other
1918	       PEs that have receivers for multicast groups that are mapped onto
1919	       the tree.

1921	   The above two cases require that explicit tracking be done for the
1922	   (C-S, C-G) stream.  The root of the S-PMSI MAY decide to do explicit
1923	   tracking of this stream only after it has determined to move the
1924	   stream to an S-PMSI, or it MAY have been doing explicit tracking all
1925	   along.

1927	   If the S-PMSI is instantiated by a P-multicast tree, the PE at the
1928	   root of the tree must signal the leaves of the tree that the (C-S, C-
1929	   G) stream is now bound to the to the S-PMSI. Note that the PE could
1930	   create the identity of the P-multicast tree prior to the actual
1931	   instantiation of the tunnel.

1933	   If the S-PMSI is instantiated by a source-initiated P-multicast tree
1934	   (e.g., an RSVP-TE P2MP tunnel), the PE at the root of the tree must
1935	   establish the source-initiated P-multicast tree to the leaves.  This
1936	   tree MAY have been established before the leaves receive the S-PMSI
1937	   binding, or MAY be established after the leaves receives the binding.
1938	   The leaves MUST not switch to the S-PMSI until they receive both the
1939	   binding and the tree signaling message.

1941	7.1. S-PMSI Instantiation Using Ingress Replication

1943	   As described in section 6.1.1, ingress replication can be used to
1944	   instantiate a UI-PMSI. However this can result in a PE receiving
1945	   packets for a multicast group for which it doesn't have any
1946	   receivers. This can be avoided if the ingress PE tracks the remote
1947	   PEs which have receivers in a particular C-multicast group.  In order
1948	   to do this it needs to receive C-Joins from each of the remote PEs.
1949	   It then replicates the C-multicast data packet and sends it to only
1950	   those egress PEs which are on the path to a receiver of that C-group.
1951	   It is possible that each PE that is using ingress replication
1952	   instantiates only S-PMSIs. It is also possible that some PEs
1953	   instantiate UI-PMSIs while others instantiate only S-PMSIs. In both
1954	   these cases the PE MUST either unicast MVPN routing information using
1955	   PIM or use BGP for exchanging the MVPN routing information. This is
1956	   because there may be no MI-PMSI available for it to exchange MVPN
1957	   routing information.

1959	   Note that the use of ingress replication doesn't require any extra
1960	   procedures for signaling the binding of the S-PMSI from the ingress
1961	   PE to the egress PEs.  The procedures described for I-PMSIs are
1962	   sufficient.

1964	7.2. Protocol for Switching to S-PMSIs

1966	   We describe two protocols for switching to S-PMSIs.  These protocols
1967	   can be used when the tunnel that instantiates the S-PMSI is a P-
1968	   multicast tree.

1970	7.2.1. A UDP-based Protocol for Switching to S-PMSIs

1972	   This procedure can be used for any MVPN which has an MI-PMSI.
1973	   Traffic from all multicast streams in a given MPVN is sent, by
1974	   default, on the MI-PMSI.  Consider a single multicast stream within a
1975	   given MVPN, and consider a PE which is attached to a source of
1976	   multicast traffic for that stream.  The PE can be configured to move
1977	   the stream from the MI-PMSI to an S-PMSI if certain configurable
1978	   conditions are met.  To do this, it needs to inform all the PEs which
1979	   attach to receivers for stream.  These PEs need to start listening
1980	   for traffic on the S-PMSI, and the transmitting PE may start sending
1981	   traffic on the S-PMSI when it is reasonably certain that all
1982	   receiving PEs are listening on the S-PMSI.

1984	7.2.1.1. Binding a Stream to an S-PMSI

1986	   When a PE which attaches to a transmitter for a particular multicast
1987	   stream notices that the conditions for moving the stream to an S-PMSI
1988	   are met, it begins to periodically send an "S-PMSI Join Message" on
1989	   the MI-PMSI.  The S-PMSI Join is a UDP-encapsulated message whose
1990	   destination address is ALL-PIM-ROUTERS (224.0.0.13), and whose
1991	   destination port is 3232.

1993	   The S-PMSI Join Message contains the following information:

1995	     - An identifier for the particular multicast stream which is to be
1996	       bound to the S-PMSI.  This can be represented as an (S,G) pair.

1998	     - An identifier for the particular S-PMSI to which the stream is to
1999	       be bound.  This identifier is a structured field which includes
2000	       the following information:

2002	         * The type of tunnel used to instantiate the S-PMSI

2004	         * An identifier for the tunnel.  The form of the identifier
2005	           will depend upon the tunnel type.  The combination of tunnel
2006	           identifier and tunnel type should contain enough information
2007	           to enable all the PEs to "join" the tunnel and receive
2008	           messages from it.

2010	         * Any demultiplexing information needed by the tunnel
2011	           encapsulation protocol to identify the particular S-PMSI.
2012	           This allows a single tunnel to aggregate multiple S-PMSIs.
2013	           If a particular tunnel is not aggregating multiple S-PMSIs,
2014	           then no demultiplexing information is needed.

2016	   A PE router which is not connected to a receiver will still receive
2017	   the S-PMSI Joins, and MAY cache the information contained therein.
2018	   Then if the PE later finds that it is attached to a receiver, it can
2019	   immediately start listening to the S-PMSI.

2021	   Upon receiving the S-PMSI Join, PE routers connected to receivers for
2022	   the specified stream will take whatever action is necessary to start
2023	   receiving multicast data packets on the S-PMSI.  The precise action
2024	   taken will depend upon the tunnel type.

2026	   After a configurable delay, the PE router which is sending the S-PMSI
2027	   Joins will start transmitting the stream's data packets on the S-
2028	   PMSI.

2030	   When the pre-configured conditions are no longer met for a particular
2031	   stream, e.g. the traffic stops, the PE router connected to the source
2032	   stops announcing S-PMSI Joins for that stream.  Any PE that does not
2033	   receive, over a configurable interval, an S-PMSI Join for a
2034	   particular stream will stop listening to the S-PMSI.

2036	7.2.1.2. Packet Formats and Constants

2038	   The S-PMSI Join message is encapsulated within UDP, and has the
2039	   following type/length/value (TLV) encoding:

2041	        0                   1                   2                   3
2042	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2043	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2044	       |     Type      |            Length           |     Value       |
2045	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2046	       |                               .                               |
2047	       |                               .                               |
2048	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2050	   Type (8 bits)

2052	   Length (16 bits): the total number of octets in the Type, Length, and
2053	   Value fields combined

2055	   Value (variable length)
2056	   Currently only one type of S-PMSI Join is defined.  A type 1 S-PMSI
2057	   Join is used when the S-PMSI tunnel is a PIM tunnel which is used to
2058	   carry a single multicast stream, where the packets of that stream
2059	   have IPv4 source and destination IP addresses.

2061	        0                   1                   2                   3
2062	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2063	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2064	       |     Type      |           Length            |    Reserved     |
2065	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2066	       |                           C-source                            |
2067	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2068	       |                           C-group                             |
2069	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2070	       |                           P-group                             |
2071	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2073	   Type (8 bits): 1

2075	   Length (16 bits): 16

2077	   Reserved (8 bits):  This field SHOULD be zero when transmitted, and
2078	   MUST be ignored when received.

2080	   C-Source (32 bits): the IPv4 address of the traffic source in the
2081	   VPN.

2083	   C-Group (32 bits): the IPv4 address of the multicast traffic
2084	   destination address in the VPN.

2086	   P-Group (32 bits): the IPv4 group address that the PE router is going
2087	   to use to encapsulate the flow (C-Source, C-Group).

2089	   The P-group identifies the S-PMSI tunnel, and the (C-S, C-G)
2090	   identifies the multicast flow that is carried in the tunnel.

2092	   The protocol uses the following constants.

2094	   [S-PMSI_DELAY]:

2096	       the PE router which is to transmit onto the S-PMSI will delay
2097	       this amount of time before it begins using the S-PMSI.  The
2098	       default value is 3 seconds.

2100	   [S-PMSI_TIMEOUT]:

2102	       if a PE (other than the transmitter) does not receive any packets
2103	       over the S-PMSI tunnel for this amount of time, the PE will prune
2104	       itself from the S-PMSI tunnel, and will expect (C-S, C-G) packets
2105	       to arrive on an I-PMSI.  The default value is 3 minutes.  This
2106	       value must be consistent among PE routers.

2108	   [S-PMSI_HOLDOWN]:

2110	       if the PE that transmits onto the S-PMSI does not see any (C-S,
2111	       C-G) packets for this amount of time, it will resume sending (C-
2112	       S, C-G) packets on an I-PMSI.

2114	       This is used to avoid oscillation when traffic is bursty.  The
2115	       default value is 1 minute.

2117	   [S-PMSI_INTERVAL]
2118	       the interval the transmitting PE router uses to periodically send
2119	       the S-PMSI Join message.  The default value is 60 seconds.

2121	7.2.2. A BGP-based Protocol for Switching to S-PMSIs

2123	   This procedure can be used for a MVPN that is using either a UI-PMSI
2124	   or a MI-PMSI. Consider a single multicast stream for a C-(S, G)
2125	   within a given MVPN, and consider a PE which is attached to a source
2126	   of multicast traffic for that stream. The PE can be configured to
2127	   move the stream from the MI-PMSI or UI-PMSI to an S-PMSI if certain
2128	   configurable conditions are met. Once a PE decides to move the C-(S,
2129	   G) for a given MVPN to a S-PMSI, it needs to instantiate the S-PMSI
2130	   using a tunnel and announce to all the egress PEs, that are on the
2131	   path to receivers of the C-(S, G), of the binding of the S-PMSI to
2132	   the C-(S, G). The announcement is done using BGP.  Depending on the
2133	   tunneling technology used, this announcement may be done before or
2134	   after setting up the tunnel. The source and egress PEs have to switch
2135	   to using the S-PMSI for the C-(S, G).

2137	7.2.2.1. Advertising C-(S, G) Binding to a S-PMSI using BGP

2139	   The ingress PE informs all the PEs that are on the path to receivers
2140	   of the C-(S, G) of the binding of the S-PMSI to the C-(S, G). The BGP
2141	   announcement is done by sending update for the MCAST-VPN address
2142	   family.  An A-D route is used, containing the following information:

2144	      a) IP address of the originating PE

2146	      b) The RD configured locally for the MVPN. This is required to
2147	         uniquely identify the <C-Source, C-Group> as the addresses
2148	         could overlap between different MVPNs.  This is the same RD
2149	         value used in the auto-discovery process.

2151	      c) The C-Source address. This address can be a prefix in order to
2152	         allow a range of C-Source addresses to be mapped to an
2153	         Aggregate Tree.

2155	      d) The C-Group address. This address can be a range in order to
2156	         allow a range of C-Group addresses to be mapped to an Aggregate
2157	         Tree.

2159	      e) A PE MAY aggregate two or more S-PMSIs originated by the PE
2160	         onto the same P-Multicast tree. If the PE already advertises S-
2161	         PMSI auto-discovery routes for these S-PMSIs, then aggregation
2162	         requires the PE to re-advertise these routes. The re-advertised
2163	         routes MUST be the same as the original ones, except for the
2164	         PMSI tunnel attribute. If the PE has not previously advertised
2165	         S-PMSI auto-discovery routes for these S-PMSIs, then the
2166	         aggregation requires the PE to advertise (new) S-PMSI auto-
2167	         discovery routes for these S-PMSIs.  The PMSI Tunnel attribute
2168	         in the newly advertised/re-advertised routes MUST carry the
2169	         identity of the P- Multicast tree that aggregates the S-PMSIs.
2170	         If at least some of the S-PMSIs aggregated onto the same P-
2171	         Multicast tree belong to different MVPNs, then all these routes
2172	         MUST carry an MPLS upstream assigned label [MPLS-UPSTREAM-
2173	         LABEL, section 6.3.4].  If all these aggregated S-PMSIs belong
2174	         to the same MVPN, then the routes MAY carry an MPLS upstream
2175	         assigned label [MPLS-UPSTREAM-LABEL].  The labels MUST be
2176	         distinct on a per MVPN basis, and MAY be distinct on a per
2177	         route basis.

2179	   When a PE distributes this information via BGP, it must include the
2180	   following:

2182	      1. An identifier for the particular S-PMSI to which the stream is
2183	         to be bound.  This identifier is a structured field which
2184	         includes the following information:

2186	           * The type of tunnel used to instantiate the S-PMSI

2188	           * An identifier for the tunnel.  The form of the identifier
2189	             will depend upon the tunnel type.  The combination of
2190	             tunnel identifier and tunnel type should contain enough
2191	             information to enable all the PEs to "join" the tunnel and
2192	             receive messages from it.

2194	      2. Route Target Extended Communities attribute. This is used as
2195	         described in section 4.

2197	7.2.2.2. Explicit Tracking

2199	   If the PE wants to enable explicit tracking for the specified flow,
2200	   it also indicates this in the A-D route it uses to bind the flow to a
2201	   particular S-PMSI.  Then any PE which receives the A-D route will
2202	   respond with a "Leaf A-D Route" in which it identifies itself as a
2203	   receiver of the specified flow.  The Leaf A-D route will be withdrawn
2204	   when the PE is no longer a receiver for the flow.

2206	   If the PE needs to enable explicit tracking for a flow before binding
2207	   the flow to an S-PMSI, it can do so by sending an A-D route
2208	   identifying the flow but not specifying an S-PMSI.  This will elicit
2209	   the Leaf A-D Routes.  This is useful when the PE needs to know the
2210	   receivers before selecting an S-PMSI.

2212	7.2.2.3. Switching to S-PMSI

2214	   After the egress PEs receive the announcement they setup their
2215	   forwarding path to receive traffic on the S-PMSI if they have one or
2216	   more receivers interested in the <C-S, C-G> bound to the S-PMSI. This
2217	   involves changing the RPF interface for the relevant <C-S, C-G>
2218	   entries to the interface that is used to instantiate the S-PMSI. If
2219	   an Aggregate Tree is used to instantiate a S-PMSI this also implies
2220	   setting up the demultiplexing forwarding entries based on the inner
2221	   label as described in section 6.3.4.  The egress PEs may perform the
2222	   switch to the S-PMSI once the advertisement from the ingress PE is
2223	   received or wait for a preconfigured timer to do so.

2225	   A source PE may use one of two approaches to decide when to start
2226	   transmitting data on the S-PMSI. In the first approach once the
2227	   source PE instantiates the S-PMSI, it starts sending multicast
2228	   packets for <C-S, C-G> entries mapped to the S-PMSI on both that as
2229	   well as on the I-PMSI, which is currently used to send traffic for
2230	   the <C-S, C-G>. After some preconfigured timer the PE stops sending
2231	   multicast packets for <C-S, C-G> on the I-PMSI. In the second
2232	   approach after a certain pre-configured delay after advertising the
2233	   <C-S, C-G> entry bound to a S-PMSI, the source PE begins to send
2234	   traffic on the S-PMSI. At this point it stops to send traffic for the
2235	   <C-S, C-G> on the I-PMSI. This traffic is instead transmitted on the
2236	   S-PMSI.

2238	7.3. Aggregation

2240	   S-PMSIs can be aggregated on a P-multicast tree. The S-PMSI to C-(S,
2241	   G) binding advertisement supports aggregation. Furthermore the
2242	   aggregation procedures of section 6.3 apply. It is also possible to
2243	   aggregate both S-PMSIs and I-PMSIs on the same P-multicast tree.

2245	7.4. Instantiating the S-PMSI with a PIM Tree

2247	   The procedures of section 7.3 tell a PE when it must start listening
2248	   and stop listening to a particular S-PMSI.  Those procedures also
2249	   specify the method for instantiating the S-PMSI.  In this section, we
2250	   provide the procedures to be used when the S-PMSI is instantiated as
2251	   a PIM tree.  The PIM tree is created by the PIM P-instance.

2253	   If a single PIM tree is being used to aggregate multiple S-PMSIs,
2254	   then the PIM tree to which a given stream is bound may have already
2255	   been joined by a given receiving PE.  If the tree does not already
2256	   exist, then the appropriate PIM procedures to create it must be
2257	   executed in the P-instance.

2259	   If the S-PMSI for a particular multicast stream is instantiated as a
2260	   PIM-SM or BIDIR-PIM tree, the S-PMSI identifier will specify the RP
2261	   and the group P-address, and the PE routers which have receivers for
2262	   that stream must build a shared tree toward the RP.

2264	   If the S-PMSI is instantiated as a PIM-SSM tree, the PE routers build
2265	   a source tree toward the PE router that is advertising the S-PMSI
2266	   Join.  The IP address root of the tree is the same as the source IP
2267	   address which appears in the S-PMSI Join.  In this case, the tunnel
2268	   identifier in the S-PMSI Join will only need to specify a group P-
2269	   address.

2271	   The above procedures assume that each PE router has a set of group P-
2272	   addresses that it can use for setting up the PIM-trees.  Each PE must
2273	   be configured with this set of P-addresses.  If PIM-SSM is used to
2274	   set up the tunnels, then the PEs may be with overlapping sets of
2275	   group P-addresses.  If PIM-SSM is not used, then each PE must be
2276	   configured with a unique set of group P-addresses (i.e., having no
2277	   overlap with the set configured at any other PE router).  The
2278	   management of this set of addresses is thus greatly simplified when
2279	   PIM-SSM is used, so the use of PIM-SSM is strongly recommended
2280	   whenever PIM trees are used to instantiate S-PMSIs.

2282	   If it is known that all the PEs which need to receive data traffic on
2283	   a given S-PMSI can support aggregation of multiple S-PMSIs on a
2284	   single PIM tree, then the transmitting PE, may, at its discretion,
2285	   decide to bind the S-PMSI to a PIM tree which is already bound to one
2286	   or more other S-PMSIs, from the same or from different MVPNs.  In
2287	   this case, appropriate demultiplexing information must be signaled.

2289	7.5. Instantiating S-PMSIs using RSVP-TE P2MP Tunnels

2291	   RSVP-TE P2MP Tunnels can be used for instantiating S-PMSIs.
2292	   Procedures described in the context of I-PMSIs in section 6.7 apply.

2294	8. Inter-AS Procedures

2296	   If an MVPN has sites in more than one AS, it requires one or more
2297	   PMSIs to be instantiated by inter-AS tunnels.  This document
2298	   describes two different types of inter-AS tunnel:

2300	      1. "Segmented Inter-AS tunnels"

2302	         A segmented inter-AS tunnel consists of a number of independent
2303	         segments which are stitched together at the ASBRs.  There are
2304	         two types of segment, inter-AS segments and intra-AS segments.
2305	         The segmented inter-AS tunnel consists of alternating intra-AS
2306	         and inter-AS segments.

2308	         Inter-AS segments connect adjacent ASBRs of different ASes;
2309	         these "one-hop" segments are instantiated as unicast tunnels.

2311	         Intra-AS segments connect ASBRs and PEs which are in the same
2312	         AS.  An intra-AS segment may be of whatever technology is
2313	         desired by the SP that administers the that AS.  Different
2314	         intra-AS segments may be of different technologies.

2316	         Note that an intra-AS segment of an inter-AS tunnel is distinct
2317	         from any intra-AS tunnel in the AS.

2319	         A segmented inter-AS tunnel can be thought of as a tree which
2320	         is rooted at a particular AS, and which has as its leaves the
2321	         other ASes which need to receive multicast data from the root
2322	         AS.

2324	      2. "Non-segmented Inter-AS tunnels"

2326	         A non-segmented inter-AS tunnel is a single tunnel which spans
2327	         AS boundaries.  The tunnel technology cannot change from one
2328	         point in the tunnel to the next, so all ASes through which the
2329	         tunnel passes must support that technology.  In essence, AS
2330	         boundaries are of no significance to a non-segmented inter-AS
2331	         tunnel.

2333	         [Editor's Note: This is the model in [ROSEN-8] and [MVPN-
2334	         BASE].]

2336	   Section 10 of [RFC4364] describes three different options for
2337	   supporting unicast Inter-AS BGP/MPLS IP VPNs, known as options A, B,
2338	   and C.  We describe below how both segmented and non-segmented inter-
2339	   AS trees can be supported when option B or option C is used. (Option
2340	   A does not pass any routing information through an ASBR at all, so no
2341	   special inter-AS procedures are needed.)

2343	8.1. Non-Segmented Inter-AS Tunnels

2345	   In this model, the previously described discovery and tunnel setup
2346	   mechanisms are used, even though the PEs belonging to a given MVPN
2347	   may be in different ASes.

2349	8.1.1. Inter-AS MVPN Auto-Discovery

2351	   The previously described BGP-based auto-discovery mechanisms work "as
2352	   is" when an MVPN contains PEs that are in different Autonomous
2353	   Systems.  However, please note that, if non-segmented Inter-AS
2354	   Tunnels are to be used, then the "Intra-AS" A-D routes MUST be
2355	   distributed across AS boundaries!

2357	8.1.2. Inter-AS MVPN Routing Information Exchange

2359	   When non-segmented inter-AS tunnels are used, MVPN C-multicast
2360	   routing information may be exchanged by means of PIM peering across
2361	   an MI-PMSI, or by means of BGP carrying C-multicast routes.

2363	   When PIM peering is used to distribute the C-multicast routing
2364	   information, a PE that sends C-PIM Join/Prune messages for a
2365	   particular C-(S,G) must be able to identify the PE which is its PIM
2366	   adjacency on the path to S.  This is the "selected upstream PE"
2367	   described in section 5.1.

2369	   If BGP (rather than PIM) is used to distribute the C-multicast
2370	   routing information, and if option b of section 10 of [RFC4364] is in
2371	   use, then the C-multicast routes will be installed in the ASBRs along
2372	   the path from each multicast source in the MVPN to each multicast
2373	   receiver in the MVPN.  If option b is not in use, the C-multicast
2374	   routes are not installed in the ASBRs.  The handling of the C-
2375	   multicast routes in either case is thus exactly analogous to the
2376	   handling of unicast VPN-IP routes in the corresponding case.

2378	8.1.3. Inter-AS P-Tunnels

2380	   The procedures described earlier in this document can be used to
2381	   instantiate either an I-PMSI or an S-PMSI with inter-AS P-tunnels.
2382	   Specific tunneling techniques require some explanation.

2384	   If ingress replication is used, the inter-AS PE-PE tunnels will use
2385	   the inter-AS tunneling procedures for the tunneling technology used.

2387	   Procedures in [RSVP-P2MP] are used for inter-AS RSVP-TE P2MP P-
2388	   Tunnels.

2390	   Procedures for using PIM  to set up the P-tunnels are discussed in
2391	   the next section.

2393	8.1.4. PIM-Based Inter-AS P-Multicast Trees

2395	   When PIM is used to set up an inter-AS P-multicast tree, the PIM
2396	   Join/Prune messages used to join the tree contain the IP address of
2397	   the upstream PE.  However, there are two special considerations that
2398	   must be taken into account:

2400	     - It is possible that the P routers within one or more of the ASes
2401	       will not have routes to the upstream PE.  For example, if an AS
2402	       has a "BGP-free core", the P routers in an AS will not have
2403	       routes to addresses outside the AS.

2405	     - If the PIM Join/Prune message must travel through several ASes,
2406	       it is possible that the ASBRs will not have routes to he PE
2407	       routers.  For example, in an inter-AS VPN constructed according
2408	       to "option b" of section 10 of [RFC4364], the ASBRs do not
2409	       necessarily have routes to the PE routers.

2411	   If either of these two conditions obtains, then "ordinary" PIM
2412	   Join/Prune messages cannot be routed to the upstream PE.  Thus the
2413	   following information needs to be added to the PIM Join/Prune
2414	   messages: a "Proxy Address", which contains the address of the next
2415	   ASBR on the path to the upstream PE.  When the PIM Join/Prune arrives
2416	   at the ASBR which is identified by the "proxy address", that ASBR
2417	   must change the proxy address to identify the next hop ASBR.

2419	   This information allows the PIM Join/Prune to be routed through an AS
2420	   even if the P routers of that AS do not have routes to the upstream
2421	   PE.  However, this information is not sufficient to enable the ASBRs
2422	   to route the Join/Prune if the ASBRs themselves do not have routes to
2423	   the upstream PE.

2425	   However, even if the ASBRs do not have routes to the upstream PE, the
2426	   procedures of this draft ensure that they will have A-D routes that
2427	   lead to the upstream PE.  If non-segmented inter-AS MVPNs are being
2428	   used, the ASBRs (and PEs) will have Intra-AS A-D routes which have
2429	   been distributed inter-AS.

2431	   So rather than having the PIM Join/Prune messages routed by the ASBRs
2432	   along a route to the upstream PE,  the PIM Join/Prune messages MUST
2433	   be routed along the path determined by the intra-AS A-D routes.

2435	   If the only intra-AS A-D route for a given MVPN is the "Intra-AS I-
2436	   PMSI Route", the PIM Join/Prunes will be routed along that.  However,
2437	   if the PIM Join/Prune message is for a particular P-group address,
2438	   and there is an "Intra-AS S-PMSI Route" specifying that particular P-
2439	   group address as the P-tunnel for a particular S-PMSI, then the PIM
2440	   Join/Prunes MUST be routed along the path determined by those intra-
2441	   AS A-D routes.

2443	   The next revision of this document will provide the following
2444	   details:

2446	     - encoding of the proxy address in the PIM message (the PIM Join
2447	       Attribute [PIM-ATTRIB] will be used)

2449	     - encoding of any other information which may be needed in order to
2450	       enable the correct intra-AS route to be chosen.

2452	   Support for non-segmented inter-AS trees using BIDIR-PIM is for
2453	   further study.

2455	8.2. Segmented Inter-AS Tunnels

2457	8.2.1. Inter-AS MVPN Auto-Discovery Routes

2459	   The BGP based MVPN membership discovery procedures of section 4 are
2460	   used to auto-discover the intra-AS MVPN membership. This section
2461	   describes the additional procedures for inter-AS MVPN membership
2462	   discovery. It also describes the procedures for constructing
2463	   segmented inter-AS tunnels.

2465	   In this case, for a given MVPN in an AS, the objective is to form a
2466	   spanning tree of MVPN membership, rooted at the AS. The nodes of this
2467	   tree are ASes.  The leaves of this tree are only those ASes that have
2468	   at least one PE with a member in the MVPN. The inter-AS tunnel used
2469	   to instantiate an inter-AS PMSI must traverse this spanning tree. A
2470	   given AS needs to announce to another AS only the fact that it has
2471	   membership in a given MVPN. It doesn't need to announce the
2472	   membership of each PE in the AS to other ASes.

2474	   This section defines an inter-AS auto-discovery route as a route that
2475	   carries information about an AS that has one or more PEs (directly)
2476	   connected to the site(s) of that MVPN. Further it defines an inter-AS
2477	   leaf auto-discovery route (leaf auto-discovery route) as a route used
2478	   to inform the root of an intra-AS segment, of an inter-AS tunnel, of
2479	   a leaf of that intra-AS segment.

2481	8.2.1.1. Originating Inter-AS MVPN A-D Information

2483	   A PE in a given AS advertises its MVPN membership to all its IBGP
2484	   peers.  This IBGP peer may be a route reflector which in turn
2485	   advertises this information to only its IBGP peers. In this manner
2486	   all the PEs and ASBRs in the AS learn this membership information.

2488	   An Autonomous System Border Router (ASBR) may be configured to
2489	   support a particular MVPN. If an ASBR is configured to support a
2490	   particular MVPN, the ASBR MUST participate in the intra-AS MVPN auto-
2491	   discovery/binding procedures for that MVPN within the AS that the
2492	   ASBR belongs to, as defined in this document.

2494	   Each ASBR then advertises the "AS MVPN membership" to its neighbor
2495	   ASBRs using EBGP. This inter-AS auto-discovery route must not be
2496	   advertised to the PEs/ASBRs in the same AS as this ASBR. The
2497	   advertisement carries the following information elements:

2499	      a. A Route Distinguisher for the MVPN. For a given MVPN each ASBR
2500	         in the AS must use the same RD when advertising this
2501	         information to other ASBRs. To accomplish this all the ASBRs
2502	         within that AS, that are configured to support the MVPN, MUST
2503	         be configured with the same RD for that MVPN. This RD MUST be
2504	         of Type 0, MUST embed the autonomous system number of the AS.

2506	      b. The announcing ASBR's local address as the next-hop for the
2507	         above information elements.

2509	      c. By default the BGP Update message MUST carry export Route
2510	         Targets used by the unicast routing of that VPN. The default
2511	         could be modified via configuration by having a set of Route
2512	         Targets used for the inter-AS auto-discovery routes being
2513	         distinct from the ones used by the unicast routing of that VPN.

2515	8.2.1.2. Propagating Inter-AS MVPN A-D Information

2517	   As an inter-AS auto-discovery route originated by an ASBR within a
2518	   given AS is propagated via BGP to other ASes, this results in
2519	   creation of a data plane tunnel that spans multiple ASes. This tunnel
2520	   is used to carry (multicast) traffic from the MVPN sites connected to
2521	   the PEs of the AS to the MVPN sites connected to the PEs that are in
2522	   the other ASes. Such tunnel consists of multiple intra-AS segments
2523	   (one per AS) stitched at ASBRs' boundaries by single hop <ASBR-ASBR>
2524	   LSP segments.

2526	   An ASBR originates creation of an intra-AS segment when the ASBR
2527	   receives an inter-AS auto-discovery route from an EBGP neighbor.
2528	   Creation of the segment is completed as a result of distributing via
2529	   IBGP this route within the ASBR's own AS.

2531	   For a given inter-AS tunnel each of its intra-AS segments could be
2532	   constructed by its own independent mechanism. Moreover, by using
2533	   upstream labels within a given AS multiple intra-AS segments of
2534	   different inter-AS tunnels of either the same or different MVPNs may
2535	   share the same P-Multicast Tree.

2537	   Since (aggregated) inter-AS auto-discovery routes have granularity of
2538	   <AS, MVPN>, an MVPN that is present in N ASes would have total of N
2539	   inter-AS tunnels. Thus for a given MVPN the number of inter-AS
2540	   tunnels is independent of the number of PEs that have this MVPN.

2542	   The following sections specify procedures for propagation of
2543	   (aggregated) inter-AS auto-discovery routes across ASes.

2545	8.2.1.2.1. Inter-AS Auto-Discovery Route received via EBGP

2547	   When an ASBR receives from one of its EBGP neighbors a BGP Update
2548	   message that carries the inter-AS auto-discovery route if (a) at
2549	   least one of the Route Targets carried in the message matches one of
2550	   the import Route Targets configured on the ASBR, and (b) the ASBR
2551	   determines that the received route is the best route to the
2552	   destination carried in the NLRI of the route, the ASBR:

2554	      a) Re-advertises this inter-AS auto-discovery route within its own
2555	         AS.

2557	         If the ASBR uses ingress replication to instantiate the intra-
2558	         AS segment of the inter-AS tunnel, the re-advertised route
2559	         SHOULD carry a Tunnel attribute with the Tunnel Identifier set
2560	         to Ingress Replication, but no MPLS labels.

2562	         If a P-Multicast Tree is used to instantiate the intra-AS
2563	         segment of the inter-AS tunnel, and in order to advertise the
2564	         P-Multicast tree identifier the ASBR doesn't need to know the
2565	         leaves of the tree beforehand, then the advertising ASBR SHOULD
2566	         advertise the P-Multicast tree identifier in the Tunnel
2567	         Identifier of the Tunnel attribute. This, in effect, creates a
2568	         binding between the inter-AS auto-discovery route and the P-
2569	         Multicast Tree.

2571	         If a P-Multicast Tree is used to instantiate the intra-AS
2572	         segment of the inter-AS tunnel, and in order to advertise the
2573	         P-Multicast tree identifier the advertising ASBR needs to know
2574	         the leaves of the tree beforehand, the ASBR first discovers the
2575	         leaves using the Auto-Discovery procedures, as specified
2576	         further down. It then advertises the binding of the tree to the
2577	         inter-AS auto-discovery route using the the original auto-
2578	         discovery route with the addition of carrying in the route the
2579	         Tunnel attribute that contains the type and the identity of the
2580	         tree (encoded in the Tunnel Identifier of the attribute).

2582	      b) Re-advertises the received inter-AS auto-discovery route to its
2583	         EBGP peers, other than the EBGP neighbor from which the best
2584	         inter-AS auto-discovery route was received.

2586	      c) Advertises to its neighbor ASBR, from which it received the
2587	         best inter-AS autodiscovery route to the destination carried in
2588	         the NRLI of the route, a leaf auto-discovery route that carries
2589	         an ASBR-ASBR tunnel binding with the tunnel identifier set to
2590	         ingress replication. This binding as described in section 6 can
2591	         be used by the neighbor ASBR to send traffic to this ASBR.

2593	8.2.1.2.2. Leaf Auto-Discovery Route received via EBGP

2595	   When an ASBR receives via EBGP a leaf auto-discovery route, the ASBR
2596	   finds an inter-AS auto-discovery route that has the same RD as the
2597	   leaf auto-discovery route. The MPLS label carried in the leaf auto-
2598	   discovery route is used to stitch a one hop ASBR-ASBR LSP to the tail
2599	   of the intra-AS tunnel segment associated with the inter-AS auto-
2600	   discovery route.

2602	8.2.1.2.3. Inter-AS Auto-Discovery Route received via IBGP

2604	   If a given inter-AS auto-discovery route is advertised within an AS
2605	   by multiple ASBRs of that AS, the BGP best route selection performed
2606	   by other PE/ASBR routers within the AS does not require all these
2607	   PE/ASBR routers to select the route advertised by the same ASBR - to
2608	   the contrary different PE/ASBR routers may select routes advertised
2609	   by different ASBRs.

2611	   Further when a PE/ASBR receives from one of its IBGP neighbors a BGP
2612	   Update message that carries a AS MVPN membership tree , if (a) the
2613	   route was originated outside of the router's own AS, (b) at least one
2614	   of the Route Targets carried in the message matches one of the import
2615	   Route Targets configured on the PE/ASBR, and (c) the PE/ASBR
2616	   determines that the received route is the best route to the
2617	   destination carried in the NLRI of the route, if the router is an
2618	   ASBR then the ASBR propagates the route to its EBGP neighbors. In
2619	   addition the PE/ASBR performs the following.

2621	   If the received inter-AS auto-discovery route carries the Tunnel
2622	   attribute with the Tunnel Identifier set to LDP P2MP LSP, or PIM-SSM
2623	   tree, or PIM-SM tree, the PE/ASBR SHOULD join the P-Multicast tree
2624	   whose identity is carried in the Tunnel Identifier.

2626	   If the received source auto-discovery route carries the Tunnel
2627	   attribute with the Tunnel Identifier set to RSVP-TE P2MP LSP, then
2628	   the ASBR that originated the route MUST signal the local PE/ASBR as
2629	   one of leaf LSRs of the RSVP-TE P2MP LSP. This signaling MAY have
2630	   been completed before the local PE/ASBR receives the BGP Update
2631	   message.

2633	   If the NLRI of the route does not carry a label, then this tree is an
2634	   intra-AS LSP segment that is part of the inter-AS Tunnel for the MVPN
2635	   advertised by the inter-AS auto-discovery route. If the NLRI carries
2636	   a (upstream) label, then a combination of this tree and the label
2637	   identifies the intra-AS segment.

2639	   If this is an ASBR, this intra-AS segment may further be stitched to
2640	   ASBR-ASBR inter-AS segment of the inter-AS tunnel. If the PE/ASBR has
2641	   local receivers in the MVPN, packets received over the intra-AS
2642	   segment must be forwarded to the local receivers using the local VRF.

2644	   If the received inter-AS auto-discovery route either does not carry
2645	   the Tunnel attribute, or carries the Tunnel attribute with the Tunnel
2646	   Identifier set to ingress replication, then the PE/ASBR originates a
2647	   new auto-discovery route to allow the ASBR from which the auto-
2648	   discovery route was received, to learn of this ASBR as a leaf of the
2649	   intra-AS tree.

2651	   Thus the AS MVPN membership information propagates across multiple
2652	   ASes along a spanning tree. BGP AS-Path based loop prevention
2653	   mechanism prevents loops from forming as this information propagates.

2655	8.2.2. Inter-AS MVPN Routing Information Exchange

2657	   All of the MVPN routing information exchange methods specified in
2658	   section 5 can be supported across ASes.

2660	   The objective in this case is to propagate the MVPN routing
2661	   information to the remote PE that originates the unicast route to C-
2662	   S/C-RP, in the reverse direction of the AS MVPN membership
2663	   information announced by the remote PE's origin AS. This information
2664	   is processed by each ASBR along this reverse path.

2666	   To achieve this the PE that is generating the MVPN routing
2667	   advertisement, first determines the source AS of the unicast route to
2668	   C-S/C-RP. It then determines from the received AS MVPN membership
2669	   information, for the source AS, the ASBR that is the next-hop for the
2670	   best path of the source AS MVPN membership. The BGP MVPN routing
2671	   update is sent to this ASBR and the ASBR then further propagates the
2672	   BGP advertisement. BGP filtering mechanisms ensure that the BGP MVPN
2673	   routing information updates flow only to the upstream router on the
2674	   reverse path of the inter-AS MVPN membership tree. Details of this
2675	   filtering mechanism and the relevant encoding will be specified in a
2676	   separate document.

2678	8.2.3. Inter-AS I-PMSI

2680	   All PEs in a given AS, use the same inter-AS heterogeneous tunnel,
2681	   rooted at the AS, to instantiate an I-PMSI for an inter-AS MVPN
2682	   service. As explained earlier the intra-AS tunnel segments that
2683	   comprise this tunnel can be built using different tunneling
2684	   technologies. To instantiate an MI-PMSI service for a MVPN there must
2685	   be an inter-AS tunnel rooted at each AS that has at least one PE that
2686	   is a member of the MVPN.

2688	   A C-multicast data packet is sent using an intra-AS tunnel segment by
2689	   the PE that first receives this packet from the MVPN customer site.
2690	   An ASBR forwards this packet to any locally connected MVPN receivers
2691	   for the multicast stream. If this ASBR has received a tunnel binding
2692	   for the AS MVPN membership that it advertised to a neighboring ASBR,
2693	   it also forwards this packet to the neighboring ASBR. In this case
2694	   the packet is encapsulated in the downstream MPLS label received from
2695	   the neighboring ASBR. The neighboring ASBR delivers this packet to
2696	   any locally connected MVPN receivers for that multicast stream. It
2697	   also transports this packet on an intra-AS tunnel segment, for the
2698	   inter-AS MVPN tunnel, and the other PEs and ASBRs in the AS then
2699	   receive this packet.  The other ASBRs then repeat the procedure
2700	   followed by the ASBR in the origin AS and the packet traverses the
2701	   overlay inter-AS tunnel along a spanning tree.

2703	8.2.3.1. Support for Unicast VPN Inter-AS Methods

2705	   The above procedures for setting up an inter-AS I-PMSI can be
2706	   supported for each of the unicast VPN inter-AS models described in
2707	   [RFC4364]. These procedures do not depend on the method used to
2708	   exchange unicast VPN routes. For Option B and Option C they do
2709	   require MPLS encapsulation between the ASBRs.

2711	8.2.4. Inter-AS S-PMSI

2713	   An inter-AS tunnel for an S-PMSI is constructed similar to an inter-
2714	   AS tunnel for an I-PMSI. Namely, such a tunnel is constructed as a
2715	   concatenation of tunnel segments. There are two types of tunnel
2716	   segments: an intra-AS tunnel segment (a segment that spans ASBRs
2717	   within the same AS), and inter-AS tunnel segment (a segment that
2718	   spans adjacent ASBRs in adjacent ASes). ASes that are spanned by a
2719	   tunnel are not required to use the same tunneling mechanism to
2720	   construct the tunnel - each AS may pick up a tunneling mechanism to
2721	   construct the intra-AS tunnel segment of the tunnel on its

2723	   The PE that decides to set up a S-PMSI, advertises the S-PMSI tunnel
2724	   binding using procedures in section 7.3.2 to the routers in its own
2725	   AS. The <C-S, C-G> membership for which the S-PMSI is instantiated,
2726	   is propagated along an inter-AS spanning tree. This spanning tree
2727	   traverses the same ASBRs as the AS MVPN membership spanning tree. In
2728	   addition to the information elements described in section 7.3.2
2729	   (Origin AS, RD, next-hop) the C-S and C-G is also advertised.

2731	   An ASBR that receives the AS <C-S, C-G> information from its upstream
2732	   ASBR using EBGP sends back a tunnel binding for AS <C-S, C-G>
2733	   information if a) at least one of the Route Targets carried in the
2734	   message matches one of the import Route Targets configured on the
2735	   ASBR, and (b) the ASBR determines that the received route is the best
2736	   route to the destination carried in the NLRI of the route. If the
2737	   ASBR instantiates a S-PMSI for the AS <C-S, C-G> it sends back a
2738	   downstream label that is used to forward the packet along its intra-
2739	   AS S-PMSI for the <C-S, C-G>. However the ASBR may decide to use an
2740	   AS MVPN membership I-PMSI instead, in which case it sends back the
2741	   same label that it advertised for the AS MVPN membership I-PMSI. If
2742	   the downstream ASBR instantiates a S-PMSI, it further propagates the
2743	   <C-S, C-G> membership to its downstream ASes, else it does not.

2745	   An AS can instantiate an intra-AS S-PMSI for the inter-AS S-PMSI
2746	   tunnel only if the upstream AS instantiates a S-PMSI. The procedures
2747	   allow each AS to determine whether it wishes to setup a S-PMSI or not
2748	   and the AS is not forced to setup a S-PMSI just because the upstream
2749	   AS decides to do so.

2751	   The leaves of an intra-AS S-PMSI tunnel will be the PEs that have
2752	   local receivers that are interested in <C-S, C-G> and the ASBRs that
2753	   have received MVPN routing information for <C-S, C-G>. Note that an
2754	   AS can determine these ASBRs as the MVPN routing information is
2755	   propagated and processed by each ASBR on the AS MVPN membership
2756	   spanning tree.

2758	   The C-multicast data traffic is sent on the S-PMSI by the originating
2759	   PE.  When it reaches an ASBR that is on the spanning tree, it is
2760	   delivered to local receivers, if any, and is also forwarded to the
2761	   neighbor ASBR after being encapsulated in the label advertised by the
2762	   neighbor. The neighbor ASBR either transports this packet on the S-
2763	   PMSI for the multicast stream or an I-PMSI, delivering it to the
2764	   ASBRs in its own AS. These ASBRs in turn repeat the procedures of the
2765	   origin AS ASBRs and the multicast packet traverses the spanning tree.

2767	9. Duplicate Packet Detection and Single Forwarder PE

2769	   Consider the case of an egress PE that receives packets of a customer
2770	   multicast stream (C-S, C-G) over a non-aggregated S-PMSI.  The
2771	   procedures described so far will never cause the PE to receive
2772	   duplicate copies of any packet in that stream.  It is possible that
2773	   the (C-S, C-G) stream is carried in more than one S-PMSI; this may
2774	   happen when the site that contains C-S is multihomed to more than one
2775	   PE.  However, a PE that needs to receive (C-S, C-G) packets only
2776	   joins one of these S-PMSIs, and so only receives one copy of each
2777	   packet.

2779	   However, if the data packets of stream (C-S, C-G) are carried in
2780	   either an I-PMSI or in an aggregated S-PMSI, then it the procedures
2781	   specified so far make it possible for an egress PE to receive more
2782	   than one copy of each data packet.  In this section, we define
2783	   additional procedures to that an MVPN customer sees no multicast data
2784	   packet duplication.

2786	   This section covers the situation where the customer multicast tree
2787	   is unidirectional, i.e. with the C-G is either a "Sparse Mode" or a
2788	   "Single Source Mode" group.  The case where the customer multicast
2789	   tree is bidirectional (the C-G is a BIDIR-PIM group) is considered
2790	   separately in section 12.

2792	   The first case when an egress PE may receive duplicate multicast data
2793	   packets, is the case where both (a) an MVPN site that contains C-S or
2794	   C-RP is multihomed to more than one PE, and (b) either an I-PMSI, or
2795	   an aggregated S-PMSI is used for carrying the packets originated by
2796	   C-S.  In this case, an egress PE may receive one copy of the packet
2797	   from each PE to which the site is homed.

2799	   The second case when an egress PE may receive duplicate multicast
2800	   data packets is when all of the following is true: (a) the IP
2801	   destination address of the customer packet is a C-G that is operating
2802	   in ASM mode, and whose C-multicast tree is set up using PIM-SM, (b)
2803	   an MI-PMSI is used for carrying the packets, and (c) a router or a CE
2804	   in a site connected to the egress PE switches from the C-RP tree to
2805	   C-S tree.  In this case, it is possible to get one copy of a given
2806	   packet from the ingress PE attached to the C-RP's site, and one from
2807	   the ingress PE attached to the C-S's site.

2809	9.1. Multihomed C-S or C-RP

2811	   In the first case for a given <C-S, C-G> an egress PE, say PE1,
2812	   expects to receive C-data packets from the upstream PE, say PE2,
2813	   which PE1 identified as the upstream multicast hop in the C-Multicast
2814	   Routing Update that PE1 sent in order to join <C-S, C-G>. If PE1 can
2815	   determine that a data packet for <C-S, C-G> was received from the
2816	   expected upstream PE, PE2, PE1 will accept and forward the packet.
2817	   Otherwise, PE1 will drop the packet; this means that the PE will see
2818	   a duplicate, but the duplicate will not get forwarded.  (But see
2819	   section 10 for an exception case where PE1 will accept a packet even
2820	   if it is from an unexpected upstream PE.)

2822	   The method used by an egress PE to determine the ingress PE for a
2823	   particular packet, received over a particular PMSI, depends on the P-
2824	   tunnel technology that is used to instantiate the PMSI.  If the P-
2825	   tunnel is a P2MP LSP, a PIM-SM or PIM-SSM tree, or a unicast tunnel,
2826	   then the tunnel encapsulation contains information which can be used
2827	   (possibly along with other state information in the PE) to determine
2828	   the ingress PE, as long as the P-tunnel is instantiating an intra-AS
2829	   PMSI, or an inter-AS PMSI which is supported by a non-segmented
2830	   inter-AS tunnel.

2832	   Even when inter-AS segmented tunnels are used, if a UI-PMSI or an
2833	   aggregated S-PMSI is used for carrying the packets, the P-tunnel
2834	   encapsulation must have some information which can be used to
2835	   identify the PMSI, and that in turn implicitly identifies the ingress
2836	   PE.

2838	   If an MI-PMSI is used for carrying the packets, the MI-PMSI spans
2839	   multiple ASes, and the MI-PMSI is realized via segmented inter-AS
2840	   tunnels, if C-S or C-RP is multi-homed to different PEs, as long as
2841	   each such PE is in a different AS, the egress PE can detect duplicate
2842	   traffic as such duplicate traffic will arrive on a different (inter-
2843	   AS) tunnel. Specifically, if the PE was expecting the traffic on an
2844	   particular inter-AS tunnel, duplicate traffic will arrive either on
2845	   an intra-AS tunnel [this is not an intra-AS tunnel segment, of an
2846	   inter-AS tunnel], or on some other inter-AS tunnel.  Therefore, to
2847	   detect duplicates the PE has to keep track of which (inter-AS) auto-
2848	   discovery route the PE uses for sending MVPN multicast routing
2849	   information towards C-S/C-RP. Then the PE should receive (multicast)
2850	   traffic originated by C-S/C-RP only from the (inter-AS) tunnel that
2851	   was carried in the best Inter-AS auto-discovery route for the MVPN
2852	   and was originated by the AS that contains C-S/C-RP (where "the best"
2853	   is determined by the PE). The PE should discard, as duplicated, all
2854	   other multicast traffic originated by C-S/C-RP, but received on any
2855	   other tunnel.

2857	9.1.1. Single forwarder PE selection

2859	   When for a given MVPN (a) MI-PMSI is used for carrying multicast data
2860	   packets, (b) C-S or C-RP is multi-homed to different PEs, and (c) at
2861	   least two of such PEs are in the same AS, then depending on the
2862	   tunneling technology used by the MI-PMSI it may not always be
2863	   possible for the egress PE to determine the upstream PE.  Therefore,
2864	   when this determination may not be possible procedures are needed to
2865	   ensure that packets are received on an MI-PMSI at an egress PE only
2866	   from a single upstream PE.  Furthermore, even if the determination is
2867	   possible, it may be preferable to send only one copy of each packet
2868	   to each egress PE, rather than sending multiple copies and having the
2869	   egress PE discard all but one.

2871	   Section 5.1 specifies a procedure for choosing a "default upstream PE
2872	   selection", such that (except during routing transients) all PEs will
2873	   choose the same default upstream PE.  To ensure that duplicate
2874	   packets are not sent through the backbone (except during routing
2875	   transients), an ingress PE does not forward to the backbone any (C-S,
2876	   C-G) multicast data packet it receives from a CE, unless the PE is
2877	   the default upstream PE selection.

2879	   This procedure is optional whenever the P-tunnel technology that is
2880	   being used to carry the multicast stream in question allows the
2881	   egress PEs to determine the identity of the ingress PE.  This
2882	   procedure is mandatory if the P-tunnel technology does not make this
2883	   determination possible.

2885	   The above procedure ensures that if C-S or C-RP is multi-homed to PEs
2886	   within a single AS, a PE will not receive duplicate traffic as long
2887	   as all the PEs are on either the C-S or C-RP tree. If some PEs are on
2888	   the C-S tree and some on the C-RP tree, however, packet duplication
2889	   is still possible. This is discussed in the next section.

2891	9.2. Switching from the C-RP tree to C-S tree

2893	   If some PEs are on the C-S tree and some on the R-RP tree then a PE
2894	   may also receive duplicate traffic during a <C-*, C-G> to <C-S, C-G>
2895	   switch. The issue and the solution are described next.

2897	   When for a given MVPN (a) MI-PMSI is used for carrying multicast data
2898	   packets, (b) C-S and C-RP are connected to PEs within the same AS,
2899	   and (c) the MI-PMSI tunneling technology in use does not allow the
2900	   egress PEs to identify the ingress PE, then having all the PEs select
2901	   the same PE to be the upstream multicast hop for C-S or C-RP is not
2902	   sufficient to prevent packet duplication.

2904	   The reason is that a single tunnel used by MI-PMSI may be carrying
2905	   traffic on both the (C-*, C-G) tree and the (C-S, C-G) tree. If some
2906	   of the egress PEs have joined the source tree, but others expect to
2907	   receive (C-S, C-G) packets from the shared tree, then two copies of
2908	   data packet will travel on the tunnel, and since due to the choice of
2909	   the tunneling technology the egress PEs have no way to identify the
2910	   ingress PE, the egress PEs will have no way to determine that only
2911	   one copy should be accepted.

2913	   To avoid this, it is necessary to ensure that once any PE joins the
2914	   (C-S, C-G) tree, any other PE that has joined the (C-*, C- G) tree
2915	   also switches to the (C-S, C-G) tree (selecting, of course, the same
2916	   upstream multicast hop, as specified above).

2918	   Whenever a PE creates an <C-S, C-G> state as a result of receiving a
2919	   C-multicast route for <C-S, C-G> from some other PE, and the C-G
2920	   group is a Sparse Mode group, the PE that creates the state MUST
2921	   originate an auto-discovery route as specified below. The route is
2922	   being advertised using the same procedures as the MVPN auto-
2923	   discovery/binding (both intra-AS and inter-AS) specified in this
2924	   document with the following modifications:

2926	      1. The Multicast Source field MUST be set to C-S.  The Multicast
2927	         Source Length field is set appropriately to reflect this.

2929	      2. The Multicast Group field MUST be set to C-G.  The Multicast
2930	         Group Length field is set appropriately to reflect this.

2932	   The route goes to all the PEs of the MVPN. When a PE receives this
2933	   route, it checks whether there are any receivers in the MVPN sites
2934	   attached to the PE for the group carried in the route.  If yes, then
2935	   it generates a C-multicast route indicating Join for <C-S, C-G>.

2937	   This forces all the PEs (in all ASes) to switch to the C-S tree for
2938	   <C-S, C-G> from the C-RP tree.

2940	   This is the same type of A-D route used to report active sources in
2941	   the scenarios described in section 10.

2943	   Note that when a PE thus joins the <C-S, C-G> tree, it may need to
2944	   send a PIM (S,G,RPT-bit) prune to one of its CE PIM neighbors, as
2945	   determined by ordinary PIM procedures.  Whenever the PE deletes the
2946	   <C-S, C-G> state that was previousely created as a result of
2947	   receiving a C-multicast route for <C-S, C-G> from some other PE, the
2948	   PE that deletes the state also withdraws the auto-discovery route
2949	   that was advertised when the state was created.

2951	   N.B.: SINCE ALL PEs WITH RECEIVERS FOR GROUP C-G WILL JOIN THE C-S
2952	   SOURCE TREE IF ANY OF THEM DO, IT IS NEVER NECESSARY TO DISTRIBUTE A
2953	   BGP C-MULTICAST ROUTE FOR THE PURPOSE OF PRUNING SOURCES FROM THE
2954	   SHARED TREE.

2956	10. Eliminating PE-PE Distribution of (C-*,C-G) State

2958	   In sparse mode PIM, a node that wants to become a receiver for a
2959	   particular multicast group G first joins a shared tree, rooted at a
2960	   rendezvous point.  When the receiver detects traffic from a
2961	   particular source it has the option of joining a source tree, rooted
2962	   at that source.  If it does so, it has to prune that source from the
2963	   shared tree, to ensure that it receives packets from that source on
2964	   only one tree.

2966	   Maintaining the shared tree can require considerable state, as it is
2967	   necessary not only to know who the upstream and downstream nodes are,
2968	   but to know which sources have been pruned off which branches of the
2969	   share tree.

2971	   The BGP-based signaling procedures defined in this document and in
2972	   [MVPN-BGP] eliminate the need for PEs to distribute to each other any
2973	   state having to do with which sources have been pruned off a shared
2974	   C-tree.  Those procedures do still allow multicast data traffic to
2975	   travel on a shared C-tree, but they do not allow a situation in which
2976	   some CEs receive (S,G) traffic on a shared tree and some on a source
2977	   tree.  This results in a considerable simplification of the PE-PE
2978	   procedures with minimal change to the multicast service seen within
2979	   the VPN.  However, shared C-trees are still supported across the VPN
2980	   backbone.  That is, (C-*, C-G) state is distributed PE-PE, but (C-*,
2981	   C-G, RPT-bit) state is not.

2983	   In this section, we specify a number of optional procedures which go
2984	   further, and which completely eliminate the support for shared C-
2985	   trees across the VPN backbone.  In these procedures, the PEs keep
2986	   track of the active sources for each C-G.  As soon as a CE tries to
2987	   join the (*,G) tree, the PEs instead join the (S,G) trees for all the
2988	   active sources.  Thus all distribution of (C-*,C-G) state is
2989	   eliminated.  These procedures are optional because they require some
2990	   additional support on the part of the VPN customer, and because they
2991	   are not always appropriate.  (E.g., a VPN customer may have his own
2992	   policy of always using shared trees for certain multicast groups.)
2993	   There are several different options, described in the following sub-
2994	   sections.

2996	10.1. Co-locating C-RPs on a PE

2998	   [MVPN-REQ] describes C-RP engineering as an issue when PIM-SM (or
2999	   BIDIR-PIM) is used in "Any Source Multicast (ASM) mode" [RFC4607] on
3000	   the VPN customer site. To quote from [MVPN-REQ]:

3002	   "In some cases this engineering problem is not trivial: for instance,
3003	   if sources and receivers are located in VPN sites that are different
3004	   than that of the RP, then traffic may flow twice through the SP
3005	   network and the CE-PE link of the RP (from source to RP, and then
3006	   from RP to receivers) ; this is obviously not ideal.  A multicast VPN
3007	   solution SHOULD propose a way to help on solving this RP engineering
3008	   issue."

3010	   One of the C-RP deployment models is for the customer to outsource
3011	   the RP to the provider. In this case the provider may co-locate the
3012	   RP on the PE that is connected to the customer site [MVPN-REQ]. This
3013	   model is introduced in [RP-MVPN]. This section describes how anycast-
3014	   RP can be used for achieving this by advertising active sources. This
3015	   is described below.

3017	10.1.1. Initial Configuration

3019	   For a particular MVPN, at least one or more PEs that have sites in
3020	   that MVPN, act as an RP for the sites of that MVPN connected to these
3021	   PEs.  Within each MVPN all these RPs use the same (anycast) address.
3022	   All these RPs use the Anycast RP technique.

3024	10.1.2. Anycast RP Based on Propagating Active Sources

3026	   This mechanism is based on propagating active sources between RPs.

3028	   [Editor's Note: This is derived from the model in [RP-MVPN].]

3030	10.1.2.1. Receiver(s) Within a Site

3032	   The PE which receives C-Join for (*,G) or (S,G) does not send the
3033	   information that it has receiver(s) for G until it receives
3034	   information about active sources for G from an upstream PE.

3036	   On receiving this (described in the next section), the downstream PE
3037	   will respond with Join for C-(S,G). Sending this information could be
3038	   done using any of the procedures described in section 5. If BGP is
3039	   used, the ingress address is set to the upstream PE's address which
3040	   has triggered the source active information. Only the upstream PE
3041	   will process this information. If unicast PIM is used then a unicast
3042	   PIM message will have to be sent to the PE upstream PE that has
3043	   triggered the source active information. If a MI-PMSI is used than
3044	   further clarification is needed on the upstream neighbor address of
3045	   the PIM message and will be provided in a future revision.

3047	10.1.2.2. Source Within a Site

3049	   When a PE receives PIM-Register from a site that belongs to a given
3050	   VPN, PE follows the normal PIM anycast RP procedures. It then
3051	   advertises the source and group of the multicast data packet carried
3052	   in PIM-Register message to other PEs in BGP using the following
3053	   information elements:

3055	     - Active source address

3057	     - Active group address

3059	     - Route target of the MVPN.

3061	   This advertisement goes to all the PEs that belong to that MVPN. When
3062	   a PE receives this advertisement, it checks whether there are any
3063	   receivers in the sites attached to the PE for the group carried in
3064	   the source active advertisement. If yes, then it generates an
3065	   advertisement for C-(S,G) as specified in the previous section.

3067	   Note that the mechanism described in section 7.3.2. can be leveraged
3068	   to advertise a S-PMSI binding along with the source active messages.

3070	10.1.2.3. Receiver Switching from Shared to Source Tree

3072	   No additional procedures are required when multicast receivers in
3073	   customer's site shift from shared tree to source tree.

3075	10.2. Using MSDP between a PE and a Local C-RP

3077	   Section 10.1 describes the case where each PE is a C-RP.  This
3078	   enables the PEs to know the active multicast sources for each MVPN,
3079	   and they can then use BGP to distribute this information to each
3080	   other.  As a result, the PEs do not have to join any shared C-trees,
3081	   and this results in a simplification of the PE operation.

3083	   In another deployment scenario, the PEs are not themselves C-RPs, but
3084	   use MSDP to talk to the C-RPs.  In particular, a PE which attaches to
3085	   a site that contains a C-RP becomes an MSDP peer of that C-RP.  That
3086	   PE then uses BGP to distribute the information about the active
3087	   sources to the other PEs.  When the PE determines, by MSDP, that a
3088	   particular source is no longer active, then it withdraws the
3089	   corresponding BGP update.  Then the PEs do not have to join any
3090	   shared C-trees, but they do not have to be C-RPs either.

3092	   MSDP provides the capability for a Source Active message to carry an
3093	   encapsulated data packet.  This capability can be used to allow an
3094	   MSDP speaker to receive the first (or first several) packet(s) of an
3095	   (S,G) flow, even though the MSDP speaker hasn't yet joined the (S,G)
3096	   tree.  (Presumably it will join that tree as a result of receiving
3097	   the SA message which carries the encapsulated data packet.)  If this
3098	   capability is not used, the first several data packets of an (S,G)
3099	   stream may be lost.

3101	   A PE which is talking MSDP to an RP may receive such an encapsulated
3102	   data packet from the RP.  The data packet should be decapsulated and
3103	   transmitted to the other PEs in the MVPN.  If the packet belongs to a
3104	   particular (S,G) flow, and if the PE is a transmitter for some S-PMSI
3105	   to which (S,G) has already been bound, the decapsulated data packet
3106	   should be transmitted on that S-PMSI.  Otherwise, if an I-PMSI exists
3107	   for that MVPN, the decapsulated data packet should be transmitted on
3108	   it.  (If a default MI-PMSI exists, this would typically be used.)  If
3109	   neither of these conditions hold, the decapsulated data packet is not
3110	   transmitted to the other PEs in the MVPN.  The decision as to whether
3111	   and how to transmit the decapsulated data packet does not effect the
3112	   processing of the SA control message itself.

3114	   Suppose that PE1 transmits a multicast data packet on a PMSI, where
3115	   that data packet is part of an (S,G) flow, and PE2 receives that
3116	   packet form that PMSI.  According to section 9, PE1 is not the PE
3117	   that PE2 expects to be transmitting (S,G) packets, then PE2 must
3118	   discard the packet.  If an MSDP-encapsulated data packet is
3119	   transmitted on a PMSI as specified above, this rule from section 9
3120	   would likely result in the packet's getting discarded.  Therefore, if
3121	   MSDP-encapsulated data packets being decapsulated and transmitted on
3122	   a PMSI, we need to modify the rules of section 9 as follows:

3124	      1. If the receiving PE, PE1, has already joined the (S,G) tree,
3125	         and has chosen PE2 as the upstream PE for the (S,G) tree, but
3126	         this packet does not come from PE2, PE1 must discard the
3127	         packet.

3129	      2. If the receiving PE, PE1, has not already joined the (S,G)
3130	         tree, but is a PIM adjacency to a CE which is downstream on the
3131	         (*,G) tree, the packet should be forwarded to the CE.

3133	11. Encapsulations

3135	   The BGP-based auto-discovery procedures will ensure that the PEs in a
3136	   single MVPN only use tunnels that they can all support, and for a
3137	   given kind of tunnel, that they only use encapsulations that they can
3138	   all support.

3140	11.1. Encapsulations for Single PMSI per Tunnel

3142	11.1.1. Encapsulation in GRE

3144	   GRE encapsulation can be used for any PMSI that is instantiated by a
3145	   mesh of unicast tunnels, as well as for any PMSI that is instantiated
3146	   by one or more PIM tunnels of any sort.

3148	   Packets received        Packets in transit      Packets forwarded
3149	   at ingress PE           in the service          by egress PEs
3150	                           provider network

3152	                           +---------------+
3153	                           |  P-IP Header  |
3154	                           +---------------+
3155	                           |      GRE      |
3156	   ++=============++       ++=============++       ++=============++
3157	   || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
3158	   ++=============++ >>>>> ++=============++ >>>>> ++=============++
3159	   || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
3160	   ++=============++       ++=============++       ++=============++

3162	   The IP Protocol Number field in the P-IP Header must be set to 47.
3163	   The Protocol Type field of the GRE Header must be set to 0x800.

3165	   When an encapsulated packet is transmitted by a particular PE, the
3166	   source IP address in the P-IP header must be the same address that
3167	   the PE uses to identify itself in the VRF Route Import Extended
3168	   Communities that it attaches to any of VPN-IP routes eligible for UMH
3169	   determination that it advertises via BGP (see section 5.1).

3171	   If the PMSI is instantiated by a PIM tree, the destination IP address
3172	   in the P-IP header is the group P-address associated with that tree.
3173	   The GRE key field value is omitted.

3175	   If the PMSI is instantiated by unicast tunnels, the destination IP
3176	   address is the address of the destination PE, and the optional GRE
3177	   Key field is used to identify a particular MVPN.  In this case, each
3178	   PE would have to advertise a key field value for each MVPN; each PE
3179	   would assign the key field value that it expects to receive.

3181	   [RFC2784] specifies an optional GRE checksum, and [RFC2890] specifies
3182	   an optional GRE sequence number fields.

3184	   The GRE sequence number field is not needed because the transport
3185	   layer services for the original application will be provided by the
3186	   C-IP Header.

3188	   The use of GRE checksum field must follow [RFC2784].

3190	   To facilitate high speed implementation, this document recommends
3191	   that the ingress PE routers encapsulate VPN packets without setting
3192	   the checksum, or sequence fields.

3194	11.1.2. Encapsulation in IP

3196	   IP-in-IP [RFC1853] is also a viable option.  When it is used, the
3197	   IPv4 Protocol Number field is set to 4. The following diagram shows
3198	   the progression of the packet as it enters and leaves the service
3199	   provider network.

3201	   Packets received        Packets in transit      Packets forwarded
3202	   at ingress PE           in the service          by egress PEs
3203	                           provider network

3205	                           +---------------+
3206	                           |  P-IP Header  |
3207	   ++=============++       ++=============++       ++=============++
3208	   || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
3209	   ++=============++ >>>>> ++=============++ >>>>> ++=============++
3210	   || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
3211	   ++=============++       ++=============++       ++=============++

3213	   When an encapsulated packet is transmitted by a particular PE, the
3214	   source IP address in the P-IP header must be the same address that
3215	   the PE uses to identify itself in the VRF Route Import Extended
3216	   Communities that it attaches to any of VPN-IP routes eligible for UMH
3217	   determination that it advertises via BGP (see section 5.1).

3219	11.1.3. Encapsulation in MPLS

3221	   If the PMSI is instantiated as a P2MP MPLS LSP, MPLS encapsulation is
3222	   used. Penultimate-hop-popping must be disabled for the P2MP MPLS LSP.
3223	   If the PMSI is instantiated as an RSVP-TE P2MP LSP, additional MPLS
3224	   encapsulation procedures are used, as specified in [RSVP-P2MP].

3226	   If other methods of assigning MPLS labels to multicast distribution
3227	   trees are in use, these multicast distribution trees may be used as
3228	   appropriate to instantiate PMSIs, and any additional MPLS
3229	   encapsulation procedures may be used.

3231	   Packets received        Packets in transit      Packets forwarded
3232	   at ingress PE           in the service          by egress PEs
3233	                           provider network

3235	                           +---------------+
3236	                           | P-MPLS Header |
3237	   ++=============++       ++=============++       ++=============++
3238	   || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
3239	   ++=============++ >>>>> ++=============++ >>>>> ++=============++
3240	   || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
3241	   ++=============++       ++=============++       ++=============++

3243	11.2. Encapsulations for Multiple PMSIs per Tunnel

3245	   The encapsulations for transmitting multicast data messages when
3246	   there are multiple PMSIs per tunnel are based on the encapsulation
3247	   for a single PMSI per tunnel, but with an MPLS label used for
3248	   demultiplexing.

3250	   The label is upstream-assigned and distributed via BGP as specified
3251	   in section 4.  The label must enable the receiver to select the
3252	   proper VRF, and may enable the receiver to select a particular
3253	   multicast routing entry within that VRF.

3255	11.2.1. Encapsulation in GRE

3257	   Rather than the IP-in-GRE encapsulation discussed in section 11.1.1,
3258	   we use the MPLS-in-GRE encapsulation.  This is specified in [MPLS-
3259	   IP].  The GRE protocol type MUST be set to 0x8847. [The reason for
3260	   using the unicast rather than the multicast value is specified in
3261	   [MPLS-MCAST-ENCAPS].

3263	11.2.2. Encapsulation in IP

3265	   Rather than the IP-in-IP encapsulation discussed in section 12.1.2,
3266	   we use the MPLS-in-IP encapsulation.  This is specified in [MPLS-IP].
3267	   The IP protocol number MUST be set to the value identifying the
3268	   payload as an MPLS unicast packet. [There is no "MPLS multicast
3269	   packet" protocol number.]

3271	11.3. Encapsulations Identifying a Distinguished PE

3273	11.3.1. For MP2MP LSP P-tunnels

3275	   As discussed in section 9, if a multicast data packet belongs to a
3276	   Sparse Mode or Single Source Mode multicast group, it is highly
3277	   desirable for the PE that receives the packet from a PMSI to be able
3278	   to determine the identity of the PE that transmitted the data packet
3279	   onto the PMSI.  The encapsulations of the previous sections all
3280	   provide this information, except in one case.  If a PMSI is being
3281	   instantiated by a MP2MP LSP, then the encapsulations discussed so far
3282	   do not allow one to determine the identity of the PE that transmitted
3283	   the packet onto the PMSI.

3285	   Therefore, when a packet that belongs to a Sparse Mode or Single
3286	   Source Mode multicast group is traveling on a MP2MP LSP P-tunnel, it
3287	   MUST carry, as its second label, a label which has been bound to the
3288	   packet's ingress PE.  This label is an upstream-assigned label that
3289	   the LSP's root node has bound to the ingress PE and has distributed
3290	   via an A-D Route (see section 4; precise details of this distribution
3291	   procedure will be included in the next revision of this document).
3292	   This label will appear immediately beneath the labels that are
3293	   discussed in sections 11.1.3 and 11.2.

3295	11.3.2. For Support of PIM-BIDIR C-Groups

3297	   As will be discussed in section 12, when a packet belongs to a PIM-
3298	   BIDIR multicast group, the set of PEs of that packet's VPN can be
3299	   partitioned into a number of subsets, where exactly one PE in each
3300	   partition is the upstream PE for that partition.  When such packets
3301	   are transmitted on a PMSI, then unless the procedures of section
3302	   12.2.3 are being used, it is necessary for the packet to carry
3303	   information identifying a particular partition. This is done by
3304	   having the packet carry the PE label corresponding to the upstream PE
3305	   of one partition.  For a particular P-tunnel, this label will have
3306	   been advertised by the node which is the root of that P-tunnel.
3307	   (Details of the procedure by which the PE labels are advertised will
3308	   be included in the next revision of this document.)

3310	   This label needs to be used whenever a packet belongs to a PIM-BIDIR
3311	   C-group, no matter what encapsulation is used by the P-tunnel.  Hence
3312	   the encapsulations of section 11.2 MUST be used.  If the tunnel
3313	   contains only one PMSI, the PE label replaces the label discussed in
3314	   section 11.2 If the tunnel contains multiple PMSIs, the PE label
3315	   follows the label discussed in section 11.2

3317	11.4. Encapsulations for Unicasting PIM Control Messages

3319	   When PIM control messages are unicast, rather than being sent on an
3320	   MI-PMSI, the the receiving PE needs to determine the particular MVPN
3321	   whose multicast routing information is being carried in the PIM
3322	   message.  One method is to use a downstream-assigned MPLS label which
3323	   the receiving PE has allocated for this specific purpose.  The label
3324	   would be distributed via BGP.  This can be used with an MPLS, MPLS-
3325	   in-GRE, or MPLS-in-IP encapsulation.

3327	   A possible alternative to modify the PIM messages themselves so that
3328	   they carry information which can be used to identify a particular
3329	   MVPN, such as an RT.

3331	   This area is still under consideration.

3333	11.5. General Considerations for IP and GRE Encaps

3335	   These apply also to the MPLS-in-IP and MPLS-in-GRE encapsulations.

3337	11.5.1. MTU

3339	   It is the responsibility of the originator of a C-packet to ensure
3340	   that the packet small enough to reach all of its destinations, even
3341	   when it is encapsulated within IP or GRE.

3343	   When a packet is encapsulated in IP or GRE, the router that does the
3344	   encapsulation MUST set the DF bit in the outer header.  This ensures
3345	   that the decapsulating router will not need to reassemble the
3346	   encapsulating packets before performing decapsulation.

3348	   In some cases the encapsulating router may know that a particular C-
3349	   packet is too large to reach its destinations.  Procedures by which
3350	   it may know this are outside the scope of the current document.
3351	   However, if this is known, then:

3353	     - If the DF bit is set in the IP header of a C-packet which is
3354	       known to be too large, the router will discard the C-packet as
3355	       being "too large", and follow normal IP procedures (which may
3356	       require the return of an ICMP message to the source).

3358	     - If the DF bit is not set in the IP header of a C-packet which is
3359	       known to be too large, the router MAY fragment the packet before
3360	       encapsulating it, and then encapsulate each fragment separately.
3361	       Alternatively, the router MAY discard the packet.

3363	   If the router discards a packet as too large, it should maintain OAM
3364	   information related to this behavior, allowing the operator to
3365	   properly troubleshoot the issue.

3367	   Note that if the entire path of the tunnel does not support an MTU
3368	   which is large enough to carry the a particular encapsulated C-
3369	   packet, and if the encapsulating router does not do fragmentation,
3370	   then the customer will not receive the expected connectivity.

3372	11.5.2. TTL

3374	   The ingress PE should not copy the TTL field from the payload IP
3375	   header received from a CE router to the delivery IP or MPLS header.
3376	   The setting of the TTL of the delivery header is determined by the
3377	   local policy of the ingress PE router.

3379	11.5.3. Differentiated Services

3381	   The setting of the DS field in the delivery IP header should follow
3382	   the guidelines outlined in [RFC2983].  Setting the EXP field in the
3383	   delivery MPLS header should follow the guidelines in [RFC3270]. An SP
3384	   may also choose to deploy any of the additional mechanisms the PE
3385	   routers support.

3387	11.5.4. Avoiding Conflict with Internet Multicast

3389	   If the SP is providing Internet multicast, distinct from its VPN
3390	   multicast services, and using PIM based P-multicast trees, it must
3391	   ensure that the group P-addresses which it used in support of MPVN
3392	   services are distinct from any of the group addresses of the Internet
3393	   multicasts it supports.  This is best done by using administratively
3394	   scoped addresses [ADMIN-ADDR].

3396	   The group C-addresses need not be distinct from either the group P-
3397	   addresses or the Internet multicast addresses.

3399	12. Support for PIM-BIDIR C-Groups

3401	   In BIDIR-PIM, each multicast group is associated with an RPA
3402	   (Rendezvous Point Address).  The Rendezvous Point Link (RPL) is the
3403	   link that attaches to the RPA.  Usually it's a LAN where the RPA is
3404	   in the IP subnet assigned to the LAN.  The root node of a BIDIR-PIM
3405	   tree is a node which has an interface on the RPL.

3407	   On any LAN (other than the RPL) which is a link in a PIM-bidir tree,
3408	   there must be a single node that has been chosen to be the DF.  (More
3409	   precisely, for each RPA there is a single node which is the DF for
3410	   that RPA.)  A node which receives traffic from an upstream interface
3411	   may forward it on a particular downstream interface only if the node
3412	   is the DF for that downstream interface.  A node which receives
3413	   traffic from a downstream interface may forward it on an upstream
3414	   interface only if that node is the DF for the downstream interface.

3416	   If, for any period of time, there is a link on which each of two
3417	   different nodes believes itself to be the DF, data forwarding loops
3418	   can form. Loops in a bidirectional multicast tree can be very
3419	   harmful.  However, any election procedure will have a convergence
3420	   period.  The BIDIR-PIM DF election procedures is very complicated,
3421	   because it goes to great pains to ensure that if convergence is not
3422	   extremely fast, then there is no forwarding at all until convergence
3423	   has taken place.

3425	   Other variants of PIM also have a DF election procedure for LANs.
3426	   However, as long as the multicast tree is unidirectional,
3427	   disagreement about who the DF is can result only in duplication of
3428	   packets, not in loops.  Therefore the time taken to converge on a
3429	   single DF is of much less concern for unidirectional trees and it is
3430	   for bidirectional trees.

3432	   In the MVPN environment, if PIM signaling is used among the PEs, the
3433	   can use the standard LAN-based DF election procedure can be used.
3434	   However, election procedures that are optimized for a LAN may not
3435	   work as well in the MVPN environment.  So an alternative to DF
3436	   election would be desirable.

3438	   If BGP signaling is used among the PEs, an alternative to DF election
3439	   is necessary.  One might think that use the "single forwarder
3440	   selection" procedures described in sections 5 and 9 coudl be used to
3441	   choose a single PE "DF" for the backbone (for a given RPA in a given
3442	   MVPN).  However, that is still likely to leave a convergence period
3443	   of at least several seconds during which loops could form, and there
3444	   could be a much longer convergence period if there is anything
3445	   disrupting the smooth flow of BGP updates.  So a simple procedure
3446	   like that is not sufficient.

3448	   The remainder of this section describes two different methods that
3449	   can be used to support BIDIR-PIM while eliminating the DF election.

3451	12.1. The VPN Backbone Becomes the RPL

3453	   On a per MVPN basis, this method treats the whole service provider(s)
3454	   infrastructure as a single RPL (RP Link). We refer to such an RPL as
3455	   an "MVPN-RPL".  This eliminates the need for the PEs to engage in any
3456	   "DF election" procedure, because PIM-bidir does not have a DF on the
3457	   RPL.

3459	   However, this method can only be used if the customer is
3460	   "outsourcing" the RPL/RPA functionality to the SP.

3462	   An MVPN-RPL could be realized either via an I-PMSI (this I-PMSI is on
3463	   a per MVPN basis and spans all the PEs that have sites of a given
3464	   MVPN), or via a collection of S-PMSIs, or even via a combination of
3465	   an I-PMSI and one or more S-PMSIs.

3467	12.1.1. Control Plane

3469	   Associated with each MVPN-RPL is an address prefix that is
3470	   unambiguous within the context of the MVPN associated with the MVPN-
3471	   RPL.

3473	   For a given MVPN, each VRF connected to an MVPN-RPL of that MVPN is
3474	   configured to advertise to all of its connected CEs the address
3475	   prefix of the MVPN-RPL.

3477	   Since in PIM Bidir there is no Designated Forwarder on an RPL, in the
3478	   context of MVPN-RPL there is no need to perform the Designated
3479	   Forwarder election among the PEs (note there is still necessary to
3480	   perform the Designated Forwarder election between a PE and its
3481	   directly attached CEs, but that is done using plain PIM Bidir
3482	   procedures).

3484	   For a given MVPN a PE connected to an MVPN-RPL of that MVPN should
3485	   send multicast data (C-S,C-G) on the MVPN-RPL only if at least one
3486	   other PE connected to the MVPN-RPL has a downstream multicast state
3487	   for C-G. In the context of MVPN this is accomplished by requring a PE
3488	   that has a downstream state for a particular C-G of a particular VRF
3489	   present on the PE to originate a C-multicast route for (*, C-G).  The
3490	   RD of this route should be the same as the RD associated with the
3491	   VRF. The RT(s) carried by the route should be the same as the one(s)
3492	   used for VPN-IPv4 routes.  This route will be distributed to all the
3493	   PEs of the MVPN.

3495	12.1.2. Data Plane

3497	   A PE that receives (C-S,C-G) multicast data from a CE should forward
3498	   this data on the MVPN-RPL of the MVPN the CE belongs to only if the
3499	   PE receives at least one C-multicast route for (*, C-G).  Otherwise,
3500	   the PE should not forward the data on the RPL/I-PMSI.

3502	   When a PE receives a multicast packet with (C-S,C-G) on an MVPN-RPL
3503	   associated with a given MVPN, the PE forwards this packet to every
3504	   directly connected CE of that MVPN, provided that the CE sends Join
3505	   (*,C-G) to the PE (provided that the PE has the downstream (*,C-G)
3506	   state). The PE does not forward this packet back on the MVPN-RPL.  If
3507	   a PE has no downstream (*,C-G) state, the PE does not forward the
3508	   packet.

3510	12.2. Partitioned Sets of PEs

3512	   This method does not require the use of the MVPN-RPL, and does not
3513	   require the customer to outsource the RPA/RPL functionality to the
3514	   SP.

3516	12.2.1. Partitions

3518	   Consider a particular C-RPA, call it C-R, in a particular MVPN.
3519	   Consider the set of PEs that attach to sites that have senders or
3520	   receivers for a BIDIR-PIM group C-G, where C-R is the RPA for C-G.
3521	   (As always we sue the "C-" prefix to indicate that we are referring
3522	   to an address in the VPN's address space rather than in the
3523	   provider's address space.)

3525	   Following the procedures of section 5.1, each PE in the set
3526	   independently chooses some other PE in the set to be its "upstream
3527	   PE" for those BIDIR-PIM groups with RPA C-R.  Optionally, they can
3528	   all choose the "default selection" (described in section 5.1), to
3529	   ensure that each PE to choose the same upstream PE.  Note that if a
3530	   PE has a route to C-R via a VRF interface, then the PE may choose
3531	   itself as the upstream PE.

3533	   The set of PEs can now be partitioned into a number of subsets.
3534	   We'll say that PE1 and PE2 are in the same partition if and only if
3535	   there is some PE3 such that PE1 and PE2 have each chosen PE3 as the
3536	   upstream PE for C-R.  Note that each partition has exactly one
3537	   upstream PE.  So it is possible to identify the partition by
3538	   identifying its upstream PE.

3540	   Consider packet P, and let PE1 be its ingress PE.  PE1 will send the
3541	   packet on a PMSI so that it reaches the other PEs that need to
3542	   receive it.  This is done by encapsulating the packet and sending it
3543	   on a P-tunnel.  If the original packet is part of a PIM-BIDIR group
3544	   (its ingress PE determines this from the packet's destination address
3545	   C-G), and if the VPN backbone is not the RPL, then the encapsulation
3546	   MUST carry information that can be used to identify the partition to
3547	   which the ingress PE belongs.

3549	   When PE2 receives a packet from the PMSI, PE2 must determine, by
3550	   examining the encapsulation, whether the packet's ingress PE belongs
3551	   to the same partition (relative to the C-RPA of the packet's C-G)
3552	   that PE2 itself belongs to.  If not, PE2 discards the packet.
3553	   Otherwise PE2 performs the normal BIDIR-PIM data packet processing.
3554	   With this rule in place, harmful loops cannot be introduced by the
3555	   PEs into the customer's bidirectional tree.

3557	   Note that if there is more than one partition, the VPN backbone will
3558	   not carry a packet from one partition to another.  The only way for a
3559	   packet to get from one partition to another is for it to go up
3560	   towards the RPA and then to go down another path to the backbone.  If
3561	   this is not considered desirable, then all PEs should choose the same
3562	   upstream PE for a given C-RPA.  Then multiple partitions will only
3563	   exist during routing transients.

3565	12.2.2. Using PE Labels

3567	   If a given P-tunnel is to be used to carry packets belonging to a
3568	   bidirectional C-group, then, EXCEPT for the case described in section
3569	   12.2.3 the packets that travel on that P-tunnel MUST carry a PE label
3570	   (defined in section 4), using the encapsulation discussed in section
3571	   11.3.

3573	   When a given PE transmits a given packet of a bidirectional C-group
3574	   to the P-tunnel, the packet will carry the PE label corresponding to
3575	   the partition, for the C-group's C-RPA, that contains the
3576	   transmitting PE.  This is the PE label that has been bound to the
3577	   upstream PE of that partition; it is not necessarily the label that
3578	   has been bound to the transmitting PE.

3580	   Recall that the PE labels are upstream-assigned labels that are
3581	   assigned and advertised by the node which is at the root of the P-
3582	   tunnel.  (Procedures for PE label assignment when the P-tunnel is not
3583	   a multicast tree will be given is later revisions of this document.)

3585	   When a PE receives a packet with a PE label that does not identify
3586	   the partition of the receiving PE, then the receiving PE discards the
3587	   packet.

3589	   Note that this procedure does not require the root of a P-tunnel to
3590	   assign a PE label for every PE that belongs to the tunnel, but only
3591	   for those PEs that might become the upstream PEs of some partition.

3593	12.2.3. Mesh of MP2MP P-Tunnels

3595	   There is one case in which support for BIDIR-PIM C-groups does not
3596	   require the use of a PE label.  For a given C-RPA, suppose a distinct
3597	   MP2MP LSP is used as the P-tunnel serving that partition.  Then for a
3598	   given packet, a PE receiving the packet from a P-tunnel can be infer
3599	   the partition from the tunnel.  So PE labels are not needed in this
3600	   case.

3602	13. Security Considerations

3604	   To be supplied.

3606	14. IANA Considerations

3608	   To be supplied.

3610	15. Other Authors

3612	   Sarveshwar Bandi, Yiqun Cai, Thomas Morin, Yakov Rekhter, IJsbrands
3613	   Wijnands, Seisho Yasukawa

3615	16. Other Contributors

3617	   Significant contributions were made Arjen Boers, Toerless Eckert,
3618	   Adrian Farrel, Luyuan Fang, Dino Farinacci, Lenny Guiliano, Shankar
3619	   Karuna, Anil Lohiya, Tom Pusateri, Ted Qian, Robert Raszuk, Tony
3620	   Speakman, Dan Tappan.

3622	17. Authors' Addresses

3624	   Rahul Aggarwal (Editor)
3625	   Juniper Networks
3626	   1194 North Mathilda Ave.
3627	   Sunnyvale, CA 94089
3628	   Email: rahul@juniper.net
3629	   Sarveshwar Bandi
3630	   Motorola
3631	   Vanenburg IT park, Madhapur,
3632	   Hyderabad, India
3633	   Email: sarvesh@motorola.com

3635	   Yiqun Cai
3636	   Cisco Systems, Inc.
3637	   170 Tasman Drive
3638	   San Jose, CA, 95134
3639	   E-mail: ycai@cisco.com

3641	   Thomas Morin
3642	   France Telecom R & D
3643	   2, avenue Pierre-Marzin
3644	   22307 Lannion Cedex
3645	   France
3646	   Email: thomas.morin@francetelecom.com

3648	   Yakov Rekhter
3649	   Juniper Networks
3650	   1194 North Mathilda Ave.
3651	   Sunnyvale, CA 94089
3652	   Email: yakov@juniper.net

3654	   Eric C. Rosen (Editor)
3655	   Cisco Systems, Inc.
3656	   1414 Massachusetts Avenue
3657	   Boxborough, MA, 01719
3658	   E-mail: erosen@cisco.com

3660	   IJsbrand Wijnands
3661	   Cisco Systems, Inc.
3662	   170 Tasman Drive
3663	   San Jose, CA, 95134
3664	   E-mail: ice@cisco.com
3665	   Seisho Yasukawa
3666	   NTT Corporation
3667	   9-11, Midori-Cho 3-Chome
3668	   Musashino-Shi, Tokyo 180-8585,
3669	   Japan
3670	   Phone: +81 422 59 4769
3671	   Email: yasukawa.seisho@lab.ntt.co.jp

3673	18. Normative References

3675	   [MVPN-BGP], R. Aggarwal, E. Rosen,  T. Morin, Y. Rekhter,  C.
3676	   Kodeboniya, "BGP Encodings for Multicast in MPLS/BGP IP VPNs", draft-
3677	   ietf-l3vpn-2547bis-mcast-bgp-03.txt, July 2007

3679	   [MPLS-IP] T. Worster, Y. Rekhter, E. Rosen, "Encapsulating MPLS in IP
3680	   or Generic Routing Encapsulation (GRE)", RFC 4023, March 2005

3682	   [MPLS-MCAST-ENCAPS] T. Eckert, E. Rosen, R. Aggarwal, Y. Rekhter,
3683	   "MPLS Multicast Encapsulations", draft-ietf-mpls-multicast-
3684	   encaps-06.txt, July 2007

3686	   [MPLS-UPSTREAM-LABEL] R. Aggarwal, Y. Rekhter, E. Rosen, "MPLS
3687	   Upstream Label Assignment and Context Specific Label Space", draft-
3688	   ietf-mpls-upstream-label-02.txt, March 2007

3690	   [PIM-ATTRIB], A. Boers, IJ. Wijnands, E. Rosen, "Format for Using
3691	   TLVs in PIM Messages",  draft-ietf-pim-join-attributes-03, May 2007

3693	   [PIM-SM]  "Protocol Independent Multicast - Sparse Mode (PIM-SM)",
3694	   Fenner, Handley, Holbrook, Kouvelas, August 2006, RFC 4601

3696	   [RFC2119] "Key words for use in RFCs to Indicate Requirement
3697	   Levels.", Bradner, March 1997

3699	   [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006

3701	   [RSVP-P2MP] R. Aggarwal, D. Papadimitriou, S. Yasukawa, et. al.,
3702	   "Extensions to RSVP-TE for Point-to-Multipoint TE LSPs", RFC 4875,
3703	   May 2007

3705	19. Informative References

3707	   [ADMIN-ADDR] D. Meyer, "Administratively Scoped IP Multicast", RFC
3708	   2365, July 1998

3710	   [MVPN-REQ] T. Morin, Ed., "Requirements for Multicast in L3 Provider-
3711	   Provisioned VPNs", RFC 4834, April 2007

3713	   [MVPN-BASE] R. Aggarwal, A. Lohiya, T. Pusateri, Y. Rekhter, "Base
3714	   Specification for Multicast in MPLS/BGP VPNs", draft-raggarwa-
3715	   l3vpn-2547-mvpn-00.txt

3717	   [RAGGARWA-MCAST] R. Aggarwal, et. al., "Multicast in BGP MPLS VPNs
3718	   and VPLS", draft-raggarwa-l3vpn-mvpn-vpls-mcast-01.txt".

3720	   [ROSEN-8] E. Rosen, Y. Cai, I. Wijnands, "Multicast in MPLS/BGP IP
3721	   VPNs", draft-rosen-vpn-mcast-08.txt

3723	   [RP-MVPN] S. Yasukawa, et. al., "BGP/MPLS IP Multicast VPNs", draft-
3724	   yasukawa-l3vpn-p2mp-mcast-01.txt

3726	   [RFC1853] W. Simpson, "IP in IP Tunneling", October 1995

3728	   [RFC2784] D. Farinacci, et. al., "Generic Routing Encapsulation",
3729	   March 2000

3731	   [RFC2890] G. Dommety, "Key and Sequence Number Extensions to GRE",
3732	   September 2000

3734	   [RFC2983] D. Black, "Differentiated Services and Tunnels", October
3735	   2000

3737	   [RFC3270] F. Le Faucheur, et. al., "MPLS Support of Differentiated
3738	   Services", May 2002

3740	   [RFC4607] H. Holbrook, B. Cain, "Source-Specific Multicast for IP",
3741	   August 2006

3743	20. Full Copyright Statement

3745	   Copyright (C) The IETF Trust (2007).

3747	   This document is subject to the rights, licenses and restrictions
3748	   contained in BCP 78, and except as set forth therein, the authors
3749	   retain all their rights.

3751	   This document and the information contained herein are provided on an
3752	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
3753	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
3754	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
3755	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
3756	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
3757	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

3759	21. Intellectual Property

3761	   The IETF takes no position regarding the validity or scope of any
3762	   Intellectual Property Rights or other rights that might be claimed to
3763	   pertain to the implementation or use of the technology described in
3764	   this document or the extent to which any license under such rights
3765	   might or might not be available; nor does it represent that it has
3766	   made any independent effort to identify any such rights.  Information
3767	   on the procedures with respect to rights in RFC documents can be
3768	   found in BCP 78 and BCP 79.

3770	   Copies of IPR disclosures made to the IETF Secretariat and any
3771	   assurances of licenses to be made available, or the result of an
3772	   attempt made to obtain a general license or permission for the use of
3773	   such proprietary rights by implementers or users of this
3774	   specification can be obtained from the IETF on-line IPR repository at
3775	   http://www.ietf.org/ipr.

3777	   The IETF invites any interested party to bring to its attention any
3778	   copyrights, patents or patent applications, or other proprietary
3779	   rights that may cover technology that may be required to implement
3780	   this standard.  Please address the information to the IETF at ietf-
3781	   ipr@ietf.org.