idnits 2.17.1 

draft-ietf-l3vpn-2547bis-mcast-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 19.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 3805.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 3816.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 3823.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 3829.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (January 14, 2008) is 5944 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-15) exists of
     draft-ietf-mpls-ldp-p2mp-03

  == Outdated reference: A later version (-10) exists of
     draft-ietf-mpls-multicast-encaps-06

  == Outdated reference: A later version (-07) exists of
     draft-ietf-mpls-upstream-label-02

  == Outdated reference: A later version (-08) exists of
     draft-ietf-l3vpn-2547bis-mcast-bgp-04

  == Outdated reference: A later version (-06) exists of
     draft-ietf-pim-join-attributes-03

  ** Obsolete normative reference: RFC 4601 (ref. 'PIM-SM') (Obsoleted by RFC
     7761)


     Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                             Eric C. Rosen (Editor)
3	Internet Draft                                       Cisco Systems, Inc.
4	Intended Status: Standards Track
5	Expires: July 14, 2008                           Rahul Aggarwal (Editor)
6	                                                        Juniper Networks

8	                                                        January 14, 2008

10	                     Multicast in MPLS/BGP IP VPNs

12	                 draft-ietf-l3vpn-2547bis-mcast-06.txt

14	Status of this Memo

16	   By submitting this Internet-Draft, each author represents that any
17	   applicable patent or other IPR claims of which he or she is aware
18	   have been or will be disclosed, and any of which he or she becomes
19	   aware will be disclosed, in accordance with Section 6 of BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF), its areas, and its working groups.  Note that
23	   other groups may also distribute working documents as Internet-
24	   Drafts.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   The list of current Internet-Drafts can be accessed at
32	   http://www.ietf.org/ietf/1id-abstracts.txt.

34	   The list of Internet-Draft Shadow Directories can be accessed at
35	   http://www.ietf.org/shadow.html.

37	Abstract

39	   In order for IP multicast traffic within a BGP/MPLS IP VPN (Virtual
40	   Private Network) to travel from one VPN site to another, special
41	   protocols and procedures must be implemented by the VPN Service
42	   Provider.  These protocols and procedures are specified in this
43	   document.

45	Table of Contents

47	 1          Specification of requirements  .........................   5
48	 2          Introduction  ..........................................   5
49	 2.1        Optimality vs Scalability  .............................   5
50	 2.1.1      Multicast Distribution Trees  ..........................   7
51	 2.1.2      Ingress Replication through Unicast Tunnels  ...........   8
52	 2.2        Overview  ..............................................   8
53	 2.2.1      Multicast Routing Adjacencies  .........................   8
54	 2.2.2      MVPN Definition  .......................................   9
55	 2.2.3      Auto-Discovery  ........................................  10
56	 2.2.4      PE-PE Multicast Routing Information  ...................  11
57	 2.2.5      PE-PE Multicast Data Transmission  .....................  11
58	 2.2.6      Inter-AS MVPNs  ........................................  12
59	 2.2.7      Optionally Eliminating Shared Tree State  ..............  12
60	 3          Concepts and Framework  ................................  13
61	 3.1        PE-CE Multicast Routing  ...............................  13
62	 3.2        P-Multicast Service Interfaces (PMSIs)  ................  14
63	 3.2.1      Inclusive and Selective PMSIs  .........................  15
64	 3.2.2      Tunnels Instantiating PMSIs  ...........................  16
65	 3.3        Use of PMSIs for Carrying Multicast Data  ..............  18
66	 3.3.1      MVPNs with MI-PMSIs  ...................................  18
67	 3.3.2      When MI-PMSIs are Required  ............................  19
68	 3.3.3      MVPNs That Do Not Use MI-PMSIs  ........................  19
69	 3.4        PE-PE Transmission of C-Multicast Routing  .............  19
70	 3.4.1      PIM Peering  ...........................................  20
71	 3.4.1.1    Full Per-MVPN PIM Peering Across a MI-PMSI  ............  20
72	 3.4.1.2    Lightweight PIM Peering Across a MI-PMSI  ..............  20
73	 3.4.1.3    Unicasting of PIM C-Join/Prune Messages  ...............  21
74	 3.4.2      Using BGP to Carry C-Multicast Routing  ................  21
75	 4          BGP-Based Autodiscovery of MVPN Membership  ............  22
76	 5          PE-PE Transmission of C-Multicast Routing  .............  25
77	 5.1        Selecting the Upstream Multicast Hop (UMH)  ............  25
78	 5.1.1      Eligible Routes for UMH Selection  .....................  26
79	 5.1.2      Information Carried by Eligible UMH Routes  ............  26
80	 5.1.3      Selecting the Upstream PE  .............................  27
81	 5.1.4      Selecting the Upstream Multicast Hop  ..................  29
82	 5.2        Details of Per-MVPN Full PIM Peering over MI-PMSI  .....  29
83	 5.2.1      PIM C-Instance Control Packets  ........................  30
84	 5.2.2      PIM C-instance RPF Determination  ......................  30
85	 5.2.3      Backwards Compatibility  ...............................  31
86	 5.3        Use of BGP for Carrying C-Multicast Routing  ...........  31
87	 5.3.1      Sending BGP Updates  ...................................  31
88	 5.3.2      Explicit Tracking  .....................................  33
89	 5.3.3      Withdrawing BGP Updates  ...............................  33
90	 6          I-PMSI Instantiation  ..................................  33
91	 6.1        MVPN Membership and Egress PE Auto-Discovery  ..........  34
92	 6.1.1      Auto-Discovery for Ingress Replication  ................  34
93	 6.1.2      Auto-Discovery for P-Multicast Trees  ..................  34
94	 6.2        C-Multicast Routing Information Exchange  ..............  35
95	 6.3        Aggregation  ...........................................  35
96	 6.3.1      Aggregate Tree Leaf Discovery  .........................  35
97	 6.3.2      Aggregation Methodology  ...............................  36
98	 6.3.3      Encapsulation of the Aggregate Tree  ...................  37
99	 6.3.4      Demultiplexing C-multicast traffic  ....................  37
100	 6.4        Mapping Received Packets to MVPNs  .....................  38
101	 6.4.1      Unicast Tunnels  .......................................  38
102	 6.4.2      Non-Aggregated P-Multicast Trees  ......................  39
103	 6.4.3      Aggregate P-Multicast Trees  ...........................  39
104	 6.5        I-PMSI Instantiation Using Ingress Replication  ........  40
105	 6.6        Establishing P-Multicast Trees  ........................  41
106	 6.7        RSVP-TE P2MP LSPs  .....................................  42
107	 6.7.1      P2MP TE LSP Tunnel - MVPN Mapping  .....................  42
108	 6.7.2      Demultiplexing C-Multicast Data Packets  ...............  42
109	 7          Optimizing Multicast Distribution via S-PMSIs  .........  43
110	 7.1        S-PMSI Instantiation Using Ingress Replication  ........  44
111	 7.2        Protocol for Switching to S-PMSIs  .....................  44
112	 7.2.1      A UDP-based Protocol for Switching to S-PMSIs  .........  44
113	 7.2.1.1    Binding a Stream to an S-PMSI  .........................  45
114	 7.2.1.2    Packet Formats and Constants  ..........................  46
115	 7.2.2      A BGP-based Protocol for Switching to S-PMSIs  .........  48
116	 7.2.2.1    Advertising C-(S, G) Binding to a S-PMSI using BGP  ....  48
117	 7.2.2.2    Explicit Tracking  .....................................  49
118	 7.2.2.3    Switching to S-PMSI  ...................................  50
119	 7.3        Aggregation  ...........................................  50
120	 7.4        Instantiating the S-PMSI with a PIM Tree  ..............  51
121	 7.5        Instantiating S-PMSIs using RSVP-TE P2MP Tunnels  ......  52
122	 8          Inter-AS Procedures  ...................................  52
123	 8.1        Non-Segmented Inter-AS Tunnels  ........................  53
124	 8.1.1      Inter-AS MVPN Auto-Discovery  ..........................  53
125	 8.1.2      Inter-AS MVPN Routing Information Exchange  ............  53
126	 8.1.3      Inter-AS P-Tunnels  ....................................  54
127	 8.1.3.1    PIM-Based Inter-AS P-Multicast Trees  ..................  54
128	 8.2        Segmented Inter-AS Tunnels  ............................  55
129	 8.2.1      Inter-AS MVPN Auto-Discovery Routes  ...................  55
130	 8.2.1.1    Originating Inter-AS MVPN A-D Information  .............  56
131	 8.2.1.2    Propagating Inter-AS MVPN A-D Information  .............  57
132	 8.2.1.2.1  Inter-AS Auto-Discovery Route received via EBGP  .......  57
133	 8.2.1.2.2  Leaf Auto-Discovery Route received via EBGP  ...........  58
134	 8.2.1.2.3  Inter-AS Auto-Discovery Route received via IBGP  .......  58
135	 8.2.2      Inter-AS MVPN Routing Information Exchange  ............  60
136	 8.2.3      Inter-AS I-PMSI  .......................................  60
137	 8.2.3.1    Support for Unicast VPN Inter-AS Methods  ..............  61
138	 8.2.4      Inter-AS S-PMSI  .......................................  61
139	 9          Duplicate Packet Detection and Single Forwarder PE  ....  62
140	 9.1        Multihomed C-S or C-RP  ................................  63
141	 9.1.1      Single forwarder PE selection  .........................  64
142	 9.2        Switching from the C-RP tree to C-S tree  ..............  65
143	10          Eliminating PE-PE Distribution of (C-*,C-G) State  .....  66
144	10.1        Co-locating C-RPs on a PE  .............................  67
145	10.1.1      Initial Configuration  .................................  68
146	10.1.2      Anycast RP Based on Propagating Active Sources  ........  68
147	10.1.2.1    Receiver(s) Within a Site  .............................  68
148	10.1.2.2    Source Within a Site  ..................................  68
149	10.1.2.3    Receiver Switching from Shared to Source Tree  .........  69
150	10.2        Using MSDP between a PE and a Local C-RP  ..............  69
151	11          Encapsulations  ........................................  70
152	11.1        Encapsulations for Single PMSI per Tunnel  .............  70
153	11.1.1      Encapsulation in GRE  ..................................  70
154	11.1.2      Encapsulation in IP  ...................................  72
155	11.1.3      Encapsulation in MPLS  .................................  72
156	11.2        Encapsulations for Multiple PMSIs per Tunnel  ..........  73
157	11.2.1      Encapsulation in GRE  ..................................  73
158	11.2.2      Encapsulation in IP  ...................................  73
159	11.3        Encapsulations Identifying a Distinguished PE  .........  74
160	11.3.1      For MP2MP LSP P-tunnels  ...............................  74
161	11.3.2      For Support of PIM-BIDIR C-Groups  .....................  74
162	11.4        Encapsulations for Unicasting PIM Control Messages  ....  75
163	11.5        General Considerations for IP and GRE Encaps  ..........  75
164	11.5.1      MTU  ...................................................  75
165	11.5.2      TTL  ...................................................  76
166	11.5.3      Avoiding Conflict with Internet Multicast  .............  76
167	11.6        Differentiated Services  ...............................  76
168	12          Support for PIM-BIDIR C-Groups  ........................  77
169	12.1        The VPN Backbone Becomes the RPL  ......................  78
170	12.1.1      Control Plane  .........................................  78
171	12.1.2      Data Plane  ............................................  79
172	12.2        Partitioned Sets of PEs  ...............................  79
173	12.2.1      Partitions  ............................................  79
174	12.2.2      Using PE Labels  .......................................  80
175	12.2.3      Mesh of MP2MP P-Tunnels  ...............................  81
176	13          Security Considerations  ...............................  81
177	14          IANA Considerations  ...................................  82
178	15          Other Authors  .........................................  82
179	16          Other Contributors  ....................................  82
180	17          Authors' Addresses  ....................................  82
181	18          Normative References  ..................................  84
182	19          Informative References  ................................  85
183	20          Full Copyright Statement  ..............................  85
184	21          Intellectual Property  .................................  86

186	1. Specification of requirements

188	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
189	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
190	   document are to be interpreted as described in [RFC2119].

192	2. Introduction

194	   [RFC4364] specifies the set of procedures which a Service Provider
195	   (SP) must implement in order to provide a particular kind of VPN
196	   service ("BGP/MPLS IP VPN") for its customers.  The service described
197	   therein allows IP unicast packets to travel from one customer site to
198	   another, but it does not provide a way for IP multicast traffic to
199	   travel from one customer site to another.

201	   This document extends the service defined in [RFC4364] so that it
202	   also includes the capability of handling IP multicast traffic.  This
203	   requires a number of different protocols to work together.  The
204	   document provides a framework describing how the various protocols
205	   fit together, and also provides detailed specification of some of the
206	   protocols.  The detailed specification of some of the other protocols
207	   is found in pre-existing documents or in companion documents.

209	2.1. Optimality vs Scalability

211	   In a "BGP/MPLS IP VPN" [RFC4364], unicast routing of VPN packets is
212	   achieved without the need to keep any per-VPN state in the core of
213	   the SP's network (the "P routers").  Routing information from a
214	   particular VPN is maintained only by the Provider Edge routers (the
215	   "PE routers", or "PEs") that attach directly to sites of that VPN.
216	   Customer data travels through the P routers in tunnels from one PE to
217	   another (usually MPLS Label Switched Paths, LSPs), so to support the
218	   VPN service the P routers only need to have routes to the PE routers.

220	   The PE-to-PE routing is optimal, but the amount of associated state
221	   in the P routers depends only on the number of PEs, not on the number
222	   of VPNs.

224	   However, in order to provide optimal multicast routing for a
225	   particular multicast flow, the P routers through which that flow
226	   travels have to hold state which is specific to that flow.  A
227	   multicast flow is identified by the (source, group) tuple where the
228	   source is the IP address of the sender and the group is the IP
229	   multicast group address of the destination.  Scalability would be
230	   poor if the amount of state in the P routers were proportional to the
231	   number of multicast flows in the VPNs.  Therefore, when supporting
232	   multicast service for a BGP/MPLS IP VPN, the optimality of the
233	   multicast routing must be traded off against the scalability of the P
234	   routers.  We explain this below in more detail.

236	   If a particular VPN is transmitting "native" multicast traffic over
237	   the backbone, we refer to it as an "MVPN".  By "native" multicast
238	   traffic, we mean packets that a CE sends to a PE, such that the IP
239	   destination address of the packets is a multicast group address, or
240	   the packets are multicast control packets addressed to the PE router
241	   itself, or the packets are IP multicast data packets encapsulated in
242	   MPLS.

244	   We say that the backbone multicast routing for a particular multicast
245	   group in a particular VPN is "optimal" if and only if all of the
246	   following conditions hold:

248	     - When a PE router receives a multicast data packet of that group
249	       from a CE router, it transmits the packet in such a way that the
250	       packet is received by every other PE router which is on the path
251	       to a receiver of that group;

253	     - The packet is not received by any other PEs;

255	     - While in the backbone, no more than one copy of the packet ever
256	       traverses any link.

258	     - While in the backbone, if bandwidth usage is to be optimized, the
259	       packet traverses minimum cost trees rather than shortest path
260	       trees.

262	   Optimal routing for a particular multicast group requires that the
263	   backbone maintain one or more source-trees which are specific to that
264	   flow.  Each such tree requires that state be maintained in all the P
265	   routers that are in the tree.

267	   This would potentially require an unbounded amount of state in the P
268	   routers, since the SP has no control of the number of multicast
269	   groups in the VPNs that it supports. Nor does the SP have any control
270	   over the number of transmitters in each group, nor of the
271	   distribution of the receivers.

273	   The procedures defined in this document allow an SP to provide
274	   multicast VPN service without requiring the amount of state
275	   maintained by the P routers to be proportional to the number of
276	   multicast data flows in the VPNs.  The amount of state is traded off
277	   against the optimality of the multicast routing.  Enough flexibility
278	   is provided so that a given SP can make his own tradeoffs between
279	   scalability and optimality.  An SP can even allow some multicast
280	   groups in some VPNs to receive optimal routing, while others do not.
281	   Of course, the cost of this flexibility is an increase in the number
282	   of options provided by the protocols.

284	   The basic technique for providing scalability is to aggregate a
285	   number of customer multicast flows onto a single multicast
286	   distribution tree through the P routers.  A number of aggregation
287	   methods are supported.

289	   The procedures defined in this document also accommodate the SP that
290	   does not want to build multicast distribution trees in his backbone
291	   at all; the ingress PE can replicate each multicast data packet and
292	   then unicast each replica through a tunnel to each egress PE that
293	   needs to receive the data.

295	2.1.1. Multicast Distribution Trees

297	   This document supports the use of a single multicast distribution
298	   tree in the backbone to carry all the multicast traffic from a
299	   specified set of one or more MVPNs.  Such a tree is referred to as an
300	   "Inclusive Tree". An Inclusive Tree which carries the traffic of more
301	   than one MVPN is an "Aggregate Inclusive Tree".  An Inclusive Tree
302	   contains, as its members, all the PEs that attach to any of the MVPNs
303	   using the tree.

305	   With this option, even if each tree supports only one MVPN, the upper
306	   bound on the amount of state maintained by the P routers is
307	   proportional to the number of VPNs supported, rather than to the
308	   number of multicast flows in those VPNs.  If the trees are
309	   unidirectional, it would be more accurate to say that the state is
310	   proportional to the product of the number of VPNs and the average
311	   number of PEs per VPN.  The amount of state maintained by the P
312	   routers can be further reduced by aggregating more MVPNs onto a
313	   single tree.  If each such tree supports a set of MVPNs, (call it an
314	   "MVPN aggregation set"), the state maintained by the P routers is
315	   proportional to the product of the number of MVPN aggregation sets
316	   and the average number of PEs per MVPN. Thus the state does not grow
317	   linearly with the number of MVPNs.

319	   However, as data from many multicast groups is aggregated together
320	   onto a single "Inclusive Tree", it is likely that some PEs will
321	   receive multicast data for which they have no need, i.e., some degree
322	   of optimality has been sacrificed.

324	   This document also provides procedures which enable a single
325	   multicast distribution tree in the backbone to be used to carry
326	   traffic belonging only to a specified set of one or more multicast
327	   groups, from one or more MVPNs. Such a tree is referred to as a
328	   "Selective Tree" and more specifically as an "Aggregate Selective
329	   Tree" when the multicast groups belong to different MVPNs.  By
330	   default, traffic from most multicast groups could be carried by an
331	   Inclusive Tree, while traffic from, e.g., high bandwidth groups could
332	   be carried in one of the "Selective Trees".  When setting up the
333	   Selective Trees, one should include only those PEs which need to
334	   receive multicast data from one or more of the groups assigned to the
335	   tree.  This provides more optimal routing than can be obtained by
336	   using only Inclusive Trees, though it requires additional state in
337	   the P routers.

339	2.1.2. Ingress Replication through Unicast Tunnels

341	   This document also provides procedures for carry MVPN data traffic
342	   through unicast tunnels from the ingress PE to each of the egress
343	   PEs. The ingress PE replicates the multicast data packet received
344	   from a CE and sends it to each of the egress PEs using the unicast
345	   tunnels.  This requires no multicast routing state in the P routers
346	   at all, but it puts the entire replication load on the ingress PE
347	   router, and makes no attempt to optimize the multicast routing.

349	2.2. Overview

351	2.2.1. Multicast Routing Adjacencies

353	   In BGP/MPLS IP VPNs [RFC4364], each CE ("Customer Edge") router is a
354	   unicast routing adjacency of a PE router, but CE routers at different
355	   sites do not become unicast routing adjacencies of each other. This
356	   important characteristic is retained for multicast routing -- a CE
357	   router becomes a multicast routing adjacency of a PE router, but CE
358	   routers at different sites do not become multicast routing
359	   adjacencies of each other.

361	   The multicast routing protocol on the PE-CE link is presumed to be
362	   PIM ("Protocol Independent Multicast") [PIM-SM].  The Sparse Mode,
363	   Dense Mode, Single Source Mode, and Bidirectional Modes are
364	   supported. A CE router exchanges "ordinary" PIM control messages with
365	   the PE router to which it is attached.

367	   The PEs attaching to a particular MVPN then have to exchange the
368	   multicast routing information with each other.  Two basic methods for
369	   doing this are defined: (1) PE-PE PIM, and (2) BGP.  In the former
370	   case, the PEs need to be multicast routing adjacencies of each other.
371	   In the latter case, they do not.  For example, each PE may be a BGP
372	   adjacency of a Route Reflector (RR), and not of any other PEs.

374	   To support the "Carrier's Carrier" model of [RFC4364], mLDP or BGP
375	   can be used on the PE-CE interface. This will be described in
376	   subsequent versions of this document.

378	2.2.2. MVPN Definition

380	   An MVPN is defined by two sets of sites, Sender Sites set and
381	   Receiver Sites set, with the following properties:

383	     - Hosts within the Sender Sites set could originate multicast
384	       traffic for receivers in the Receiver Sites set.

386	     - Receivers not in the Receiver Sites set should not be able to
387	       receive this traffic.

389	     - Hosts within the Receiver Sites set could receive multicast
390	       traffic originated by any host in the Sender Sites set.

392	     - Hosts within the Receiver Sites set should not be able to receive
393	       multicast traffic originated by any host that is not in the
394	       Sender Sites set.

396	   A site could be both in the Sender Sites set and Receiver Sites set,
397	   which implies that hosts within such a site could both originate and
398	   receive multicast traffic. An extreme case is when the Sender Sites
399	   set is the same as the Receiver Sites set, in which case all sites
400	   could originate and receive multicast traffic from each other.

402	   Sites within a given MVPN may be either within the same, or in
403	   different organizations, which implies that an MVPN can be either an
404	   Intranet or an Extranet.

406	   A given site may be in more than one MVPN, which implies that MVPNs
407	   may overlap.

409	   Not all sites of a given MVPN have to be connected to the same
410	   service provider, which implies that an MVPN can span multiple
411	   service providers.

413	   Another way to look at MVPN is to say that an MVPN is defined by a
414	   set of administrative policies. Such policies determine both Sender
415	   Sites set and Receiver Site set. Such policies are established by
416	   MVPN customers, but implemented/realized by MVPN Service Providers
417	   using the existing BGP/MPLS VPN mechanisms, such as Route Targets,
418	   with extensions, as necessary.

420	2.2.3. Auto-Discovery

422	   In order for the PE routers attaching to a given MVPN to exchange
423	   MVPN control information with each other, each one needs to discover
424	   all the other PEs that attach to the same MVPN.  (Strictly speaking,
425	   a PE in the receiver sites set need only discover the other PEs in
426	   the sender sites set and a PE in the sender sites set need only
427	   discover the other PEs in the receiver sites set.) This is referred
428	   to as "MVPN Auto-Discovery".

430	   This document discusses two ways of providing MVPN autodiscovery:

432	     - BGP can be used for discovering and maintaining MVPN membership.
433	       The PE routers advertise their MVPN membership to other PE
434	       routers using BGP. A PE is considered to be a "member" of a
435	       particular MVPN if it contains a VRF (Virtual Routing and
436	       Forwarding table, see [RFC4364]) which is configured to contain
437	       the multicast routing information of that MVPN.  This auto-
438	       discovery option does not make any assumptions about the methods
439	       used for transmitting MVPN multicast data packets through the
440	       backbone.

442	     - If it is known that the multicast data packets of a particular
443	       MVPN are to be transmitted (at least, by default) through a non-
444	       aggregated Inclusive Tree which is to be set up by PIM-SM or
445	       BIDIR-PIM, and if the PEs attaching to that MVPN are configured
446	       with the group address corresponding to that tree, then the PEs
447	       can auto-discover each other simply by joining the tree and then
448	       multicasting PIM Hellos over the tree.

450	2.2.4. PE-PE Multicast Routing Information

452	   The BGP/MPLS IP VPN [RFC4364] specification requires a PE to maintain
453	   at most one BGP peering with every other PE in the network. This
454	   peering is used to exchange VPN routing information. The use of Route
455	   Reflectors further reduces the number of BGP adjacencies maintained
456	   by a PE to exchange VPN routing information with other PEs. This
457	   document describes various options for exchanging MVPN control
458	   information between PE routers based on the use of PIM or BGP. These
459	   options have different overheads with respect to the number of
460	   routing adjacencies that a PE router needs to maintain to exchange
461	   MVPN control information with other PE routers. Some of these options
462	   allow the retention of the unicast BGP/MPLS VPN model letting a PE
463	   maintain at most one BGP routing adjacency with other PE routers to
464	   exchange MVPN control information.  BGP also provides reliable
465	   transport and uses incremental updates. Another option is the use of
466	   the currently existing, "soft state" PIM standard [PIM-SM] that uses
467	   periodic complete updates.

469	2.2.5. PE-PE Multicast Data Transmission

471	   Like [RFC4364], this document decouples the procedures for exchanging
472	   routing information from the procedures for transmitting data
473	   traffic. Hence a variety of transport technologies may be used in the
474	   backbone. For inclusive trees, these transport technologies include
475	   unicast PE-PE tunnels (using MPLS or IP/GRE encapsulation), multicast
476	   distribution trees created by PIM-SSM, PIM-SM, or BIDIR-PIM (using
477	   IP/GRE encapsulation), point-to-multipoint LSPs created by RSVP-TE or
478	   mLDP, and multipoint-to-multipoint LSPs created by mLDP.  (However,
479	   techniques for aggregating the traffic of multiple MVPNs onto a
480	   single multipoint-to-multipoint LSP or onto a single bidirectional
481	   multicast distribution tree are for further study.) For selective
482	   trees, only unicast PE-PE tunnels (using MPLS or IP/GRE
483	   encapsulation) and unidirectional single-source trees are supported,
484	   and the supported tree creation protocols are PIM-SSM (using IP/GRE
485	   encapsulation), RSVP-TE, and mLDP.

487	   In order to aggregate traffic from multiple MVPNs onto a single
488	   multicast distribution tree, it is necessary to have a mechanism to
489	   enable the egresses of the tree to demultiplex the multicast traffic
490	   received over the tree and to associate each received packet with a
491	   particular MVPN.  This document specifies a mechanism whereby
492	   upstream label assignment [MPLS-UPSTREAM-LABEL] is used by the root
493	   of the tree to assign a label to each flow.  This label is used by
494	   the receivers to perform the demultiplexing. This document also
495	   describes procedures based on BGP that are used by the root of an
496	   Aggregate Tree to advertise the Inclusive and/or Selective binding
497	   and the demultiplexing information to the leaves of the tree.

499	   This document also describes the data plane encapsulations for
500	   supporting the various SP multicast transport options.

502	   This document assumes that when SP multicast trees are used, traffic
503	   for a particular multicast group is transmitted by a particular PE on
504	   only one SP multicast tree. The use of multiple SP multicast trees
505	   for transmitting traffic belonging to a particular multicast group is
506	   for further study.

508	2.2.6. Inter-AS MVPNs

510	   [RFC4364] describes different options for supporting BGP/MPLS IP
511	   unicast VPNs whose provider backbones contain more than one
512	   Autonomous System (AS).  These are know as Inter-AS VPNs. In an
513	   Inter-AS VPN, the ASes may belong to the same provider or to
514	   different providers.  This document describes how Inter-AS MVPNs can
515	   be supported for each of the unicast BGP/MPLS VPN Inter-AS options.
516	   This document also specifies a model where Inter-AS MVPN service can
517	   be offered without requiring a single SP multicast tree to span
518	   multiple ASes. In this model, an inter-AS multicast tree consists of
519	   a number of "segments", one per AS, which are stitched together at AS
520	   boundary points. These are known as "segmented inter-AS trees".  Each
521	   segment of a segmented inter-AS tree may use a different multicast
522	   transport technology.

524	   It is also possible to support Inter-AS MVPNs with non-segmented
525	   source trees that extend across AS boundaries.

527	2.2.7. Optionally Eliminating Shared Tree State

529	   The document also discusses some options and protocol extensions
530	   which can be used to eliminate the need for the PE routers to
531	   distribute to each other the (*, G) and (*, G, RPT-bit) states when
532	   there are PIM Sparse Mode multicast groups in the VPNs.

534	3. Concepts and Framework

536	3.1. PE-CE Multicast Routing

538	   Support of multicast in BGP/MPLS IP VPNs is modeled closely after
539	   support of unicast in BGP/MPLS IP VPNs. That is, a multicast routing
540	   protocol will be run on the PE-CE interfaces, such that PE and CE are
541	   multicast routing adjacencies on that interface.  CEs at different
542	   sites do not become multicast routing adjacencies of each other.

544	   If a PE attaches to n VPNs for which multicast support is provided
545	   (i.e., to n "MVPNs"), the PE will run n independent instances of a
546	   multicast routing protocol.  We will refer to these multicast routing
547	   instances as "VPN-specific multicast routing instances", or more
548	   briefly as "multicast C-instances". The notion of a "VRF" ("Virtual
549	   Routing and Forwarding Table"), defined in [RFC4364], is extended to
550	   include multicast routing entries as well as unicast routing entries.
551	   Each multicast routing entry is thus associated with a particular
552	   VRF.

554	   Whether a particular VRF belongs to an MVPN or not is determined by
555	   configuration.

557	   In this document, we will not attempt to provide support for every
558	   possible multicast routing protocol that could possibly run on the
559	   PE-CE link.  Rather, we consider multicast C-instances only for the
560	   following multicast routing protocols:

562	     - PIM Sparse Mode (PIM-SM)

564	     - PIM Single Source Mode (PIM-SSM)

566	     - PIM Bidirectional Mode (BIDIR-PIM)

568	     - PIM Dense Mode (PIM-DM)

570	   In order to support the "Carrier's Carrier" model of [RFC4364], mLDP
571	   or BGP will also be supported on the PE-CE interface. The use of mLDP
572	   on the PE-CE interface is described in [MVPN-BGP]. The use of BGP on
573	   the PE-CE interface is not described in this revision.

575	   As the document only supports PIM-based C-instances, we will
576	   generally use the term "PIM C-instances" to refer to the multicast C-
577	   instances.

579	   A PE router may also be running a "provider-wide" instance of PIM, (a
580	   "PIM P-instance"), in which it has a PIM adjacency with, e.g., each
581	   of its IGP neighbors (i.e., with P routers), but NOT with any CE
582	   routers, and not with other PE routers (unless another PE router
583	   happens to be an IGP adjacency).  In this case, P routers would also
584	   run the P-instance of PIM, but NOT a C-instance.  If there is a PIM
585	   P-instance, it may or may not have a role to play in support of VPN
586	   multicast; this is discussed in later sections.  However, in no case
587	   will the PIM P-instance contain VPN-specific multicast routing
588	   information.

590	   In order to help clarify when we are speaking of the PIM P-instance
591	   and when we are speaking of a PIM C-instance, we will also apply the
592	   prefixes "P-" and "C-" respectively to control messages, addresses,
593	   etc.  Thus a P-Join would be a PIM Join which is processed by the PIM
594	   P-instance, and a C-Join would be a PIM Join which is processed by a
595	   C-instance.  A P-group address would be a group address in the SP's
596	   address space, and a C-group address would be a group address in a
597	   VPN's address space.

599	3.2. P-Multicast Service Interfaces (PMSIs)

601	   Multicast data packets received by a PE over a PE-CE interface must
602	   be forwarded to one or more of the other PEs in the same MVPN for
603	   delivery to one or more other CEs.

605	   We define the notion of a "P-Multicast Service Interface" (PMSI).  If
606	   a particular MVPN is supported by a particular set of PE routers,
607	   then there will be a PMSI connecting those PE routers.  A PMSI is a
608	   conceptual "overlay" on the P network with the following property: a
609	   PE in a given MVPN can give a packet to the PMSI, and the packet will
610	   be delivered to some or all of the other PEs in the MVPN, such that
611	   any PE receiving such a packet will be able to tell which MVPN the
612	   packet belongs to.

614	   As we discuss below, a PMSI may be instantiated by a number of
615	   different transport mechanisms, depending on the particular
616	   requirements of the MVPN and of the SP.  We will refer to these
617	   transport mechanisms as "tunnels".

619	   For each MVPN, there are one or more PMSIs that are used for
620	   transmitting the MVPN's multicast data from one PE to others.  We
621	   will use the term "PMSI" such that a single PMSI belongs to a single
622	   MVPN.  However, the transport mechanism which is used to instantiate
623	   a PMSI may allow a single "tunnel" to carry the data of multiple
624	   PMSIs.

626	   In this document we make a clear distinction between the multicast
627	   service (the PMSI) and its instantiation.  This allows us to separate
628	   the discussion of different services from the discussion of different
629	   instantiations of each service.  The term "tunnel" is used to refer
630	   only to the transport mechanism that instantiates a service.

632	3.2.1. Inclusive and Selective PMSIs

634	   We will distinguish between three different kinds of PMSI:

636	     - "Multidirectional Inclusive" PMSI (MI-PMSI)

638	       A Multidirectional Inclusive PMSI is one which enables ANY PE
639	       attaching to a particular MVPN to transmit a message such that it
640	       will be received by EVERY other PE attaching to that MVPN.

642	       There is at most one MI-PMSI per MVPN.  (Though the tunnel or
643	       tunnels that instantiate an MI-PMSI may actually carry the data
644	       of more than one PMSI.)

646	       An MI-PMSI can be thought of as an overlay broadcast network
647	       connecting the set of PEs supporting a particular MVPN.

649	     - "Unidirectional Inclusive" PMSI (UI-PMSI)

651	       A Unidirectional Inclusive PMSI is one which enables a particular
652	       PE, attached to a particular MVPN, to transmit a message such
653	       that it will be received by all the other PEs attaching to that
654	       MVPN.  There is at most one UI-PMSI per PE per MVPN, though the
655	       tunnel which instantiates a UI-PMSI may in fact carry the data of
656	       more than one PMSI.

658	     - "Selective" PMSI (S-PMSI).

660	       A Selective PMSI is one which provides a mechanism wherein a
661	       particular PE in an MVPN can multicast messages so that they will
662	       be received by a subset of the other PEs of that MVPN.  There may
663	       be an arbitrary number of S-PMSIs per PE per MVPN.  Again, the
664	       tunnel which instantiates a given S-PMSI may carry data from
665	       multiple S-PMSIs.

667	   We will see in later sections the role played by these different
668	   kinds of PMSI.  We will use the term "I-PMSI" when we are not
669	   distinguishing between "MI-PMSIs" and "UI-PMSIs".

671	3.2.2. Tunnels Instantiating PMSIs

673	   The tunnels which are used to instantiate PMSIs will be referred to
674	   as "P-tunnels".  A number of different tunnel setup techniques can be
675	   used to create the P-tunnels that instantiate the PMSIs.  Among these
676	   are:

678	     - PIM

680	       A PMSI can be instantiated as (a set of) Multicast Distribution
681	       Trees created by the PIM P-instance ("P-trees").

683	       PIM-SSM, BIDIR-PIM, or PIM-SM can be used to create P-trees.
684	       (PIM-DM is not supported for this purpose.)

686	       A single MI-PMSI can be instantiated by a single shared P-tree,
687	       or by a number of source P-trees (one for each PE of the MI-
688	       PMSI).  P-trees may be shared by multiple MVPNs (i.e., a given P-
689	       tree may be the instantiation of multiple PMSIs), as long as the
690	       encapsulation provides some means of demultiplexing the data
691	       traffic by MVPN.

693	       Selective PMSIs are instantiated by source P-trees, and are most
694	       naturally created by PIM-SSM, since by definition only one PE is
695	       the source of the multicast data on a Selective PMSI.

697	     - MLDP

699	       A PMSI may be instantiated as one or more mLDP Point-to-
700	       Multipoint (P2MP) LSPs, or as an mLDP Multipoint-to-
701	       MultiPoint(MP2MP) LSP.  A Selective PMSI or a Unidirectional
702	       Inclusive PMSI would be instantiated as a single mLDP P2MP LSP,
703	       whereas a Multidirectional Inclusive PMSI could be instantiated
704	       either as a set of such LSPs (one for each PE in the MVPN) or as
705	       a single MP2MP LSP.

707	       MLDP P2MP LSPs can be shared across multiple MVPNs.

709	     - RSVP-TE

711	       A PMSI may be instantiated as one or more RSVP-TE Point-to-
712	       Multipoint (P2MP) LSPs.  A Selective PMSI or a Unidirectional
713	       Inclusive PMSI would be instantiated as a single RSVP-TE P2MP
714	       LSP, whereas a Multidirectional Inclusive PMSI would be
715	       instantiated as a set of such LSPs, one for each PE in the MVPN.
716	       RSVP-TE P2MP LSPs can be shared across multiple MVPNs.

718	     - A Mesh of Unicast Tunnels.

720	       If a PMSI is implemented as a mesh of unicast tunnels, a PE
721	       wishing to transmit a packet through the PMSI would replicate the
722	       packet, and send a copy to each of the other PEs.

724	       An MI-PMSI for a given MVPN can be instantiated as a full mesh of
725	       unicast tunnels among that MVPN's PEs.  A UI-PMSI or an S-PMSI
726	       can be instantiated as a partial mesh.

728	     - Unicast Tunnels to the Root of a P-Tree.

730	       Any type of PMSI can be instantiated through a method in which
731	       there is a single P-tree (created, for example, via PIM-SSM or
732	       via RSVP-TE), and a PE transmits a packet to the PMSI by sending
733	       it in a unicast tunnel to the root of that P-tree.  All PEs in
734	       the given MVPN would need to be leaves of the tree.

736	       When this instantiation method is used, the transmitter of the
737	       multicast data may receive its own data back.  Methods for
738	       avoiding this are for further study.

740	   It can be seen that each method of implementing PMSIs has its own
741	   area of applicability.  This specification therefore allows for the
742	   use of any of these methods.  At first glance, this may seem like an
743	   overabundance of options.  However, the history of multicast
744	   development and deployment should make it clear that there is no one
745	   option which is always acceptable.  The use of segmented inter-AS
746	   trees does allow each SP to select the option which it finds most
747	   applicable in its own environment, without causing any other SP to
748	   choose that same option.

750	   Specifying the conditions under which a particular tree building
751	   method is applicable is outside the scope of this document.

753	   The choice of the tunnel technique belongs to the sender router and
754	   is a local policy decision of the router. The procedures defined
755	   throughout this document do not mandate that the same tunnel
756	   technique be used for all PMSI tunnels going through a given provider
757	   backbone.  It is however expected that any tunnel technique that can
758	   be used by a PE for a particular MVPN is also supported by other PE
759	   having VRFs for the MVPN.  Moreover, the use of ingress replication
760	   by any PE for an MVPN, implies that all other PEs MUST use ingress
761	   replication for this MVPN.

763	3.3. Use of PMSIs for Carrying Multicast Data

765	   Each PE supporting a particular MVPN must have a way of discovering:

767	     - The set of other PEs in its AS that are attached to sites of that
768	       MVPN, and the set of other ASes that have PEs attached to sites
769	       of that MVPN.  However, if segmented inter-AS trees are not used
770	       (see section 8.2), then each PE needs to know the entire set of
771	       PEs attached to sites of that MVPN.

773	     - If segmented inter-AS trees are to be used, the set of border
774	       routers in its AS that support inter-AS connectivity for that
775	       MVPN

777	     - If the MVPN is configured to use a MI-PMSI, the information
778	       needed to set up and to use the tunnels instantiating the default
779	       MI-PMSI,

781	     - For each other PE, whether the PE supports Aggregate Trees for
782	       the MVPN, and if so, the demultiplexing information which must be
783	       provided so that the other PE can determine whether a packet
784	       which it received on an aggregate tree belongs to this MVPN.

786	   In some cases this information is provided by means of the BGP-based
787	   auto-discovery procedures detailed in section 4.  In other cases,
788	   this information is provided after discovery is complete, by means of
789	   procedures defined in section 6.1.2.  In either case, the information
790	   which is provided must be sufficient to enable the PMSI to be bound
791	   to the identified tunnel, to enable the tunnel to be created if it
792	   does not already exist, and to enable the different PMSIs which may
793	   travel on the same tunnel to be properly demultiplexed.

795	3.3.1. MVPNs with MI-PMSIs

797	   If an MVPN uses an MI-PMSI, then the MI-PMSI for that MVPN will be
798	   created as soon as the necessary information has been obtained.
799	   Creating a PMSI means creating the tunnel which carries it (unless
800	   that tunnel already exists), as well as binding the PMSI to the
801	   tunnel. The MI-PMSI for that MVPN is then used as the default method
802	   of transmitting multicast data packets for that MVPN.  In effect, all
803	   the multicast streams for the MVPN are, by default, aggregated onto
804	   the MI-MVPN.

806	   If a particular multicast stream from a particular source PE has
807	   certain characteristics, it can be desirable to migrate it from the
808	   MI-PMSI to an S-PMSI.  These characteristics and procedures for
809	   migrating a stream from an MI-PMSI to an S-PMSI are discussed in
810	   section 7.

812	3.3.2. When MI-PMSIs are Required

814	   MI-PMSIs are required under the following conditions:

816	     - The MVPN is using PIM-DM, or some other protocol (such as BSR)
817	       which relies upon flooding.  Only with an MI-PMSI can the C-data
818	       (or C-control-packets) received from any CE be flooded to all
819	       PEs.

821	     - If the procedure for carrying C-multicast routes from PE to PE
822	       involves the multicasting of P-PIM control messages among the PEs
823	       (see sections 3.4.1.1, 3.4.1.2, and 5.2).

825	3.3.3. MVPNs That Do Not Use MI-PMSIs

827	   If a particular MVPN does not use a MI-PMSI, then its multicast data
828	   may be sent on a set of UI-PMSIs.

830	   It is also possible to send all the multicast data on a set of S-
831	   PMSIs, omitting any usage of I-PMSIs.  This prevents PEs from
832	   receiving data which they don't need, at the cost of requiring
833	   additional tunnels.  However, cost-effective instantiation of S-PMSIs
834	   is likely to require Aggregate P-trees, which in turn makes it
835	   necessary for the transmitting PE to know which PEs need to receive
836	   which multicast streams. This is known as "explicit tracking", and
837	   the procedures to enable explicit tracking may themselves impose a
838	   cost.  This is further discussed in section 7.2.2.2.

840	3.4. PE-PE Transmission of C-Multicast Routing

842	   As a PE attached to a given MVPN receives C-Join/Prune messages from
843	   its CEs in that MVPN, it must convey the information contained in
844	   those messages to other PEs that are attached to the same MVPN.

846	   There are several different methods for doing this. As these methods
847	   are not interoperable, the method to be used for a particular MVPN
848	   must either be configured, or discovered as part of the auto-
849	   discovery process.

851	3.4.1. PIM Peering

853	3.4.1.1. Full Per-MVPN PIM Peering Across a MI-PMSI

855	   If the set of PEs attached to a given MVPN are connected via a MI-
856	   PMSI, the PEs can form "normal" PIM adjacencies with each other.
857	   Since the MI-PMSI functions as a broadcast network, the standard PIM
858	   procedures for forming and maintaining adjacencies over a LAN can be
859	   applied.

861	   As a result, the C-Join/Prune messages which a PE receives from a CE
862	   can be multicast to all the other PEs of the MVPN.  PIM "join
863	   suppression" can be enabled and the PEs can send Asserts as needed.

865	   This procedure is fully specified in section 5.2.

867	3.4.1.2. Lightweight PIM Peering Across a MI-PMSI

869	   The procedure of the previous section has the following
870	   disadvantages:

872	     - Periodic Hello messages must be sent by all PEs.

874	       Standard PIM procedures require that each PE in a particular MVPN
875	       periodically multicast a Hello to all the other PEs in that MVPN.
876	       If the number of MVPNs becomes very large, sending and receiving
877	       these Hellos can become a substantial overhead for the PE
878	       routers.

880	     - Periodic retransmission of C-Join/Prune messages.

882	       PIM is a "soft-state" protocol, in which reliability is assured
883	       through frequent retransmissions (refresh) of control messages.
884	       This too can begin to impose a large overhead on the PE routers
885	       as the number of MVPNs grows.

887	   The first of these disadvantages is easily remedied.  The reason for
888	   the periodic PIM Hellos is to ensure that each PIM speaker on a LAN
889	   knows who all the other PIM speakers on the LAN are.  However, in the
890	   context of MVPN, PEs in a given MVPN can learn the identities of all
891	   the other PEs in the MVPN by means of the BGP-based auto-discovery
892	   procedure of section 4.  In that case, the periodic Hellos would
893	   serve no function, and could simply be eliminated.  (Of course, this
894	   does imply a change to the standard PIM procedures.)

896	   When Hellos are suppressed, we may speak of "lightweight PIM
897	   peering".

899	   The periodic refresh of the C-Join/Prunes is not as simple to
900	   eliminate.  If and when "refresh reduction" procedures are specified
901	   for PIM, it may be useful to incorporate them, so as to make the
902	   lightweight PIM peering procedures even more lightweight.

904	   Lightweight PIM peering is not specified in this document.

906	3.4.1.3. Unicasting of PIM C-Join/Prune Messages

908	   PIM does not require that the C-Join/Prune messages which a PE
909	   receives from a CE to be multicast to all the other PEs; it allows
910	   them to be unicast to a single PE, the one which is upstream on the
911	   path to the root of the multicast tree mentioned in the Join/Prune
912	   message. Note that when the C-Join/Prune messages are unicast, there
913	   is no such thing as "join suppression".  Therefore PIM Refresh
914	   Reduction may be considered to be a pre-requisite for the procedure
915	   of unicasting the C-Join/Prune messages.

917	   When the C-Join/Prunes are unicast, they are not transmitted on a
918	   PMSI at all.  Note that the procedure of unicasting the C-Join/Prunes
919	   is different than the procedure of transmitting the C-Join/Prunes on
920	   an MI-PMSI which is instantiated as a mesh of unicast tunnels.

922	   If there are multiple PEs that can be used to reach a given C-source,
923	   procedures described in section 9 MUST be used to ensue that, at
924	   least within a single AS, all PEs choose the same PE to reach the C-
925	   source.

927	   Procedures for unicasting the PIM control messages are not further
928	   specified in this document.

930	3.4.2. Using BGP to Carry C-Multicast Routing

932	   It is possible to use BGP to carry C-multicast routing information
933	   from PE to PE, dispensing entirely with the transmission of C-
934	   Join/Prune messages from PE to PE. This is specified in section 5.3.
935	   Inter-AS procedures are described in section 8.

937	4. BGP-Based Autodiscovery of MVPN Membership

939	   BGP-based autodiscovery is done by means of a new address family, the
940	   MCAST-VPN address family. (This address family also has other uses,
941	   as will be seen later.)  Any PE which attaches to an MVPN must issue
942	   a BGP update message containing an NLRI in this address family, along
943	   with a specific set of attributes.  In this document, we specify the
944	   information which must be contained in these BGP updates in order to
945	   provide auto-discovery.  The encoding details, along with the
946	   complete set of detailed procedures, are specified in a separate
947	   document [MVPN-BGP].

949	   This section specifies the intra-AS BGP-based autodiscovery
950	   procedures.  When segmented inter-AS trees are used, additional
951	   procedures are needed, as specified in section 8.  Further detail may
952	   be found in [MVPN-BGP].  (When segmented inter-AS trees are not used,
953	   the inter-AS procedures are almost identical to the intra-AS
954	   procedures.)

956	   BGP-based autodiscovery uses a particular kind of MCAST-VPN route
957	   known as an "auto-discovery routes", or "A-D route".  In particular,
958	   it uses two kinds of "A-D routes", the "Intra-AS A-D Route" and the
959	   "Inter-AS A-D Route".  (There are also additional kinds of A-D
960	   routes, such as the Source Active A-D routes which are used for
961	   purposes that go beyond auto-discovery.  These are discussed in
962	   subsequent sections.)

964	   The Inter-AS A-D Route is used only when segmented inter-AS tunnels
965	   are used, as specified in section 8.

967	   The "Intra-AS A-D route" is originated by the PEs that are (directly)
968	   connected to the site(s) of an MVPN.  It is distributed to other PEs
969	   that attach to sites of the MVPN.  If segmented Inter-AS Tunnels are
970	   used, then the Intra-AS A-D routes are not distributed outside the AS
971	   where they originate; if segmented Inter-AS Tunnels are not used,
972	   then the Intra-AS A-D routes are, despite their name, distributed to
973	   all PEs attached to the VPN, no matter what AS the PEs are in.

975	   The NLRI of an Intra-AS A-D route must contain the following
976	   information:

978	     - The route type (i.e., Intra-AS A-D route)

980	     - The IP address of the originating PE
981	     - An RD configured locally for the MVPN.  This is an RD which can
982	       be prepended to that IP address to form a globally unique VPN-IP
983	       address of the PE.

985	   The A-D route must also carry the following attributes:

987	     - One or more Route Target attributes.  If any other PE has one of
988	       these Route Targets configured for import into a VRF, it treats
989	       the advertising PE as a member in the MVPN to which the VRF
990	       belongs. This allows each PE to discover the PEs that belong to a
991	       given MVPN.  More specifically it allows a PE in the receiver
992	       sites set to discover the PEs in the sender sites set of the MVPN
993	       and the PEs in the sender sites set of the MVPN to discover the
994	       PEs in the receiver sites set of the MVPN. The PEs in the
995	       receiver sites set would be configured to import the Route
996	       Targets advertised in the BGP Auto-Discovery routes by PEs in the
997	       sender sites set. The PEs in the sender sites set would be
998	       configured to import the Route Targets advertised in the BGP
999	       Auto-Discovery routes by PEs in the receiver sites set.

1001	     - PMSI tunnel attribute.  This attribute is present if and only if
1002	       either MI-PMSI is to be used for the MVPN, or UI-PMSI is to be
1003	       used for the MVPN on the PE that originates the intra-AS A-D
1004	       route. It contains the following information:

1006	         * whether the MI-PMSI is instantiated by

1008	             + A BIDIR-PIM tree,

1010	             + a set of PIM-SSM trees,

1012	             + a set of PIM-SM trees

1014	             + a set of RSVP-TE point-to-multipoint LSPs

1016	             + a set of mLDP point-to-multipoint LSPs

1018	             + an mLDP multipoint-to-multipoint LSP

1020	             + a set of unicast tunnels

1022	             + a set of unicast tunnels to the root of a shared tree (in
1023	               this case the root must be identified)

1025	         * If the PE wishes to setup a tunnel to instantiate the I-PMSI,
1026	           a unique identifier for the tunnel used to instantiate the I-
1027	           PMSI.  This identifier depends on the tunnel technology used.

1029	           All the PEs attaching to a given MVPN (within a given AS)
1030	           must have been configured with the same PMSI tunnel attribute
1031	           for that MVPN.  They are also expected to know the
1032	           encapsulation to use.

1034	           Note that a tunnel can be identified at discovery time only
1035	           if the tunnel already exists (e.g., it was constructed by
1036	           means of configuration), or if it can be constructed without
1037	           each PE knowing the the identities of all the others. This is
1038	           obviously the case when the tunnel is constructed by a
1039	           receiver-initiated join technique such as PIM or mLDP. It is
1040	           also the case when the tunnel is an RSVP-TE P2MP LSP as the
1041	           tunnel identifier can be constructed without the head end
1042	           learning the identities of the other PEs.

1044	           In other cases, a tunnel cannot be identified until the PE
1045	           has discovered one or more of the other PEs. In these cases,
1046	           a PE will first send an A-D route without a tunnel
1047	           identifier, and then will send another one with a tunnel
1048	           identifier after discovering one or more of the other PEs.

1050	           All the PEs attaching to a given MVPN must be configured with
1051	           information specifying the encapsulation to use.

1053	         * Whether the tunnel used to instantiate the I-PMSI for this
1054	           MVPN is aggregating I-PMSIs from multiple MVPNs.  This will
1055	           affect the encapsulation used.  If aggregation is to be used,
1056	           a demultiplexor value to be carried by packets for this
1057	           particular MVPN must also be specified.  The demultiplexing
1058	           mechanism and signaling procedures are described in section
1059	           6.

1061	       Further details of the use of this information are provided in
1062	       subsequent sections.

1064	       Sometimes it is necessary for one PE to advertise an upstream-
1065	       assigned MPLS label that identifies another PE.  Under certain
1066	       circumstances to be discussed later, a PE which is the root of a
1067	       multicast P-tunnel will bind an MPLS label value to one or more
1068	       of the PEs that belong to the P-tunnel, and will distribute these
1069	       label bindings using A-D routes. The precise details of this
1070	       label distribution will be included in the next revision of this
1071	       document.  We will refer to these as "PE Labels".  A packet
1072	       traveling on the P-tunnel may carry one of these labels as an
1073	       indication that the PE corresponding to that label is special.
1074	       See section 11.3 for more details.

1076	5. PE-PE Transmission of C-Multicast Routing

1078	   As a PE attached to a given MVPN receives C-Join/Prune messages from
1079	   its CEs in that MVPN, it must convey the information contained in
1080	   those messages to other PEs that are attached to the same MVPN.  This
1081	   is known as the "PE-PE transmission of C-multicast routing
1082	   information".

1084	   This section specifies the procedures used for PE-PE transmission of
1085	   C-multicast routing information.  Not every procedure mentioned in
1086	   section 3.4 is specified here.  Rather, this section focuses on two
1087	   particular procedures:

1089	     - Full PIM Peering.

1091	       This procedure is fully specified herein.

1093	     - Use of BGP to distribute C-multicast routing

1095	       This procedure is described herein, but the full specification
1096	       appears in [MVPN-BGP].

1098	   Those aspect of the procedures which apply to both of the above are
1099	   also specified fully herein.

1101	   Specification of other procedures is for future study.

1103	5.1. Selecting the Upstream Multicast Hop (UMH)

1105	   When a PE receives a C-Join/Prune message from a CE, the message
1106	   identifies a particular multicast flow as belonging either to a
1107	   source tree (S,G) or to a shared tree (*,G).  Throughout this
1108	   section, we use the term C-source to refer to S, in the case of a
1109	   source tree, or to the Rendezvous Point (RP) for G, in the case of
1110	   (*,G).  If the route to the C-source is across the VPN backbone, then
1111	   the PE needs to find the "upstream multicast hop" (UMH) for the (S,G)
1112	   or (*,G) flow. The "upstream multicast hop" is either the PE at which
1113	   (S,G) or (*,G) data packets enter the VPN backbone, or else is the
1114	   Autonomous System Border Router (ASBR) at which those data packets
1115	   enter the local AS when traveling through the VPN backbone.  The
1116	   process of finding the upstream multicast hop for a given C-source is
1117	   known as "upstream multicast hop selection".

1119	5.1.1. Eligible Routes for UMH Selection

1121	   In the simplest case, the PE does the upstream hop selection by
1122	   looking up the C-source in the unicast VRF associated with the PE-CE
1123	   interface over which the C-Join/Prune was received.  The route that
1124	   matches the C-source will contain the information needed to select
1125	   the upstream multicast hop.

1127	   However, in some cases, the CEs may be distributing to the PEs a
1128	   special set of routes that are to be used exclusively for the purpose
1129	   of upstream multicast hop selection, and not used for unicast routing
1130	   at all.  For example, when BGP is the CE-PE unicast routing protocol,
1131	   the CEs may be using SAFI 2 to distribute a special set of routes
1132	   that are to be used for, and only for, upstream multicast hop
1133	   selection.  When OSPF is the CE-PE routing protocol, the CE may use
1134	   an MT-ID of 1 to distribute a special set of routes that are to be
1135	   used for, and only for, upstream multicast hop selection .  When a CE
1136	   uses one of these mechanisms to distribute to a PE a special set of
1137	   routes to be used exclusively for upstream multicast hop selection,
1138	   these routes are distributed among the PEs using SAFI 129, as
1139	   described in [MVPN-BGP].

1141	   Whether the routes used for upstream multicast hop selection are (a)
1142	   the "ordinary" unicast routes or (b) a special set of routes that are
1143	   used exclusively for upstream multicast hop selection, is a matter of
1144	   policy.  How that policy is chosen, deployed, or implemented is
1145	   outside the scope of this document.  In the following, we will simply
1146	   refer to the set of routes that are used for upstream multicast hop
1147	   selection, the "Eligible UMH routes", with no presumptions about the
1148	   policy by which this set of routes was chosen.

1150	5.1.2. Information Carried by Eligible UMH Routes

1152	   Every route which is eligible for UMH selection MUST carry a VRF
1153	   Route Import Extended Community [MVPN-BGP].  This attribute
1154	   identifies the PE that originated the route.

1156	   If BGP is used for carrying C-multicast routes, OR if "Segmented
1157	   Inter-AS Tunnels" (see section 8.2) are used, then every UMH route
1158	   MUST also carry a Source AS Extended Community [MVPN-BGP].

1160	   These two attributes are used in the upstream multicast hop selection
1161	   procedures described below.

1163	5.1.3. Selecting the Upstream PE

1165	   The first step in selecting the upstream multicast hop for a given C-
1166	   source is to select the upstream PE router for that C-source.

1168	   The PE that received the C-Join message from a CE looks in the VRF
1169	   corresponding to the interfaces over which the C-Join was received.
1170	   It finds the Eligible UMH route which is the best match for the C-
1171	   source specified in that C-Join.  Call this the "Installed UMH
1172	   Route".

1174	   Note that the outgoing interface of the Installed UMH Route may be
1175	   one of the interfaces associated with the VRF, in which case the
1176	   upstream multicast hop is a CE and the route to the C-source is not
1177	   across the VPN backbone.

1179	   Consider the set of all VPN-IP routes that are: (a) eligible to be
1180	   imported into the VRF (as determined by their Route Targets), (b) are
1181	   eligible to be used for upstream multicast hop selection, and (c)
1182	   have exactly the same IP prefix (not necessarily the same RD) as the
1183	   installed UMH route.

1185	   For each route in this set, determine the corresponding upstream PE
1186	   and upstream RD.  If a route has a VRF Route Import Extended
1187	   Community, the route's upstream PE is determined from it. If a route
1188	   does not have a VRF Route Import Extended Community, the route's
1189	   upstream PE is determined from the route's BGP next hop attribute.
1190	   In either case, the upstream RD is taken from the route's NLRI.

1192	   This results in a set of pairs of <route, upstream PE, upstream RD>.

1194	   Call this the "UMH Route Candidate Set."  Then the PE MUST select a
1195	   single route from the set to be the "Selected UMH Route".  The
1196	   corresponding upstream PE is known as the "Selected Upstream PE", and
1197	   the corresponding upstream RD is known as the "Selected Upstream RD".

1199	   There are several possible procedures that can be used by a PE to
1200	   select a single route from the candidate set.

1202	   The default procedure, which MUST be implemented, is to select the
1203	   route whose corresponding upstream PE address is numerically highest,
1204	   where a 32-bit IP address is treated as a 32 bit unsigned integer.
1205	   Call this the "default upstream PE selection".  For a given C-source,
1206	   provided that the routing information used to create the candidate
1207	   set is stable, all PEs will have the same default upstream PE
1208	   selection.  (Though different default upstream PE selections may be
1209	   chosen during a routing transient.)
1210	   An alternative procedure which MUST be implemented, but which is
1211	   disabled by default, is the following.  This procedure ensures that,
1212	   except during a routing transient, each PE chooses the same upstream
1213	   PE for a given combination of C-source and C-G.

1215	      1. The PEs in the candidate set are numbered from lower to higher
1216	         IP address, starting from 0.

1218	      2. The following hash is performed:

1220	           - A bytewise exclusive-or of all the bytes in the C-source
1221	             address and the C-G address is performed.

1223	           - The result is taken modulo n, where n is the number of PEs
1224	             in the candidate set.  Call this result N.

1226	   The selected upstream PE is then the one that appears in position N
1227	   in the list of step 1.

1229	   Other hashing algorithms are allowed as well, but not required.

1231	   The alternative procedure allows a form of "equal cost load
1232	   balancing".  Suppose, for example, that from egress PEs PE3 and PE4,
1233	   source C-S can be reached, at equal cost, via ingress PE PE1 or
1234	   ingress PE PE2.  The load balancing procedure makes it possible for
1235	   PE1 to be the ingress PE for (C-S, C-G1) data traffic while PE2 is
1236	   the ingress PE for (C-S, C-G2) data traffic.

1238	   Another procedure, which SHOULD be implemented, is to use the
1239	   Installed UMH Route as the Selected UMH Route.  If this procedure is
1240	   used, the result is likely to be that a given PE will choose the
1241	   upstream PE that is closest to it, according to the routing in the SP
1242	   backbone.  As a result, for a given C-source, different PEs may
1243	   choose different upstream PEs.  This is useful if the C-source is an
1244	   anycast address, and can also be useful if the C-source is in a
1245	   multihomed site (i.e., a site that is attached to multiple PEs).
1246	   However, this procedure is more likely to lead to steady state
1247	   duplication of traffic unless (a) PEs discard data traffic which
1248	   arrives from the "wrong" upstream PE, or (b) data traffic is carried
1249	   only in non-aggregated S-PMSIs .  This issue is discussed at length
1250	   in section 9.

1252	   General policy-based procedures for selecting the UMH route are
1253	   allowed, but not required and are not further discussed in this
1254	   specification.

1256	5.1.4. Selecting the Upstream Multicast Hop

1258	   In certain cases, the selected upstream multicast hop is the same as
1259	   the selected upstream PE.  In other cases, the selected upstream
1260	   multicast hop is the ASBR which is the "BGP next hop" of the Selected
1261	   UMH Route.

1263	   If the selected upstream PE is in the local AS, then the selected
1264	   upstream PE is also the selected upstream multicast hop.  This is the
1265	   case if any of the following conditions holds:

1267	     - The selected UMH route has a Source AS Extended Community, and
1268	       the Source AS is the same as the local AS,

1270	     - The selected UMH route does not have a Source AS Extended
1271	       Community, but the route's BGP next hop is the same as the
1272	       upstream PE.

1274	   Otherwise, the selected upstream multicast hop is an ASBR.  The
1275	   method of determining just which ASBR it is depends on the particular
1276	   inter-AS signaling method being used (PIM or BGP), and on whether
1277	   segmented or non-segmented inter-AS tunnels are used.  These details
1278	   are presented in later sections.

1280	5.2. Details of Per-MVPN Full PIM Peering over MI-PMSI

1282	   In this section, we assume that inter-AS MVPNs will be supported by
1283	   means of non-segmented inter-AS trees.  Support for segmented inter-
1284	   AS trees with PIM peering is for further study.

1286	   When an MVPN uses an MI-PMSI, the C-instances of that MVPN can treat
1287	   the MI-PMSI as a LAN interface, and form either full PIM adjacencies
1288	   with each other over that "LAN interface".

1290	   To form a full PIM adjacency, the PEs execute the PIM LAN procedures,
1291	   including the generation and processing of PIM Hello, Join/Prune,
1292	   Assert, DF election and other PIM control packets.  These are
1293	   executed independently for each C-instance.  PIM "join suppression"
1294	   SHOULD be enabled.

1296	5.2.1. PIM C-Instance Control Packets

1298	   All PIM C-Instance control packets of a particular MVPN are addressed
1299	   to the ALL-PIM-ROUTERS (224.0.0.13) IP destination address, and
1300	   transmitted over the MI-PMSI of that MVPN.  While in transit in the
1301	   P-network, the packets are encapsulated as required for the
1302	   particular kind of tunnel that is being used to instantiate the MI-
1303	   PMSI.  Thus the C-instance control packets are not processed by the P
1304	   routers, and MVPN-specific PIM routes can be extended from site to
1305	   site without appearing in the P routers.

1307	   As specified in section 5.1.2, when a PE distributes VPN-IP routes
1308	   which are eligible for use as UMH routes, the PE MUST include a VRF
1309	   Route Import Extended Community with each route.  For a given MVPN, a
1310	   single such IP address MUST be used, and that same IP address MUST be
1311	   used as the source address in all PIM control packets for that MVPN.

1313	5.2.2. PIM C-instance RPF Determination

1315	   Although the MI-PMSI is treated by PIM as a LAN interface, unicast
1316	   routing is NOT run over it, and there are no unicast routing
1317	   adjacencies over it.  It is therefore necessary to specify special
1318	   procedures for determining when the MI-PMSI is to be regarded as the
1319	   "RPF Interface" for a particular C-address.

1321	   The PE follows the procedures of section 5.1 to determine the
1322	   selected UMH route.  If that route is NOT a VPN-IP route learned from
1323	   BGP as described in [RFC4364], or if that route's outgoing interface
1324	   is one of the interfaces associated with the VRF, then ordinary PIM
1325	   procedures for determining the RPF interface apply.

1327	   However, if the selected UMH route is a VPN-IP route whose outgoing
1328	   interface is not one of the interfaces associated with the VRF, then
1329	   PIM will consider the RPF interface to be the MI-PMSI associated with
1330	   the VPN-specific PIM instance.

1332	   Once PIM has determined that the RPF interface for a particular C-
1333	   source is the MI-PMSI, it is necessary for PIM to determine the "RPF
1334	   neighbor" for that C-source.  This will be one of the other PEs that
1335	   is a PIM adjacency over the MI-PMSI.  In particular, it will be the
1336	   "selected upstream PE" as defined in section 5.1.

1338	5.2.3. Backwards Compatibility

1340	   There are older implementations which do not use the VRF Route Import
1341	   Extended Community or any explicit mechanism for carrying information
1342	   to identify the originating PE of a selected UMH route.

1344	   For backwards compatibility, when the selected UMH route does not
1345	   have any such mechanism, the IP address from the "BGP Next Hop" field
1346	   of the selected UMH route will be used as the selected UMH address,
1347	   and will be treated as the address of the upstream PE.  There is no
1348	   selected upstream RD in this case.  However, use of this backwards
1349	   compatibility technique presupposes that:

1351	     - The PE which originated the selected UMH route placed the same IP
1352	       address in the BGP Next Hop field that it is using as the source
1353	       address of the PE-PE PIM control packets for this MVPN.

1355	     - The MVPN is not an Inter-AS MVPN that uses option b from section
1356	       10 of [RFC4364].

1358	   Should either of these conditions fail, interoperability with the
1359	   older implementations will not be achieved.

1361	5.3. Use of BGP for Carrying C-Multicast Routing

1363	   It is possible to use BGP to carry C-multicast routing information
1364	   from PE to PE, dispensing entirely with the transmission of C-
1365	   Join/Prune messages from PE to PE. This section describes the
1366	   procedures for carrying intra-AS multicast routing information.
1367	   Inter-AS procedures are described in section 8.  The complete
1368	   specification of both sets of procedures and of the encodings can be
1369	   found in [MVPN-BGP].

1371	5.3.1. Sending BGP Updates

1373	   The MCAST-VPN address family is used for this purpose.  MCAST-VPN
1374	   routes used for the purpose of carrying C-multicast routing
1375	   information are distinguished from those used for the purpose of
1376	   carrying auto-discovery information by means of a "route type" field
1377	   which is encoded into the NLRI.  The following information is
1378	   required in BGP to advertise the MVPN routing information.  The NLRI
1379	   contains:

1381	     - The type of C-multicast route.

1383	       There are two types:

1385	         * source tree join

1387	         * shared tree join

1389	     - The RD configured, for the MVPN, on the PE that is advertising
1390	       the information.  The RD is required in order to uniquely
1391	       identify the <C-Source, C-Group> when different MVPNs have
1392	       overlapping address spaces.

1394	     - The C-Group address.

1396	     - The C-Source address.

1398	       This field is omitted if the route type is "shared tree join".
1399	       In the case of a shared tree join, the C-source is a C-RP.  The
1400	       address of the C-RP corresponding to the C-group address is
1401	       presumed to be already known (or automatically determinable) be
1402	       the other PEs, though means that are outside the scope of this
1403	       specification.

1405	     - The Selected Upstream RD corresponding to the C-source address
1406	       (determined by the procedures of section 5.1).

1408	   Whenever a C-multicast route is sent, it must also carry the Selected
1409	   Upstream Multicast Hop corresponding to the C-source address
1410	   (determined by the procedures of section 5.1). The selected upstream
1411	   multicast hop must be encoded as part of a Route Target Extended
1412	   Community, to facilitate the optional use of filters which can
1413	   prevent the distribution of the update to BGP speakers other than the
1414	   upstream multicast hop.  See section 10.1.3 of [MVPN-BGP] for the
1415	   details.

1417	   There is no C-multicast route corresponding to the PIM function of
1418	   pruning a source off the shared tree when a PE switches from a <C-*,
1419	   C-G> tree to a <C-S, C-G> tree.  Section 9 of this document specifies
1420	   a mandatory procedure that ensures that if any PE joins a <C-S, C-G>
1421	   source tree, all other PEs that have joined or will join the <C-*, C-
1422	   G> shared tree will also join the <C-S, C-G> source tree.  This
1423	   eliminates the need for a C-multicast route that prunes C-S off the
1424	   <C-*, C-G> shared tree when switching from <C-*, C-G> to <C-S, C-G>
1425	   tree.

1427	5.3.2. Explicit Tracking

1429	   Note that the upstream multicast hop is NOT part of the NLRI in the
1430	   C-multicast BGP routes.  This means that if several PEs join the same
1431	   C-tree, the BGP routes they distribute to do so are regarded by BGP
1432	   as comparable routes, and only one will be installed.  If a route
1433	   reflector is being used, this further means that the PE which is used
1434	   to reach the C-source will know only that one or more of the other
1435	   PEs have joined the tree, but it won't know which one.  That is, this
1436	   BGP update mechanism does not provide "explicit tracking".  Explicit
1437	   tracking is not provided by default because it increases the amount
1438	   of state needed and thus decreases scalability.  Also, as
1439	   constructing the C-PIM messages to send "upstream" for a given tree
1440	   does not depend on knowing all the PEs that are downstream on that
1441	   tree, there is no reason for the C-multicast route type updates to
1442	   provide explicit tracking.

1444	   There are some cases in which explicit tracking is necessary in order
1445	   for the PEs to set up certain kinds of P-trees.  There are other
1446	   cases in which explicit tracking is desirable in order to determine
1447	   how to optimally aggregate multicast flows onto a given aggregate
1448	   tree.  As these functions have to do with the setting up of
1449	   infrastructure in the P-network, rather than with the dissemination
1450	   of C-multicast routing information, any explicit tracking that is
1451	   necessary is handled by sending the "source active" A-D routes, that
1452	   are described in sections 9 and 10.  Detailed procedures for turning
1453	   on explicit tracking can be found in [MVPN-BGP].

1455	5.3.3. Withdrawing BGP Updates

1457	   A PE removes itself from a C-multicast tree (shared or source) by
1458	   withdrawing the corresponding BGP update.

1460	   If a PE has pruned a C-source from a shared C-multicast tree, and it
1461	   needs to "unprune" that source from that tree, it does so by
1462	   withdrawing the route that pruned the source from the tree.

1464	6. I-PMSI Instantiation

1466	   This section describes how tunnels in the SP network can be used to
1467	   instantiate an I-PMSI for an MVPN on a PE.  When C-multicast data is
1468	   delivered on an I-PMSI, the data will go to all PEs that are on the
1469	   path to receivers for that C-group, but may also go to PEs that are
1470	   not on the path to receivers for that C-group.

1472	   The tunnels which instantiate I-PMSIs can be either PE-PE unicast
1473	   tunnels or P-multicast trees. When PE-PE unicast tunnels are used the
1474	   PMSI is said to be instantiated using ingress replication.  The
1475	   instantiation of a tunnel for an I-PMSI is a matter of local policy
1476	   decision and is not mandatory.  Even for a site attached to multicast
1477	   sources, transport of customer multicast traffic can be accommodated
1478	   with S-PMSI-bound tunnels only

1480	6.1. MVPN Membership and Egress PE Auto-Discovery

1482	   As described in section 4 a PE discovers the MVPN membership
1483	   information of other PEs using BGP auto-discovery mechanisms or using
1484	   a mechanism that instantiates a MI-PMSI interface. When a PE supports
1485	   only a UI-PMSI service for an MVPN, it MUST rely on the BGP auto-
1486	   discovery mechanisms for discovering this information. This
1487	   information also results in a PE in the sender sites set discovering
1488	   the leaves of the P-multicast tree, which are the egress PEs that
1489	   have sites in the receiver sites set in one or more MVPNs mapped onto
1490	   the tree.

1492	6.1.1. Auto-Discovery for Ingress Replication

1494	   In order for a PE to use Unicast Tunnels to send a C-multicast data
1495	   packet for a particular MVPN to a set of remote PEs, the remote PEs
1496	   must be able to correctly decapsulate such packets and to assign each
1497	   one to the proper MVPN. This requires that the encapsulation used for
1498	   sending packets through the tunnel have demultiplexing information
1499	   which the receiver can associate with a particular MVPN.

1501	   If ingress replication is being used for an MVPN, the PEs announce
1502	   this as part of the BGP based MVPN membership auto-discovery process,
1503	   described in section 4.  The PMSI tunnel attribute specifies ingress
1504	   replication.  The demultiplexor value is a downstream-assigned MPLS
1505	   label (i.e., assigned by the PE that originated the A-D route, to be
1506	   used by other PEs when they send multicast packets on a unicast
1507	   tunnel to that PE).

1509	   Other demultiplexing procedures for unicast are under consideration.

1511	6.1.2. Auto-Discovery for P-Multicast Trees

1513	   A PE announces the P-multicast technology it supports for a specified
1514	   MVPN, as part of the BGP MVPN membership discovery. This allows other
1515	   PEs to determine the P-multicast technology they can use for building
1516	   P-multicast trees to instantiate an I-PMSI. If a PE has a tree
1517	   instantiation of an I-PMSI, it also announces the tree identifier as
1518	   part of the auto-discovery, as well as announcing its aggregation
1519	   capability.

1521	   The announcement of a tree identifier at discovery time is only
1522	   possible if the tree already exists (e.g., a preconfigured "traffic
1523	   engineered" tunnel), or if the tree can be constructed dynamically
1524	   without any PE having to know in advance all the other PEs on the
1525	   tree (e.g., the tree is created by receiver-initiated joins).

1527	6.2. C-Multicast Routing Information Exchange

1529	   When a PE doesn't support the use of a MI-PMSI for a given MVPN, it
1530	   MUST either unicast MVPN routing information using PIM or else use
1531	   BGP for exchanging the MVPN routing information.

1533	6.3. Aggregation

1535	   A P-multicast tree can be used to instantiate a PMSI service for only
1536	   one MVPN or for more than one MVPN. When a P-multicast tree is shared
1537	   across multiple MVPNs it is termed an "Aggregate Tree". The
1538	   procedures described in this document allow a single SP multicast
1539	   tree to be shared across multiple MVPNs. The procedures that are
1540	   specific to aggregation are optional and are explicitly pointed out.
1541	   Unless otherwise specified a P-multicast tree technology supports
1542	   aggregation.

1544	   Aggregate Trees allow a single P-multicast tree to be used across
1545	   multiple MVPNs and hence state in the SP core grows per-set-of-MVPNs
1546	   and not per MVPN.  Depending on the congruence of the aggregated
1547	   MVPNs, this may result in trading off optimality of multicast
1548	   routing.

1550	   An Aggregate Tree can be used by a PE to provide an UI-PMSI or MI-
1551	   PMSI service for more than one MVPN. When this is the case the
1552	   Aggregate Tree is said to have an inclusive mapping.

1554	6.3.1. Aggregate Tree Leaf Discovery

1556	   BGP MVPN membership discovery allows a PE to determine the different
1557	   Aggregate Trees that it should create and the MVPNs that should be
1558	   mapped onto each such tree. The leaves of an Aggregate Tree are
1559	   determined by the PEs, supporting aggregation, that belong to all the
1560	   MVPNs that are mapped onto the tree.

1562	   If an Aggregate Tree is used to instantiate one or more S-PMSIs, then
1563	   it may be desirable for the PE at the root of the tree to know which
1564	   PEs (in its MVPN) are receivers on that tree.  This enables the PE to
1565	   decide when to aggregate two S-PMSIs, based on congruence (as
1566	   discussed in the next section).  Thus explicit tracking may be
1567	   required.  Since the procedures for disseminating C-multicast routes
1568	   do not provide explicit tracking, a type of A-D route known as a
1569	   "Leaf A-D Route" is used.  The PE which wants to assign a particular
1570	   C-multicast flow to a particular Aggregate Tree can send an A-D route
1571	   which elicits Leaf A-D routes from the PEs that need to receive that
1572	   C-multicast flow.  This provides the explicit tracking information
1573	   needed to support the aggregation methodology discussed in the next
1574	   section. For more details on Leaf A-D routes please refer to [MVPN-
1575	   BGP].

1577	6.3.2. Aggregation Methodology

1579	   This document does not specify the mandatory implementation of any
1580	   particular set of rules for determining whether or not the PMSIs of
1581	   two particular MVPNs are to be instantiated by the same Aggregate
1582	   Tree.  This determination can be made by implementation-specific
1583	   heuristics, by configuration, or even perhaps by the use of offline
1584	   tools.

1586	   It is the intention of this document that the control procedures will
1587	   always result in all the PEs of an MVPN to agree on the PMSIs which
1588	   are to be used and on the tunnels used to instantiate those PMSIs.

1590	   This section discusses potential methodologies with respect to
1591	   aggregation.

1593	   The "congruence" of aggregation is defined by the amount of overlap
1594	   in the leaves of the customer trees that are aggregated on a SP tree.
1595	   For Aggregate Trees with an inclusive mapping the congruence depends
1596	   on the overlap in the membership of the MVPNs that are aggregated on
1597	   the tree. If there is complete overlap i.e. all MVPNs have exactly
1598	   the same sites, aggregation is perfectly congruent. As the overlap
1599	   between the MVPNs that are aggregated reduces, i.e. the number of
1600	   sites that are common across all the MVPNs reduces, the congruence
1601	   reduces.

1603	   If aggregation is done such that it is not perfectly congruent a PE
1604	   may receive traffic for MVPNs to which it doesn't belong. As the
1605	   amount of multicast traffic in these unwanted MVPNs increases
1606	   aggregation becomes less optimal with respect to delivered traffic.
1607	   Hence there is a tradeoff between reducing state and delivering
1608	   unwanted traffic.

1610	   An implementation should provide knobs to control the congruence of
1611	   aggregation. These knobs are implementation dependent. Configuring
1612	   the percentage of sites that MVPNs must have in common to be
1613	   aggregated, is an example of such a knob. This will allow a SP to
1614	   deploy aggregation depending on the MVPN membership and traffic
1615	   profiles in its network.  If different PEs or servers are setting up
1616	   Aggregate Trees this will also allow a service provider to engineer
1617	   the maximum amount of unwanted MVPNs hat a particular PE may receive
1618	   traffic for.

1620	6.3.3. Encapsulation of the Aggregate Tree

1622	   An Aggregate Tree may use an IP/GRE encapsulation or an MPLS
1623	   encapsulation.  The protocol type in the IP/GRE header in the former
1624	   case and the protocol type in the data link header in the latter need
1625	   further explanation. This will be specified in a separate document.

1627	6.3.4. Demultiplexing C-multicast traffic

1629	   When multiple MVPNs are aggregated onto one P-Multicast tree,
1630	   determining the tree over which the packet is received is not
1631	   sufficient to determine the MVPN to which the packet belongs.  The
1632	   packet must also carry some demultiplexing information to allow the
1633	   egress PEs to determine the MVPN to which the packet belongs.  Since
1634	   the packet has been multicast through the P network, any given
1635	   demultiplexing value must have the same meaning to all the egress
1636	   PEs.  The demultiplexing value is a MPLS label that corresponds to
1637	   the multicast VRF to which the packet belongs. This label is placed
1638	   by the ingress PE immediately beneath the P-Multicast tree header.
1639	   Each of the egress PEs must be able to associate this MPLS label with
1640	   the same MVPN.  If downstream label assignment were used this would
1641	   require all the egress PEs in the MVPN to agree on a common label for
1642	   the MVPN. Instead the MPLS label is upstream assigned [MPLS-UPSTREAM-
1643	   LABEL]. The label bindings are advertised via BGP updates originated
1644	   the ingress PEs.

1646	   This procedure requires each egress PE to support a separate label
1647	   space for every other PE. The egress PEs create a forwarding entry
1648	   for the upstream assigned MPLS label, allocated by the ingress PE, in
1649	   this label space. Hence when the egress PE receives a packet over an
1650	   Aggregate Tree, it first determines the tree that the packet was
1651	   received over. The tree identifier determines the label space in
1652	   which the upstream assigned MPLS label lookup has to be performed.
1653	   The same label space may be used for all P-multicast trees rooted at
1654	   the same ingress PE, or an implementation may decide to use a
1655	   separate label space for every P-multicast tree.

1657	   The support of aggregation for shared trees and MP2MP trees is
1658	   discussed in section 6.6.

1660	   The encapsulation format is either MPLS or MPLS-in-something (e.g.
1661	   MPLS-in-GRE [MPLS-IP]). When MPLS is used, this label will appear
1662	   immediately below the label that identifies the P-multicast tree.
1663	   When MPLS-in-GRE is used, this label will be the top MPLS label that
1664	   appears when the GRE header is stripped off.

1666	   When IP encapsulation is used for the P-multicast Tree, whatever
1667	   information that particular encapsulation format uses for identifying
1668	   a particular tunnel is used to determine the label space in which the
1669	   MPLS label is looked up.

1671	   If the P-multicast tree uses MPLS encapsulation, the P-multicast tree
1672	   is itself identified by an MPLS label.  The egress PE MUST NOT
1673	   advertise IMPLICIT NULL or EXPLICIT NULL for that tree.  Once the
1674	   label representing the tree is popped off the MPLS label stack, the
1675	   next label is the demultiplexing information that allows the proper
1676	   MVPN to be determined.

1678	   This specification requires that, to support this sort of
1679	   aggregation, there be at least one upstream-assigned label per MVPN.
1680	   It does not require that there be only one.  For example, an ingress
1681	   PE could assign a unique label to each C-(S,G).  (This could be done
1682	   using the same technique this is used to assign a particular C-(S,G)
1683	   to an S-PMSI, see section 7.3.)

1685	6.4. Mapping Received Packets to MVPNs

1687	   When an egress PE receives a C-multicast data packet over a P-
1688	   multicast tree, it needs to forward the packet to the CEs that have
1689	   receivers in the packet's C-multicast group.  In order to do this the
1690	   egress PE needs to determine the tunnel that the packet was received
1691	   on. The PE can then determine the MVPN that the packet belongs to and
1692	   if needed do any further lookups that are needed to forward the
1693	   packet.

1695	6.4.1. Unicast Tunnels

1697	   When ingress replication is used, the MVPN to which the received C-
1698	   multicast data packet belongs can be determined by the MPLS label
1699	   that was allocated by the egress. This label is distributed by the
1700	   egress.

1702	6.4.2. Non-Aggregated P-Multicast Trees

1704	   If a P-multicast tree is associated with only one MVPN, determining
1705	   the P-multicast tree on which a packet was received is sufficient to
1706	   determine the packet's MVPN. All that the egress PE needs to know is
1707	   the MVPN the P-multicast tree is associated with.

1709	   There are different ways in which the egress PE can learn this
1710	   association:

1712	      a) Configuration. The P-multicast tree that a particular MVPN
1713	         belongs to is configured on each PE.

1715	      b) BGP based advertisement of the P-multicast tree - MPVN mapping
1716	         after the root of the tree discovers the leaves of the tree.
1717	         The root of the tree sets up the tree after discovering each of
1718	         the PEs that belong to the MVPN.  It then advertises the P-
1719	         multicast tree - MVPN mapping to each of the leaves.  This
1720	         mechanism can be used with both source initiated trees [e.g.
1721	         RSVP-TE P2MP LSPs] and receiver initiated trees [e.g. PIM
1722	         trees].

1724	      c) BGP based advertisement of the P-multicast tree - MVPN mapping
1725	         as part of the MVPN membership discovery. The root of the tree
1726	         advertises, to each of the other PEs that belong to the MVPN,
1727	         the P-multicast tree that the MVPN is associated with. This
1728	         implies that the root doesn't need to know the leaves of the
1729	         tree beforehand. This is possible only for receiver initiated
1730	         trees e.g. PIM based trees.

1732	   Both of the above require the BGP based advertisement to contain the
1733	   P-multicast tree identifier. This identifier is encoded as a BGP
1734	   attribute and contains the following elements:

1736	     - Tunnel Type.

1738	     - Tunnel identifier. The semantics of the identifier is determined
1739	       by the tunnel type.

1741	6.4.3. Aggregate P-Multicast Trees

1743	   Once a PE sets up an Aggregate Tree it needs to announce the C-
1744	   multicast groups being mapped to this tree to other PEs in the
1745	   network. This procedure is referred to as Aggregate Tree discovery.
1746	   For an Aggregate Tree with an inclusive mapping this discovery
1747	   implies announcing:

1749	     - The mapping of all MVPNs mapped to the Tree.

1751	     - For each MVPN mapped onto the tree the inner label allocated for
1752	       it by the ingress PE. The use of this label is explained in the
1753	       demultiplexing procedures of section 6.3.4.

1755	     - The P-multicast tree Identifier

1757	   The egress PE creates a logical interface corresponding to the tree
1758	   identifier. This interface is the RPF interface for all the <C-
1759	   Source, C-Group> entries mapped to that tree.

1761	   When PIM is used to setup P-multicast trees, the egress PE also Joins
1762	   the P-Group Address corresponding to the tree. This results in setup
1763	   of the PIM P-multicast tree.

1765	6.5. I-PMSI Instantiation Using Ingress Replication

1767	   As described in section 3 a PMSI can be instantiated using Unicast
1768	   Tunnels between the PEs that are participating in the MVPN. In this
1769	   mechanism the ingress PE replicates a C-multicast data packet
1770	   belonging to a particular MVPN and sends a copy to all or a subset of
1771	   the PEs that belong to the MVPN. A copy of the packet is tunneled to
1772	   a remote PE over an Unicast Tunnel to the remote PE. IP/GRE Tunnels
1773	   or MPLS LSPs are examples of unicast tunnels that may be used. Note
1774	   that the same Unicast Tunnel can be used to transport packets
1775	   belonging to different MVPNs.

1777	   Ingress replication can be used to instantiate a UI-PMSI. The PE sets
1778	   up unicast tunnels to each of the remote PEs that support ingress
1779	   replication. For a given MVPN all C-multicast data packets are sent
1780	   to each of the remote PEs in the MVPN that support ingress
1781	   replication. Hence a remote PE may receive C-multicast data packets
1782	   for a group even if it doesn't have any receivers in that group.

1784	   Ingress replication can also be used to instantiate a MI-PMSI. In
1785	   this case each PE has a mesh of unicast tunnels to every other PE in
1786	   that MVPN.

1788	   However when ingress replication is used it is recommended that only
1789	   S-PMSIs be used. Instantiation of S-PMSIs with ingress replication is
1790	   described in section 7.1.  Note that this requires the use of
1791	   explicit tracking, i.e., a PE must know which of the other PEs have
1792	   receivers for each C-multicast tree.

1794	6.6. Establishing P-Multicast Trees

1796	   It is believed that the architecture outlined in this document places
1797	   no limitations on the protocols used to instantiate P-multicast
1798	   trees. However, the only protocols being explicitly considered are
1799	   PIM-SM, PIM-SSM, BIDIR-PIM, RSVP-TE, and mLDP.

1801	   A P-multicast tree can be either a source tree or a shared tree. A
1802	   source tree is used to carry traffic only for the multicast VRFs that
1803	   exist locally on the root of the tree i.e. for which the root has
1804	   local CEs. The root is a PE router. Source P-multicast trees can be
1805	   instantiated using PIM-SM, PIM-SSM, RSVP-TE P2MP LSPs, and mLDP P2MP
1806	   LSPs.

1808	   A shared tree on the other hand can be used to carry traffic
1809	   belonging to VRFs that exist on other PEs as well. The root of a
1810	   shared tree is not necessarily one of the PEs in the MVPN. All PEs
1811	   that use the shared tree will send MVPN data packets to the root of
1812	   the shared tree; if PIM is being used as the control protocol, PIM
1813	   control packets also get sent to the root of the shared tree.  This
1814	   may require an unicast tunnel between each of these PEs and the root.
1815	   The root will then send them on the shared tree and all the PEs that
1816	   are leaves of the shared tree will receive the packets. For example a
1817	   RP based PIM-SM tree would be a shared tree. Shared trees can be
1818	   instantiated using PIM-SM, PIM-SSM, BIDIR-PIM, RSVP-TE P2MP LSPs,
1819	   mLDP P2MP LSPs, and mLDP MP2MP LSPs.. Aggregation support for
1820	   bidirectional P-trees (i.e., BIDIR-PIM trees or mLDP MP2MP trees) is
1821	   for further study. Shared trees require all the PEs to discover the
1822	   root of the shared tree for a MVPN. To achieve this the root of a
1823	   shared tree advertises as part of the BGP based MVPN membership
1824	   discovery:

1826	     - The capability to setup a shared tree for a specified MVPN.

1828	     - A downstream assigned label that is to be used by each PE to
1829	       encapsulate a MVPN data packet, when they send this packet to the
1830	       root of the shared tree.

1832	     - A downstream assigned label that is to be used by each PE to
1833	       encapsulate a MVPN control packet, when they send this packet to
1834	       the root of the shared tree.

1836	   Both a source tree and a shared tree can be used to instantiate an I-
1837	   PMSI.  If a source tree is used to instantiate an UI-PMSI for a MVPN,
1838	   all the other PEs that belong to the MVPN, must be leaves of the
1839	   source tree. If a shared tree is used to instantiate a UI-PMSI for a
1840	   MVPN, all the PEs that are members of the MVPN must be leaves of the
1841	   shared tree.

1843	6.7. RSVP-TE P2MP LSPs

1845	   This section describes procedures that are specific to the usage of
1846	   RSVP-TE P2MP LSPs for instantiating a UI-PMSI. The RSVP-TE P2MP LSP
1847	   can be either a source tree or a shared tree. Procedures in [RSVP-
1848	   P2MP] are used to signal the LSP. The LSP is signaled after the root
1849	   of the LSP discovers the leaves. The egress PEs are discovered using
1850	   the MVPN membership procedures described in section 4. RSVP-TE P2MP
1851	   LSPs can optionally support aggregation.

1853	6.7.1. P2MP TE LSP Tunnel - MVPN Mapping

1855	   P2MP TE LSP Tunnel to MVPN mapping can be learned at the egress PEs
1856	   using either option (a) or option (b) described in section 6.4.2.
1857	   Option (b) i.e. BGP based advertisements of the P2MP TE LSP Tunnel -
1858	   MPVN mapping require that the root of the tree include the P2MP TE
1859	   LSP Tunnel identifier as the tunnel identifier in the BGP
1860	   advertisements. This identifier contains the following information
1861	   elements:

1863	     - The type of the tunnel is set to RSVP-TE P2MP Tunnel

1865	     - RSVP-TE P2MP Tunnel's SESSION Object

1867	     - Optionally RSVP-TE P2MP LSP's SENDER_TEMPLATE Object. This object
1868	       is included when it is desired to identify a particular P2MP TE
1869	       LSP.

1871	6.7.2. Demultiplexing C-Multicast Data Packets

1873	   Demultiplexing the C-multicast data packets at the egress PE follow
1874	   procedures described in section 6.3.4. The RSVP-TE P2MP LSP Tunnel
1875	   must be signaled with penultimate-hop-popping (PHP) off. Signaling
1876	   the P2MP TE LSP Tunnel with PHP off requires an extension to RSVP-TE
1877	   which will be described later.

1879	7. Optimizing Multicast Distribution via S-PMSIs

1881	   Whenever a particular multicast stream is being sent on an I-PMSI, it
1882	   is likely that the data of that stream is being sent to PEs that do
1883	   not require it.  If a particular stream has a significant amount of
1884	   traffic, it may be beneficial to move it to an S-PMSI which includes
1885	   only those PEs that are transmitters and/or receivers (or at least
1886	   includes fewer PEs that are neither).

1888	   If explicit tracking is being done, S-PMSI creation can also be
1889	   triggered on other criteria.  For instance there could be a "pseudo
1890	   wasted bandwidth" criteria: switching to an S-PMSI would be done if
1891	   the bandwidth multiplied by the number of uninterested PEs (PE that
1892	   are receiving the stream but have no receivers) is above a specified
1893	   threshold. The motivation is that (a) the total bandwidth wasted by
1894	   many sparsely subscribed low-bandwidth groups may be large, and (b)
1895	   there's no point to moving a high-bandwidth group to an S-PMSI if all
1896	   the PEs have receivers for it.

1898	   Switching a (C-S, C-G) stream to an S-PMSI may require the root of
1899	   the S-PMSI to determine the egress PEs that need to receive the (C-S,
1900	   C-G) traffic.  This is true in the following cases:

1902	     - If the tunnel is a source initiated tree, such as a RSVP-TE P2MP
1903	       Tunnel, the PE needs to know the leaves of the tree before it can
1904	       instantiate the S-PMSI.

1906	     - If a PE instantiates multiple S-PMSIs, belonging to different
1907	       MVPNs, using one P-multicast tree, such a tree is termed an
1908	       Aggregate Tree with a selective mapping. The setting up of such
1909	       an Aggregate Tree requires the ingress PE to know all the other
1910	       PEs that have receivers for multicast groups that are mapped onto
1911	       the tree.

1913	   The above two cases require that explicit tracking be done for the
1914	   (C-S, C-G) stream.  The root of the S-PMSI MAY decide to do explicit
1915	   tracking of this stream only after it has determined to move the
1916	   stream to an S-PMSI, or it MAY have been doing explicit tracking all
1917	   along.

1919	   If the S-PMSI is instantiated by a P-multicast tree, the PE at the
1920	   root of the tree must signal the leaves of the tree that the (C-S, C-
1921	   G) stream is now bound to the to the S-PMSI. Note that the PE could
1922	   create the identity of the P-multicast tree prior to the actual
1923	   instantiation of the tunnel.

1925	   If the S-PMSI is instantiated by a source-initiated P-multicast tree
1926	   (e.g., an RSVP-TE P2MP tunnel), the PE at the root of the tree must
1927	   establish the source-initiated P-multicast tree to the leaves.  This
1928	   tree MAY have been established before the leaves receive the S-PMSI
1929	   binding, or MAY be established after the leaves receives the binding.
1930	   The leaves MUST NOT switch to the S-PMSI until they receive both the
1931	   binding and the tree signaling message.

1933	7.1. S-PMSI Instantiation Using Ingress Replication

1935	   As described in section 6.1.1, ingress replication can be used to
1936	   instantiate a UI-PMSI. However this can result in a PE receiving
1937	   packets for a multicast group for which it doesn't have any
1938	   receivers. This can be avoided if the ingress PE tracks the remote
1939	   PEs which have receivers in a particular C-multicast group.  In order
1940	   to do this it needs to receive C-Joins from each of the remote PEs.
1941	   It then replicates the C-multicast data packet and sends it to only
1942	   those egress PEs which are on the path to a receiver of that C-group.
1943	   It is possible that each PE that is using ingress replication
1944	   instantiates only S-PMSIs. It is also possible that some PEs
1945	   instantiate UI-PMSIs while others instantiate only S-PMSIs. In both
1946	   these cases the PE MUST either unicast MVPN routing information using
1947	   PIM or use BGP for exchanging the MVPN routing information. This is
1948	   because there may be no MI-PMSI available for it to exchange MVPN
1949	   routing information.

1951	   Note that the use of ingress replication doesn't require any extra
1952	   procedures for signaling the binding of the S-PMSI from the ingress
1953	   PE to the egress PEs.  The procedures described for I-PMSIs are
1954	   sufficient.

1956	7.2. Protocol for Switching to S-PMSIs

1958	   We describe two protocols for switching to S-PMSIs.  These protocols
1959	   can be used when the tunnel that instantiates the S-PMSI is a P-
1960	   multicast tree.

1962	7.2.1. A UDP-based Protocol for Switching to S-PMSIs

1964	   This procedure can be used for any MVPN which has an MI-PMSI.
1965	   Traffic from all multicast streams in a given MPVN is sent, by
1966	   default, on the MI-PMSI.  Consider a single multicast stream within a
1967	   given MVPN, and consider a PE which is attached to a source of
1968	   multicast traffic for that stream.  The PE can be configured to move
1969	   the stream from the MI-PMSI to an S-PMSI if certain configurable
1970	   conditions are met.  To do this, it needs to inform all the PEs which
1971	   attach to receivers for stream.  These PEs need to start listening
1972	   for traffic on the S-PMSI, and the transmitting PE may start sending
1973	   traffic on the S-PMSI when it is reasonably certain that all
1974	   receiving PEs are listening on the S-PMSI.

1976	7.2.1.1. Binding a Stream to an S-PMSI

1978	   When a PE which attaches to a transmitter for a particular multicast
1979	   stream notices that the conditions for moving the stream to an S-PMSI
1980	   are met, it begins to periodically send an "S-PMSI Join Message" on
1981	   the MI-PMSI.  The S-PMSI Join is a UDP-encapsulated message whose
1982	   destination address is ALL-PIM-ROUTERS (224.0.0.13), and whose
1983	   destination port is 3232.

1985	   The S-PMSI Join Message contains the following information:

1987	     - An identifier for the particular multicast stream which is to be
1988	       bound to the S-PMSI.  This can be represented as an (S,G) pair.

1990	     - An identifier for the particular S-PMSI to which the stream is to
1991	       be bound.  This identifier is a structured field which includes
1992	       the following information:

1994	         * The type of tunnel used to instantiate the S-PMSI

1996	         * An identifier for the tunnel.  The form of the identifier
1997	           will depend upon the tunnel type.  The combination of tunnel
1998	           identifier and tunnel type should contain enough information
1999	           to enable all the PEs to "join" the tunnel and receive
2000	           messages from it.

2002	         * Any demultiplexing information needed by the tunnel
2003	           encapsulation protocol to identify the particular S-PMSI.
2004	           This allows a single tunnel to aggregate multiple S-PMSIs.
2005	           If a particular tunnel is not aggregating multiple S-PMSIs,
2006	           then no demultiplexing information is needed.

2008	   A PE router which is not connected to a receiver will still receive
2009	   the S-PMSI Joins, and MAY cache the information contained therein.
2010	   Then if the PE later finds that it is attached to a receiver, it can
2011	   immediately start listening to the S-PMSI.

2013	   Upon receiving the S-PMSI Join, PE routers connected to receivers for
2014	   the specified stream will take whatever action is necessary to start
2015	   receiving multicast data packets on the S-PMSI.  The precise action
2016	   taken will depend upon the tunnel type.

2018	   After a configurable delay, the PE router which is sending the S-PMSI
2019	   Joins will start transmitting the stream's data packets on the S-
2020	   PMSI.

2022	   When the pre-configured conditions are no longer met for a particular
2023	   stream, e.g. the traffic stops, the PE router connected to the source
2024	   stops announcing S-PMSI Joins for that stream.  Any PE that does not
2025	   receive, over a configurable interval, an S-PMSI Join for a
2026	   particular stream will stop listening to the S-PMSI.

2028	7.2.1.2. Packet Formats and Constants

2030	   The S-PMSI Join message is encapsulated within UDP, and has the
2031	   following type/length/value (TLV) encoding:

2033	        0                   1                   2                   3
2034	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2035	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2036	       |     Type      |            Length           |     Value       |
2037	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2038	       |                               .                               |
2039	       |                               .                               |
2040	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2042	   Type (8 bits)

2044	   Length (16 bits): the total number of octets in the Type, Length, and
2045	   Value fields combined

2047	   Value (variable length)

2049	   Currently only one type of S-PMSI Join is defined.  A type 1 S-PMSI
2050	   Join is used when the S-PMSI tunnel is a PIM tunnel which is used to
2051	   carry a single multicast stream, where the packets of that stream
2052	   have IPv4 source and destination IP addresses.

2054	        0                   1                   2                   3
2055	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2056	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2057	       |     Type      |           Length            |    Reserved     |
2058	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2059	       |                           C-source                            |
2060	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2061	       |                           C-group                             |
2062	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2063	       |                           P-group                             |
2064	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

2066	   Type (8 bits): 1

2068	   Length (16 bits): 16

2070	   Reserved (8 bits):  This field SHOULD be zero when transmitted, and
2071	   MUST be ignored when received.

2073	   C-Source (32 bits): the IPv4 address of the traffic source in the
2074	   VPN.

2076	   C-Group (32 bits): the IPv4 address of the multicast traffic
2077	   destination address in the VPN.

2079	   P-Group (32 bits): the IPv4 group address that the PE router is going
2080	   to use to encapsulate the flow (C-Source, C-Group).

2082	   The P-group identifies the S-PMSI tunnel, and the (C-S, C-G)
2083	   identifies the multicast flow that is carried in the tunnel.

2085	   The protocol uses the following constants.

2087	   [S-PMSI_DELAY]:

2089	       the PE router which is to transmit onto the S-PMSI will delay
2090	       this amount of time before it begins using the S-PMSI.  The
2091	       default value is 3 seconds.

2093	   [S-PMSI_TIMEOUT]:

2095	       if a PE (other than the transmitter) does not receive any packets
2096	       over the S-PMSI tunnel for this amount of time, the PE will prune
2097	       itself from the S-PMSI tunnel, and will expect (C-S, C-G) packets
2098	       to arrive on an I-PMSI.  The default value is 3 minutes.  This
2099	       value must be consistent among PE routers.

2101	   [S-PMSI_HOLDOWN]:

2103	       if the PE that transmits onto the S-PMSI does not see any (C-S,
2104	       C-G) packets for this amount of time, it will resume sending (C-
2105	       S, C-G) packets on an I-PMSI.

2107	       This is used to avoid oscillation when traffic is bursty.  The
2108	       default value is 1 minute.

2110	   [S-PMSI_INTERVAL]
2111	       the interval the transmitting PE router uses to periodically send
2112	       the S-PMSI Join message.  The default value is 60 seconds.

2114	7.2.2. A BGP-based Protocol for Switching to S-PMSIs

2116	   This procedure can be used for a MVPN that is using either a UI-PMSI
2117	   or a MI-PMSI. Consider a single multicast stream for a C-(S, G)
2118	   within a given MVPN, and consider a PE which is attached to a source
2119	   of multicast traffic for that stream. The PE can be configured to
2120	   move the stream from the MI-PMSI or UI-PMSI to an S-PMSI if certain
2121	   configurable conditions are met. Once a PE decides to move the C-(S,
2122	   G) for a given MVPN to a S-PMSI, it needs to instantiate the S-PMSI
2123	   using a tunnel and announce to all the egress PEs, that are on the
2124	   path to receivers of the C-(S, G), of the binding of the S-PMSI to
2125	   the C-(S, G). The announcement is done using BGP.  Depending on the
2126	   tunneling technology used, this announcement may be done before or
2127	   after setting up the tunnel. The source and egress PEs have to switch
2128	   to using the S-PMSI for the C-(S, G).

2130	7.2.2.1. Advertising C-(S, G) Binding to a S-PMSI using BGP

2132	   The ingress PE informs all the PEs that are on the path to receivers
2133	   of the C-(S, G) of the binding of the S-PMSI to the C-(S, G). The BGP
2134	   announcement is done by sending update for the MCAST-VPN address
2135	   family.  An A-D route is used, containing the following information:

2137	      a) IP address of the originating PE

2139	      b) The RD configured locally for the MVPN. This is required to
2140	         uniquely identify the <C-Source, C-Group> as the addresses
2141	         could overlap between different MVPNs.  This is the same RD
2142	         value used in the auto-discovery process.

2144	      c) The C-Source address.

2146	      d) The C-Group address.

2148	      e) A PE MAY aggregate two or more S-PMSIs originated by the PE
2149	         onto the same P-Multicast tree. If the PE already advertises S-
2150	         PMSI auto-discovery routes for these S-PMSIs, then aggregation
2151	         requires the PE to re-advertise these routes. The re-advertised
2152	         routes MUST be the same as the original ones, except for the
2153	         PMSI tunnel attribute. If the PE has not previously advertised
2154	         S-PMSI auto-discovery routes for these S-PMSIs, then the
2155	         aggregation requires the PE to advertise (new) S-PMSI auto-
2156	         discovery routes for these S-PMSIs.  The PMSI Tunnel attribute
2157	         in the newly advertised/re-advertised routes MUST carry the
2158	         identity of the P- Multicast tree that aggregates the S-PMSIs.
2159	         If at least some of the S-PMSIs aggregated onto the same P-
2160	         Multicast tree belong to different MVPNs, then all these routes
2161	         MUST carry an MPLS upstream assigned label [MPLS-UPSTREAM-
2162	         LABEL, section 6.3.4].  If all these aggregated S-PMSIs belong
2163	         to the same MVPN, then the routes MAY carry an MPLS upstream
2164	         assigned label [MPLS-UPSTREAM-LABEL].  The labels MUST be
2165	         distinct on a per MVPN basis, and MAY be distinct on a per
2166	         route basis.

2168	   When a PE distributes this information via BGP, it must include the
2169	   following:

2171	      1. An identifier for the particular S-PMSI to which the stream is
2172	         to be bound.  This identifier is a structured field which
2173	         includes the following information:

2175	           * The type of tunnel used to instantiate the S-PMSI

2177	           * An identifier for the tunnel.  The form of the identifier
2178	             will depend upon the tunnel type.  The combination of
2179	             tunnel identifier and tunnel type should contain enough
2180	             information to enable all the PEs to "join" the tunnel and
2181	             receive messages from it.

2183	      2. Route Target Extended Communities attribute. This is used as
2184	         described in section 4.

2186	7.2.2.2. Explicit Tracking

2188	   If the PE wants to enable explicit tracking for the specified flow,
2189	   it also indicates this in the A-D route it uses to bind the flow to a
2190	   particular S-PMSI.  Then any PE which receives the A-D route will
2191	   respond with a "Leaf A-D Route" in which it identifies itself as a
2192	   receiver of the specified flow.  The Leaf A-D route will be withdrawn
2193	   when the PE is no longer a receiver for the flow.

2195	   If the PE needs to enable explicit tracking for a flow before binding
2196	   the flow to an S-PMSI, it can do so by sending an A-D route
2197	   identifying the flow but not specifying an S-PMSI.  This will elicit
2198	   the Leaf A-D Routes.  This is useful when the PE needs to know the
2199	   receivers before selecting an S-PMSI.

2201	7.2.2.3. Switching to S-PMSI

2203	   After the egress PEs receive the announcement they setup their
2204	   forwarding path to receive traffic on the S-PMSI if they have one or
2205	   more receivers interested in the <C-S, C-G> bound to the S-PMSI. This
2206	   involves changing the RPF interface for the relevant <C-S, C-G>
2207	   entries to the interface that is used to instantiate the S-PMSI. If
2208	   an Aggregate Tree is used to instantiate a S-PMSI this also implies
2209	   setting up the demultiplexing forwarding entries based on the inner
2210	   label as described in section 6.3.4.  The egress PEs may perform the
2211	   switch to the S-PMSI once the advertisement from the ingress PE is
2212	   received or wait for a preconfigured timer to do so.

2214	   A source PE may use one of two approaches to decide when to start
2215	   transmitting data on the S-PMSI. In the first approach once the
2216	   source PE instantiates the S-PMSI, it starts sending multicast
2217	   packets for <C-S, C-G> entries mapped to the S-PMSI on both that as
2218	   well as on the I-PMSI, which is currently used to send traffic for
2219	   the <C-S, C-G>. After some preconfigured timer the PE stops sending
2220	   multicast packets for <C-S, C-G> on the I-PMSI. In the second
2221	   approach after a certain pre-configured delay after advertising the
2222	   <C-S, C-G> entry bound to a S-PMSI, the source PE begins to send
2223	   traffic on the S-PMSI. At this point it stops to send traffic for the
2224	   <C-S, C-G> on the I-PMSI. This traffic is instead transmitted on the
2225	   S-PMSI.

2227	7.3. Aggregation

2229	   S-PMSIs can be aggregated on a P-multicast tree. The S-PMSI to C-(S,
2230	   G) binding advertisement supports aggregation. Furthermore the
2231	   aggregation procedures of section 6.3 apply. It is also possible to
2232	   aggregate both S-PMSIs and I-PMSIs on the same P-multicast tree.

2234	7.4. Instantiating the S-PMSI with a PIM Tree

2236	   The procedures of section 7.3 tell a PE when it must start listening
2237	   and stop listening to a particular S-PMSI.  Those procedures also
2238	   specify the method for instantiating the S-PMSI.  In this section, we
2239	   provide the procedures to be used when the S-PMSI is instantiated as
2240	   a PIM tree.  The PIM tree is created by the PIM P-instance.

2242	   If a single PIM tree is being used to aggregate multiple S-PMSIs,
2243	   then the PIM tree to which a given stream is bound may have already
2244	   been joined by a given receiving PE.  If the tree does not already
2245	   exist, then the appropriate PIM procedures to create it must be
2246	   executed in the P-instance.

2248	   If the S-PMSI for a particular multicast stream is instantiated as a
2249	   PIM-SM or BIDIR-PIM tree, the S-PMSI identifier will specify the RP
2250	   and the group P-address, and the PE routers which have receivers for
2251	   that stream must build a shared tree toward the RP.

2253	   If the S-PMSI is instantiated as a PIM-SSM tree, the PE routers build
2254	   a source tree toward the PE router that is advertising the S-PMSI
2255	   Join.  The IP address root of the tree is the same as the source IP
2256	   address which appears in the S-PMSI Join.  In this case, the tunnel
2257	   identifier in the S-PMSI Join will only need to specify a group P-
2258	   address.

2260	   The above procedures assume that each PE router has a set of group P-
2261	   addresses that it can use for setting up the PIM-trees.  Each PE must
2262	   be configured with this set of P-addresses.  If PIM-SSM is used to
2263	   set up the tunnels, then the PEs may be with overlapping sets of
2264	   group P-addresses.  If PIM-SSM is not used, then each PE must be
2265	   configured with a unique set of group P-addresses (i.e., having no
2266	   overlap with the set configured at any other PE router).  The
2267	   management of this set of addresses is thus greatly simplified when
2268	   PIM-SSM is used, so the use of PIM-SSM is strongly recommended
2269	   whenever PIM trees are used to instantiate S-PMSIs.

2271	   If it is known that all the PEs which need to receive data traffic on
2272	   a given S-PMSI can support aggregation of multiple S-PMSIs on a
2273	   single PIM tree, then the transmitting PE, may, at its discretion,
2274	   decide to bind the S-PMSI to a PIM tree which is already bound to one
2275	   or more other S-PMSIs, from the same or from different MVPNs.  In
2276	   this case, appropriate demultiplexing information must be signaled.

2278	7.5. Instantiating S-PMSIs using RSVP-TE P2MP Tunnels

2280	   RSVP-TE P2MP Tunnels can be used for instantiating S-PMSIs.
2281	   Procedures described in the context of I-PMSIs in section 6.7 apply.

2283	8. Inter-AS Procedures

2285	   If an MVPN has sites in more than one AS, it requires one or more
2286	   PMSIs to be instantiated by inter-AS tunnels.  This document
2287	   describes two different types of inter-AS tunnel:

2289	      1. "Segmented Inter-AS tunnels"

2291	         A segmented inter-AS tunnel consists of a number of independent
2292	         segments which are stitched together at the ASBRs.  There are
2293	         two types of segment, inter-AS segments and intra-AS segments.
2294	         The segmented inter-AS tunnel consists of alternating intra-AS
2295	         and inter-AS segments.

2297	         Inter-AS segments connect adjacent ASBRs of different ASes;
2298	         these "one-hop" segments are instantiated as unicast tunnels.

2300	         Intra-AS segments connect ASBRs and PEs which are in the same
2301	         AS.  An intra-AS segment may be of whatever technology is
2302	         desired by the SP that administers the that AS.  Different
2303	         intra-AS segments may be of different technologies.

2305	         Note that the intra-AS segments of inter-AS tunnels form a
2306	         category of tunnels that is distinct from simple intra-AS
2307	         tunnels; we will rely on this distinction later (see Section
2308	         9).

2310	         A segmented inter-AS tunnel can be thought of as a tree which
2311	         is rooted at a particular AS, and which has as its leaves the
2312	         other ASes which need to receive multicast data from the root
2313	         AS.

2315	      2. "Non-segmented Inter-AS tunnels"

2317	         A non-segmented inter-AS tunnel is a single tunnel which spans
2318	         AS boundaries.  The tunnel technology cannot change from one
2319	         point in the tunnel to the next, so all ASes through which the
2320	         tunnel passes must support that technology.  In essence, AS
2321	         boundaries are of no significance to a non-segmented inter-AS
2322	         tunnel.

2324	   Section 10 of [RFC4364] describes three different options for
2325	   supporting unicast Inter-AS BGP/MPLS IP VPNs, known as options A, B,
2326	   and C.  We describe below how both segmented and non-segmented inter-
2327	   AS trees can be supported when option B or option C is used. (Option
2328	   A does not pass any routing information through an ASBR at all, so no
2329	   special inter-AS procedures are needed.)

2331	8.1. Non-Segmented Inter-AS Tunnels

2333	   In this model, the previously described discovery and tunnel setup
2334	   mechanisms are used, even though the PEs belonging to a given MVPN
2335	   may be in different ASes.

2337	8.1.1. Inter-AS MVPN Auto-Discovery

2339	   The previously described BGP-based auto-discovery mechanisms work "as
2340	   is" when an MVPN contains PEs that are in different Autonomous
2341	   Systems.  However, please note that, if non-segmented Inter-AS
2342	   Tunnels are to be used, then the "Intra-AS" A-D routes MUST be
2343	   distributed across AS boundaries!

2345	8.1.2. Inter-AS MVPN Routing Information Exchange

2347	   When non-segmented inter-AS tunnels are used, MVPN C-multicast
2348	   routing information may be exchanged by means of PIM peering across
2349	   an MI-PMSI, or by means of BGP carrying C-multicast routes.

2351	   When PIM peering is used to distribute the C-multicast routing
2352	   information, a PE that sends C-PIM Join/Prune messages for a
2353	   particular C-(S,G) must be able to identify the PE which is its PIM
2354	   adjacency on the path to S.  This is the "selected upstream PE"
2355	   described in section 5.1.

2357	   If BGP (rather than PIM) is used to distribute the C-multicast
2358	   routing information, and if option b of section 10 of [RFC4364] is in
2359	   use, then the C-multicast routes will be installed in the ASBRs along
2360	   the path from each multicast source in the MVPN to each multicast
2361	   receiver in the MVPN.  If option b is not in use, the C-multicast
2362	   routes are not installed in the ASBRs.  The handling of the C-
2363	   multicast routes in either case is thus exactly analogous to the
2364	   handling of unicast VPN-IP routes in the corresponding case.

2366	8.1.3. Inter-AS P-Tunnels

2368	   The procedures described earlier in this document can be used to
2369	   instantiate either an I-PMSI or an S-PMSI with inter-AS P-tunnels.
2370	   Specific tunneling techniques require some explanation.

2372	   If ingress replication is used, the inter-AS PE-PE tunnels will use
2373	   the inter-AS tunneling procedures for the tunneling technology used.

2375	   Procedures in [RSVP-P2MP] are used for inter-AS RSVP-TE P2MP P-
2376	   Tunnels.

2378	   Procedures for using PIM  to set up the P-tunnels are discussed in
2379	   the next section.

2381	8.1.3.1. PIM-Based Inter-AS P-Multicast Trees

2383	   When PIM is used to set up an inter-AS P-multicast tree, the PIM
2384	   Join/Prune messages used to join the tree contain the IP address of
2385	   the upstream PE.  However, there are two special considerations that
2386	   must be taken into account:

2388	     - It is possible that the P routers within one or more of the ASes
2389	       will not have routes to the upstream PE.  For example, if an AS
2390	       has a "BGP-free core", the P routers in an AS will not have
2391	       routes to addresses outside the AS.

2393	     - If the PIM Join/Prune message must travel through several ASes,
2394	       it is possible that the ASBRs will not have routes to he PE
2395	       routers.  For example, in an inter-AS VPN constructed according
2396	       to "option b" of section 10 of [RFC4364], the ASBRs do not
2397	       necessarily have routes to the PE routers.

2399	   If either of these two conditions obtains, then "ordinary" PIM
2400	   Join/Prune messages cannot be routed to the upstream PE.  Thus the
2401	   following information needs to be added to the PIM Join/Prune
2402	   messages: a "Proxy Address", which contains the address of the next
2403	   ASBR on the path to the upstream PE.  When the PIM Join/Prune arrives
2404	   at the ASBR which is identified by the "proxy address", that ASBR
2405	   must change the proxy address to identify the next hop ASBR.

2407	   This information allows the PIM Join/Prune to be routed through an AS
2408	   even if the P routers of that AS do not have routes to the upstream
2409	   PE.  However, this information is not sufficient to enable the ASBRs
2410	   to route the Join/Prune if the ASBRs themselves do not have routes to
2411	   the upstream PE.

2413	   However, even if the ASBRs do not have routes to the upstream PE, the
2414	   procedures of this draft ensure that they will have A-D routes that
2415	   lead to the upstream PE.  If non-segmented inter-AS MVPNs are being
2416	   used, the ASBRs (and PEs) will have Intra-AS A-D routes which have
2417	   been distributed inter-AS.

2419	   So rather than having the PIM Join/Prune messages routed by the ASBRs
2420	   along a route to the upstream PE,  the PIM Join/Prune messages MUST
2421	   be routed along the path determined by the intra-AS A-D routes.

2423	   If the only intra-AS A-D route for a given MVPN is the "Intra-AS I-
2424	   PMSI Route", the PIM Join/Prunes will be routed along that.  However,
2425	   if the PIM Join/Prune message is for a particular P-group address,
2426	   and there is an "Intra-AS S-PMSI Route" specifying that particular P-
2427	   group address as the P-tunnel for a particular S-PMSI, then the PIM
2428	   Join/Prunes MUST be routed along the path determined by those intra-
2429	   AS A-D routes.

2431	   The next revision of this document will provide the following
2432	   details:

2434	     - encoding of the proxy address in the PIM message (the PIM Join
2435	       Attribute [PIM-ATTRIB] will be used)

2437	     - encoding of any other information which may be needed in order to
2438	       enable the correct intra-AS route to be chosen.

2440	   Support for non-segmented inter-AS trees using BIDIR-PIM is for
2441	   further study.

2443	8.2. Segmented Inter-AS Tunnels

2445	8.2.1. Inter-AS MVPN Auto-Discovery Routes

2447	   The BGP based MVPN membership discovery procedures of section 4 are
2448	   used to auto-discover the intra-AS MVPN membership. This section
2449	   describes the additional procedures for inter-AS MVPN membership
2450	   discovery. It also describes the procedures for constructing
2451	   segmented inter-AS tunnels.

2453	   In this case, for a given MVPN in an AS, the objective is to form a
2454	   spanning tree of MVPN membership, rooted at the AS. The nodes of this
2455	   tree are ASes.  The leaves of this tree are only those ASes that have
2456	   at least one PE with a member in the MVPN. The inter-AS tunnel used
2457	   to instantiate an inter-AS PMSI must traverse this spanning tree. A
2458	   given AS needs to announce to another AS only the fact that it has
2459	   membership in a given MVPN. It doesn't need to announce the
2460	   membership of each PE in the AS to other ASes.

2462	   This section defines an inter-AS auto-discovery route as a route that
2463	   carries information about an AS that has one or more PEs (directly)
2464	   connected to the site(s) of that MVPN. Further it defines an inter-AS
2465	   leaf auto-discovery route in the following way:
2466	     - Consider a node which is the root of an an intra-AS segment of an
2467	       inter-AS tunnel. An inter-AS leaf autodiscovery route is used to
2468	       inform such a node of a leaf of that intra-AS segment.

2470	8.2.1.1. Originating Inter-AS MVPN A-D Information

2472	   A PE in a given AS advertises its MVPN membership to all its IBGP
2473	   peers.  This IBGP peer may be a route reflector which in turn
2474	   advertises this information to only its IBGP peers. In this manner
2475	   all the PEs and ASBRs in the AS learn this membership information.

2477	   An Autonomous System Border Router (ASBR) may be configured to
2478	   support a particular MVPN. If an ASBR is configured to support a
2479	   particular MVPN, the ASBR MUST participate in the intra-AS MVPN auto-
2480	   discovery/binding procedures for that MVPN within the AS that the
2481	   ASBR belongs to, as defined in this document.

2483	   Each ASBR then advertises the "AS MVPN membership" to its neighbor
2484	   ASBRs using EBGP. This inter-AS auto-discovery route must not be
2485	   advertised to the PEs/ASBRs in the same AS as this ASBR. The
2486	   advertisement carries the following information elements:

2488	      a. A Route Distinguisher for the MVPN. For a given MVPN each ASBR
2489	         in the AS must use the same RD when advertising this
2490	         information to other ASBRs. To accomplish this all the ASBRs
2491	         within that AS, that are configured to support the MVPN, MUST
2492	         be configured with the same RD for that MVPN. This RD MUST be
2493	         of Type 0, MUST embed the autonomous system number of the AS.

2495	      b. The announcing ASBR's local address as the next-hop for the
2496	         above information elements.

2498	      c. By default the BGP Update message MUST carry export Route
2499	         Targets used by the unicast routing of that VPN. The default
2500	         could be modified via configuration by having a set of Route
2501	         Targets used for the inter-AS auto-discovery routes being
2502	         distinct from the ones used by the unicast routing of that VPN.

2504	8.2.1.2. Propagating Inter-AS MVPN A-D Information

2506	   As an inter-AS auto-discovery route originated by an ASBR within a
2507	   given AS is propagated via BGP to other ASes, this results in
2508	   creation of a data plane tunnel that spans multiple ASes. This tunnel
2509	   is used to carry (multicast) traffic from the MVPN sites connected to
2510	   the PEs of the AS to the MVPN sites connected to the PEs that are in
2511	   the other ASes. Such tunnel consists of multiple intra-AS segments
2512	   (one per AS) stitched at ASBRs' boundaries by single hop <ASBR-ASBR>
2513	   LSP segments.

2515	   An ASBR originates creation of an intra-AS segment when the ASBR
2516	   receives an inter-AS auto-discovery route from an EBGP neighbor.
2517	   Creation of the segment is completed as a result of distributing via
2518	   IBGP this route within the ASBR's own AS.

2520	   For a given inter-AS tunnel each of its intra-AS segments could be
2521	   constructed by its own independent mechanism. Moreover, by using
2522	   upstream labels within a given AS multiple intra-AS segments of
2523	   different inter-AS tunnels of either the same or different MVPNs may
2524	   share the same P-Multicast Tree.

2526	   Since (aggregated) inter-AS auto-discovery routes have granularity of
2527	   <AS, MVPN>, an MVPN that is present in N ASes would have total of N
2528	   inter-AS tunnels. Thus for a given MVPN the number of inter-AS
2529	   tunnels is independent of the number of PEs that have this MVPN.

2531	   The following sections specify procedures for propagation of
2532	   (aggregated) inter-AS auto-discovery routes across ASes.

2534	8.2.1.2.1. Inter-AS Auto-Discovery Route received via EBGP

2536	   When an ASBR receives from one of its EBGP neighbors a BGP Update
2537	   message that carries the inter-AS auto-discovery route if (a) at
2538	   least one of the Route Targets carried in the message matches one of
2539	   the import Route Targets configured on the ASBR, and (b) the ASBR
2540	   determines that the received route is the best route to the
2541	   destination carried in the NLRI of the route, the ASBR:

2543	      a) Re-advertises this inter-AS auto-discovery route within its own
2544	         AS.

2546	         If the ASBR uses ingress replication to instantiate the intra-
2547	         AS segment of the inter-AS tunnel, the re-advertised route
2548	         SHOULD carry a Tunnel attribute with the Tunnel Identifier set
2549	         to Ingress Replication, but no MPLS labels.

2551	         If a P-Multicast Tree is used to instantiate the intra-AS
2552	         segment of the inter-AS tunnel, and in order to advertise the
2553	         P-Multicast tree identifier the ASBR doesn't need to know the
2554	         leaves of the tree beforehand, then the advertising ASBR SHOULD
2555	         advertise the P-Multicast tree identifier in the Tunnel
2556	         Identifier of the Tunnel attribute. This, in effect, creates a
2557	         binding between the inter-AS auto-discovery route and the P-
2558	         Multicast Tree.

2560	         If a P-Multicast Tree is used to instantiate the intra-AS
2561	         segment of the inter-AS tunnel, and in order to advertise the
2562	         P-Multicast tree identifier the advertising ASBR needs to know
2563	         the leaves of the tree beforehand, the ASBR first discovers the
2564	         leaves using the Auto-Discovery procedures, as specified
2565	         further down. It then advertises the binding of the tree to the
2566	         inter-AS auto-discovery route using the the original auto-
2567	         discovery route with the addition of carrying in the route the
2568	         Tunnel attribute that contains the type and the identity of the
2569	         tree (encoded in the Tunnel Identifier of the attribute).

2571	      b) Re-advertises the received inter-AS auto-discovery route to its
2572	         EBGP peers, other than the EBGP neighbor from which the best
2573	         inter-AS auto-discovery route was received.

2575	      c) Advertises to its neighbor ASBR, from which it received the
2576	         best inter-AS autodiscovery route to the destination carried in
2577	         the NRLI of the route, a leaf auto-discovery route that carries
2578	         an ASBR-ASBR tunnel binding with the tunnel identifier set to
2579	         ingress replication. This binding as described in section 6 can
2580	         be used by the neighbor ASBR to send traffic to this ASBR.

2582	8.2.1.2.2. Leaf Auto-Discovery Route received via EBGP

2584	   When an ASBR receives via EBGP a leaf auto-discovery route, the ASBR
2585	   finds an inter-AS auto-discovery route that has the same RD as the
2586	   leaf auto-discovery route. The MPLS label carried in the leaf auto-
2587	   discovery route is used to stitch a one hop ASBR-ASBR LSP to the tail
2588	   of the intra-AS tunnel segment associated with the inter-AS auto-
2589	   discovery route.

2591	8.2.1.2.3. Inter-AS Auto-Discovery Route received via IBGP

2593	   If a given inter-AS auto-discovery route is advertised within an AS
2594	   by multiple ASBRs of that AS, the BGP best route selection performed
2595	   by other PE/ASBR routers within the AS does not require all these
2596	   PE/ASBR routers to select the route advertised by the same ASBR - to
2597	   the contrary different PE/ASBR routers may select routes advertised
2598	   by different ASBRs.

2600	   Further when a PE/ASBR receives from one of its IBGP neighbors a BGP
2601	   Update message that carries a AS MVPN membership tree , if (a) the
2602	   route was originated outside of the router's own AS, (b) at least one
2603	   of the Route Targets carried in the message matches one of the import
2604	   Route Targets configured on the PE/ASBR, and (c) the PE/ASBR
2605	   determines that the received route is the best route to the
2606	   destination carried in the NLRI of the route, if the router is an
2607	   ASBR then the ASBR propagates the route to its EBGP neighbors. In
2608	   addition the PE/ASBR performs the following.

2610	   If the received inter-AS auto-discovery route carries the Tunnel
2611	   attribute with the Tunnel Identifier set to LDP P2MP LSP, or PIM-SSM
2612	   tree, or PIM-SM tree, the PE/ASBR SHOULD join the P-Multicast tree
2613	   whose identity is carried in the Tunnel Identifier.

2615	   If the received source auto-discovery route carries the Tunnel
2616	   attribute with the Tunnel Identifier set to RSVP-TE P2MP LSP, then
2617	   the ASBR that originated the route MUST signal the local PE/ASBR as
2618	   one of leaf LSRs of the RSVP-TE P2MP LSP. This signaling MAY have
2619	   been completed before the local PE/ASBR receives the BGP Update
2620	   message.

2622	   If the NLRI of the route does not carry a label, then this tree is an
2623	   intra-AS tunnel segment that is part of the inter-AS Tunnel for the
2624	   MVPN advertised by the inter-AS auto-discovery route. If the NLRI
2625	   carries a (upstream) label, then a combination of this tree and the
2626	   label identifies the intra-AS segment.

2628	   If this is an ASBR, this intra-AS segment may further be stitched to
2629	   ASBR-ASBR inter-AS segment of the inter-AS tunnel. If the PE/ASBR has
2630	   local receivers in the MVPN, packets received over the intra-AS
2631	   segment must be forwarded to the local receivers using the local VRF.

2633	   If the received inter-AS auto-discovery route either does not carry
2634	   the Tunnel attribute, or carries the Tunnel attribute with the Tunnel
2635	   Identifier set to ingress replication, then the PE/ASBR originates a
2636	   new auto-discovery route to allow the ASBR from which the auto-
2637	   discovery route was received, to learn of this ASBR as a leaf of the
2638	   intra-AS tree.

2640	   Thus the AS MVPN membership information propagates across multiple
2641	   ASes along a spanning tree. BGP AS-Path based loop prevention
2642	   mechanism prevents loops from forming as this information propagates.

2644	8.2.2. Inter-AS MVPN Routing Information Exchange

2646	   All of the MVPN routing information exchange methods specified in
2647	   section 5 can be supported across ASes.

2649	   The objective in this case is to propagate the MVPN routing
2650	   information to the remote PE that originates the unicast route to C-
2651	   S/C-RP, in the reverse direction of the AS MVPN membership
2652	   information announced by the remote PE's origin AS. This information
2653	   is processed by each ASBR along this reverse path.

2655	   To achieve this the PE that is generating the MVPN routing
2656	   advertisement, first determines the source AS of the unicast route to
2657	   C-S/C-RP. It then determines from the received AS MVPN membership
2658	   information, for the source AS, the ASBR that is the next-hop for the
2659	   best path of the source AS MVPN membership. The BGP MVPN routing
2660	   update is sent to this ASBR and the ASBR then further propagates the
2661	   BGP advertisement. BGP filtering mechanisms ensure that the BGP MVPN
2662	   routing information updates flow only to the upstream router on the
2663	   reverse path of the inter-AS MVPN membership tree. Details of this
2664	   filtering mechanism and the relevant encoding will be specified in a
2665	   separate document.

2667	8.2.3. Inter-AS I-PMSI

2669	   All PEs in a given AS, use the same inter-AS heterogeneous tunnel,
2670	   rooted at the AS, to instantiate an I-PMSI for an inter-AS MVPN
2671	   service. As explained earlier the intra-AS tunnel segments that
2672	   comprise this tunnel can be built using different tunneling
2673	   technologies. To instantiate an MI-PMSI service for a MVPN there must
2674	   be an inter-AS tunnel rooted at each AS that has at least one PE that
2675	   is a member of the MVPN.

2677	   A C-multicast data packet is sent using an intra-AS tunnel segment by
2678	   the PE that first receives this packet from the MVPN customer site.
2679	   An ASBR forwards this packet to any locally connected MVPN receivers
2680	   for the multicast stream. If this ASBR has received a tunnel binding
2681	   for the AS MVPN membership that it advertised to a neighboring ASBR,
2682	   it also forwards this packet to the neighboring ASBR. In this case
2683	   the packet is encapsulated in the downstream MPLS label received from
2684	   the neighboring ASBR. The neighboring ASBR delivers this packet to
2685	   any locally connected MVPN receivers for that multicast stream. It
2686	   also transports this packet on an intra-AS tunnel segment, for the
2687	   inter-AS MVPN tunnel, and the other PEs and ASBRs in the AS then
2688	   receive this packet.  The other ASBRs then repeat the procedure
2689	   followed by the ASBR in the origin AS and the packet traverses the
2690	   overlay inter-AS tunnel along a spanning tree.

2692	8.2.3.1. Support for Unicast VPN Inter-AS Methods

2694	   The above procedures for setting up an inter-AS I-PMSI can be
2695	   supported for each of the unicast VPN inter-AS models described in
2696	   [RFC4364]. These procedures do not depend on the method used to
2697	   exchange unicast VPN routes. For Option B and Option C they do
2698	   require MPLS encapsulation between the ASBRs.

2700	8.2.4. Inter-AS S-PMSI

2702	   An inter-AS tunnel for an S-PMSI is constructed similar to an inter-
2703	   AS tunnel for an I-PMSI. Namely, such a tunnel is constructed as a
2704	   concatenation of tunnel segments. There are two types of tunnel
2705	   segments: an intra-AS tunnel segment (a segment that spans ASBRs and
2706	   PEs within the same AS), and inter-AS tunnel segment (a segment that
2707	   spans adjacent ASBRs in adjacent ASes). ASes that are spanned by a
2708	   tunnel are not required to use the same tunneling mechanism to
2709	   construct the tunnel - each AS may pick up a tunneling mechanism to
2710	   construct the intra-AS tunnel segment of the tunnel on its own.

2712	   The PE that decides to set up a S-PMSI, advertises the S-PMSI tunnel
2713	   binding using procedures in section 7.3.2 to the routers in its own
2714	   AS. The <C-S, C-G> membership for which the S-PMSI is instantiated,
2715	   is propagated along an inter-AS spanning tree. This spanning tree
2716	   traverses the same ASBRs as the AS MVPN membership spanning tree. In
2717	   addition to the information elements described in section 7.3.2
2718	   (Origin AS, RD, next-hop) the C-S and C-G is also advertised.

2720	   An ASBR that receives the AS <C-S, C-G> information from its upstream
2721	   ASBR using EBGP sends back a tunnel binding for AS <C-S, C-G>
2722	   information if a) at least one of the Route Targets carried in the
2723	   message matches one of the import Route Targets configured on the
2724	   ASBR, and (b) the ASBR determines that the received route is the best
2725	   route to the destination carried in the NLRI of the route. If the
2726	   ASBR instantiates a S-PMSI for the AS <C-S, C-G> it sends back a
2727	   downstream label that is used to forward the packet along its intra-
2728	   AS S-PMSI for the <C-S, C-G>. However the ASBR may decide to use an
2729	   AS MVPN membership I-PMSI instead, in which case it sends back the
2730	   same label that it advertised for the AS MVPN membership I-PMSI. If
2731	   the downstream ASBR instantiates a S-PMSI, it further propagates the
2732	   <C-S, C-G> membership to its downstream ASes, else it does not.

2734	   An AS can instantiate an intra-AS S-PMSI for the inter-AS S-PMSI
2735	   tunnel only if the upstream AS instantiates a S-PMSI. The procedures
2736	   allow each AS to determine whether it wishes to setup a S-PMSI or not
2737	   and the AS is not forced to setup a S-PMSI just because the upstream
2738	   AS decides to do so.

2740	   The leaves of an intra-AS S-PMSI tunnel will be the PEs that have
2741	   local receivers that are interested in <C-S, C-G> and the ASBRs that
2742	   have received MVPN routing information for <C-S, C-G>. Note that an
2743	   AS can determine these ASBRs as the MVPN routing information is
2744	   propagated and processed by each ASBR on the AS MVPN membership
2745	   spanning tree.

2747	   The C-multicast data traffic is sent on the S-PMSI by the originating
2748	   PE.  When it reaches an ASBR that is on the spanning tree, it is
2749	   delivered to local receivers, if any, and is also forwarded to the
2750	   neighbor ASBR after being encapsulated in the label advertised by the
2751	   neighbor. The neighbor ASBR either transports this packet on the S-
2752	   PMSI for the multicast stream or an I-PMSI, delivering it to the
2753	   ASBRs in its own AS. These ASBRs in turn repeat the procedures of the
2754	   origin AS ASBRs and the multicast packet traverses the spanning tree.

2756	9. Duplicate Packet Detection and Single Forwarder PE

2758	   Consider the case of an egress PE that receives packets of a customer
2759	   multicast stream (C-S, C-G) over a non-aggregated S-PMSI.  The
2760	   procedures described so far will never cause the PE to receive
2761	   duplicate copies of any packet in that stream.  It is possible that
2762	   the (C-S, C-G) stream is carried in more than one S-PMSI; this may
2763	   happen when the site that contains C-S is multihomed to more than one
2764	   PE.  However, a PE that needs to receive (C-S, C-G) packets only
2765	   joins one of these S-PMSIs, and so only receives one copy of each
2766	   packet.

2768	   However, if the data packets of stream (C-S, C-G) are carried in
2769	   either an I-PMSI or in an aggregated S-PMSI, then it the procedures
2770	   specified so far make it possible for an egress PE to receive more
2771	   than one copy of each data packet.  In this section, we define
2772	   additional procedures to that an MVPN customer sees no multicast data
2773	   packet duplication.

2775	   This section covers the situation where the customer multicast tree
2776	   is unidirectional, i.e. with the C-G is either a "Sparse Mode" or a
2777	   "Single Source Mode" group.  The case where the customer multicast
2778	   tree is bidirectional (the C-G is a BIDIR-PIM group) is considered
2779	   separately in section 12.

2781	   The first case when an egress PE may receive duplicate multicast data
2782	   packets, is the case where both (a) an MVPN site that contains C-S or
2783	   C-RP is multihomed to more than one PE, and (b) either an I-PMSI, or
2784	   an aggregated S-PMSI is used for carrying the packets originated by
2785	   C-S.  In this case, an egress PE may receive one copy of the packet
2786	   from each PE to which the site is homed.

2788	   The second case when an egress PE may receive duplicate multicast
2789	   data packets is when all of the following is true: (a) the IP
2790	   destination address of the customer packet is a C-G that is operating
2791	   in ASM mode, and whose C-multicast tree is set up using PIM-SM, (b)
2792	   an MI-PMSI is used for carrying the packets, and (c) a router or a CE
2793	   in a site connected to the egress PE switches from the C-RP tree to
2794	   C-S tree.  In this case, it is possible to get one copy of a given
2795	   packet from the ingress PE attached to the C-RP's site, and one from
2796	   the ingress PE attached to the C-S's site.

2798	9.1. Multihomed C-S or C-RP

2800	   In the first case for a given <C-S, C-G> an egress PE, say PE1,
2801	   expects to receive C-data packets from the upstream PE, say PE2,
2802	   which PE1 identified as the upstream multicast hop in the C-Multicast
2803	   Routing Update that PE1 sent in order to join <C-S, C-G>. If PE1 can
2804	   determine that a data packet for <C-S, C-G> was received from the
2805	   expected upstream PE, PE2, PE1 will accept and forward the packet.
2806	   Otherwise, PE1 will drop the packet; this means that the PE will see
2807	   a duplicate, but the duplicate will not get forwarded.  (But see
2808	   section 10 for an exception case where PE1 will accept a packet even
2809	   if it is from an unexpected upstream PE.)

2811	   The method used by an egress PE to determine the ingress PE for a
2812	   particular packet, received over a particular PMSI, depends on the P-
2813	   tunnel technology that is used to instantiate the PMSI.  If the P-
2814	   tunnel is a P2MP LSP, a PIM-SM or PIM-SSM tree, or a unicast tunnel,
2815	   then the tunnel encapsulation contains information which can be used
2816	   (possibly along with other state information in the PE) to determine
2817	   the ingress PE, as long as the P-tunnel is instantiating an intra-AS
2818	   PMSI, or an inter-AS PMSI which is supported by a non-segmented
2819	   inter-AS tunnel.

2821	   Even when inter-AS segmented tunnels are used, if an aggregated S-
2822	   PMSI is used for carrying the packets, the P-tunnel encapsulation
2823	   must have some information which can be used to identify the PMSI,
2824	   and that in turn implicitly identifies the ingress PE.

2826	   If an I-PMSI is used for carrying the packets, the I-PMSI spans
2827	   multiple ASes, and the I-PMSI is realized via segmented inter-AS
2828	   tunnels, if C-S or C-RP is multi-homed to different PEs, as long as
2829	   each such PE is in a different AS, the egress PE can detect duplicate
2830	   traffic as such duplicate traffic will arrive on a different (inter-
2831	   AS) tunnel. Specifically, if the PE was expecting the traffic on an
2832	   particular inter-AS tunnel, duplicate traffic will arrive either on
2833	   an intra-AS tunnel [this is not an intra-AS tunnel segment, of an
2834	   inter-AS tunnel], or on some other inter-AS tunnel.  Therefore, to
2835	   detect duplicates the PE has to keep track of which (inter-AS) auto-
2836	   discovery route the PE uses for sending MVPN multicast routing
2837	   information towards C-S/C-RP. Then the PE should receive (multicast)
2838	   traffic originated by C-S/C-RP only from the (inter-AS) tunnel that
2839	   was carried in the best Inter-AS auto-discovery route for the MVPN
2840	   and was originated by the AS that contains C-S/C-RP (where "the best"
2841	   is determined by the PE). The PE should discard, as duplicated, all
2842	   other multicast traffic originated by C-S/C-RP, but received on any
2843	   other tunnel.

2845	9.1.1. Single forwarder PE selection

2847	   When for a given MVPN (a) MI-PMSI is used for carrying multicast data
2848	   packets, (b) C-S or C-RP is multi-homed to different PEs, and (c) at
2849	   least two of such PEs are in the same AS, then depending on the
2850	   tunneling technology used by the MI-PMSI it may not always be
2851	   possible for the egress PE to determine the upstream PE.  Therefore,
2852	   when this determination may not be possible procedures are needed to
2853	   ensure that packets are received on an MI-PMSI at an egress PE only
2854	   from a single upstream PE.  Furthermore, even if the determination is
2855	   possible, it may be preferable to send only one copy of each packet
2856	   to each egress PE, rather than sending multiple copies and having the
2857	   egress PE discard all but one.

2859	   Section 5.1 specifies a procedure for choosing a "default upstream PE
2860	   selection", such that (except during routing transients) all PEs will
2861	   choose the same default upstream PE.  To ensure that duplicate
2862	   packets are not sent through the backbone (except during routing
2863	   transients), an ingress PE does not forward to the backbone any (C-S,
2864	   C-G) multicast data packet it receives from a CE, unless the PE is
2865	   the default upstream PE selection.

2867	   This procedure is optional whenever the P-tunnel technology that is
2868	   being used to carry the multicast stream in question allows the
2869	   egress PEs to determine the identity of the ingress PE.  This
2870	   procedure is mandatory if the P-tunnel technology does not make this
2871	   determination possible.

2873	   The above procedure ensures that if C-S or C-RP is multi-homed to PEs
2874	   within a single AS, a PE will not receive duplicate traffic as long
2875	   as all the PEs are on either the C-S or C-RP tree. If some PEs are on
2876	   the C-S tree and some on the C-RP tree, however, packet duplication
2877	   is still possible. This is discussed in the next section.

2879	9.2. Switching from the C-RP tree to C-S tree

2881	   If some PEs are on the C-S tree and some on the R-RP tree then a PE
2882	   may also receive duplicate traffic during a <C-*, C-G> to <C-S, C-G>
2883	   switch. The issue and the solution are described next.

2885	   When for a given MVPN (a) MI-PMSI is used for carrying multicast data
2886	   packets, (b) C-S and C-RP are connected to PEs within the same AS,
2887	   and (c) the MI-PMSI tunneling technology in use does not allow the
2888	   egress PEs to identify the ingress PE, then having all the PEs select
2889	   the same PE to be the upstream multicast hop for C-S or C-RP is not
2890	   sufficient to prevent packet duplication.

2892	   The reason is that a single tunnel used by MI-PMSI may be carrying
2893	   traffic on both the (C-*, C-G) tree and the (C-S, C-G) tree. If some
2894	   of the egress PEs have joined the source tree, but others expect to
2895	   receive (C-S, C-G) packets from the shared tree, then two copies of
2896	   data packet will travel on the tunnel, and since due to the choice of
2897	   the tunneling technology the egress PEs have no way to identify the
2898	   ingress PE, the egress PEs will have no way to determine that only
2899	   one copy should be accepted.

2901	   To avoid this, it is necessary to ensure that once any PE joins the
2902	   (C-S, C-G) tree, any other PE that has joined the (C-*, C- G) tree
2903	   also switches to the (C-S, C-G) tree (selecting, of course, the same
2904	   upstream multicast hop, as specified above).

2906	   Whenever a PE creates an <C-S, C-G> state as a result of receiving a
2907	   C-multicast route for <C-S, C-G> from some other PE, and the C-G
2908	   group is a Sparse Mode group, the PE that creates the state MUST
2909	   originate a Source Active auto-discovery route (see [MVPN-BGP]
2910	   section 4.5) as specified below. The route is advertised using the
2911	   same procedures as the MVPN auto-discovery/binding (both intra-AS and
2912	   inter-AS) specified in this document with the following
2913	   modifications:

2915	      1. The Multicast Source field MUST be set to C-S.  The Multicast
2916	         Source Length field is set appropriately to reflect this.

2918	      2. The Multicast Group field MUST be set to C-G.  The Multicast
2919	         Group Length field is set appropriately to reflect this.

2921	   The route goes to all the PEs of the MVPN. When as a result of
2922	   receiving a new Source Active auto-discovery route a PE updates its
2923	   VRF with the route, the PE MUST check if the newly received route
2924	   matches any <C-*, C-G> entries. If (a) there is a matching entry, (b)
2925	   the PE does not have (C-S, C-G) state in its MVPN-TIB for (C-S, C-G)
2926	   carried in the route, and (c) the received route is selected as the
2927	   best (using the BGP route selection procedures), then the PE sets up
2928	   its forwarding path to receive (C-S, C-G) traffic from the tunnel the
2929	   originator of the selected Source Active auto-discovery route uses
2930	   for sending (C-S, C-G). This procedures forces all the PEs (in all
2931	   ASes) to switch from the C-RP tree to the C-S tree for <C-S, C-G>.

2933	   (Additional uses of the Source Active A-D route are discussed in
2934	   section 10.)

2936	   Note that when a PE thus joins the <C-S, C-G> tree, it may need to
2937	   send a PIM (S,G,RPT-bit) prune to one of its CE PIM neighbors, as
2938	   determined by ordinary PIM procedures. (This will be the case if the
2939	   incoming interface for the (C-*, C-G) tree is one of the VRF
2940	   interfaces.)  However, before doing this, it SHOULD run a timer to
2941	   help ensure that the source is not pruned from the shared tree until
2942	   all PEs have had time to receive the Source Active route.

2944	   Whenever the PE deletes the <C-S, C-G> state that was previously
2945	   created as a result of receiving a C-multicast route for <C-S, C-G>
2946	   from some other PE, the PE that deletes the state also withdraws the
2947	   auto-discovery route that was advertised when the state was created.

2949	   N.B.: SINCE ALL PEs WITH RECEIVERS FOR GROUP C-G WILL JOIN THE C-S
2950	   SOURCE TREE IF ANY OF THEM DO, IT IS NEVER NECESSARY TO DISTRIBUTE A
2951	   BGP C-MULTICAST ROUTE FOR THE PURPOSE OF PRUNING SOURCES FROM THE
2952	   SHARED TREE.

2954	   It is worth nothing that if a PE joins a source tree as a result of
2955	   this procedure, the UMH is not necessarily the same as it would be if
2956	   the PE had joined the source tree as a result of receiving a PIM Join
2957	   for the same source tree from a directly attached CE.

2959	10. Eliminating PE-PE Distribution of (C-*,C-G) State

2961	   In sparse mode PIM, a node that wants to become a receiver for a
2962	   particular multicast group G first joins a shared tree, rooted at a
2963	   rendezvous point.  When the receiver detects traffic from a
2964	   particular source it has the option of joining a source tree, rooted
2965	   at that source.  If it does so, it has to prune that source from the
2966	   shared tree, to ensure that it receives packets from that source on
2967	   only one tree.

2969	   Maintaining the shared tree can require considerable state, as it is
2970	   necessary not only to know who the upstream and downstream nodes are,
2971	   but to know which sources have been pruned off which branches of the
2972	   share tree.

2974	   The BGP-based signaling procedures defined in this document and in
2975	   [MVPN-BGP] eliminate the need for PEs to distribute to each other any
2976	   state having to do with which sources have been pruned off a shared
2977	   C-tree.  Those procedures do still allow multicast data traffic to
2978	   travel on a shared C-tree, but they do not allow a situation in which
2979	   some CEs receive (S,G) traffic on a shared tree and some on a source
2980	   tree.  This results in a considerable simplification of the PE-PE
2981	   procedures with minimal change to the multicast service seen within
2982	   the VPN.  However, shared C-trees are still supported across the VPN
2983	   backbone.  That is, (C-*, C-G) state is distributed PE-PE, but (C-*,
2984	   C-G, RPT-bit) state is not.

2986	   In this section, we specify a number of optional procedures which go
2987	   further, and which completely eliminate the support for shared C-
2988	   trees across the VPN backbone.  In these procedures, the PEs keep
2989	   track of the active sources for each C-G.  As soon as a CE tries to
2990	   join the (*,G) tree, the PEs instead join the (S,G) trees for all the
2991	   active sources.  Thus all distribution of (C-*,C-G) state is
2992	   eliminated.  These procedures are optional because they require some
2993	   additional support on the part of the VPN customer, and because they
2994	   are not always appropriate.  (E.g., a VPN customer may have his own
2995	   policy of always using shared trees for certain multicast groups.)
2996	   There are several different options, described in the following sub-
2997	   sections.

2999	10.1. Co-locating C-RPs on a PE

3001	   [MVPN-REQ] describes C-RP engineering as an issue when PIM-SM (or
3002	   BIDIR-PIM) is used in "Any Source Multicast (ASM) mode" [RFC4607] on
3003	   the VPN customer site. To quote from [MVPN-REQ]:

3005	   "In some cases this engineering problem is not trivial: for instance,
3006	   if sources and receivers are located in VPN sites that are different
3007	   than that of the RP, then traffic may flow twice through the SP
3008	   network and the CE-PE link of the RP (from source to RP, and then
3009	   from RP to receivers) ; this is obviously not ideal.  A multicast VPN
3010	   solution SHOULD propose a way to help on solving this RP engineering
3011	   issue."

3013	   One of the C-RP deployment models is for the customer to outsource
3014	   the RP to the provider. In this case the provider may co-locate the
3015	   RP on the PE that is connected to the customer site [MVPN-REQ]. This
3016	   section describes how anycast-RP can be used for achieving this. This
3017	   is described below.

3019	10.1.1. Initial Configuration

3021	   For a particular MVPN, at least one or more PEs that have sites in
3022	   that MVPN, act as an RP for the sites of that MVPN connected to these
3023	   PEs.  Within each MVPN all these RPs use the same (anycast) address.
3024	   All these RPs use the Anycast RP technique.

3026	10.1.2. Anycast RP Based on Propagating Active Sources

3028	   This mechanism is based on propagating active sources between RPs.

3030	10.1.2.1. Receiver(s) Within a Site

3032	   The PE which receives C-Join for (*,G) or (S,G) does not send the
3033	   information that it has receiver(s) for G until it receives
3034	   information about active sources for G from an upstream PE.

3036	   On receiving this (described in the next section), the downstream PE
3037	   will respond with Join for C-(S,G). Sending this information could be
3038	   done using any of the procedures described in section 5. If BGP is
3039	   used, the ingress address is set to the upstream PE's address which
3040	   has triggered the source active information. Only the upstream PE
3041	   will process this information. If unicast PIM is used then a unicast
3042	   PIM message will have to be sent to the PE upstream PE that has
3043	   triggered the source active information. If a MI-PMSI is used than
3044	   further clarification is needed on the upstream neighbor address of
3045	   the PIM message and will be provided in a future revision.

3047	10.1.2.2. Source Within a Site

3049	   When a PE receives PIM-Register from a site that belongs to a given
3050	   VPN, PE follows the normal PIM anycast RP procedures. It then
3051	   advertises the source and group of the multicast data packet carried
3052	   in PIM-Register message to other PEs in BGP using the following
3053	   information elements:

3055	     - Active source address

3057	     - Active group address

3059	     - Route target of the MVPN.

3061	   This advertisement goes to all the PEs that belong to that MVPN. When
3062	   a PE receives this advertisement, it checks whether there are any
3063	   receivers in the sites attached to the PE for the group carried in
3064	   the source active advertisement. If yes, then it generates an
3065	   advertisement for C-(S,G) as specified in the previous section.

3067	   Note that the mechanism described in section 7.3.2. can be leveraged
3068	   to advertise a S-PMSI binding along with the source active messages.

3070	10.1.2.3. Receiver Switching from Shared to Source Tree

3072	   No additional procedures are required when multicast receivers in
3073	   customer's site shift from shared tree to source tree.

3075	10.2. Using MSDP between a PE and a Local C-RP

3077	   Section 10.1 describes the case where each PE is a C-RP.  This
3078	   enables the PEs to know the active multicast sources for each MVPN,
3079	   and they can then use BGP to distribute this information to each
3080	   other.  As a result, the PEs do not have to join any shared C-trees,
3081	   and this results in a simplification of the PE operation.

3083	   In another deployment scenario, the PEs are not themselves C-RPs, but
3084	   use MSDP to talk to the C-RPs.  In particular, a PE which attaches to
3085	   a site that contains a C-RP becomes an MSDP peer of that C-RP.  That
3086	   PE then uses BGP to distribute the information about the active
3087	   sources to the other PEs.  When the PE determines, by MSDP, that a
3088	   particular source is no longer active, then it withdraws the
3089	   corresponding BGP update.  Then the PEs do not have to join any
3090	   shared C-trees, but they do not have to be C-RPs either.

3092	   MSDP provides the capability for a Source Active message to carry an
3093	   encapsulated data packet.  This capability can be used to allow an
3094	   MSDP speaker to receive the first (or first several) packet(s) of an
3095	   (S,G) flow, even though the MSDP speaker hasn't yet joined the (S,G)
3096	   tree.  (Presumably it will join that tree as a result of receiving
3097	   the SA message which carries the encapsulated data packet.)  If this
3098	   capability is not used, the first several data packets of an (S,G)
3099	   stream may be lost.

3101	   A PE which is talking MSDP to an RP may receive such an encapsulated
3102	   data packet from the RP.  The data packet should be decapsulated and
3103	   transmitted to the other PEs in the MVPN.  If the packet belongs to a
3104	   particular (S,G) flow, and if the PE is a transmitter for some S-PMSI
3105	   to which (S,G) has already been bound, the decapsulated data packet
3106	   should be transmitted on that S-PMSI.  Otherwise, if an I-PMSI exists
3107	   for that MVPN, the decapsulated data packet should be transmitted on
3108	   it.  (If a MI-PMSI exists, this would typically be used.)  If neither
3109	   of these conditions hold, the decapsulated data packet is not
3110	   transmitted to the other PEs in the MVPN.  The decision as to whether
3111	   and how to transmit the decapsulated data packet does not effect the
3112	   processing of the SA control message itself.

3114	   Suppose that PE1 transmits a multicast data packet on a PMSI, where
3115	   that data packet is part of an (S,G) flow, and PE2 receives that
3116	   packet from that PMSI.  According to section 9, if PE1 is not the PE
3117	   that PE2 expects to be transmitting (S,G) packets, then PE2 must
3118	   discard the packet.  If an MSDP-encapsulated data packet is
3119	   transmitted on a PMSI as specified above, this rule from section 9
3120	   would likely result in the packet's getting discarded.  Therefore, if
3121	   MSDP-encapsulated data packets being decapsulated and transmitted on
3122	   a PMSI, we need to modify the rules of section 9 as follows:

3124	      1. If the receiving PE, PE2, has already joined the (S,G) tree,
3125	         and has chosen PE1 as the upstream PE for the (S,G) tree, but
3126	         this packet does not come from PE1, PE2 must discard the
3127	         packet.

3129	      2. If the receiving PE, PE2, has not already joined the (S,G)
3130	         tree, but is a PIM adjacency to a CE which is downstream on the
3131	         (*,G) tree, the packet should be forwarded to the CE.

3133	11. Encapsulations

3135	   The BGP-based auto-discovery procedures will ensure that the PEs in a
3136	   single MVPN only use tunnels that they can all support, and for a
3137	   given kind of tunnel, that they only use encapsulations that they can
3138	   all support.

3140	11.1. Encapsulations for Single PMSI per Tunnel

3142	11.1.1. Encapsulation in GRE

3144	   GRE encapsulation can be used for any PMSI that is instantiated by a
3145	   mesh of unicast tunnels, as well as for any PMSI that is instantiated
3146	   by one or more PIM tunnels of any sort.

3148	   Packets received        Packets in transit      Packets forwarded
3149	   at ingress PE           in the service          by egress PEs
3150	                           provider network

3152	                           +---------------+
3153	                           |  P-IP Header  |
3154	                           +---------------+
3155	                           |      GRE      |
3156	   ++=============++       ++=============++       ++=============++
3157	   || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
3158	   ++=============++ >>>>> ++=============++ >>>>> ++=============++
3159	   || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
3160	   ++=============++       ++=============++       ++=============++

3162	   The IP Protocol Number field in the P-IP Header must be set to 47.
3163	   The Protocol Type field of the GRE Header must be set to 0x800.

3165	   When an encapsulated packet is transmitted by a particular PE, the
3166	   source IP address in the P-IP header must be the same address that
3167	   the PE uses to identify itself in the VRF Route Import Extended
3168	   Communities that it attaches to any of VPN-IP routes eligible for UMH
3169	   determination that it advertises via BGP (see section 5.1).

3171	   If the PMSI is instantiated by a PIM tree, the destination IP address
3172	   in the P-IP header is the group P-address associated with that tree.
3173	   The GRE key field value is omitted.

3175	   If the PMSI is instantiated by unicast tunnels, the destination IP
3176	   address is the address of the destination PE, and the optional GRE
3177	   Key field is used to identify a particular MVPN.  In this case, each
3178	   PE would have to advertise a key field value for each MVPN; each PE
3179	   would assign the key field value that it expects to receive.

3181	   [RFC2784] specifies an optional GRE checksum, and [RFC2890] specifies
3182	   an optional GRE sequence number fields.

3184	   The GRE sequence number field is not needed because the transport
3185	   layer services for the original application will be provided by the
3186	   C-IP Header.

3188	   The use of GRE checksum field must follow [RFC2784].

3190	   To facilitate high speed implementation, this document recommends
3191	   that the ingress PE routers encapsulate VPN packets without setting
3192	   the checksum, or sequence fields.

3194	11.1.2. Encapsulation in IP

3196	   IP-in-IP [RFC1853] is also a viable option.  When it is used, the
3197	   IPv4 Protocol Number field is set to 4. The following diagram shows
3198	   the progression of the packet as it enters and leaves the service
3199	   provider network.

3201	   Packets received        Packets in transit      Packets forwarded
3202	   at ingress PE           in the service          by egress PEs
3203	                           provider network

3205	                           +---------------+
3206	                           |  P-IP Header  |
3207	   ++=============++       ++=============++       ++=============++
3208	   || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
3209	   ++=============++ >>>>> ++=============++ >>>>> ++=============++
3210	   || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
3211	   ++=============++       ++=============++       ++=============++

3213	   When an encapsulated packet is transmitted by a particular PE, the
3214	   source IP address in the P-IP header must be the same address that
3215	   the PE uses to identify itself in the VRF Route Import Extended
3216	   Communities that it attaches to any of VPN-IP routes eligible for UMH
3217	   determination that it advertises via BGP (see section 5.1).

3219	11.1.3. Encapsulation in MPLS

3221	   If the PMSI is instantiated as a P2MP MPLS LSP or MP2MP LSP, MPLS
3222	   encapsulation is used. Penultimate-hop-popping must be disabled for
3223	   the P2MP MPLS LSP. If the PMSI is instantiated as an RSVP-TE P2MP
3224	   LSP, additional MPLS encapsulation procedures are used, as specified
3225	   in [RSVP-P2MP].

3227	   If other methods of assigning MPLS labels to multicast distribution
3228	   trees are in use, these multicast distribution trees may be used as
3229	   appropriate to instantiate PMSIs, and appropriate additional MPLS
3230	   encapsulation procedures may be used.

3232	   Packets received        Packets in transit      Packets forwarded
3233	   at ingress PE           in the service          by egress PEs
3234	                           provider network

3236	                           +---------------+
3237	                           | P-MPLS Header |
3238	   ++=============++       ++=============++       ++=============++
3239	   || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
3240	   ++=============++ >>>>> ++=============++ >>>>> ++=============++
3241	   || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
3242	   ++=============++       ++=============++       ++=============++

3244	11.2. Encapsulations for Multiple PMSIs per Tunnel

3246	   The encapsulations for transmitting multicast data messages when
3247	   there are multiple PMSIs per tunnel are based on the encapsulation
3248	   for a single PMSI per tunnel, but with an MPLS label used for
3249	   demultiplexing.

3251	   The label is upstream-assigned and distributed via BGP as specified
3252	   in section 4.  The label must enable the receiver to select the
3253	   proper VRF, and may enable the receiver to select a particular
3254	   multicast routing entry within that VRF.

3256	11.2.1. Encapsulation in GRE

3258	   Rather than the IP-in-GRE encapsulation discussed in section 11.1.1,
3259	   we use the MPLS-in-GRE encapsulation.  This is specified in [MPLS-
3260	   IP].  The GRE protocol type MUST be set to 0x8847. [The reason for
3261	   using the unicast rather than the multicast value is specified in
3262	   [MPLS-MCAST-ENCAPS].

3264	11.2.2. Encapsulation in IP

3266	   Rather than the IP-in-IP encapsulation discussed in section 12.1.2,
3267	   we use the MPLS-in-IP encapsulation.  This is specified in [MPLS-IP].
3268	   The IP protocol number MUST be set to the value identifying the
3269	   payload as an MPLS unicast packet. [There is no "MPLS multicast
3270	   packet" protocol number.]

3272	11.3. Encapsulations Identifying a Distinguished PE

3274	11.3.1. For MP2MP LSP P-tunnels

3276	   As discussed in section 9, if a multicast data packet belongs to a
3277	   Sparse Mode or Single Source Mode multicast group, it is highly
3278	   desirable for the PE that receives the packet from a PMSI to be able
3279	   to determine the identity of the PE that transmitted the data packet
3280	   onto the PMSI.  The encapsulations of the previous sections all
3281	   provide this information, except in one case.  If a PMSI is being
3282	   instantiated by a MP2MP LSP, then the encapsulations discussed so far
3283	   do not allow one to determine the identity of the PE that transmitted
3284	   the packet onto the PMSI.

3286	   Therefore, when a packet that belongs to a Sparse Mode or Single
3287	   Source Mode multicast group is traveling on a MP2MP LSP P-tunnel, it
3288	   MUST carry, as its second label, a label which has been bound to the
3289	   packet's ingress PE.  This label is an upstream-assigned label that
3290	   the LSP's root node has bound to the ingress PE and has distributed
3291	   via an A-D Route (see section 4; precise details of this distribution
3292	   procedure will be included in the next revision of this document).
3293	   This label will appear immediately beneath the labels that are
3294	   discussed in sections 11.1.3 and 11.2.

3296	11.3.2. For Support of PIM-BIDIR C-Groups

3298	   As will be discussed in section 12, when a packet belongs to a PIM-
3299	   BIDIR multicast group, the set of PEs of that packet's VPN can be
3300	   partitioned into a number of subsets, where exactly one PE in each
3301	   partition is the upstream PE for that partition.  When such packets
3302	   are transmitted on a PMSI, then unless the procedures of section
3303	   12.2.3 are being used, it is necessary for the packet to carry
3304	   information identifying a particular partition. This is done by
3305	   having the packet carry the PE label corresponding to the upstream PE
3306	   of one partition.  For a particular P-tunnel, this label will have
3307	   been advertised by the node which is the root of that P-tunnel.
3308	   (Details of the procedure by which the PE labels are advertised will
3309	   be included in the next revision of this document.)

3311	   This label needs to be used whenever a packet belongs to a PIM-BIDIR
3312	   C-group, no matter what encapsulation is used by the P-tunnel.  Hence
3313	   the encapsulations of section 11.2 MUST be used.  If the tunnel
3314	   contains only one PMSI, the PE label replaces the label discussed in
3315	   section 11.2 If the tunnel contains multiple PMSIs, the PE label
3316	   follows the label discussed in section 11.2

3318	11.4. Encapsulations for Unicasting PIM Control Messages

3320	   When PIM control messages are unicast, rather than being sent on an
3321	   MI-PMSI, then the receiving PE needs to determine the particular MVPN
3322	   whose multicast routing information is being carried in the PIM
3323	   message.  One method is to use a downstream-assigned MPLS label which
3324	   the receiving PE has allocated for this specific purpose.  The label
3325	   would be distributed via BGP.  This can be used with an MPLS, MPLS-
3326	   in-GRE, or MPLS-in-IP encapsulation.

3328	   A possible alternative to modify the PIM messages themselves so that
3329	   they carry information which can be used to identify a particular
3330	   MVPN, such as an RT.

3332	   This area is still under consideration.

3334	11.5. General Considerations for IP and GRE Encaps

3336	   These apply also to the MPLS-in-IP and MPLS-in-GRE encapsulations.

3338	11.5.1. MTU

3340	   It is the responsibility of the originator of a C-packet to ensure
3341	   that the packet is small enough to reach all of its destinations,
3342	   even when it is encapsulated within IP or GRE.

3344	   When a packet is encapsulated in IP or GRE, the router that does the
3345	   encapsulation MUST set the DF bit in the outer header.  This ensures
3346	   that the decapsulating router will not need to reassemble the
3347	   encapsulating packets before performing decapsulation.

3349	   In some cases the encapsulating router may know that a particular C-
3350	   packet is too large to reach its destinations.  Procedures by which
3351	   it may know this are outside the scope of the current document.
3352	   However, if this is known, then:

3354	     - If the DF bit is set in the IP header of a C-packet which is
3355	       known to be too large, the router will discard the C-packet as
3356	       being "too large", and follow normal IP procedures (which may
3357	       require the return of an ICMP message to the source).

3359	     - If the DF bit is not set in the IP header of a C-packet which is
3360	       known to be too large, the router MAY fragment the packet before
3361	       encapsulating it, and then encapsulate each fragment separately.
3362	       Alternatively, the router MAY discard the packet.

3364	   If the router discards a packet as too large, it should maintain OAM
3365	   information related to this behavior, allowing the operator to
3366	   properly troubleshoot the issue.

3368	   Note that if the entire path of the tunnel does not support an MTU
3369	   which is large enough to carry the a particular encapsulated C-
3370	   packet, and if the encapsulating router does not do fragmentation,
3371	   then the customer will not receive the expected connectivity.

3373	11.5.2. TTL

3375	   The ingress PE should not copy the TTL field from the payload IP
3376	   header received from a CE router to the delivery IP or MPLS header.
3377	   The setting of the TTL of the delivery header is determined by the
3378	   local policy of the ingress PE router.

3380	11.5.3. Avoiding Conflict with Internet Multicast

3382	   If the SP is providing Internet multicast, distinct from its VPN
3383	   multicast services, and using PIM based P-multicast trees, it must
3384	   ensure that the group P-addresses which it used in support of MPVN
3385	   services are distinct from any of the group addresses of the Internet
3386	   multicasts it supports.  This is best done by using administratively
3387	   scoped addresses [ADMIN-ADDR].

3389	   The group C-addresses need not be distinct from either the group P-
3390	   addresses or the Internet multicast addresses.

3392	11.6. Differentiated Services

3394	   The setting of the DS field in the delivery IP header should follow
3395	   the guidelines outlined in [RFC2983].  Setting the EXP field in the
3396	   delivery MPLS header should follow the guidelines in [RFC3270]. An SP
3397	   may also choose to deploy any of additional Differentiated Services
3398	   mechanisms that the PE routers support for the encapsulation in use.
3399	   Note that the type of encapsulation determines the set of
3400	   Differentiated Services mechanisms that may be deployed.

3402	12. Support for PIM-BIDIR C-Groups

3404	   In BIDIR-PIM, each multicast group is associated with an RPA
3405	   (Rendezvous Point Address).  The Rendezvous Point Link (RPL) is the
3406	   link that attaches to the RPA.  Usually it's a LAN where the RPA is
3407	   in the IP subnet assigned to the LAN.  The root node of a BIDIR-PIM
3408	   tree is a node which has an interface on the RPL.

3410	   On any LAN (other than the RPL) which is a link in a PIM-bidir tree,
3411	   there must be a single node that has been chosen to be the DF.  (More
3412	   precisely, for each RPA there is a single node which is the DF for
3413	   that RPA.)  A node which receives traffic from an upstream interface
3414	   may forward it on a particular downstream interface only if the node
3415	   is the DF for that downstream interface.  A node which receives
3416	   traffic from a downstream interface may forward it on an upstream
3417	   interface only if that node is the DF for the downstream interface.

3419	   If, for any period of time, there is a link on which each of two
3420	   different nodes believes itself to be the DF, data forwarding loops
3421	   can form. Loops in a bidirectional multicast tree can be very
3422	   harmful.  However, any election procedure will have a convergence
3423	   period.  The BIDIR-PIM DF election procedures is very complicated,
3424	   because it goes to great pains to ensure that if convergence is not
3425	   extremely fast, then there is no forwarding at all until convergence
3426	   has taken place.

3428	   Other variants of PIM also have a DF election procedure for LANs.
3429	   However, as long as the multicast tree is unidirectional,
3430	   disagreement about who the DF is can result only in duplication of
3431	   packets, not in loops.  Therefore the time taken to converge on a
3432	   single DF is of much less concern for unidirectional trees and it is
3433	   for bidirectional trees.

3435	   In the MVPN environment, if PIM signaling is used among the PEs, the
3436	   can use the standard LAN-based DF election procedure can be used.
3437	   However, election procedures that are optimized for a LAN may not
3438	   work as well in the MVPN environment.  So an alternative to DF
3439	   election would be desirable.

3441	   If BGP signaling is used among the PEs, an alternative to DF election
3442	   is necessary.  One might think that use the "single forwarder
3443	   selection" procedures described in sections 5 and 9 coudl be used to
3444	   choose a single PE "DF" for the backbone (for a given RPA in a given
3445	   MVPN).  However, that is still likely to leave a convergence period
3446	   of at least several seconds during which loops could form, and there
3447	   could be a much longer convergence period if there is anything
3448	   disrupting the smooth flow of BGP updates.  So a simple procedure
3449	   like that is not sufficient.

3451	   The remainder of this section describes two different methods that
3452	   can be used to support BIDIR-PIM while eliminating the DF election.

3454	12.1. The VPN Backbone Becomes the RPL

3456	   On a per MVPN basis, this method treats the whole service provider(s)
3457	   infrastructure as a single RPL (RP Link). We refer to such an RPL as
3458	   an "MVPN-RPL".  This eliminates the need for the PEs to engage in any
3459	   "DF election" procedure, because PIM-bidir does not have a DF on the
3460	   RPL.

3462	   However, this method can only be used if the customer is
3463	   "outsourcing" the RPL/RPA functionality to the SP.

3465	   An MVPN-RPL could be realized either via an I-PMSI (this I-PMSI is on
3466	   a per MVPN basis and spans all the PEs that have sites of a given
3467	   MVPN), or via a collection of S-PMSIs, or even via a combination of
3468	   an I-PMSI and one or more S-PMSIs.

3470	12.1.1. Control Plane

3472	   Associated with each MVPN-RPL is an address prefix that is
3473	   unambiguous within the context of the MVPN associated with the MVPN-
3474	   RPL.

3476	   For a given MVPN, each VRF connected to an MVPN-RPL of that MVPN is
3477	   configured to advertise to all of its connected CEs the address
3478	   prefix of the MVPN-RPL.

3480	   Since in PIM Bidir there is no Designated Forwarder on an RPL, in the
3481	   context of MVPN-RPL there is no need to perform the Designated
3482	   Forwarder election among the PEs (note there is still necessary to
3483	   perform the Designated Forwarder election between a PE and its
3484	   directly attached CEs, but that is done using plain PIM Bidir
3485	   procedures).

3487	   For a given MVPN a PE connected to an MVPN-RPL of that MVPN should
3488	   send multicast data (C-S,C-G) on the MVPN-RPL only if at least one
3489	   other PE connected to the MVPN-RPL has a downstream multicast state
3490	   for C-G. In the context of MVPN this is accomplished by requring a PE
3491	   that has a downstream state for a particular C-G of a particular VRF
3492	   present on the PE to originate a C-multicast route for (*, C-G).  The
3493	   RD of this route should be the same as the RD associated with the
3494	   VRF. The RT(s) carried by the route should be the same as the one(s)
3495	   used for VPN-IPv4 routes.  This route will be distributed to all the
3496	   PEs of the MVPN.

3498	12.1.2. Data Plane

3500	   A PE that receives (C-S,C-G) multicast data from a CE should forward
3501	   this data on the MVPN-RPL of the MVPN the CE belongs to only if the
3502	   PE receives at least one C-multicast route for (*, C-G).  Otherwise,
3503	   the PE should not forward the data on the RPL/I-PMSI.

3505	   When a PE receives a multicast packet with (C-S,C-G) on an MVPN-RPL
3506	   associated with a given MVPN, the PE forwards this packet to every
3507	   directly connected CE of that MVPN, provided that the CE sends Join
3508	   (*,C-G) to the PE (provided that the PE has the downstream (*,C-G)
3509	   state). The PE does not forward this packet back on the MVPN-RPL.  If
3510	   a PE has no downstream (*,C-G) state, the PE does not forward the
3511	   packet.

3513	12.2. Partitioned Sets of PEs

3515	   This method does not require the use of the MVPN-RPL, and does not
3516	   require the customer to outsource the RPA/RPL functionality to the
3517	   SP.

3519	12.2.1. Partitions

3521	   Consider a particular C-RPA, call it C-R, in a particular MVPN.
3522	   Consider the set of PEs that attach to sites that have senders or
3523	   receivers for a BIDIR-PIM group C-G, where C-R is the RPA for C-G.
3524	   (As always we use the "C-" prefix to indicate that we are referring
3525	   to an address in the VPN's address space rather than in the
3526	   provider's address space.)

3528	   Following the procedures of section 5.1, each PE in the set
3529	   independently chooses some other PE in the set to be its "upstream
3530	   PE" for those BIDIR-PIM groups with RPA C-R.  Optionally, they can
3531	   all choose the "default selection" (described in section 5.1), to
3532	   ensure that each PE to choose the same upstream PE.  Note that if a
3533	   PE has a route to C-R via a VRF interface, then the PE may choose
3534	   itself as the upstream PE.

3536	   The set of PEs can now be partitioned into a number of subsets.
3537	   We'll say that PE1 and PE2 are in the same partition if and only if
3538	   there is some PE3 such that PE1 and PE2 have each chosen PE3 as the
3539	   upstream PE for C-R.  Note that each partition has exactly one
3540	   upstream PE.  So it is possible to identify the partition by
3541	   identifying its upstream PE.

3543	   Consider packet P, and let PE1 be its ingress PE.  PE1 will send the
3544	   packet on a PMSI so that it reaches the other PEs that need to
3545	   receive it.  This is done by encapsulating the packet and sending it
3546	   on a P-tunnel.  If the original packet is part of a PIM-BIDIR group
3547	   (its ingress PE determines this from the packet's destination address
3548	   C-G), and if the VPN backbone is not the RPL, then the encapsulation
3549	   MUST carry information that can be used to identify the partition to
3550	   which the ingress PE belongs.

3552	   When PE2 receives a packet from the PMSI, PE2 must determine, by
3553	   examining the encapsulation, whether the packet's ingress PE belongs
3554	   to the same partition (relative to the C-RPA of the packet's C-G)
3555	   that PE2 itself belongs to.  If not, PE2 discards the packet.
3556	   Otherwise PE2 performs the normal BIDIR-PIM data packet processing.
3557	   With this rule in place, harmful loops cannot be introduced by the
3558	   PEs into the customer's bidirectional tree.

3560	   Note that if there is more than one partition, the VPN backbone will
3561	   not carry a packet from one partition to another.  The only way for a
3562	   packet to get from one partition to another is for it to go up
3563	   towards the RPA and then to go down another path to the backbone.  If
3564	   this is not considered desirable, then all PEs should choose the same
3565	   upstream PE for a given C-RPA.  Then multiple partitions will only
3566	   exist during routing transients.

3568	12.2.2. Using PE Labels

3570	   If a given P-tunnel is to be used to carry packets belonging to a
3571	   bidirectional C-group, then, EXCEPT for the case described in section
3572	   12.2.3 the packets that travel on that P-tunnel MUST carry a PE label
3573	   (defined in section 4), using the encapsulation discussed in section
3574	   11.3.

3576	   When a given PE transmits a given packet of a bidirectional C-group
3577	   to the P-tunnel, the packet will carry the PE label corresponding to
3578	   the partition, for the C-group's C-RPA, that contains the
3579	   transmitting PE.  This is the PE label that has been bound to the
3580	   upstream PE of that partition; it is not necessarily the label that
3581	   has been bound to the transmitting PE.

3583	   Recall that the PE labels are upstream-assigned labels that are
3584	   assigned and advertised by the node which is at the root of the P-
3585	   tunnel.  (Procedures for PE label assignment when the P-tunnel is not
3586	   a multicast tree will be given is later revisions of this document.)

3588	   When a PE receives a packet with a PE label that does not identify
3589	   the partition of the receiving PE, then the receiving PE discards the
3590	   packet.

3592	   Note that this procedure does not require the root of a P-tunnel to
3593	   assign a PE label for every PE that belongs to the tunnel, but only
3594	   for those PEs that might become the upstream PEs of some partition.

3596	12.2.3. Mesh of MP2MP P-Tunnels

3598	   There is one case in which support for BIDIR-PIM C-groups does not
3599	   require the use of a PE label.  For a given C-RPA, suppose a distinct
3600	   MP2MP LSP is used as the P-tunnel serving that partition.  Then for a
3601	   given packet, a PE receiving the packet from a P-tunnel can be infer
3602	   the partition from the tunnel.  So PE labels are not needed in this
3603	   case.

3605	13. Security Considerations

3607	   This document describes an extension to the procedures of [RFC4364],
3608	   and hence shares the security considerations described in  [RFC4364]
3609	   and [RFC4365].

3611	   When GRE encapsulation is used, the security considerations of [MPLS-
3612	   IP] are also relevant.  The security considerations of [RFC4797] are
3613	   also relevant as it discusses implications on packet spoofing in the
3614	   context of 2547 VPNs.

3616	   The security considerations of [MPLS-HDR] apply when MPLS
3617	   encapsulation is used.

3619	   This document makes use of a number of control protocols: PIM [PIM-
3620	   SM], BGP MVPN-BGP], mLDP [MLDP], and RSVP-TE [RSVP-P2MP].  Security
3621	   considerations relevant to each protocol are discussed in the
3622	   respective protocol specifications.

3624	   If one uses the UDP-based protocol for switching to S-PMSI (as
3625	   specified in Section 7.2.1), then by default each PE router MUST
3626	   install packet filters that would result in discarding all UDP
3627	   packets with the destination port 3232 that the PE router receives
3628	   from the CE routers connected to the PE router.

3630	   The various procedures for P-tunnel construction have security issues
3631	   that are specific to the way in which the P-tunnels are used in this
3632	   document.  When P-tunnels are constructed via such techniques as as
3633	   PIM, mLDP, or RSVP-TE, it is important for each P or PE router
3634	   receiving a control message to be sure that the control message comes
3635	   from another P or PE router, not from a CE router.  This should not
3636	   be a problem, because mLDP or PIM or RSVP-TE control messages from CE
3637	   routers will never be interpreted as referring to P-tunnels.

3639	   An ASBR may receive, from one SP's domain, an mLDP, PIM, or RSVP-TE
3640	   control message that attempts to extend a multicast distribution tree
3641	   from one SP's domain into another SP's domain.  The ASBR should not
3642	   allow this unless explicitly configured to do so.

3644	14. IANA Considerations

3646	   Section 7.2.1.1 defines the "S-PMSI Join Message", which is carried
3647	   in a UDP datagram whose port number is 3232.  This port number is
3648	   already assigned by IANA to "MDT port".  IANA should now have that
3649	   assignment reference this document.

3651	   IANA should create a registry for the "S-PMSI Join Message Type
3652	   Field".  The value 1 should be registered with a reference to this
3653	   document.  The description should read "PIM IPv4 S-PMSI
3654	   (unaggregated)".

3656	15. Other Authors

3658	   Sarveshwar Bandi, Yiqun Cai, Thomas Morin, Yakov Rekhter, IJsbrands
3659	   Wijnands, Seisho Yasukawa

3661	16. Other Contributors

3663	   Significant contributions were made Arjen Boers, Toerless Eckert,
3664	   Adrian Farrel, Luyuan Fang, Dino Farinacci, Lenny Guiliano, Shankar
3665	   Karuna, Anil Lohiya, Tom Pusateri, Ted Qian, Robert Raszuk, Tony
3666	   Speakman, Dan Tappan.

3668	17. Authors' Addresses

3670	   Rahul Aggarwal (Editor)
3671	   Juniper Networks
3672	   1194 North Mathilda Ave.
3673	   Sunnyvale, CA 94089
3674	   Email: rahul@juniper.net
3675	   Sarveshwar Bandi
3676	   Motorola
3677	   Vanenburg IT park, Madhapur,
3678	   Hyderabad, India
3679	   Email: sarvesh@motorola.com

3681	   Yiqun Cai
3682	   Cisco Systems, Inc.
3683	   170 Tasman Drive
3684	   San Jose, CA, 95134
3685	   E-mail: ycai@cisco.com

3687	   Thomas Morin
3688	   France Telecom R & D
3689	   2, avenue Pierre-Marzin
3690	   22307 Lannion Cedex
3691	   France
3692	   Email: thomas.morin@francetelecom.com

3694	   Yakov Rekhter
3695	   Juniper Networks
3696	   1194 North Mathilda Ave.
3697	   Sunnyvale, CA 94089
3698	   Email: yakov@juniper.net

3700	   Eric C. Rosen (Editor)
3701	   Cisco Systems, Inc.
3702	   1414 Massachusetts Avenue
3703	   Boxborough, MA, 01719
3704	   E-mail: erosen@cisco.com

3706	   IJsbrand Wijnands
3707	   Cisco Systems, Inc.
3708	   170 Tasman Drive
3709	   San Jose, CA, 95134
3710	   E-mail: ice@cisco.com
3711	   Seisho Yasukawa
3712	   NTT Corporation
3713	   9-11, Midori-Cho 3-Chome
3714	   Musashino-Shi, Tokyo 180-8585,
3715	   Japan
3716	   Phone: +81 422 59 4769
3717	   Email: yasukawa.seisho@lab.ntt.co.jp

3719	18. Normative References

3721	   [MLDP] I. Minei, K., Kompella, I. Wijnands, B. Thomas, "Label
3722	   Distribution Protocol Extensions for Point-to-Multipoint and
3723	   Multipoint-to-Multipoint Label Switched Paths", draft-ietf-mpls-ldp-
3724	   p2mp-03, July 2007

3726	   [MPLS-HDR] E. Rosen, et. al., "MPLS Label Stack Encoding", RFC 3032,
3727	   January 2001

3729	   [MPLS-IP] T. Worster, Y. Rekhter, E. Rosen, "Encapsulating MPLS in IP
3730	   or Generic Routing Encapsulation (GRE)", RFC 4023, March 2005

3732	   [MPLS-MCAST-ENCAPS] T. Eckert, E. Rosen, R. Aggarwal, Y. Rekhter,
3733	   "MPLS Multicast Encapsulations", draft-ietf-mpls-multicast-
3734	   encaps-06.txt, July 2007

3736	   [MPLS-UPSTREAM-LABEL] R. Aggarwal, Y. Rekhter, E. Rosen, "MPLS
3737	   Upstream Label Assignment and Context Specific Label Space", draft-
3738	   ietf-mpls-upstream-label-02.txt, March 2007

3740	   [MVPN-BGP], R. Aggarwal, E. Rosen,  T. Morin, Y. Rekhter,  C.
3741	   Kodeboniya, "BGP Encodings for Multicast in MPLS/BGP IP VPNs", draft-
3742	   ietf-l3vpn-2547bis-mcast-bgp-04.txt, November 2007

3744	   [PIM-ATTRIB], A. Boers, IJ. Wijnands, E. Rosen, "Format for Using
3745	   TLVs in PIM Messages",  draft-ietf-pim-join-attributes-03, May 2007

3747	   [PIM-SM]  "Protocol Independent Multicast - Sparse Mode (PIM-SM)",
3748	   Fenner, Handley, Holbrook, Kouvelas, August 2006, RFC 4601

3750	   [RFC2119] "Key words for use in RFCs to Indicate Requirement
3751	   Levels.", Bradner, March 1997

3753	   [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006

3755	   [RSVP-P2MP] R. Aggarwal, D. Papadimitriou, S. Yasukawa, et. al.,
3756	   "Extensions to RSVP-TE for Point-to-Multipoint TE LSPs", RFC 4875,
3757	   May 2007

3759	19. Informative References

3761	   [ADMIN-ADDR] D. Meyer, "Administratively Scoped IP Multicast", RFC
3762	   2365, July 1998

3764	   [MVPN-REQ] T. Morin, Ed., "Requirements for Multicast in L3 Provider-
3765	   Provisioned VPNs", RFC 4834, April 2007

3767	   [RFC1853] W. Simpson, "IP in IP Tunneling", October 1995

3769	   [RFC2784] D. Farinacci, et. al., "Generic Routing Encapsulation",
3770	   March 2000

3772	   [RFC2890] G. Dommety, "Key and Sequence Number Extensions to GRE",
3773	   September 2000

3775	   [RFC2983] D. Black, "Differentiated Services and Tunnels", October
3776	   2000

3778	   [RFC3270] F. Le Faucheur, et. al., "MPLS Support of Differentiated
3779	   Services", May 2002

3781	   [RFC4365], E. Rosen, " Applicability Statement for BGP/MPLS IP
3782	   Virtual Private Networks (VPNs)", February 2006

3784	   [RFC4607] H. Holbrook, B. Cain, "Source-Specific Multicast for IP",
3785	   August 2006

3787	   [RFC4797] Y. Rekhter, R. Bonica, E. Rosen, "Use of Provider Edge to
3788	   Provider Edge (PE-PE) Generic Routing Encapsulation (GRE) or IP in
3789	   BGP/MPLS IP Virtual Private Networks", January 2007

3791	20. Full Copyright Statement

3793	   Copyright (C) The IETF Trust (2008).

3795	   This document is subject to the rights, licenses and restrictions
3796	   contained in BCP 78, and except as set forth therein, the authors
3797	   retain all their rights.

3799	   This document and the information contained herein are provided on an
3800	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
3801	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
3802	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
3803	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
3804	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
3805	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

3807	21. Intellectual Property

3809	   The IETF takes no position regarding the validity or scope of any
3810	   Intellectual Property Rights or other rights that might be claimed to
3811	   pertain to the implementation or use of the technology described in
3812	   this document or the extent to which any license under such rights
3813	   might or might not be available; nor does it represent that it has
3814	   made any independent effort to identify any such rights.  Information
3815	   on the procedures with respect to rights in RFC documents can be
3816	   found in BCP 78 and BCP 79.

3818	   Copies of IPR disclosures made to the IETF Secretariat and any
3819	   assurances of licenses to be made available, or the result of an
3820	   attempt made to obtain a general license or permission for the use of
3821	   such proprietary rights by implementers or users of this
3822	   specification can be obtained from the IETF on-line IPR repository at
3823	   http://www.ietf.org/ipr.

3825	   The IETF invites any interested party to bring to its attention any
3826	   copyrights, patents or patent applications, or other proprietary
3827	   rights that may cover technology that may be required to implement
3828	   this standard.  Please address the information to the IETF at
3829	   ietf-ipr@ietf.org.