idnits 2.17.1 

draft-raggarwa-l3vpn-mvpn-vpls-mcast-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 691.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 698.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 704.

  ** The document claims conformance with section 10 of RFC 2026, but uses
     some RFC 3978/3979 boilerplate.  As RFC 3978/3979 replaces section 10 of
     RFC 2026, you should not claim conformance with it if you have changed to
     using RFC 3978/3979 boilerplate.

  ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure
     Acknowledgement. 

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** The document seems to lack an RFC 3978 Section 5.4 Reference to BCP 78
     -- however, there's a paragraph with a matching beginning. Boilerplate
     error?

  ** The document seems to lack an RFC 3978 Section 5.5 (updated by RFC 4748)
     Disclaimer -- however, there's a paragraph with a matching beginning.
     Boilerplate error?


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 18 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'KEYWORDS' is mentioned on line 49, but not defined

  == Missing Reference: 'MPLS-IP' is mentioned on line 563, but not defined

  == Unused Reference: 'RFC2119' is defined on line 622, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC3107' is defined on line 625, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-12) exists of
     draft-ietf-pim-sm-v2-new-08

  == Outdated reference: A later version (-03) exists of
     draft-ietf-l3vpn-rfc2547bis-01

  -- Possible downref: Normative reference to a draft: ref. 'MVPN-PIM' 

  ** Obsolete normative reference: RFC 3107 (Obsoleted by RFC 8277)

  == Outdated reference: A later version (-08) exists of
     draft-ietf-l2vpn-vpls-bgp-02

  == Outdated reference: A later version (-09) exists of
     draft-ietf-l2vpn-vpls-ldp-03

  == Outdated reference: A later version (-15) exists of
     draft-rosen-vpn-mcast-07


     Summary: 10 errors (**), 0 flaws (~~), 12 warnings (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                          Rahul Aggarwal (Editor)
2	Internet Draft                                 Juniper Networks
3	Expiration Date: February 2005

5	                  Multicast in BGP/MPLS VPNs and VPLS

7	              draft-raggarwa-l3vpn-mvpn-vpls-mcast-00.txt

9	Status of this Memo

11	   This document is an Internet-Draft and is in full conformance with
12	   all provisions of Section 10 of RFC2026.

14	   Internet-Drafts are working documents of the Internet Engineering
15	   Task Force (IETF), its areas, and its working groups.  Note that
16	   other groups may also distribute working documents as Internet-
17	   Drafts.

19	   Internet-Drafts are draft documents valid for a maximum of six months
20	   and may be updated, replaced, or obsoleted by other documents at any
21	   time.  It is inappropriate to use Internet-Drafts as reference
22	   material or to cite them other than as ``work in progress.''

24	   The list of current Internet-Drafts can be accessed at
25	   http://www.ietf.org/ietf/1id-abstracts.txt

27	   The list of Internet-Draft Shadow Directories can be accessed at
28	   http://www.ietf.org/shadow.html.

30	Abstract

32	   This document describes a solution framework for overcoming the
33	   limitations of existing Multicast VPN (MVPN) and VPLS multicast
34	   solutions.  It describes procedures for enhancing the scalability of
35	   multicast for BGP/MPLS VPNs. It also describes procedures for VPLS
36	   multicast that utilize multicast trees in the sevice provider (SP)
37	   network.  The procedures described here reduce the overhead of PIM
38	   neighbor relationships that a PE router needs to maintain for
39	   BGP/MPLS VPNs. They also reduce the state (and the overhead of
40	   maintaining the state) in the SP network by removing the need to
41	   maintain in the SP network at least one dedicated multicast tree per
42	   each VPN.

44	Conventions used in this document

46	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
47	   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
48	   document are to be interpreted as described in RFC-2119 [KEYWORDS].

50	1. Contributors

52	   Rahul Aggarwal
53	   Yakov Rekhter
54	   Anil Lohiya
55	   Tom Pusateri
56	   Lenny Giuliano
57	   Chaitanya Kodeboniya
58	   Juniper Networks

60	2. Terminology

62	   This document uses terminology described in [MVPN-PIM], [VPLS-BGP]
63	   and [VPLS-LDP].

65	3. Introduction

67	   [MVPN-PIM] describes the minimal set of procedures that are required
68	   to build multi-vendor inter-operable implementations of multicast for
69	   BGP/MPLS VPNs. However the solution described in [MVPN-PIM] has
70	   undesirable scaling properties. [ROSEN] describes additional
71	   procedures for multicast for BGP/MPLS VPNs and they too have
72	   undesirable scaling properties.

74	   [VPLS-BGP] and [VPLS-LDP] describe a solution for VPLS multicast that
75	   relies on ingress replication. This solution has certain limitations
76	   for some VPLS multicast traffic profiles.

78	   This document describes a solution framework to overcome the
79	   limitations of existing MVPN [MVPN-PIM, ROSEN] solutions. It also
80	   extends VPLS multicast to provide a solution that can utilize
81	   multicast trees in the SP network.

83	4. Existing Scalability Issues in BGP/MPLS MVPNs

85	   The solution described in [MVPN-PIM] and [ROSEN] has three
86	   fundamental scalability issues.

88	4.1. PIM Neighbor Adjacencies Overhead

90	   The solution for unicast in BGP/MPLS VPNs [2547] requires a PE to
91	   maintain at most one BGP peering each with every other PE in the
92	   network that is participating in BGP/MPLS VPNs. The use of Route
93	   Reflectors further reduces the number of BGP adjacencies maintained
94	   by a PE.

96	   On the other hand for multicast in BGP/MPLS VPNs [MVPN-PIM, ROSEN],
97	   for a particular MVPN, a PE has to maintain PIM neighbor adjacencies
98	   with every other PE that has a site in that MVPN. Thus for a given
99	   PE-PE pair multiple PIM adjacencies are required, one per MVPN that
100	   the PEs have in common.  This implies that the number of PIM neighbor
101	   adjacencies that a PE has to maintain is equal to the product of the
102	   number of MVPNs the PE belongs to and the average number of sites in
103	   each of these MVPNs.

105	   For each such PIM neighbor adjacency the PE has to send and receive
106	   PIM Hello packets that are transmitted periodically at a default
107	   interval of 30 seconds. For example, on a PE router with 1000 VPNs
108	   and 100 sites per VPN, ascenario that is not uncommon in L3VPN
109	   deployments today, the PE router would have to maintain 100,000 PIM
110	   neighbors.  With a default of hello interval of 30s, this would
111	   result in an average of 3,333 hellos per second.

113	   It is highly desirable to reduce the overhead due to PIM adjacencies
114	   that a PE router needs to maintain in support of multicast with
115	   BGP/MPLS VPNs.

117	4.2. Periodic PIM Join/Prune Messages

119	   PIM [PIM-SM] is a soft state protocol. It requires PIM Join/Prune
120	   messages to be transmitted periodically. Hence each PE participating
121	   in MVPNs has to periodically refresh the PIM C-Join messages. It is
122	   desirable to reduce the overhead of the periodic PIM control
123	   messages. The overheard of PIM C-Join messages increases when PIM
124	   Join suppression is disabled.  There is a need to disable PIM Join
125	   suppression as described in section 6.5.2.  This in turn further
126	   justifies the need to reduce the overhead of periodic PIM C-Join
127	   messages.

129	4.3. State in the SP Core

131	   Unicast in BGP/MPLS VPNs [2547] requires no per VPN state in the SP
132	   core.  The core maintains state for only PE to PE transport tunnels.
133	   VPN routing information is maintained only by the PEs participating
134	   in the VPN service.

136	   On the other hand [MVPN-PIM] specifies a solution that requires the
137	   SP core to maintain per MVPN state. This is because a RP rooted
138	   shared tree is setup using PIM-SM, by default, in the SP core for
139	   each MVPN. Based on configuration receiver PEs may also switch to a
140	   source rooted tree for a particular MVPN which further increases the
141	   number of multicast trees in the SP core.  [ROSEN] specifies the use
142	   of PIM-SSM for setting up SP multicast trees.  The use of PIM-SSM
143	   instead of PIM-SM increases the amount of per MVPN state maintained
144	   in the SP core. Use of Data MDT as specified in [ROSEN] further
145	   increases the overhead resulting from this state.

147	   It is desirable to remove the need to maintain per MVPN state in the
148	   SP core.

150	5. Existing Limitation of VPLS Multicast

152	   VPLS multicast solutions described in [VPLS-BGP] and [VPLS-LDP] rely
153	   on ingress replication. Thus the ingress PE replicates the multicast
154	   packet for each egress PE and sends it to the egress PE using a
155	   unicast tunnel.  By appropriate IGMP or PIM snooping it is possible
156	   to send the packet to only to the PEs that have the receivers for
157	   that traffic, rather than to all the PEs in the VPLS instance.

159	   This is a reasonable model when the bandwidth of the multicast
160	   traffic is low or/and the number of replications performed on an
161	   average on each outgoing interface for a particular customer VPLS
162	   multicast packet is small. If this is not the case it is desirable to
163	   utilize multicast trees in the SP core to transmit VPLS multicast
164	   packets.  Note that unicast packets that are flooded to each of the
165	   egress PEs, before the ingress PE performs learning for those unicast
166	   packets, will still use ingress replication.

168	6. MVPN Solution Framework

170	   This section describes the framework for the MVPN solution. This
171	   framework makes it possible to overcome the existing scalability
172	   limitations described in section 4.

174	6.1. PIM Neighbor Maintenance using BGP

176	   This document proposes the use of BGP for discovering and maintaining
177	   PIM neighbors in a given MVPN. All the PE routers advertise their
178	   MVPN membership i.e. the VRFs configured for multicast, to other PE
179	   routers using BGP. This allows each PE router in the SP network to
180	   have a complete view of the MVPN membership of other PE routers. A PE
181	   that belongs to a MVPN considers all the other PEs that advertise
182	   membership for that MVPN to be PIM neighbors for that MVPN. However
183	   the PE does not have to perform PIM neighbor adjacency management as
184	   the PIM neighbor discovery is performed using BGP. This eliminates
185	   the PIM Hello processing required for maintaining the PIM neighbors.

187	6.2. PIM Refresh Reduction

189	   As described in section 4.2 PIM is a soft state protocol. To
190	   eliminate the need to peridically refresh PIM control messages there
191	   is a need to build a refresh reduction mechanism in PIM. The detailed
192	   procedures for this will be specified later.

194	6.3. Separation of Customer Control Messages and Data Traffic

196	   BGP/MPLS VPN unicast [2547] maintains a separation between the
197	   exchange of customer routing information and the transmission of
198	   customer data i.e.  VPN unicast traffic. VPN routing information is
199	   exchanged using BGP while VPN data traffic is encapsulated in PE-to-
200	   PE tunnels. This makes the exchange of VPN routing information
201	   agnostic of the unicast tunneling technology. This, in turn, provides
202	   flexibility of supporting various tunneling technologies, without
203	   impacting the procedures for exchange of VPN routing information.

205	   [MVPN-PIM] on the other hand uses Multicast Domain (MD) tunnels for
206	   sending both C-Join messages and C-Data traffic. This creates an
207	   undesirable dependency between the exchange of customer control
208	   information and the multicast transport technology.

210	   Procedures described in section 6.1 make the discovery and
211	   maintenance of PIM neighbors independent of the multicast transport
212	   technology in the SP network. The other piece is the exchange of
213	   customer multicast control information. This document proposes that a
214	   PE use a PE-to-PE tunnel to send the customer multicast control
215	   information to the upstream PE that is the PIM neighbor. The C-Join
216	   packets are encapsulated in a MPLS label before being encapsulated in
217	   the PE-to-PE tunnel. This label specifies the context of the C-Join
218	   i.e. the MVPN the C-Join is intended for. Section 9 specifies how
219	   this label is learned. The destination address of the C-Join is still
220	   the ALL-PIM-ROUTERS multicast group address. Thus a C-Join packet is
221	   tunnelled to the PE which is the PIM neighbor for that packet. A
222	   beneficial side effect of this is that C-Join suppression is
223	   disabled. As described in section 6.5.2 it is desirable to disable C-
224	   Join suppression.

226	6.4. Transport of Customer Multicast Data Packets

228	   This document describes two mechanisms to transport customer
229	   multicast data packets over the SP network. One is ingress
230	   replication and the other is the use of multicast trees in the SP
231	   network.

233	6.4.1. Ingress Replication

235	   In this mechanism the ingress PE replicates a customer multicast data
236	   packet of a particular group and sends it to each egress PE which is
237	   on the path to a receiver of that group. The packet is sent to an
238	   egress PE using a unicast tunnel. This has the advantage of
239	   operational simplicity as the SP network doesn't need to run a
240	   multicast routing protocol. It also has the advantage of minimizing
241	   state in the SP network. With C-Join suppression disabled, it has an
242	   advantage of sending the traffic to only the PEs that have the
243	   receivers for that traffic. This is a reasonable model when the
244	   bandwidth of the multicast traffic is low or/and the number of
245	   replications performed by the ingress PE on each outgoing interface
246	   for a particular customer multicast data packet is small.

248	6.4.2. Multicast Trees in the SP Network

250	   This mechanism uses multicast trees in the SP network for
251	   transporting customer multicast data packets. MD trees described in
252	   [MVPN-PIM] are an example of such multicast trees. The use of
253	   multicast trees in the SP network can be beneficial when the
254	   bandwidith of the multicast traffic is high or when it is desirable
255	   to optimize the number of copies of a multicast packet transmitted by
256	   the ingress. This comes at a cost of operational overhead in the SP
257	   core to build multicast trees and state in the SP core. This document
258	   places no restrictions on the protocols used to build SP multicast
259	   trees.

261	6.5. Sharing a Single SP Multicast Tree across Multiple MVPNs

263	   This document describes procedures for sharing a single SP multicast
264	   tree across multiple MVPNs.

266	6.5.1. Aggregate Trees

268	   An Aggregate Tree is a SP multicast tree that can be shared across
269	   multiple MPVNs and is setup by discovering the egress PEs i.e. the
270	   leaves of the tree, by using BGP.

272	   PIM neighbor discovery and maintenance using BGP allows a PE or a RP
273	   to learn the MVPN membership information of other PEs. This in turn
274	   allows the creation of one or more Aggregate Trees where each
275	   Aggregate tree is mapped to one or more MVPNs. The leaves of the
276	   Aggregate Tree are determined by the PEs that belong to all the MVPNs
277	   that are mapped onto the Aggregate Tree. Aggregate Trees remove the
278	   need to maintain per MVPN state in the SP core as a single SP
279	   multicast tree can be used across multiple VPNs.

281	   Note that like default MDTs described in [MVPN-PIM] Aggregate MDTs
282	   may result in a multicast data packet for a particular group being
283	   delivered to PE routers that do not have receivers for that multicast
284	   group.

286	6.5.2. Aggregate Data Trees

288	   An Aggregate Data Tree is a SP multicast tree that can be shared
289	   across multiple MVPNs and is setup by discovering the egress PEs i.e.
290	   the leaves of the tree, by using C-Join messages. The reason for
291	   having Aggregate Data Trees is to provide a PE to have the ability to
292	   create separate SP multicast trees for high bandwidth multicast
293	   groups. This allows traffic for these multicast groups to reach only
294	   those PE routers that have receivers in these groups. This avoids
295	   flooding other PE routers in the MVPN. More than one such multicast
296	   groups can be mapped on to the same SP multicast tree. The multicast
297	   groups that are mapped to this SP multicast tree may also belong to
298	   different MVPNs.

300	   The setting up of Aggregate Data Trees requires the ingress PE to
301	   know all the other PEs that have receivers for multicast groups that
302	   are mapped onto the Aggregate Data Trees. This is learned from the C-
303	   Joins received by the ingress PE. It requires that C-Join suppression
304	   be disabled. The procedures used for C-Join propagation as described
305	   in section 6.3 ensure that Join suppression is not enabled.

307	   Note that [ROSEN] describes a limited solution for building Data MDTs
308	   where a Data MDT cannot be shared across different VPNs.

310	6.5.3. Setting up Aggregate Trees and Aggregate Data Trees

312	   This document does not place any restrictions on the multicast
313	   technology used to setup Aggregate Trees or Aggregate Data Trees.

315	   When PIM is used to setup multicast trees in the SP core, an
316	   Aggregate Tree is termed as the "Aggregate MDT" and an Aggregate Data
317	   Tree is termed as an "Aggregate Data MDT". The Aggregate MDT may be a
318	   shared tree, rooted at the RP, or a shortest path tree. Aggregate
319	   Data MDT is rooted at the PE that is connected to the multicast
320	   traffic source. The root of the Aggregate MDT or the Aggregate Data
321	   MDT has to advertise the P-Group address chosen by it for the MDT to
322	   the PEs that are leaves of the MDT. These other PEs can then Join
323	   this MDT. The announcement of this address is done as part of the
324	   discovery procedures described in section 6.5.5.

326	6.5.4. Demultiplexing Aggregate Tree and Aggregate Data Tree Multicast
327	   Traffic

329	   Aggregate Trees and Aggregate Data Trees require a mechanism for the
330	   egress PEs to demultiplex the multicast traffic received over the
331	   Aggregate Tree. This is because traffic belonging to multiple MVPNs
332	   can be carried over the same tree. Hence there is a need to identify
333	   the MVPN the packet belongs to. This is done by using an inner label
334	   that corresponds to the multicast VRF for which the packet is
335	   intended. The ingress PE uses this label as the inner label while
336	   encapsulating a customer multicast data packet. Each of the egress
337	   PEs must be able to associate this inner label with the same MVPN and
338	   use it to demultimplex the traffic received over the Aggregate Tree
339	   or the Aggregate Data Tree. If downstream label assignment were used
340	   this would require all the egress PEs in the MVPN to agree on a
341	   common label for the MVPN.

343	   We propose a solution that uses upstream label assignment by the
344	   ingress PE.  Hence the inner label is allocated by the ingress PE.
345	   Each egress PE has a separate label space for every Aggregate Tree or
346	   Aggregate Data Tree for which the egress PE is a leaf node. The inner
347	   VPN label allocated by the ingress PE can be programmed in this label
348	   space by the egress PEs. Hence when the egress PE receives a packet
349	   over an Aggregate Tree (or an Aggregate Data Tree), Aggregate Tree
350	   identifier (or Aggregate Datat Tree Identifier) specifies the label
351	   space to perform the inner label lookup. An implementation may create
352	   a logical interface corresponding to an Aggregate Tree (or an
353	   Aggregate Data Tree). In that case the label space to lookup the
354	   inner label in is an interface based label space where the interface
355	   corresponds to the tree.

357	   When Aggregate MDTs (or Aggregate Data MDTs) are used the roote PE
358	   source address and the Aggregate MDT (or Aggregate Data MDT) P-group
359	   address identifies the MDT. The label space corresponding to the MDT
360	   interface is the label space to perform the inner label lookup in. A
361	   lookup in this label space identifies the multicast VRF in which the
362	   customer multicast lookup needs to be done.

364	   The ingress PE informs the egress PEs about the inner label as part
365	   of the discovery procedures described in the next section.

367	6.5.5. Aggregate Tree and Aggregate Data Tree Discovery

369	   Once a PE sets up an Aggregate Tree or an Aggregate Data Tree it
370	   needs to announce the customer multicast groups being mapped to this
371	   tree to other PEs in the network. This procedure is referred to as
372	   Aggregate Tree or Aggregate Data Tree discovery. For an Aggregate
373	   Tree this discovery implies announcing the mapping of all MVPNs
374	   mapped to the Aggregate Tree. The inner label allocated by the
375	   ingress PE for each MVPN is included along with the Aggregate Tree
376	   Identifier. For an Aggregate Data Tree this discovery implies
377	   announcing all the specific <C-Source, C-Group> entries mapped to
378	   this tree along with the Aggregate Data Tree Identifer. The inner
379	   label allocated for each <C-Source, C-Group> is included along with
380	   the Aggregate Data Tree Identifier.

382	   The egress PE creates a logical interface corresponding to the
383	   Aggregate Tree or the Aggregate Data Tree identifier. This interface
384	   is the RPF interface for all the <C-Source, C-Group> entries mapped
385	   to that tree.  An Aggregate Tree by definition maps to all the <C-
386	   Source, C-Group> entries belonging to all the MVPNs associated with
387	   the Aggregate Tree. An Aggregate Data Tree maps to the specific <C-
388	   Source, C-Group> associated with it.

390	   When PIM is used to setup SP multicast trees, the egress PE also
391	   Joins the P-Group Address corresponding to the Aggregate MDT or the
392	   Aggregate Data MDT. This results in setup of the PIM SP tree.

394	7. VPLS Multicast

396	   This document proposes the use of SP multicast trees for VPLS
397	   multicast.  This allows a SP to have an option when ingress
398	   replication as described in [VPLS-BGP] and [VPLS-LDP] is not the best
399	   fit for the customer multicast traffic profile.

401	   Aggregate Trees and Aggreagate Data Trees described in section 6 can
402	   be used as SP multicast trees for VPLS multicast. No resriction is
403	   placed on the protocols used for building SP Aggregate Trees for
404	   VPLS. VPLS auto-discovery as described in [VPLS-BGP] is used to map
405	   VPLS instances on Aggregate Trees. IGMP and PIM snooping is required
406	   for mapping multicast groups to Aggregate Data Trees. Detailed
407	   procedures for this will be specified in the next revision.

409	8. BGP Advertisements

411	   The procedures required in this document use BGP for MVPN membership
412	   discovery, for Aggregate Tree discovery and for Aggregate Data Tree
413	   discovery. A new Subsequence-Address Family (SAFI) called the MVPN
414	   SAFI is defined. Following is the format of the NLRI associated with
415	   this SAFI:

417	             +---------------------------------+
418	             |   Length (2 octets)             |
419	             +---------------------------------+
420	             |   MPLS Labels (variable)        |
421	             |---------------------------------+
422	             |    RD   (8 octets)              |
423	             +---------------------------------+
424	             |Multicast Source  (4 octets)     |
425	             +---------------------------------+
426	             |Multicast Group   (4 octets)     |
427	             +---------------------------------+

429	   The RD corresponds to the multicast enabled VRF or the VPLS instance.
430	   The BGP next-hop advertised with this NLRI contains an IPv4 address
431	   which is the same as the BGP next-hop advertised with the unicast VPN
432	   routes.

434	   When a PE distributes this NLRI via BGP, it must include a Route
435	   Target Extended Communities attribute. This RT must be an "Import RT"
436	   [2547] of each VRF in the MVPN or of each VSI in the VPLS.  The BGP
437	   distribution procedures used by [2547] will then ensure that the
438	   advertised information gets associated with the right VRFs or VSIs.

440	   A new optional transitive attribute called the
441	   Multicast_Tree_Attribute is defined to signal the Aggregate Tree or
442	   the Aggregate Data Tree. This attribute is a TLV. Currently a single
443	   Tree Identifier is defined:
444	     1. PIM MDT.

446	   When the type is set to PIM MDT, the attribute contains a PIM P-
447	   Multicast Group address.

449	   Hence MP_REACH idenfies the set of VPN customer's multicast trees,
450	   the Multicast_Tree_Attribute identifies a particular SP tree (aka
451	   Aggregate tree or Aggregate Data Tree), and the advertisement of both
452	   in a single BGP Update creates a binding/mapping between the SP tree
453	   (the Aggregate Tree) and the set of VPN customer's trees.

455	9. MVPN Neighbor Discovery and Maintenance

457	   The BGP NLRI described in section 8 is used for MVPN neighbor
458	   discovery and maintenance. Each PE advertises its multicast VPN
459	   membership information using BGP. For the purpose of the MVPN
460	   membership distribution, the NRLI contains the Route-Distinguisher
461	   (RD), a MPLS label and the PE source address.  The group address is
462	   set to 0. The RD corresponds to the multicast enabled VRF.  The MPLS
463	   label is used by other PEs to send PIM Join/Prune messages to this
464	   PE. This label identifies the multicast VRF for which the Join/Prune
465	   is intended. When ingress replication is used, this label must also
466	   be present for sending customer multicast traffic.

468	   When a PE distributes this NLRI via BGP, it must include a Route
469	   Target Extended Communities attribute. This RT must be an "Import RT"
470	   [2547] of each VRF in the MVPN. The BGP distribution procedures used
471	   by [2547] will then ensure that each PE learns the other PEs in the
472	   MVPN, and that this information gets associated with the right VRFs.
473	   This allows the MVPN PIM instance in a PE to discover all the PIM
474	   neighbors in that MVPN.

476	   The advertisement of the NLRI described above by a PE implies that
477	   the PIM module on that PE that deals with the MVPN corresponding to
478	   the NLRI is fully functional. When such module becomes disfunctional
479	   (for whatever reason) the PE MUST withdraw the advertisement.

481	   The neighbor discovery described here is applicable only to BGP/MPLS
482	   VPNs, and is not applicable to VPLS.

484	9.1. PIM Hello Options

486	   PIM Hellos allow PIM neighbors to exchange various optional
487	   capabilities.  The use of BGP for discovering and maintaining PIM
488	   neighbors may imply that some of these optional capabilities need to
489	   be supported in the BGP based discovery procedures. Exchanging these
490	   capabilities via BGP will be described if and when the need for
491	   supporting these optional capabilities will arise.

493	10. Aggregate MDT

495	   An Aggregate MDT can be created by a RP or an ingress PE. It results
496	   in the creation of a MD tree that can be shared by multiple MVPNs or
497	   VPLS intances. The MD group address associated with the Aggregate MDT
498	   is assigned by the router that creates the Aggregate MDT. This
499	   address along with the source address of the router forms the
500	   Aggregate MDT Identifier. Once the RP or an ingress PE maps one or
501	   more MVPNs or VPLS instances to an Aggregate MDT it needs to
502	   advertise this mapping to the egress PEs that belong to these MVPNs
503	   or VPLS instances. This requires advertising one or more MVPNs/VPLS
504	   instances and the corresponding Aggregate MDT Identifier. The MVPNs
505	   or VPLS instances can be advertised using the BGP procedures
506	   described in section 8.  The Aggregate MDT Identifer is encoded using
507	   a TLV in the Multicast_Tree_Attribute. Each NLRI also encodes the
508	   upstream label assigned by the Aggregate MDT root for that MVPN or
509	   VPLS instance.

511	   This information allows the egress PE to associate an Aggregate MDT
512	   with one or more MVPNs or VPLS instances. The Aggregate MDT Identifer
513	   identifies the label space to lookup the inner label. The inner label
514	   identifies the VRF or VSI to do the multicast lookup in after a
515	   packet is received from the Aggregate MDT.  The Aggregate MDT
516	   interface is used for the multicast RPF check for the customer
517	   packet.  On the receipt of this information each egress PE can Join
518	   the Aggregate MDT. This results in the setup of the Aggregate MDT in
519	   the SP network.

521	11. Aggregate Data MDT

523	   Aggregate Data MDT is created by an ingress PE. It is created for one
524	   or more customer multicast groups that the PE wishes to move to a
525	   dedicated SP tree. These groups may belong to different MVPNs or VPLS
526	   instances. It may be desirable that the set of PEs that have
527	   receivers belonging to these groups be exactly the same. However the
528	   procedures for setting up Aggregate Data MDTs do not require this.
529	   The mapping of an Aggregate Data MDT Identifier to <C-Source, C-
530	   Group> entries requires a source PE to know the PE routers that have
531	   receievers in these groups. For MVPN this is learned using the C-Join
532	   information. For VPLS IGMP snooping or PIM snooping is required at
533	   the source PE.

535	   The mapping of the Aggregate Data MDT Identifier to the <C-Source, C-
536	   Group> entries is advertised by the ingress PE to the egress PEs
537	   using the procedures described in section 8. The source address in
538	   the NLRI is set to the C-Source Address and the group address is set
539	   to the C-Group address. The Aggregate Data MDT is encoded in the
540	   Multicast_Tree_Attribute. Each NLRI also encodes the upstream label
541	   assigned by the Aggregate Data MDT root for the MVPN or VPLS instance
542	   corresponding to the <C-Source, C-Group> encoded in the NLRI. A
543	   single BGP Update may carry multiple <C-Source, C-Group> addresses as
544	   long as they all belong to the same VPN.

546	   This information allows the egress PE to associate an Aggregate Data
547	   MDT with one or more <C-Source, C-Group>s. On the receipt of this
548	   information each egress PE can Join the Aggregate Data MDT. This
549	   results in the setup of the Aggregate Data MDT in the SP network. The
550	   inner label is used to identify the VRF or VSI to do the multicast
551	   lookup in after a packet is received from the Aggregate Data MDT. It
552	   is also needed for multicast RPF check for MVPNs.

554	   Note that the procedures for signaling Aggregate Data MDTs are the
555	   same as the procedures for signaling Aggregate MDTs describe in
556	   section 10.

558	12. Data Forwarding

560	   The following diagram shows the progression of the packet as it
561	   enters and leaves the SP network when the Aggregate MDT or Aggregate
562	   Data MDTs are being used for multiple MVPNs or multiple VPLS
563	   instances. MPLS-in-GRE [MPLS-IP] encapsulation is used to encapsulate
564	   the customer multicast packets.

566	      Packets received        Packets in transit      Packets forwarded
567	      at ingress PE           in the service          by egress PEs
568	                              provider network

570	                              +---------------+
571	                              |  P-IP Header  |
572	                              +---------------+
573	                              |      GRE      |
574	                              +---------------+
575	                              | VPN Label     |
576	      ++=============++       ++=============++       ++=============++
577	      || C-IP Header ||       || C-IP Header ||       || C-IP Header ||
578	      ++=============++ >>>>> ++=============++ >>>>> ++=============++
579	      || C-Payload   ||       || C-Payload   ||       || C-Payload   ||
580	      ++=============++       ++=============++       ++=============++

582	   The P-IP header contains the Aggregate MDT (or Aggregate Data MDT) P-
583	   group address as the destination address and the root PE address as
584	   the source address. The receiver PE does a lookup on the P-IP header
585	   and determines the MPLS forwarding table in which to lookup the inner
586	   MPLS label. This table is specific to the Aggregate MDT (or Aggregate
587	   Data MDT) label space. The inner label is unique within the context
588	   of the root of the MDT (as it is assigned by the root of the MDT,
589	   without any coordination with any other nodes). Thus it is not unique
590	   across multiple roots.  So, to unambiguously identify a particular
591	   VPN one has to know the label, and the context within which that
592	   label is unique. The context is provided by the P-IP header.

594	   The P-IP header and the GRE header is stripped. The lookup of the
595	   resulting VPN MPLS label determines the VRF or the VSI in which the
596	   receiver PE needs to do the C-multicast data packet lookup. It then
597	   strips the inner MPLS label and sends the packet to the VRF/VSI for
598	   multicast data forwarding.

600	13. Security Considerations

602	   Security considerations discussed in [2547], [MVPN-PIM], [VPLS-BGP]
603	   and [VPLS-LDP] apply to this document.

605	14. Acknowledgments

607	   TBD

609	15. Normative References

611	   [PIM-SM]  "Protocol Independent Multicast - Sparse Mode (PIM-SM)",
612	   Fenner, Handley, Holbrook, Kouvelas, October 2003, draft-ietf-pim-
613	   sm-v2-new-08.txt

615	   [2547] "BGP/MPLS VPNs", Rosen, Rekhter, et. al., September 2003,
616	   draft-ietf-l3vpn-rfc2547bis-01.txt

618	   [MVPN-PIM] R. Aggarwal, A. Lohiya, T. Pusateri, Y. Rekhter, "Base
619	   Specification for Multicast in MPLS/BGP VPNs", draft-raggarwa-
620	   l3vpn-2547-mvpn-00.txt

622	   [RFC2119] "Key words for use in RFCs to Indicate Requirement
623	   Levels.", Bradner, March 1997

625	   [RFC3107] Y. Rekhter, E. Rosen, "Carrying Label Information in
626	   BGP-4", RFC3107.

628	   [VPLS-BGP] K. Kompella, Y. Rekther, "Virtual Private LAN Service",
629	   draft-ietf-l2vpn-vpls-bgp-02.txt

631	   [VPLS-LDP] M. Lasserre, V. Kompella, "Virtual Private LAN Services
632	   over MPLS", draft-ietf-l2vpn-vpls-ldp-03.txt

634	16. Informative References

636	   [ROSEN] E. Rosen, Y. Cai, I. Wijnands, "Multicast in MPLS/BGP IP
637	   VPNs", draft-rosen-vpn-mcast-07.txt

639	17. Author Information

641	17.1. Editor Information

643	   Rahul Aggarwal
644	   Juniper Networks
645	   1194 North Mathilda Ave.
646	   Sunnyvale, CA 94089
647	   Email: rahul@juniper.net

649	17.2. Contributor Information

651	   Yakov Rekhter
652	   Juniper Networks
653	   1194 North Mathilda Ave.
654	   Sunnyvale, CA 94089
655	   Email: yakov@juniper.net

657	   Anil Lohiya
658	   Juniper Networks
659	   1194 North Mathilda Ave.
660	   Sunnyvale, CA 94089
661	   Email: alohiya@juniper.net

663	   Tom Pusateri
664	   Juniper Networks
665	   1194 North Mathilda Ave.
666	   Sunnyvale, CA 94089
667	   Email: pusateri@juniper.net

669	   Lenny Giuliano
670	   Juniper Networks
671	   1194 North Mathilda Ave.
672	   Sunnyvale, CA 94089
673	   Email: lenny@juniper.net

675	   Chaitanya Kodeboniya
676	   Juniper Networks
677	   1194 North Mathilda Ave.

679	   Sunnyvale, CA 94089
680	   Email: ck@juniper.net

682	18. Intellectual Property

684	   The IETF takes no position regarding the validity or scope of any
685	   Intellectual Property Rights or other rights that might be claimed to
686	   pertain to the implementation or use of the technology described in
687	   this document or the extent to which any license under such rights
688	   might or might not be available; nor does it represent that it has
689	   made any independent effort to identify any such rights.  Information
690	   on the procedures with respect to rights in RFC documents can be
691	   found in BCP 78 and BCP 79.

693	   Copies of IPR disclosures made to the IETF Secretariat and any
694	   assurances of licenses to be made available, or the result of an
695	   attempt made to obtain a general license or permission for the use of
696	   such proprietary rights by implementers or users of this
697	   specification can be obtained from the IETF on-line IPR repository at
698	   http://www.ietf.org/ipr.

700	   The IETF invites any interested party to bring to its attention any
701	   copyrights, patents or patent applications, or other proprietary
702	   rights that may cover technology that may be required to implement
703	   this standard.  Please address the information to the IETF at ietf-
704	   ipr@ietf.org.

706	19. Full Copyright Statement

708	   Copyright (C) The Internet Society (2004). This document is subject
709	   to the rights, licenses and restrictions contained in BCP 78 and
710	   except as set forth therein, the authors retain all their rights.

712	   This document and the information contained herein is provided on an
713	   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
714	   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
715	   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
716	   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
717	   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

719	20. Acknowledgement

721	   Funding for the RFC Editor function is currently provided by the
722	   Internet Society.