idnits 2.17.1 

draft-ietf-l2vpn-vpls-bgp-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 15.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1330.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1307.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1314.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1320.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 37 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 168 instances of too long lines in the document, the longest
     one being 1 character in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (June 21, 2006) is 6518 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 2385 (ref. '2') (Obsoleted by RFC 5925)

  -- Obsolete informational reference (is this intentional?): RFC 2796 (ref.
     '8') (Obsoleted by RFC 4456)

  == Outdated reference: A later version (-09) exists of
     draft-ietf-l3vpn-bgpvpn-auto-07

  == Outdated reference: A later version (-10) exists of
     draft-kompella-l2vpn-l2vpn-01


     Summary: 5 errors (**), 0 flaws (~~), 5 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                   K. Kompella, Ed.
2	Internet-Draft                                           Y. Rekhter, Ed.
3	Expires: December 23, 2006                              Juniper Networks
4	                                                            June 21, 2006

6	   Virtual Private LAN Service (VPLS) Using BGP for Auto-discovery and
7	                                Signaling
8	                       draft-ietf-l2vpn-vpls-bgp-08

10	Status of this Memo

12	    By submitting this Internet-Draft, each author represents that any
13	    applicable patent or other IPR claims of which he or she is aware
14	    have been or will be disclosed, and any of which he or she becomes
15	    aware will be disclosed, in accordance with Section 6 of BCP 79.

17	    Internet-Drafts are working documents of the Internet Engineering
18	    Task Force (IETF), its areas, and its working groups.  Note that
19	    other groups may also distribute working documents as Internet-
20	    Drafts.

22	    Internet-Drafts are draft documents valid for a maximum of six months
23	    and may be updated, replaced, or obsoleted by other documents at any
24	    time.  It is inappropriate to use Internet-Drafts as reference
25	    material or to cite them other than as "work in progress."

27	    The list of current Internet-Drafts can be accessed at
28	    http://www.ietf.org/ietf/1id-abstracts.txt.

30	    The list of Internet-Draft Shadow Directories can be accessed at
31	    http://www.ietf.org/shadow.html.

33	    This Internet-Draft will expire on December 23, 2006.

35	Copyright Notice

37	    Copyright (C) The Internet Society (2006).

39	Abstract

41	    Virtual Private LAN (Local Area Network) Service (VPLS), also known
42	    as Transparent LAN Service, and Virtual Private Switched Network
43	    service, is a useful Service Provider offering.  The service offers a
44	    Layer 2 Virtual Private Network (VPN); however, in the case of VPLS,
45	    the customers in the VPN are connected by a multipoint Ethernet LAN,
46	    in contrast to the usual Layer 2 VPNs, which are point-to-point in
47	    nature.

49	    This document describes the functions required to offer VPLS, a
50	    mechanism for signaling a VPLS, and rules for forwarding VPLS frames
51	    across a packet switched network.

53	Table of Contents

55	    1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
56	      1.1.  Scope of this Document . . . . . . . . . . . . . . . . . .  4
57	      1.2.  Conventions used in this document  . . . . . . . . . . . .  5
58	      1.3.  Changes from version 06 to 07  . . . . . . . . . . . . . .  5
59	      1.4.  Changes from version 05 to 06  . . . . . . . . . . . . . .  6
60	      1.5.  Changes from version 04 to 05  . . . . . . . . . . . . . .  6
61	      1.6.  Changes from version 03 to 04  . . . . . . . . . . . . . .  7
62	    2.  Functional Model . . . . . . . . . . . . . . . . . . . . . . .  8
63	      2.1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  8
64	      2.2.  Assumptions  . . . . . . . . . . . . . . . . . . . . . . .  9
65	      2.3.  Interactions . . . . . . . . . . . . . . . . . . . . . . .  9
66	    3.  Control Plane  . . . . . . . . . . . . . . . . . . . . . . . . 11
67	      3.1.  Autodiscovery  . . . . . . . . . . . . . . . . . . . . . . 11
68	        3.1.1.  Functions  . . . . . . . . . . . . . . . . . . . . . . 11
69	        3.1.2.  Protocol Specification . . . . . . . . . . . . . . . . 12
70	      3.2.  Signaling  . . . . . . . . . . . . . . . . . . . . . . . . 12
71	        3.2.1.  Label Blocks . . . . . . . . . . . . . . . . . . . . . 13
72	        3.2.2.  VPLS BGP NLRI  . . . . . . . . . . . . . . . . . . . . 13
73	        3.2.3.  PW Setup and Teardown  . . . . . . . . . . . . . . . . 14
74	        3.2.4.  Signaling PE Capabilities  . . . . . . . . . . . . . . 15
75	      3.3.  BGP VPLS Operation . . . . . . . . . . . . . . . . . . . . 16
76	      3.4.  Multi-AS VPLS  . . . . . . . . . . . . . . . . . . . . . . 17
77	        3.4.1.  a) VPLS-to-VPLS connections at the ASBRs.  . . . . . . 18
78	        3.4.2.  b) EBGP redistribution of VPLS information between
79	                ASBRs. . . . . . . . . . . . . . . . . . . . . . . . . 19
80	        3.4.3.  c) Multi-hop EBGP redistribution of VPLS
81	                information between ASes.  . . . . . . . . . . . . . . 20
82	        3.4.4.  Allocation of VE IDs Across Multiple ASes  . . . . . . 20
83	      3.5.  Multi-homing and Path Selection  . . . . . . . . . . . . . 21
84	      3.6.  Hierarchical BGP VPLS  . . . . . . . . . . . . . . . . . . 21
85	    4.  Data Plane . . . . . . . . . . . . . . . . . . . . . . . . . . 24
86	      4.1.  Encapsulation  . . . . . . . . . . . . . . . . . . . . . . 24
87	      4.2.  Forwarding . . . . . . . . . . . . . . . . . . . . . . . . 24
88	        4.2.1.  MAC address learning . . . . . . . . . . . . . . . . . 24
89	        4.2.2.  Aging  . . . . . . . . . . . . . . . . . . . . . . . . 24
90	        4.2.3.  Flooding . . . . . . . . . . . . . . . . . . . . . . . 25
91	        4.2.4.  Broadcast and Multicast  . . . . . . . . . . . . . . . 25
92	        4.2.5.  "Split Horizon" Forwarding . . . . . . . . . . . . . . 26
93	        4.2.6.  Qualified and Unqualified Learning . . . . . . . . . . 26
94	        4.2.7.  Class of Service . . . . . . . . . . . . . . . . . . . 26
95	    5.  Deployment Options . . . . . . . . . . . . . . . . . . . . . . 28
96	    6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 29
97	    7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 31
98	    8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 32
99	      8.1.  Normative References . . . . . . . . . . . . . . . . . . . 32
100	      8.2.  Informative References . . . . . . . . . . . . . . . . . . 32
101	    Appendix A.  Contributors  . . . . . . . . . . . . . . . . . . . . 34
102	    Appendix B.  Acknowledgements  . . . . . . . . . . . . . . . . . . 35
103	    Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 36
104	    Intellectual Property and Copyright Statements . . . . . . . . . . 37

106	1.  Introduction

108	    Virtual Private LAN Service (VPLS), also known as Transparent LAN
109	    Service, and Virtual Private Switched Network service, is a useful
110	    service offering.  A Virtual Private LAN appears in (almost) all
111	    respects as an Ethernet LAN to customers of a Service Provider.
112	    However, in a VPLS, the customers are not all connected to a single
113	    LAN; the customers may be spread across a metro or wide area.  In
114	    essence, a VPLS glues together several individual LANs across a
115	    packet-switched network to appear and function as a single LAN ([9]).
116	    This is accomplished by incorporating MAC address learning, flooding
117	    and forwarding functions in the context of pseudowires that connect
118	    these individual LANs across the packet-switched network.

120	    This document details the functions needed to offer VPLS, and then
121	    goes on to describe a mechanism for the autodiscovery of the
122	    endpoints of a VPLS as well as for signaling a VPLS.  It also
123	    describes how VPLS frames are transported over tunnels across a
124	    packet switched network.  The autodiscovery and signaling mechanism
125	    uses BGP as the control plane protocol.  This document also briefly
126	    discusses deployment options, in particular, the notion of decoupling
127	    functions across devices.

129	    Alternative approaches include: [14], which allows one to build a
130	    Layer 2 VPN with Ethernet as the interconnect; and [13]), which
131	    allows one to set up an Ethernet connection across a packet-switched
132	    network.  Both of these, however, offer point-to-point Ethernet
133	    services.  What distinguishes VPLS from the above two is that a VPLS
134	    offers a multipoint service.  A mechanism for setting up pseudowires
135	    for VPLS using the Label Distribution Protocol (LDP) is defined in
136	    [10].

138	1.1.  Scope of this Document

140	    This document has four major parts: defining a VPLS functional model;
141	    defining a control plane for setting up VPLS; defining the data plane
142	    for VPLS (encapsulation and forwarding of data); and defining various
143	    deployment options.

145	    The functional model underlying VPLS is laid out in Section 2.  This
146	    describes the service being offered, the network components that
147	    interact to provide the service, and at a high level their
148	    interactions.

150	    The control plane described in this document uses Multiprotocol BGP
151	    [4] to establish VPLS service, i.e., for the autodiscovery of VPLS
152	    members and for the setup and teardown of the pseudowires that
153	    constitute a given VPLS instance.  Section 3 focuses on this, and
154	    also describes how a VPLS that spans Autonomous System boundaries is
155	    set up, as well as how multi-homing is handled.  Using BGP as the
156	    control plane for VPNs is not new (see [14], [6] and [11]): what is
157	    described here is based on the mechanisms proposed in [6].

159	    The forwarding plane and the actions that a participating Provider
160	    Edge (PE) router offering the VPLS service must take is described in
161	    Section 4.

163	    In Section 5, the notion of 'decoupled' operation is defined, and the
164	    interaction of decoupled and non-decoupled PEs is described.
165	    Decoupling allows for more flexible deployment of VPLS.

167	1.2.  Conventions used in this document

169	    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
170	    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
171	    document are to be interpreted as described in RFC 2119 ([1]).

173	1.3.  Changes from version 06 to 07

175	    [NOTE to RFC Editor: this section is to be removed before
176	    publication.]

178	    Note: the DISCUSSes below are referred to by id; they can be accessed
179	    at https://datatracker.ietf.org/public/
180	    pidtracker.cgi?command=view_comment&id=[ID]

182	    Updated title of doc to reflect use of BGP.  (Fenner's DISCUSS id
183	    44901).

185	    Addressed Russ Housley's DISCUSSes on Figure 6 and Section 6 (ids
186	    44778 and 44779).

188	    Addressed Sam Hartman's DISCUSS on the Security Considerations (id
189	    48432).

191	    Resolution of Kessens' DISCUSS (id 44870):

193	    1.  Reference to RFC 4364 has been made normative.  There is no
194	        normative text in ref draft-kompella-l2vpn-l2vpn -- any such text
195	        has long since been incorporated directly into this document.

197	    2.  Description and IANA section updated.

199	    3.  Expanded section (b) of Section 3.4 to clarify the data plane
200	        operation for option b.

202	    4.  Updated Section 3.5 to clarify that a VPLS customer can run STP
203	        independent of whether the SP uses multi-homing or not.

205	    5.  P bit text deleted (left over from an earlier edit.)

207	    6.  Addressed (hopefully) by Sam's DISCUSS.

209	    7.  Updated Security Considerations to incorporate the techniques
210	        described in RFC 4364 for inter-AS VPNs.  Also, added a paragraph
211	        stating that misconfiguration could cause inter-VPLS connections,
212	        just as can happen with RFC 4364.

214	    Updated references; added reference to RFC 4023.

216	1.4.  Changes from version 05 to 06

218	    [NOTE to RFC Editor: this section is to be removed before
219	    publication.]

221	    Changes in response to GenART review.

223	    Updated Abstract and Introduction to make it clear that VPLS is an
224	    Ethernet-based service.

226	    Added sections on Aging, Broadcast and Multicast, Qualified and
227	    Unqualified learning and CoS.  Also added a section on scaling the
228	    BGP control plane.  These were requested for consistency between the
229	    BGP and LDP VPLS documents.

231	    Added a section clarifying the concepts of label blocks, why they are
232	    necessary and how they are used.

234	    For multi-AS operation, added a short introduction to the three
235	    options, comparing their usage.

237	    Lots of clean-up: consistent usage of terms, expansion of acronyms
238	    before use, references.

240	1.5.  Changes from version 04 to 05

242	    [NOTE to RFC Editor: this section is to be removed before
243	    publication.]

245	    Updated IANA section to reflect agreement with authors of [11] that
246	    the two docs should use the same AFI for L2VPN information.

248	    Addressed comments received from Alex Zinin.  No technical changes,
249	    but a more complete description to cover the issues that Alex raised:

251	    1.  encoding of BGP NEXT_HOP for the new AFI/SAFI is not described

253	    2.  VE ID, Block offset, Block size, Label base are not described
254	        anywhere

256	    3.  no information on how the receiving PE choose the PW label

258	    4.  section 3.2.2 talks about PE capabilities all of a sudden and
259	        introduces a L2 Info Community, whose fields and use are not
260	        described

262	    Changes to address these:

264	    1.  Broke up section 3.2.1 into "Concepts" and "PW Setup".

266	    2.  Expanded section on "Signaling PE Capabilities".

268	    3.  Added a new section 3.3 "BGP VPLS Operation".

270	    4.  Minor tweaking, e.g. to fix section number references.

272	1.6.  Changes from version 03 to 04

274	    [NOTE to RFC Editor: this section is to be removed before
275	    publication.]

277	    Incorporated IDR review comments from Eric Ji, Chaitanya Kodeboyina,
278	    and Mike Loomis.  Most changes are clarifications and rewording for
279	    better readability.  The substantive changes are to remove several
280	    flags from the control field.

282	2.  Functional Model

284	    This will be described with reference to the following figure.

286	                                                        -----
287	                                                       /  A1 \
288	         ----                                     ____CE1     |
289	        /    \          --------       --------  /    |       |
290	       |  A2 CE2-      /        \     /        PE1     \     /
291	        \    /   \    /          \___/          | \     -----
292	         ----     ---PE2                        |  \
293	                     |                          |   \   -----
294	                     | Service Provider Network |    \ /     \
295	                     |                          |     CE5  A5 |
296	                     |            ___           |   /  \     /
297	              |----|  \          /   \         PE4_/    -----
298	              |u-PE|--PE3       /     \       /
299	              |----|    --------       -------
300	       ----  /   |    ----
301	      /    \/    \   /    \               CE = Customer Edge Device
302	     |  A3 CE3    --CE4 A4 |              PE = Provider Edge Router
303	      \    /         \    /               u-PE = Layer 2 Aggregation
304	       ----           ----                A<n> = Customer site n

306	    Figure 1: Example of a VPLS

308	2.1.  Terminology

310	    Terminology similar to that in [6] is used: a Service Provider (SP)
311	    network with P (Provider-only) and PE (Provider Edge) routers, and
312	    customers with CE (Customer Edge) devices.  Here, however, there is
313	    an additional concept, that of a "u-PE", a Layer 2 PE device used for
314	    Layer 2 aggregation.  The notion of u-PE is described further in
315	    Section 5.  PE and u-PE devices are "VPLS-aware", which means that
316	    they know that a VPLS service is being offered.  We will call these
317	    VPLS edge devices, which could be either a PE or an u-PE, a VE.

319	    In contrast, the CE device (which may be owned and operated by either
320	    the SP or the customer) is VPLS-unaware; as far as the CE is
321	    concerned, it is connected to the other CEs in the VPLS via a Layer 2
322	    switched network.  This means that there should be no changes to a CE
323	    device, either to the hardware or the software, in order to offer
324	    VPLS.

326	    A CE device may be connected to a PE or a u-PE via Layer 2 switches
327	    that are VPLS-unaware.  From a VPLS point of view, such Layer 2
328	    switches are invisible, and hence will not be discussed further.
329	    Furthermore, a u-PE may be connected to a PE via Layer 2 and Layer 3
330	    devices; this will be discussed further in a later section.

332	    The term "demultiplexor" refers to an identifier in a data packet
333	    that identifies both the VPLS to which the packet belongs as well as
334	    the ingress PE.  In this document, the demultiplexor is an MPLS
335	    label.

337	    The term "VPLS" will refer to the service as well as a particular
338	    instantiation of the service (i.e., an emulated LAN); it should be
339	    clear from the context which usage is intended.

341	2.2.  Assumptions

343	    The Service Provider Network is a packet switched network.  The PEs
344	    are assumed to be (logically) fully meshed with tunnels over which
345	    packets that belong to a service (such as VPLS) are encapsulated and
346	    forwarded.  These tunnels can be IP tunnels, such as GRE, or MPLS
347	    tunnels, established by RSVP-TE or LDP.  These tunnels are
348	    established independently of the services offered over them; the
349	    signaling and establishment of these tunnels are not discussed in
350	    this document.

352	    "Flooding" and MAC address "learning" (see Section 4) are an integral
353	    part of VPLS.  However, these activities are private to an SP device,
354	    i.e., in the VPLS described below, no SP device requests another SP
355	    device to flood packets or learn MAC addresses on its behalf.

357	    All the PEs participating in a VPLS are assumed to be fully meshed in
358	    the data plane, i.e., there is a bidirectional pseudowire between
359	    every pair of PEs participating in that VPLS, and thus every
360	    (ingress) PE can send a VPLS packet to the egress PE(s) directly,
361	    without the need for an intermediate PE (see Section 4.2.5.)  This
362	    requires that VPLS PEs are logically fully meshed in the control
363	    plane so that a PE can send a message to another PE to set up the
364	    necessary pseudowires.  See Section 3.6 for a discussion on
365	    alternatives to achieve a logical full mesh in the control plane.

367	2.3.  Interactions

369	    VPLS is a "LAN Service" in that CE devices that belong to VPLS V can
370	    interact through the SP network as if they were connected by a LAN.
371	    VPLS is "private" in that CE devices that belong to different VPLSs
372	    cannot interact.  VPLS is "virtual" in that multiple VPLSs can be
373	    offered over a common packet switched network.

375	    PE devices interact to "discover" all the other PEs participating in
376	    the same VPLS, and to exchange demultiplexors.  These interactions
377	    are control-driven, not data-driven.

379	    u-PEs interact with PEs to establish connections with remote PEs or
380	    u-PEs in the same VPLS.  This interaction is control-driven.

382	    PE devices can participate simultaneously in both VPLS and IP VPNs
383	    ([6]).  These are independent services, and the information exchanged
384	    for each type of service is kept separate as the Network Layer
385	    Reachability Information (NLRI) used for this exchange have different
386	    Address Family Identifiers (AFI) and Subsequent Address Family
387	    Identifiers (SAFI).  Consequently, an implementation MUST maintain a
388	    separate routing storage for each service.  However, multiple
389	    services can use the same underlying tunnels; the VPLS or VPN label
390	    is used to demultiplex the packets belonging to different services.

392	3.  Control Plane

394	    There are two primary functions of the VPLS control plane:
395	    autodiscovery, and setup and teardown of the pseudowires that
396	    constitute the VPLS, often called signaling.  Section 3.1 and
397	    Section 3.2 describe these functions.  Both of these functions are
398	    accomplished with a single BGP Update advertisement; Section 3.3
399	    describes how this is done by detailing BGP protocol operation for
400	    VPLS.  Section 3.4 describes the setting up of pseudowires that span
401	    Autonomous Systems.  Section 3.5 describes how multi-homing is
402	    handled.

404	3.1.  Autodiscovery

406	    Discovery refers to the process of finding all the PEs that
407	    participate in a given VPLS instance.  A PE can either be configured
408	    with the identities of all the other PEs in a given VPLS, or the PE
409	    can use some protocol to discover the other PEs.  The latter is
410	    called autodiscovery.

412	    The former approach is fairly configuration-intensive, especially
413	    since it is required that the PEs participating in a given VPLS are
414	    fully meshed (i.e., that every PE in a given VPLS establish
415	    pseudowires to every other PE in that VPLS).  Furthermore, when the
416	    topology of a VPLS changes (i.e., a PE is added to, or removed from
417	    the VPLS), the VPLS configuration on all PEs in that VPLS must be
418	    changed.

420	    In the autodiscovery approach, each PE "discovers" which other PEs
421	    are part of a given VPLS by means of some protocol, in this case BGP.
422	    This allows each PE's configuration to consist only of the identity
423	    of the VPLS instance established on this PE, not the identity of
424	    every other PE in that VPLS instance -- that is auto-discovered.
425	    Moreover, when the topology of a VPLS changes, only the affected PE's
426	    configuration changes; other PEs automatically find out about the
427	    change and adapt.

429	3.1.1.  Functions

431	    A PE that participates in a given VPLS instance V must be able to
432	    tell all other PEs in VPLS V that it is also a member of V. A PE must
433	    also have a means of declaring that it no longer participates in a
434	    VPLS.  To do both of these, the PE must have a means of identifying a
435	    VPLS and a means by which to communicate to all other PEs.

437	    U-PE devices also need to know what constitutes a given VPLS;
438	    however, they don't need the same level of detail.  The PE (or PEs)
439	    to which a u-PE is connected gives the u-PE an abstraction of the
440	    VPLS; this is described in section 5.

442	3.1.2.  Protocol Specification

444	    The specific mechanism for autodiscovery described here is based on
445	    [14] and [6]; it uses BGP extended communities [5] to identify
446	    members of a VPLS, in particular, the Route Target community, whose
447	    format is described in [5].  The semantics of the use of Route
448	    Targets is described in [6]; their use in VPLS is identical.

450	    As it has been assumed that VPLSs are fully meshed, a single Route
451	    Target RT suffices for a given VPLS V, and in effect that RT is the
452	    identifier for VPLS V.

454	    A PE announces (typically via I-BGP) that it belongs to VPLS V by
455	    annotating its NLRIs for V (see next subsection) with Route Target
456	    RT, and acts on this by accepting NLRIs from other PEs that have
457	    Route Target RT.  A PE announces that it no longer participates in V
458	    by withdrawing all NLRIs that it had advertised with Route Target RT.

460	3.2.  Signaling

462	    Once discovery is done, each pair of PEs in a VPLS must be able to
463	    establish (and tear down) pseudowires to each other, i.e., exchange
464	    (and withdraw) demultiplexors.  This process is known as signaling.
465	    Signaling is also used to transmit certain characteristics of the
466	    pseudowires that a PE sets up for a given VPLS.

468	    Recall that a demultiplexor is used to distinguish among several
469	    different streams of traffic carried over a tunnel, each stream
470	    possibly representing a different service.  In the case of VPLS, the
471	    demultiplexor not only says to which specific VPLS a packet belongs,
472	    but also identifies the ingress PE.  The former information is used
473	    for forwarding the packet; the latter information is used for
474	    learning MAC addresses.  The demultiplexor described here is an MPLS
475	    label.  However, note that the PE-to-PE tunnels need not be MPLS
476	    tunnels.

478	    Using a distinct BGP Update message to send a demultiplexor to each
479	    remote PE would require the originating PE to send N such messages
480	    for N remote PEs.  The solution described in this document allows a
481	    PE to send a single (common) Update message that contains
482	    demultiplexors for all the remote PEs, instead of N individual
483	    messages.  Doing this reduces the control plane load both on the
484	    originating PE as well as on the BGP Route Reflectors that may be
485	    involved in distributing this Update to other PEs.

487	3.2.1.  Label Blocks

489	    To accomplish this, we introduce the notion of "label blocks".  A
490	    label block, defined by a label base LB and a VE block size VBS, is a
491	    contiguous set of labels {LB, LB+1, ..., LB+VBS-1}.  Here's how label
492	    blocks work.  All PEs within a given VPLS are assigned unique VE IDs
493	    as part of their configuration.  A PE X wishing to send a VPLS update
494	    sends the same label block information to all other PEs.  Each
495	    receiving PE infers the label intended for PE X by adding their
496	    (unique) VE ID to the label base.  In this manner, each receiving PE
497	    gets a unique demultiplexor for PE X for that VPLS.

499	    This simple notion is enhanced with the concept of a VE block offset
500	    VBO.  A label block defined by <LB, VBO, VBS> is the set {LB+VBO, LB+
501	    VBO+1, ..., LB+VBO+VBS-1}.  Thus, instead of a single large label
502	    block to cover all VE IDs in a VPLS, one can have several label
503	    blocks, each with a different label base.  This makes label block
504	    management easier, and also allows PE X to cater gracefully to a PE
505	    joining a VPLS with a VE ID that is not covered by the set of label
506	    blocks that that PE X has already advertised.

508	    When a PE starts up, or is configured with a new VPLS instance, the
509	    BGP process may wish to wait to receive several advertisements for
510	    that VPLS instance from other PEs to improve the efficiency of label
511	    block allocation.

513	3.2.2.  VPLS BGP NLRI

515	    The VPLS BGP NLRI described below, with a new AFI and SAFI (see [4])
516	    is used to exchange VPLS membership and demultiplexors.

518	    A VPLS BGP NLRI has the following information elements: a VE ID, a VE
519	    Block Offset, a VE Block Size and a label base.  The format of the
520	    VPLS NLRI is given below.  The AFI is the L2VPN AFI (to be assigned
521	    by IANA), and the SAFI is the VPLS SAFI (65).  The Length field is in
522	    octets.

524	       +------------------------------------+
525	       |  Length (2 octets)                 |
526	       +------------------------------------+
527	       |  Route Distinguisher  (8 octets)   |
528	       +------------------------------------+
529	       |  VE ID (2 octets)                  |
530	       +------------------------------------+
531	       |  VE Block Offset (2 octets)        |
532	       +------------------------------------+
533	       |  VE Block Size (2 octets)          |
534	       +------------------------------------+
535	       |  Label Base (3 octets)             |
536	       +------------------------------------+

538	    Figure 2: BGP NLRI for VPLS Information

540	    A PE participating in a VPLS must have at least one VE ID.  If the PE
541	    is the VE, it typically has one VE ID.  If the PE is connected to
542	    several u-PEs, it has a distinct VE ID for each u-PE.  It may
543	    additionally have a VE ID for itself, if it itself acts as a VE for
544	    that VPLS.  In what follows, we will call the PE announcing the VPLS
545	    NLRI PE-a, and we will assume that PE-a owns VE ID V (either
546	    belonging to PE-a itself, or to a u-PE connected to PE-a).

548	    VE IDs are typically assigned by the network administrator.  Their
549	    scope is local to a VPLS.  A given VE ID should belong to only one
550	    PE, unless a CE is multi-homed (see Section 3.5).

552	    A label block is a set of demultiplexor labels used to reach a given
553	    VE ID.  A VPLS BGP NLRI with VE ID V, VE Block Offset VBO, VE Block
554	    Size VBS and label base LB communicates to its peers the following:

556	        label block for V: labels from LB to (LB + VBS - 1), and

558	        remote VE set for V: from VBO to (VBO + VBS - 1).

560	    There is a one-to-one correspondence between the remote VE set and
561	    the label block: VE ID (VBO + n) corresponds to label (LB + n).

563	3.2.3.  PW Setup and Teardown

565	    Suppose PE-a is part of VPLS foo, and makes an announcement with VE
566	    ID V, VE Block Offset VBO, VE Block Size VBS and label base LB.  If
567	    PE-b is also part of VPLS foo, and has VE ID W, PE-b does the
568	    following:

570	    1.  checks if W is part of PE-a's 'remote VE set': if VBO <= W < VBO
571	        + VBS, then W is part of PE-a's remote VE set.  If not, PE-b
572	        ignores this message, and skips the rest of this procedure.

574	    2.  sets up a PW to PE-a: the demultiplexor label to send traffic
575	        from PE-b to PE-a is computed as (LB + W - VBO).

577	    3.  checks if V is part of any 'remote VE set' that PE-b announced,
578	        i.e., PE-b checks if V belongs to some remote VE set that PE-b
579	        announced, say with VE Block Offset VBO', VE Block Size VBS' and
580	        label base LB'.  If not, PE-b MUST make a new announcement as
581	        described in Section 3.3.

583	    4.  sets up a PW from PE-a: the demultiplexor label over which PE-b
584	        should expect traffic from PE-a is computed as: (LB' + V - VBO').

586	    If Y withdraws an NLRI for V that X was using, then X MUST tear down
587	    its ends of the pseudowire between X and Y.

589	3.2.4.  Signaling PE Capabilities

591	    The following extended attribute, the "Layer2 Info Extended
592	    Community", is used to signal control information about the
593	    pseudowires to be setup for a given VPLS.  The extended community
594	    value is to be allocated by IANA (currently used value is 0x800A).
595	    This information includes the Encaps Type (type of encapsulation on
596	    the pseudowires), Control Flags (control information regarding the
597	    pseudowires) and the Maximum Transmission Unit (MTU) to be used on
598	    the pseudowires.

600	    The Encaps Type for VPLS is 19.

602	       +------------------------------------+
603	       | Extended community type (2 octets) |
604	       +------------------------------------+
605	       |  Encaps Type (1 octet)             |
606	       +------------------------------------+
607	       |  Control Flags (1 octet)           |
608	       +------------------------------------+
609	       |  Layer-2 MTU (2 octet)             |
610	       +------------------------------------+
611	       |  Reserved (2 octets)               |
612	       +------------------------------------+

614	    Figure 3: Layer2 Info Extended Community
615	        0 1 2 3 4 5 6 7
616	       +-+-+-+-+-+-+-+-+
617	       |   MBZ     |C|S|      (MBZ = MUST Be Zero)
618	       +-+-+-+-+-+-+-+-+

620	    Figure 4: Control Flags Bit Vector

622	    With reference to Figure 4, the following bits in the Control Flags
623	    are defined; the remaining bits, designated MBZ, MUST be set to zero
624	    when sending and MUST be ignored when receiving this community.

626	         Name   Meaning
627	            C   A Control word (
628	    [7]
629	    ) MUST or MUST NOT be present when
630	                sending VPLS packets to this PE, depending on whether C
631	                is 1 or 0, respectively
632	            S   Sequenced delivery of frames MUST or MUST NOT be used
633	                when sending VPLS packets to this PE. depending on
634	                whether S is 1 or 0, respectively

636	3.3.  BGP VPLS Operation

638	    To create a new VPLS, say VPLS foo, a network administrator must pick
639	    a RT for VPLS foo, say RT-foo.  This will be used by all PEs that
640	    serve VPLS foo.  To configure a given PE, say PE-a, to be part of
641	    VPLS foo, the network administrator only has to choose a VE ID V for
642	    PE-a.  (If PE-a is connected to u-PEs, PE-a may be configured with
643	    more than one VE ID; in that case, the following is done for each VE
644	    ID).  The PE may also be configured with a Route Distinguisher (RD);
645	    if not, it generates a unique RD for VPLS foo.  Say the RD is
646	    RD-foo-a.  PE-a then generates an initial label block and a remote VE
647	    set for V, defined by VE Block Offset VBO, VE Block Size VBS and
648	    label base LB.  These may be empty.

650	    PE-a then creates a VPLS BGP NLRI with RD RD-foo-a, VE ID V, VE Block
651	    Offset VBO, VE Block Size VBS and label base LB.  To this, it
652	    attaches a Layer2 Info Extended Community and a RT, RT-foo.  It sets
653	    the BGP Next Hop for this NLRI as itself, and announces this NLRI to
654	    its peers.  The Network Layer protocol associated with the Network
655	    Address of the Next Hop for the combination <AFI=L2VPN AFI, SAFI=VPLS
656	    SAFI> is IP; this association is required by [4], Section 5.  If the
657	    value of the Length of the Next Hop field is 4, then the Next Hop
658	    contains an IPv4 address.  If this value is 16, then the Next Hop
659	    contains an IPv6 address.

661	    If PE-a hears from another PE, say PE-b, a VPLS BGP announcement with
662	    RT-foo and VE ID W, then PE-a knows that PE-b is a member of the same
663	    VPLS (autodiscovery).  PE-a then has to set up its part of a VPLS
664	    pseudowire between PE-a and PE-b, using the mechanisms in
665	    Section 3.2.  Similarly, PE-b will have discovered that PE-a is in
666	    the same VPLS, and PE-b must set up its part of the VPLS pseudowire.
667	    Thus, signaling and pseudowire setup is also achieved with the same
668	    Update message.

670	    If W is not in any remote VE set that PE-a announced for VE ID V in
671	    VPLS foo, PE-b will not be able to set up its part of the pseudowire
672	    to PE-a.  To address this, PE-a can choose to withdraw the old
673	    announcement(s) it made for VPLS foo, and announce a new Update with
674	    a larger remote VE set and corresponding label block that covers all
675	    VE IDs that are in VPLS foo.  This however, may cause some service
676	    disruption.  An alternative for PE-a is to create a new remote VE set
677	    and corresponding label block, and announce them in a new Update,
678	    without withdrawing previous announcements.

680	    If PE-a's configuration is changed to remove VE ID V from VPLS foo,
681	    then PE-a MUST withdraw all its announcements for VPLS foo that
682	    contain VE ID V. If all of PE-a's links to its CEs in VPLS foo go
683	    down, then PE-a SHOULD either withdraw all its NLRIs for VPLS foo, or
684	    let other PEs in the VPLS foo know in some way that PE-a is no longer
685	    connected to its CEs.

687	3.4.  Multi-AS VPLS

689	    As in [14] and [6], the above autodiscovery and signaling functions
690	    are typically announced via I-BGP.  This assumes that all the sites
691	    in a VPLS are connected to PEs in a single Autonomous System (AS).

693	    However, sites in a VPLS may connect to PEs in different ASes.  This
694	    leads to two issues: 1) there would not be an I-BGP connection
695	    between those PEs, so some means of signaling across ASes is needed;
696	    and 2) there may not be PE-to-PE tunnels between the ASes.

698	    A similar problem is solved in [6], Section 10.  Three methods are
699	    suggested to address issue (1); all these methods have analogs in
700	    multi-AS VPLS.

702	    Here is a diagram for reference:

704	      __________       ____________       ____________       __________
705	     /          \     /            \     /            \     /          \
706	                 \___/        AS 1  \   /  AS 2        \___/
707	                                     \ /
708	       +-----+           +-------+    |    +-------+           +-----+
709	       | PE1 | ---...--- | ASBR1 | ======= | ASBR2 | ---...--- | PE2 |
710	       +-----+           +-------+    |    +-------+           +-----+
711	                  ___                / \                ___
712	                 /   \              /   \              /   \
713	     \__________/     \____________/     \____________/     \__________/

715	    Figure 6: Inter-AS VPLS

717	    As in the above reference, three methods for signaling inter-provider
718	    VPLS are given; these are presented in order of increasing
719	    scalability.  Method (a) is the easiest to understand conceptually,
720	    and the easiest to deploy; however, it requires an Ethernet
721	    interconnect between the ASes, and both VPLS control and data plane
722	    state on the AS border routers (ASBRs).  Method (b) requires VPLS
723	    control plane state on the ASBRs and MPLS on the AS-AS interconnect
724	    (which need not be Ethernet).  Method (c) requires MPLS on the AS-AS
725	    interconnect, but no VPLS state of any kind on the ASBRs.

727	3.4.1.  a) VPLS-to-VPLS connections at the ASBRs.

729	    In this method, an AS Border Router (ASBR1) acts as a PE for all
730	    VPLSs that span AS1 and an AS to which ASBR1 is connected, such as
731	    AS2 here.  The ASBR on the neighboring AS (ASBR2) is viewed by ASBR1
732	    as a CE for the VPLSs that span AS1 and AS2; similarly, ASBR2 acts as
733	    a PE for this VPLS from AS2's point of view, and views ASBR1 as a CE.

735	    This method does not require MPLS on the ASBR1-ASBR2 link, but does
736	    require that this link carry Ethernet traffic, and that there be a
737	    separate VLAN sub-interface for each VPLS traversing this link.  It
738	    further requires that ASBR1 does the PE operations (discovery,
739	    signaling, MAC address learning, flooding, encapsulation, etc.) for
740	    all VPLSs that traverse ASBR1.  This imposes a significant burden on
741	    ASBR1, both on the control plane and the data plane, which limits the
742	    number of multi-AS VPLSs.

744	    Note that in general, there will be multiple connections between a
745	    pair of ASes, for redundancy.  In this case, the Spanning Tree
746	    Protocol (STP) ([15]), or some other means of loop detection and
747	    prevention, must be run on each VPLS that spans these ASes, so that a
748	    loop-free topology can be constructed in each VPLS.  This imposes a
749	    further burden on the ASBRs and PEs participating in those VPLSs, as
750	    these devices would need to run a loop detection algorithm for each
751	    such VPLS.  How this may be achieved is outside the scope of this
752	    document.

754	3.4.2.  b) EBGP redistribution of VPLS information between ASBRs.

756	    This method requires I-BGP peerings between the PEs in AS1 and ASBR1
757	    in AS1 (perhaps via route reflectors), an E-BGP peering between ASBR1
758	    and ASBR2 in AS2, and I-BGP peerings between ASBR2 and the PEs in
759	    AS2.  In the above example, PE1 sends a VPLS NLRI to ASBR1 with a
760	    label block and itself as the BGP nexthop; ASBR1 sends the NLRI to
761	    ASBR2 with new labels and itself as the BGP nexthop; and ASBR2 sends
762	    the NLRI to PE2 with new labels and itself as the nexthop.
763	    Correspondingly, there are three tunnels: T1 from PE1 to ASBR1, T2
764	    from ASBR1 to ASBR2, and T3 from ASBR2 to PE2.  Within each tunnel,
765	    the VPLS label to be used is determined by the receiving device;
766	    e.g., the VPLS label within T1 is a label from the label block that
767	    ASBR1 sent to PE1.  The ASBRs are responsible for receiving VPLS
768	    packets encapsulated in a tunnel, and performing the appropriate
769	    label swap operations described next so that the next receiving
770	    device can correctly identify and forward the packet.

772	    The VPLS NLRI that ASBR1 sends to ASBR2 (and the NLRI that ASBR2
773	    sends to PE2) is identical to the VPLS NLRI that PE1 sends to ASBR1,
774	    except for the label block.  To be precise, the Length, the Route
775	    Distinguisher, the VE ID, the VE Block Offset, and the VE Block Size
776	    MUST be the same; the Label Base may be different.  Furthermore,
777	    ASBR1 must also update its forwarding path as follows: if the Label
778	    Base sent by PE1 is L1, the Label-block Size is N, the Label Base
779	    sent by ASBR1 is L2, and the tunnel label from ASBR1 to PE1 is T,
780	    then ASBR1 must install the following in the forwarding path:

782	       swap L2 with L1 and push T,

784	       swap L2+1 with L1+1 and push T, ...

786	       swap L2+N-1 with L1+N-1 and push T.

788	    ASBR2 must act similarly, except that it may not need a tunnel label
789	    if it is directly connected with ASBR1.

791	    When PE2 wants to send a VPLS packet to PE1, PE2 uses its VE ID to
792	    get the right VPLS label from ASBR2's label block for PE1, and uses a
793	    tunnel label to reach ASBR2.  ASBR2 swaps the VPLS label with the
794	    label from ASBR1; ASBR1 then swaps the VPLS label with the label from
795	    PE1, and pushes a tunnel label to reach PE1.

797	    In this method, one needs MPLS on the ASBR1-ASBR2 interface, but
798	    there is no requirement that the link layer be Ethernet.
799	    Furthermore, the ASBRs take part in distributing VPLS information.
800	    However, the data plane requirements of the ASBRs is much simpler
801	    than in method (a), being limited to label operations.  Finally, the
802	    construction of loop-free VPLS topologies is done by routing
803	    decisions, viz.  BGP path and nexthop selection, so there is no need
804	    to run the Spanning Tree Protocol on a per-VPLS basis.  Thus, this
805	    method is considerably more scalable than method (a).

807	3.4.3.  c) Multi-hop EBGP redistribution of VPLS information between
808	         ASes.

810	    In this method, there is a multi-hop E-BGP peering between the PEs
811	    (or preferably, a Route Reflector) in AS1 and the PEs (or Route
812	    Reflector) in AS2.  PE1 sends a VPLS NLRI with labels and nexthop
813	    self to PE2; if this is via route reflectors, the BGP nexthop is not
814	    changed.  This requires that there be a tunnel LSP from PE1 to PE2.
815	    This tunnel LSP can be created exactly as in [6], section 10 (c), for
816	    example using E-BGP to exchange labeled IPv4 routes for the PE
817	    loopbacks.

819	    When PE1 wants to send a VPLS packet to PE2, it pushes the VPLS label
820	    corresponding to its own VE ID onto the packet.  It then pushes the
821	    tunnel label(s) to reach PE2.

823	    This method requires no VPLS information (in either the control or
824	    the data plane) on the ASBRs.  The ASBRs only need to set up PE-to-PE
825	    tunnel LSPs in the control plane, and do label operations in the data
826	    plane.  Again, as in the case of method (b), the construction of
827	    loop-free VPLS topologies is done by routing decisions, i.e., BGP
828	    path and nexthop selection, so there is no need to run the Spanning
829	    Tree Protocol on a per-VPLS basis.  This option is likely to be the
830	    most scalable of the three methods presented here.

832	3.4.4.  Allocation of VE IDs Across Multiple ASes

834	    In order to ease the allocation of VE IDs for a VPLS that spans
835	    multiple ASes, one can allocate ranges for each AS.  For example, AS1
836	    uses VE IDs in the range 1 to 100, AS2 from 101 to 200, etc.  If
837	    there are 10 sites attached to AS1 and 20 to AS2, the allocated VE
838	    IDs could be 1-10 and 101 to 120.  This minimizes the number of VPLS
839	    NLRIs that are exchanged while ensuring that VE IDs are kept unique.

841	    In the above example, if AS1 needed more than 100 sites, then another
842	    range can be allocated to AS1.  The only caveat is that there be no
843	    overlap between VE ID ranges among ASes.  The exception to this rule
844	    is multi-homing, which is dealt with below.

846	3.5.  Multi-homing and Path Selection

848	    It is often desired to multi-home a VPLS site, i.e., to connect it to
849	    multiple PEs, perhaps even in different ASes.  In such a case, the
850	    PEs connected to the same site can either be configured with the same
851	    VE ID or with different VE IDs.  In the latter case, it is mandatory
852	    to run STP on the CE device, and possibly on the PEs, to construct a
853	    loop-free VPLS topology.  How this can be accomplished is outside the
854	    scope of this document; however, the rest of this section will
855	    describe in some detail the former case.  Note that multi-homing by
856	    the SP and STP on the CEs can co-exist; thus it is recommended that
857	    the VPLS customer run STP if the CEs are able to.

859	    In the case where the PEs connected to the same site are assigned the
860	    same VE ID, a loop-free topology is constructed by routing
861	    mechanisms, in particular, by BGP path selection.  When a BGP speaker
862	    receives two equivalent NLRIs (see below for the definition), it
863	    applies standard path selection criteria such as Local Preference and
864	    AS Path Length to determine which NLRI to choose; it MUST pick only
865	    one.  If the chosen NLRI is subsequently withdrawn, the BGP speaker
866	    applies path selection to the remaining equivalent VPLS NLRIs to pick
867	    another; if none remain, the forwarding information associated with
868	    that NLRI is removed.

870	    Two VPLS NLRIs are considered equivalent from a path selection point
871	    of view if the Route Distinguisher, the VE ID and the VE Block Offset
872	    are the same.  If two PEs are assigned the same VE ID in a given
873	    VPLS, they MUST use the same Route Distinguisher, and they SHOULD
874	    announce the same VE Block Size for a given VE Offset.

876	3.6.  Hierarchical BGP VPLS

878	    This section discusses how one can scale the VPLS control plane when
879	    using BGP.  There are at least three aspects of scaling the control
880	    plane:

882	    1.  alleviating the full mesh connectivity requirement among VPLS BGP
883	        speakers;

885	    2.  limiting BGP VPLS message passing to just the interested speakers
886	        rather than all BGP speakers; and

888	    3.  simplifying the addition and deletion of BGP speakers, whether
889	        for VPLS or other applications.

891	    Fortunately, the use of BGP for Internet routing as well as for IP
892	    VPNs has yielded several good solutions for all these problems.  The
893	    basic technique is hierarchy, using BGP Route Reflectors (RRs) ([8]).

895	    The idea is to designate a small set of Route Reflectors which are
896	    themselves fully meshed, and then establish a BGP session between
897	    each BGP speaker and one or more RRs.  In this way, there is no need
898	    of direct full mesh connectivity among all the BGP speakers.  If the
899	    particular scaling needs of a provider requires a large number of
900	    RRs, then this technique can be applied recursively: the full mesh
901	    connectivity among the RRs can be brokered by yet another level of
902	    RRs.  The use of RRs solves problems 1 and 3 above.

904	    It is important to note that RRs, as used for VPLS and VPNs, are
905	    purely a control plane technique.  The use of RRs introduces no data
906	    plane state and no data plane forwarding requirements on the RRs, and
907	    does not in any way change the forwarding path of VPLS traffic.  This
908	    is in contrast to the technique of Hierarchical VPLS defined in [10].

910	    Another consequence of this approach is that it is not required that
911	    one set of RRs handles all BGP messages, or that a particular RR
912	    handle all messages from a given PE.  One can define several sets of
913	    RRs, for example a set to handle VPLS, another to handle IP VPNs and
914	    another for Internet routing.  Another partitioning could be to have
915	    some subset of VPLSs and IP VPNs handled by one set of RRs, and
916	    another subset of VPLSs and IP VPNs handled by another set of RRs;
917	    the use of Route Target Filtering (RTF), described in [12] can make
918	    this simpler and more effective.

920	    Finally, problem 2 (that of limiting BGP VPLS message passing to just
921	    the interested BGP speakers) is addressed by the use of RTF.  This
922	    technique is orthogonal to the use of RRs, but works well in
923	    conjunction with RRs.  RTF is also very effective in inter-AS VPLS;
924	    more details on how RTF works and its benefits are provided in [12].

926	    It is worth mentioning an aspect of the control plane that is often a
927	    source of confusion.  No MAC addresses are exchanged via BGP.  All
928	    MAC address learning and aging is done in the data plane individually
929	    by each PE.  The only task of BGP VPLS message exchange is
930	    autodiscovery and label exchange.

932	    Thus, BGP processing for VPLS occurs when

934	    1.  a PE joins or leaves a VPLS; or

936	    2.  a failure occurs in the network, bringing down a PE-PE tunnel or
937	        a PE-CE link.

939	    These events are relatively rare, and typically, each such event
940	    causes one BGP update to be generated.  Coupled with BGP's messaging
941	    efficiency when used for signaling VPLS, these observations lead to
942	    the conclusion that BGP as a control plane for VPLS will scale quite
943	    well both in terms of processing and memory requirements.

945	4.  Data Plane

947	    This section discusses two aspects of the data plane for PEs and
948	    u-PEs implementing VPLS: encapsulation and forwarding.

950	4.1.  Encapsulation

952	    Ethernet frames received from CE devices are encapsulated for
953	    transmission over the packet switched network connecting the PEs.
954	    The encapsulation is as in [7].

956	4.2.  Forwarding

958	    VPLS packets are classified as belonging to a given service instance
959	    and associated forwarding table based on the interface over which the
960	    packet is received.  Packets are forwarded in the context of the
961	    service instance based on the destination MAC address.  The former
962	    mapping is determined by configuration.  The latter is the focus of
963	    this section.

965	4.2.1.  MAC address learning

967	    As was mentioned earlier, the key distinguishing feature of VPLS is
968	    that it is a multipoint service.  This means that the entire Service
969	    Provider network should appear as a single logical learning bridge
970	    for each VPLS that the SP network supports.  The logical ports for
971	    the SP "bridge" are the customer ports as well as the pseudowires on
972	    a VE.  Just as a learning bridge learns MAC addresses on its ports,
973	    the SP bridge must learn MAC addresses at its VEs.

975	    Learning consists of associating source MAC addresses of packets with
976	    the (logical) ports on which they arrive; this association is the
977	    Forwarding Information Base (FIB).  The FIB is used for forwarding
978	    packets.  For example, suppose the bridge receives a packet with
979	    source MAC address S on (logical) port P. If subsequently, the bridge
980	    receives a packet with destination MAC address S, it knows that it
981	    should send the packet out on port P.

983	    If a VE learns a source MAC address S on logical port P, then later
984	    sees S on a different port P', then the VE MUST update its FIB to
985	    reflect the new port P'.  A VE MAY implement a mechanism to damp
986	    flapping of source ports for a given MAC address.

988	4.2.2.  Aging

990	    VPLS PEs SHOULD have an aging mechanism to remove a MAC address
991	    associated with a logical port, much the same as learning bridges do.
992	    This is required so that a MAC address can be relearned if it "moves"
993	    from a logical port to another logical port, either because the
994	    station to which that MAC address belongs really has moved, or
995	    because of a topology change in the LAN that causes this MAC address
996	    to arrive on a new port.  In addition, aging reduces the size of a
997	    VPLS MAC table to just the active MAC addresses, rather than all MAC
998	    addresses in that VPLS.

1000	    The "age" of a source MAC address S on a logical port P is the time
1001	    since it was last seen as a source MAC on port P. If the age exceeds
1002	    the aging time T, S MUST be flushed from the FIB.  This of course
1003	    means that every time S is seen as a source MAC address on port P,
1004	    S's age is reset.

1006	    An implementation SHOULD provide a configurable knob to set the aging
1007	    time T on a per-VPLS basis.  In addition, an implementation MAY
1008	    accelerate aging of all MAC addresses in a VPLS if it detects certain
1009	    situations, such as a Spanning Tree topology change in that VPLS.

1011	4.2.3.  Flooding

1013	    When a bridge receives a packet to a destination that is not in its
1014	    FIB, it floods the packet on all the other ports.  Similarly, a VE
1015	    will flood packets to an unknown destination to all other VEs in the
1016	    VPLS.

1018	    In Figure 1 above, if CE2 sent an Ethernet frame to PE2, and the
1019	    destination MAC address on the frame was not in PE2's FIB (for that
1020	    VPLS), then PE2 would be responsible for flooding that frame to every
1021	    other PE in the same VPLS.  On receiving that frame, PE1 would be
1022	    responsible for further flooding the frame to CE1 and CE5 (unless PE1
1023	    knew which CE "owned" that MAC address).

1025	    On the other hand, if PE3 received the frame, it could delegate
1026	    further flooding of the frame to its u-PE.  If PE3 was connected to 2
1027	    u-PEs, it would announce that it has two u-PEs.  PE3 could either
1028	    announce that it is incapable of flooding, in which case it would
1029	    receive two frames, one for each u-PE, or it could announce that it
1030	    is capable of flooding, in which case it would receive one copy of
1031	    the frame, which it would then send to both u-PEs.

1033	4.2.4.  Broadcast and Multicast

1035	    There is a well-known broadcast MAC address.  An Ethernet frame whose
1036	    destination MAC address is the broadcast MAC address must be sent to
1037	    all stations in that VPLS.  This can be accomplished by the same
1038	    means that is used for flooding.

1040	    There is also an easily recognized set of "multicast" MAC addresses.

1042	    Ethernet frames with a destination multicast MAC address MAY be
1043	    broadcast to all stations; a VE MAY also use certain techniques to
1044	    restrict transmission of multicast frames to a smaller set of
1045	    receivers, those that have indicated interest in the corresponding
1046	    multicast group.  Discussion of this is outside the scope of this
1047	    document.

1049	4.2.5.  "Split Horizon" Forwarding

1051	    When a PE capable of flooding (say PEx) receives a broadcast Ethernet
1052	    frame, or one with an unknown destination MAC address, it must flood
1053	    the frame.  If the frame arrived from an attached CE, PEx must send a
1054	    copy of the frame to every other attached CE, as well as to all other
1055	    PEs participating in the VPLS.  If, on the other hand, the frame
1056	    arrived from another PE (say PEy), PEx must send a copy of the packet
1057	    only to attached CEs.  PEx MUST NOT send the frame to other PEs,
1058	    since PEy would have already done so.  This notion has been termed
1059	    "split horizon" forwarding, and is a consequence of the PEs being
1060	    logically fully meshed for VPLS.

1062	    Split horizon forwarding rules apply to broadcast and multicast
1063	    packets, as well as packets to an unknown MAC address.

1065	4.2.6.  Qualified and Unqualified Learning

1067	    The key for normal Ethernet MAC learning is usually just the
1068	    (6-octet) MAC address.  This is called "unqualified learning".
1069	    However, it is also possible that the key for learning includes the
1070	    VLAN tag when present; this is called "qualified learning".

1072	    In the case of VPLS, learning is done in the context of a VPLS
1073	    instance, which typically corresponds to a customer.  If the customer
1074	    uses VLAN tags, one can make the same distinctions of qualified and
1075	    unqualified learning.  If the key for learning within a VPLS is just
1076	    the MAC address, then this VPLS is operating under unqualified
1077	    learning.  If the key for learning is (customer VLAN tag + MAC
1078	    address), then this VPLS is operating under qualified learning.

1080	    Choosing between qualified and unqualified learning involves several
1081	    factors, the most important of which is whether one wants a single
1082	    global broadcast domain (unqualified), or a broadcast domain per VLAN
1083	    (qualified).  The latter makes flooding and broadcasting more
1084	    efficient, but requires larger MAC tables.  These considerations
1085	    apply equally to normal Ethernet forwarding and to VPLS.

1087	4.2.7.  Class of Service

1089	    In order to offer different Classes of Service within a VPLS, an
1090	    implementation MAY choose to map 802.1p bits in a customer Ethernet
1091	    frame with a VLAN tag to an appropriate setting of EXP bits in the
1092	    pseudowire and/or tunnel label, allowing for differential treatment
1093	    of VPLS frames in the packet-switched network.

1095	    To be useful, an implementation SHOULD allow this mapping function to
1096	    be different for each VPLS, as each VPLS customer may have their own
1097	    view of the required behavior for a given setting of 802.1p bits.

1099	5.  Deployment Options

1101	    In deploying a network that supports VPLS, the SP must decide what
1102	    functions the VPLS-aware device closest to the customer (the VE)
1103	    supports.  The default case described in this document is that the VE
1104	    is a PE.  However, there are a number of reasons that the VE might be
1105	    a device that does all the Layer 2 functions (such as MAC address
1106	    learning and flooding), and a limited set of Layer 3 functions (such
1107	    as communicating to its PE), but, for example, doesn't do full-
1108	    fledged discovery and PE-to-PE signaling.  Such a device is called a
1109	    "u-PE".

1111	    As both of these cases have benefits, one would like to be able to
1112	    "mix and match" these scenarios.  The signaling mechanism presented
1113	    here allows this.  For example, in a given provider network, one PE
1114	    may be directly connected to CE devices; another may be connected to
1115	    u-PEs that are connected to CEs; and a third may be connected
1116	    directly to a customer over some interfaces and to u-PEs over others.
1117	    All these PEs perform discovery and signaling in the same manner.
1118	    How they do learning and forwarding depends on whether or not there
1119	    is a u-PE; however, this is a local matter, and is not signaled.
1120	    However, the details of the operation of a u-PE and its interactions
1121	    with PEs and other u-PEs is beyond the scope of this document.

1123	6.  Security Considerations

1125	    The focus in Virtual Private LAN Service is the privacy of data,
1126	    i.e., that data in a VPLS is only distributed to other nodes in that
1127	    VPLS and not to any external agent or other VPLS.  Note that VPLS
1128	    does not offer confidentiality, integrity, or authentication: VPLS
1129	    packets are sent in the clear in the packet-switched network, and a
1130	    man-in-the-middle can eavesdrop, and may be able to inject packets
1131	    into the data stream.  If security is desired, the PE-to-PE tunnels
1132	    can be IPsec tunnels.  For more security, the end systems in the VPLS
1133	    sites can use appropriate means of encryption to secure their data
1134	    even before it enters the Service Provider network.

1136	    There are two aspects to achieving data privacy in a VPLS: securing
1137	    the control plane, and protecting the forwarding path.  Compromise of
1138	    the control plane could result in a PE sending data belonging to some
1139	    VPLS to another VPLS, or blackholing VPLS data, or even sending it to
1140	    an eavesdropper, none of which are acceptable from a data privacy
1141	    point of view.  Since all control plane exchanges are via BGP,
1142	    techniques such as in [2] help authenticate BGP messages, making it
1143	    harder to spoof updates (which can be used to divert VPLS traffic to
1144	    the wrong VPLS), or withdraws (denial of service attacks).  In the
1145	    multi-AS options (b) and (c), this also means protecting the inter-AS
1146	    BGP sessions, between the ASBRs, the PEs or the Route Reflectors.
1147	    One can also use the techniques described in section 10 (b) and (c)
1148	    of [6], both for the control plane and the data plane.  Note that [2]
1149	    will not help in keeping VPLS labels private -- knowing the labels,
1150	    one can eavesdrop on VPLS traffic.  However, this requires access to
1151	    the data path within a Service Provider network.

1153	    There can also be misconfiguration leading to unintentional
1154	    connection of CEs in different VPLSs.  This can be caused, for
1155	    example, by associating the wrong Route Target with a VPLS instance.
1156	    This problem, shared by [6], is for further study.

1158	    Protecting the data plane requires ensuring that PE-to-PE tunnels are
1159	    well-behaved (this is outside the scope of this document), and that
1160	    VPLS labels are accepted only from valid interfaces.  For a PE, valid
1161	    interfaces comprise links from P routers.  For an ASBR, a valid
1162	    interface is a link from an ASBR in an AS that is part of a given
1163	    VPLS.  It is especially important in the case of multi-AS VPLSs that
1164	    one accept VPLS packets only from valid interfaces.

1166	    MPLS-in-IP and MPLS-in-GRE tunneling are specified in [3].  If it is
1167	    desired to use such tunnels to carry VPLS packets, then the security
1168	    considerations described in Section 8 of that document must be fully
1169	    understood.  Any implementation of VPLS that allows VPLS packets to
1170	    be tunneled as described in that document MUST contain an
1171	    implementation of IPsec that can be used as therein described.  If
1172	    the tunnel is not secured by IPsec, then the technique of IP address
1173	    filtering at the border routers, described in Section 8.2 of that
1174	    document, is the only means of ensuring that a packet that exits the
1175	    tunnel at a particular egress PE was actually placed in the tunnel by
1176	    the proper tunnel head node (i.e., that the packet does not have a
1177	    spoofed source address).  Since border routers frequently filter only
1178	    source addresses, packet filtering may not be effective unless the
1179	    egress PE can check the IP source address of any tunneled packet it
1180	    receives, and compare it to a list of IP addresses that are valid
1181	    tunnel head addresses.  Any implementation that allows MPLS-in-IP
1182	    and/or MPLS-in-GRE tunneling to be used without IPsec MUST allow the
1183	    egress PE to validate in this manner the IP source address of any
1184	    tunneled packet that it receives.

1186	7.  IANA Considerations

1188	    IANA is asked to allocate an AFI for L2VPN information (suggested
1189	    value: 25).  This should be the same as the AFI requested by [11].

1191	    IANA is asked to allocate an extended community value for the Layer2
1192	    Info Extended Community (suggested value: 0x800a).

1194	8.  References

1196	8.1.  Normative References

1198	    [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
1199	         Levels", BCP 14, RFC 2119, March 1997.

1201	    [2]  Heffernan, A., "Protection of BGP Sessions via the TCP MD5
1202	         Signature Option", RFC 2385, August 1998.

1204	    [3]  Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating MPLS in
1205	         IP or Generic Routing Encapsulation (GRE)", RFC 4023,
1206	         March 2005.

1208	    [4]  Bates, T., "Multiprotocol Extensions for BGP-4",
1209	         draft-ietf-idr-rfc2858bis-10 (work in progress), March 2006.

1211	    [5]  Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
1212	         Communities Attribute", RFC 4360, February 2006.

1214	    [6]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks
1215	         (VPNs)", RFC 4364, February 2006.

1217	    [7]  Martini, L., Rosen, E., El-Aawar, N., and G. Heron,
1218	         "Encapsulation Methods for Transport of Ethernet over MPLS
1219	         Networks", RFC 4448, April 2006.

1221	8.2.  Informative References

1223	    [8]   Bates, T., Chandra, R., and E. Chen, "BGP Route Reflection - An
1224	          Alternative to Full Mesh IBGP", RFC 2796, April 2000.

1226	    [9]   Andersson, L. and E. Rosen, "Framework for Layer 2 Virtual
1227	          Private Networks (L2VPNs)", draft-ietf-l2vpn-l2-framework-05
1228	          (work in progress), June 2004.

1230	    [10]  Lasserre, M. and V. Kompella, "Virtual Private LAN Services
1231	          Using LDP", draft-ietf-l2vpn-vpls-ldp-09 (work in progress),
1232	          June 2006.

1234	    [11]  Ould-Brahim, H., "Using BGP as an Auto-Discovery Mechanism for
1235	          VR-based Layer-3 VPNs", draft-ietf-l3vpn-bgpvpn-auto-07 (work
1236	          in progress), April 2006.

1238	    [12]  Marques, P., "Constrained VPN Route Distribution",
1239	          draft-ietf-l3vpn-rt-constrain-02 (work in progress), June 2005.

1241	    [13]  Martini, L., "Pseudowire Setup and Maintenance using the Label
1242	          Distribution Protocol", draft-ietf-pwe3-control-protocol-17
1243	          (work in progress), June 2005.

1245	    [14]  Kompella, K., "Layer 2 VPNs Over Tunnels",
1246	          draft-kompella-l2vpn-l2vpn-01 (work in progress), January 2006.

1248	    [15]  Institute of Electrical and Electronics Engineers, "Information
1249	          technology - Telecommunications and information exchange
1250	          between systems - Local and metropolitan area networks - Common
1251	          specifications - Part 3: Media Access Control (MAC) Bridges:
1252	          Revision. This is a revision of ISO/IEC 10038: 1993, 802.1j-
1253	          1992 and 802.6k-1992.  It incorporates P802.11c, P802.1p and
1254	          P802.12e.  ISO/IEC 15802-3: 1998.", IEEE Standard 802.1D,
1255	          July 1998.

1257	Appendix A.  Contributors

1259	    The following contributed to this document:

1261	            Javier Achirica, Telefonica
1262	            Loa Andersson, Acreo
1263	            Chaitanya Kodeboyina, Juniper
1264	            Giles Heron, Tellabs
1265	            Sunil Khandekar, Alcatel
1266	            Vach Kompella, Alcatel
1267	            Marc Lasserre, Riverstone
1268	            Pierre Lin
1269	            Pascal Menezes
1270	            Ashwin Moranganti, Appian
1271	            Hamid Ould-Brahim, Nortel
1272	            Seo Yeong-il, Korea Tel

1274	Appendix B.  Acknowledgements

1276	    Thanks to Joe Regan and Alfred Nothaft for their contributions.  Many
1277	    thanks too to Eric Ji, Chaitanya Kodeboyina, Mike Loomis and Elwyn
1278	    Davies for their detailed reviews.

1280	Authors' Addresses

1282	    Kireeti Kompella (editor)
1283	    Juniper Networks
1284	    1194 N. Mathilda Ave.
1285	    Sunnyvale, CA  94089
1286	    US

1288	    Email: kireeti@juniper.net

1290	    Yakov Rekhter (editor)
1291	    Juniper Networks
1292	    1194 N. Mathilda Ave.
1293	    Sunnyvale, CA  94089
1294	    US

1296	    Email: yakov@juniper.net

1298	Intellectual Property Statement

1300	    The IETF takes no position regarding the validity or scope of any
1301	    Intellectual Property Rights or other rights that might be claimed to
1302	    pertain to the implementation or use of the technology described in
1303	    this document or the extent to which any license under such rights
1304	    might or might not be available; nor does it represent that it has
1305	    made any independent effort to identify any such rights.  Information
1306	    on the procedures with respect to rights in RFC documents can be
1307	    found in BCP 78 and BCP 79.

1309	    Copies of IPR disclosures made to the IETF Secretariat and any
1310	    assurances of licenses to be made available, or the result of an
1311	    attempt made to obtain a general license or permission for the use of
1312	    such proprietary rights by implementers or users of this
1313	    specification can be obtained from the IETF on-line IPR repository at
1314	    http://www.ietf.org/ipr.

1316	    The IETF invites any interested party to bring to its attention any
1317	    copyrights, patents or patent applications, or other proprietary
1318	    rights that may cover technology that may be required to implement
1319	    this standard.  Please address the information to the IETF at
1320	    ietf-ipr@ietf.org.

1322	Disclaimer of Validity

1324	    This document and the information contained herein are provided on an
1325	    "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1326	    OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1327	    ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1328	    INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1329	    INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1330	    WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1332	Copyright Statement

1334	    Copyright (C) The Internet Society (2006).  This document is subject
1335	    to the rights, licenses and restrictions contained in BCP 78, and
1336	    except as set forth therein, the authors retain all their rights.

1338	Acknowledgment

1340	    Funding for the RFC Editor function is currently provided by the
1341	    Internet Society.