idnits 2.17.1 

draft-ietf-l2vpn-vpls-bgp-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 15.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1241.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1218.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1225.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1231.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 3 instances of too long lines in the document, the longest one
     being 2 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (December 28, 2005) is 6687 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 2385 (ref. '2') (Obsoleted by RFC 5925)

  == Outdated reference: A later version (-10) exists of
     draft-ietf-idr-rfc2858bis-07

  -- Obsolete informational reference (is this intentional?): RFC 2796 (ref.
     '6') (Obsoleted by RFC 4456)

  == Outdated reference: A later version (-09) exists of
     draft-ietf-l2vpn-vpls-ldp-08

  == Outdated reference: A later version (-09) exists of
     draft-ietf-l3vpn-bgpvpn-auto-06

  == Outdated reference: A later version (-10) exists of
     draft-kompella-l2vpn-l2vpn-00


     Summary: 5 errors (**), 0 flaws (~~), 6 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                   K. Kompella, Ed.
3	Internet-Draft                                           Y. Rekhter, Ed.
4	Expires: July 1, 2006                                   Juniper Networks
5	                                                       December 28, 2005

7	                      Virtual Private LAN Service
8	                      draft-ietf-l2vpn-vpls-bgp-06

10	Status of this Memo

12	   By submitting this Internet-Draft, each author represents that any
13	   applicable patent or other IPR claims of which he or she is aware
14	   have been or will be disclosed, and any of which he or she becomes
15	   aware will be disclosed, in accordance with Section 6 of BCP 79.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt.

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	   This Internet-Draft will expire on July 1, 2006.

35	Copyright Notice

37	   Copyright (C) The Internet Society (2005).

39	Abstract

41	   Virtual Private LAN (Local Area Network) Service (VPLS), also known
42	   as Transparent LAN Service, and Virtual Private Switched Network
43	   service, is a useful Service Provider offering.  The service offers a
44	   Layer 2 Virtual Private Network (VPN); however, in the case of VPLS,
45	   the customers in the VPN are connected by a multipoint Ethernet LAN,
46	   in contrast to the usual Layer 2 VPNs, which are point-to-point in
47	   nature.

49	   This document describes the functions required to offer VPLS, a
50	   mechanism for signaling a VPLS, and rules for forwarding VPLS frames
51	   across a packet switched network.

53	Table of Contents

55	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
56	     1.1.  Scope of this Document . . . . . . . . . . . . . . . . . .  4
57	     1.2.  Conventions used in this document  . . . . . . . . . . . .  5
58	     1.3.  Changes from version 05 to 06  . . . . . . . . . . . . . .  5
59	     1.4.  Changes from version 04 to 05  . . . . . . . . . . . . . .  5
60	     1.5.  Changes from version 03 to 04  . . . . . . . . . . . . . .  6
61	   2.  Functional Model . . . . . . . . . . . . . . . . . . . . . . .  7
62	     2.1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  7
63	     2.2.  Assumptions  . . . . . . . . . . . . . . . . . . . . . . .  8
64	     2.3.  Interactions . . . . . . . . . . . . . . . . . . . . . . .  8
65	   3.  Control Plane  . . . . . . . . . . . . . . . . . . . . . . . . 10
66	     3.1.  Autodiscovery  . . . . . . . . . . . . . . . . . . . . . . 10
67	       3.1.1.  Functions  . . . . . . . . . . . . . . . . . . . . . . 10
68	       3.1.2.  Protocol Specification . . . . . . . . . . . . . . . . 11
69	     3.2.  Signaling  . . . . . . . . . . . . . . . . . . . . . . . . 11
70	       3.2.1.  Label Blocks . . . . . . . . . . . . . . . . . . . . . 12
71	       3.2.2.  VPLS BGP NLRI  . . . . . . . . . . . . . . . . . . . . 12
72	       3.2.3.  PW Setup and Teardown  . . . . . . . . . . . . . . . . 13
73	       3.2.4.  Signaling PE Capabilities  . . . . . . . . . . . . . . 14
74	     3.3.  BGP VPLS Operation . . . . . . . . . . . . . . . . . . . . 15
75	     3.4.  Multi-AS VPLS  . . . . . . . . . . . . . . . . . . . . . . 16
76	       3.4.1.  a) VPLS-to-VPLS connections at the ASBRs.  . . . . . . 17
77	       3.4.2.  b) EBGP redistribution of VPLS information between
78	               ASBRs. . . . . . . . . . . . . . . . . . . . . . . . . 17
79	       3.4.3.  c) Multi-hop EBGP redistribution of VPLS
80	               information between ASes.  . . . . . . . . . . . . . . 18
81	       3.4.4.  Allocation of VE IDs Across Multiple ASes  . . . . . . 19
82	     3.5.  Multi-homing and Path Selection  . . . . . . . . . . . . . 19
83	     3.6.  Hierarchical BGP VPLS  . . . . . . . . . . . . . . . . . . 20
84	   4.  Data Plane . . . . . . . . . . . . . . . . . . . . . . . . . . 22
85	     4.1.  Encapsulation  . . . . . . . . . . . . . . . . . . . . . . 22
86	     4.2.  Forwarding . . . . . . . . . . . . . . . . . . . . . . . . 22
87	       4.2.1.  MAC address learning . . . . . . . . . . . . . . . . . 22
88	       4.2.2.  Aging  . . . . . . . . . . . . . . . . . . . . . . . . 23
89	       4.2.3.  Flooding . . . . . . . . . . . . . . . . . . . . . . . 23
90	       4.2.4.  Broadcast and Multicast  . . . . . . . . . . . . . . . 23
91	       4.2.5.  "Split Horizon" Forwarding . . . . . . . . . . . . . . 24
92	       4.2.6.  Qualified and Unqualified Learning . . . . . . . . . . 24
93	       4.2.7.  Class of Service . . . . . . . . . . . . . . . . . . . 25
94	   5.  Deployment Options . . . . . . . . . . . . . . . . . . . . . . 26
95	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 27
96	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 28
97	   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 29
98	     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 29
99	     8.2.  Informative References . . . . . . . . . . . . . . . . . . 29
100	   Appendix A.  Contributors  . . . . . . . . . . . . . . . . . . . . 31
101	   Appendix B.  Acknowledgements  . . . . . . . . . . . . . . . . . . 32
102	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 33
103	   Intellectual Property and Copyright Statements . . . . . . . . . . 34

105	1.  Introduction

107	   Virtual Private LAN Service (VPLS), also known as Transparent LAN
108	   Service, and Virtual Private Switched Network service, is a useful
109	   service offering.  A Virtual Private LAN appears in (almost) all
110	   respects as an Ethernet LAN to customers of a Service Provider.
111	   However, in a VPLS, the customers are not all connected to a single
112	   LAN; the customers may be spread across a metro or wide area.  In
113	   essence, a VPLS glues together several individual LANs across a
114	   packet-switched network to appear and function as a single LAN ([7]).
115	   This is accomplished by incorporating MAC address learning, flooding
116	   and forwarding functions in the context of pseudowires that connect
117	   these individual LANs across the packet-switched network.

119	   This document details the functions needed to offer VPLS, and then
120	   goes on to describe a mechanism for the autodiscovery of the
121	   endpoints of a VPLS as well as for signaling a VPLS.  It also
122	   describes how VPLS frames are transported over tunnels across a
123	   packet switched network.  The autodiscovery and signaling mechanism
124	   uses BGP as the control plane protocol.  This document also briefly
125	   discusses deployment options, in particular, the notion of decoupling
126	   functions across devices.

128	   Alternative approaches include: [13], which allows one to build a
129	   Layer 2 VPN with Ethernet as the interconnect; and [12]), which
130	   allows one to set up an Ethernet connection across a packet-switched
131	   network.  Both of these, however, offer point-to-point Ethernet
132	   services.  What distinguishes VPLS from the above two is that a VPLS
133	   offers a multipoint service.  A mechanism for setting up pseudowires
134	   for VPLS using the Label Distribution Protocol (LDP) is defined in
135	   [8].

137	1.1.  Scope of this Document

139	   This document has four major parts: defining a VPLS functional model;
140	   defining a control plane for setting up VPLS; defining the data plane
141	   for VPLS (encapsulation and forwarding of data); and defining various
142	   deployment options.

144	   The functional model underlying VPLS is laid out in Section 2.  This
145	   describes the service being offered, the network components that
146	   interact to provide the service, and at a high level their
147	   interactions.

149	   The control plane described in this document uses Multiprotocol BGP
150	   [3] to establish VPLS service, i.e., for the autodiscovery of VPLS
151	   members and for the setup and teardown of the pseudowires that
152	   constitute a given VPLS instance.  Section 3 focuses on this, and
153	   also describes how a VPLS that spans Autonomous System boundaries is
154	   set up, as well as how multi-homing is handled.  Using BGP as the
155	   control plane for VPNs is not new (see [13], [10] and [9]): what is
156	   described here is based on the mechanisms proposed in [10].

158	   The forwarding plane and the actions that a participating Provider
159	   Edge (PE) router offering the VPLS service must take is described in
160	   Section 4.

162	   In Section 5, the notion of 'decoupled' operation is defined, and the
163	   interaction of decoupled and non-decoupled PEs is described.
164	   Decoupling allows for more flexible deployment of VPLS.

166	1.2.  Conventions used in this document

168	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
169	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
170	   document are to be interpreted as described in RFC 2119 ([1]).

172	1.3.  Changes from version 05 to 06

174	   [NOTE to RFC Editor: this section is to be removed before
175	   publication.]

177	   Changes in response to GenART review.

179	   Updated Abstract and Introduction to make it clear that VPLS is an
180	   Ethernet-based service.

182	   Added sections on Aging, Broadcast and Multicast, Qualified and
183	   Unqualified learning and CoS.  Also added a section on scaling the
184	   BGP control plane.  These were requested for consistency between the
185	   BGP and LDP VPLS documents.

187	   Added a section clarifying the concepts of label blocks, why they are
188	   necessary and how they are used.

190	   For multi-AS operation, added a short introduction to the three
191	   options, comparing their usage.

193	   Lots of clean-up: consistent usage of terms, expansion of acronyms
194	   before use, references.

196	1.4.  Changes from version 04 to 05

198	   [NOTE to RFC Editor: this section is to be removed before
199	   publication.]
200	   Updated IANA section to reflect agreement with authors of [9] that
201	   the two docs should use the same AFI for L2VPN information.

203	   Addressed comments received from Alex Zinin.  No technical changes,
204	   but a more complete description to cover the issues that Alex raised:

206	   1.  encoding of BGP NEXT_HOP for the new AFI/SAFI is not described

208	   2.  VE ID, Block offset, Block size, Label base are not described
209	       anywhere

211	   3.  no information on how the receiving PE choose the PW label

213	   4.  section 3.2.2 talks about PE capabilities all of a sudden and
214	       introduces a L2 Info Community, whose fields and use are not
215	       described

217	   Changes to address these:

219	   1.  Broke up section 3.2.1 into "Concepts" and "PW Setup".

221	   2.  Expanded section on "Signaling PE Capabilities".

223	   3.  Added a new section 3.3 "BGP VPLS Operation".

225	   4.  Minor tweaking, e.g. to fix section number references.

227	1.5.  Changes from version 03 to 04

229	   [NOTE to RFC Editor: this section is to be removed before
230	   publication.]

232	   Incorporated IDR review comments from Eric Ji, Chaitanya Kodeboyina,
233	   and Mike Loomis.  Most changes are clarifications and rewording for
234	   better readability.  The substantive changes are to remove several
235	   flags from the control field.

237	2.  Functional Model

239	   This will be described with reference to the following figure.

241	                                                       -----
242	                                                      /  A1 \
243	        ----                                     ____CE1     |
244	       /    \          --------       --------  /    |       |
245	      |  A2 CE2-      /        \     /        PE1     \     /
246	       \    /   \    /          \___/          | \     -----
247	        ----     ---PE2                        |  \
248	                    |                          |   \   -----
249	                    | Service Provider Network |    \ /     \
250	                    |                          |     CE5  A5 |
251	                    |            ___           |   /  \     /
252	             |----|  \          /   \         PE4_/    -----
253	             |u-PE|--PE3       /     \       /
254	             |----|    --------       -------
255	      ----  /   |    ----
256	     /    \/    \   /    \               CE = Customer Edge Device
257	    |  A3 CE3    --CE4 A4 |              PE = Provider Edge Router
258	     \    /         \    /               u-PE = Layer 2 Aggregation
259	      ----           ----                A<n> = Customer site n

261	   Figure 1: Example of a VPLS

263	2.1.  Terminology

265	   Terminology similar to that in [10] is used: a Service Provider (SP)
266	   network with P (Provider-only) and PE (Provider Edge) routers, and
267	   customers with CE (Customer Edge) devices.  Here, however, there is
268	   an additional concept, that of a "u-PE", a Layer 2 PE device used for
269	   Layer 2 aggregation.  The notion of u-PE is described further in
270	   Section 5.  PE and u-PE devices are "VPLS-aware", which means that
271	   they know that a VPLS service is being offered.  We will call these
272	   VPLS edge devices, which could be either a PE or an u-PE, a VE.

274	   In contrast, the CE device (which may be owned and operated by either
275	   the SP or the customer) is VPLS-unaware; as far as the CE is
276	   concerned, it is connected to the other CEs in the VPLS via a Layer 2
277	   switched network.  This means that there should be no changes to a CE
278	   device, either to the hardware or the software, in order to offer
279	   VPLS.

281	   A CE device may be connected to a PE or a u-PE via Layer 2 switches
282	   that are VPLS-unaware.  From a VPLS point of view, such Layer 2
283	   switches are invisible, and hence will not be discussed further.
284	   Furthermore, a u-PE may be connected to a PE via Layer 2 and Layer 3
285	   devices; this will be discussed further in a later section.

287	   The term "demultiplexor" refers to an identifier in a data packet
288	   that identifies both the VPLS to which the packet belongs as well as
289	   the ingress PE.  In this document, the demultiplexor is an MPLS
290	   label.

292	   The term "VPLS" will refer to the service as well as a particular
293	   instantiation of the service (i.e., an emulated LAN); it should be
294	   clear from the context which usage is intended.

296	2.2.  Assumptions

298	   The Service Provider Network is a packet switched network.  The PEs
299	   are assumed to be (logically) fully meshed with tunnels over which
300	   packets that belong to a service (such as VPLS) are encapsulated and
301	   forwarded.  These tunnels can be IP tunnels, such as GRE, or MPLS
302	   tunnels, established by RSVP-TE or LDP.  These tunnels are
303	   established independently of the services offered over them; the
304	   signaling and establishment of these tunnels are not discussed in
305	   this document.

307	   "Flooding" and MAC address "learning" (see Section 4) are an integral
308	   part of VPLS.  However, these activities are private to an SP device,
309	   i.e., in the VPLS described below, no SP device requests another SP
310	   device to flood packets or learn MAC addresses on its behalf.

312	   All the PEs participating in a VPLS are assumed to be fully meshed in
313	   the data plane, i.e., there is a bidirectional pseudowire between
314	   every pair of PEs participating in that VPLS, and thus every
315	   (ingress) PE can send a VPLS packet to the egress PE(s) directly,
316	   without the need for an intermediate PE (see Section 4.2.5.)  This
317	   requires that VPLS PEs are logically fully meshed in the control
318	   plane so that a PE can send a message to another PE to set up the
319	   necessary pseudowires.  See Section 3.6 for a discussion on
320	   alternatives to achieve a logical full mesh in the control plane.

322	2.3.  Interactions

324	   VPLS is a "LAN Service" in that CE devices that belong to VPLS V can
325	   interact through the SP network as if they were connected by a LAN.
326	   VPLS is "private" in that CE devices that belong to different VPLSs
327	   cannot interact.  VPLS is "virtual" in that multiple VPLSs can be
328	   offered over a common packet switched network.

330	   PE devices interact to "discover" all the other PEs participating in
331	   the same VPLS, and to exchange demultiplexors.  These interactions
332	   are control-driven, not data-driven.

334	   u-PEs interact with PEs to establish connections with remote PEs or
335	   u-PEs in the same VPLS.  This interaction is control-driven.

337	   PE devices can participate simultaneously in both VPLS and IP VPNs
338	   ([10]).  These are independent services, and the information
339	   exchanged for each type of service is kept separate as the Network
340	   Layer Reachability Information (NLRI) used for this exchange have
341	   different Address Family Identifiers (AFI) and Subsequent Address
342	   Family Identifiers (SAFI).  Consequently, an implementation MUST
343	   maintain a separate routing storage for each service.  However,
344	   multiple services can use the same underlying tunnels; the VPLS or
345	   VPN label is used to demultiplex the packets belonging to different
346	   services.

348	3.  Control Plane

350	   There are two primary functions of the VPLS control plane:
351	   autodiscovery, and setup and teardown of the pseudowires that
352	   constitute the VPLS, often called signaling.  Section 3.1 and
353	   Section 3.2 describe these functions.  Both of these functions are
354	   accomplished with a single BGP Update advertisement; Section 3.3
355	   describes how this is done by detailing BGP protocol operation for
356	   VPLS.  Section 3.4 describes the setting up of pseudowires that span
357	   Autonomous Systems.  Section 3.5 describes how multi-homing is
358	   handled.

360	3.1.  Autodiscovery

362	   Discovery refers to the process of finding all the PEs that
363	   participate in a given VPLS instance.  A PE can either be configured
364	   with the identities of all the other PEs in a given VPLS, or the PE
365	   can use some protocol to discover the other PEs.  The latter is
366	   called autodiscovery.

368	   The former approach is fairly configuration-intensive, especially
369	   since it is required that the PEs participating in a given VPLS are
370	   fully meshed (i.e., that every PE in a given VPLS establish
371	   pseudowires to every other PE in that VPLS).  Furthermore, when the
372	   topology of a VPLS changes (i.e., a PE is added to, or removed from
373	   the VPLS), the VPLS configuration on all PEs in that VPLS must be
374	   changed.

376	   In the autodiscovery approach, each PE "discovers" which other PEs
377	   are part of a given VPLS by means of some protocol, in this case BGP.
378	   This allows each PE's configuration to consist only of the identity
379	   of the VPLS instance established on this PE, not the identity of
380	   every other PE in that VPLS instance -- that is auto-discovered.
381	   Moreover, when the topology of a VPLS changes, only the affected PE's
382	   configuration changes; other PEs automatically find out about the
383	   change and adapt.

385	3.1.1.  Functions

387	   A PE that participates in a given VPLS instance V must be able to
388	   tell all other PEs in VPLS V that it is also a member of V. A PE must
389	   also have a means of declaring that it no longer participates in a
390	   VPLS.  To do both of these, the PE must have a means of identifying a
391	   VPLS and a means by which to communicate to all other PEs.

393	   U-PE devices also need to know what constitutes a given VPLS;
394	   however, they don't need the same level of detail.  The PE (or PEs)
395	   to which a u-PE is connected gives the u-PE an abstraction of the
396	   VPLS; this is described in section 5.

398	3.1.2.  Protocol Specification

400	   The specific mechanism for autodiscovery described here is based on
401	   [13] and [10]; it uses BGP extended communities [4] to identify
402	   members of a VPLS, in particular, the Route Target community, whose
403	   format is described in [4].  The semantics of the use of Route
404	   Targets is described in [10]; their use in VPLS is identical.

406	   As it has been assumed that VPLSs are fully meshed, a single Route
407	   Target RT suffices for a given VPLS V, and in effect that RT is the
408	   identifier for VPLS V.

410	   A PE announces (typically via I-BGP) that it belongs to VPLS V by
411	   annotating its NLRIs for V (see next subsection) with Route Target
412	   RT, and acts on this by accepting NLRIs from other PEs that have
413	   Route Target RT.  A PE announces that it no longer participates in V
414	   by withdrawing all NLRIs that it had advertised with Route Target RT.

416	3.2.  Signaling

418	   Once discovery is done, each pair of PEs in a VPLS must be able to
419	   establish (and tear down) pseudowires to each other, i.e., exchange
420	   (and withdraw) demultiplexors.  This process is known as signaling.
421	   Signaling is also used to transmit certain characteristics of the
422	   pseudowires that a PE sets up for a given VPLS.

424	   Recall that a demultiplexor is used to distinguish among several
425	   different streams of traffic carried over a tunnel, each stream
426	   possibly representing a different service.  In the case of VPLS, the
427	   demultiplexor not only says to which specific VPLS a packet belongs,
428	   but also identifies the ingress PE.  The former information is used
429	   for forwarding the packet; the latter information is used for
430	   learning MAC addresses.  The demultiplexor described here is an MPLS
431	   label.  However, note that the PE-to-PE tunnels need not be MPLS
432	   tunnels.

434	   Using a distinct BGP Update message to send a demultiplexor to each
435	   remote PE would require the originating PE to send N such messages
436	   for N remote PEs.  The solution described in this document allows a
437	   PE to send a single (common) Update message that contains
438	   demultiplexors for all the remote PEs, instead of N individual
439	   messages.  Doing this reduces the control plane load both on the
440	   originating PE as well as on the BGP Route Reflectors that may be
441	   involved in distributing this Update to other PEs.

443	3.2.1.  Label Blocks

445	   To accomplish this, we introduce the notion of "label blocks".  A
446	   label block, defined by a label base LB and a VE block size VBS, is a
447	   contiguous set of labels {LB, LB+1, ..., LB+VBS-1}.  Here's how label
448	   blocks work.  All PEs within a given VPLS are assigned unique VE IDs
449	   as part of their configuration.  A PE X wishing to send a VPLS update
450	   sends the same label block information to all other PEs.  Each
451	   receiving PE infers the label intended for PE X by adding their
452	   (unique) VE ID to the label base.  In this manner, each receiving PE
453	   gets a unique demultiplexor for PE X for that VPLS.

455	   This simple notion is enhanced with the concept of a VE block offset
456	   VBO.  A label block defined by <LB, VBO, VBS> is the set {LB+VBO, LB+
457	   VBO+1, ..., LB+VBO+VBS-1}.  Thus, instead of a single large label
458	   block to cover all VE IDs in a VPLS, one can have several label
459	   blocks, each with a different label base.  This makes label block
460	   management easier, and also allows PE X to cater gracefully to a PE
461	   joining a VPLS with a VE ID that is not covered by the set of label
462	   blocks that that PE X has already advertised.

464	   When a PE starts up, or is configured with a new VPLS instance, the
465	   BGP process may wish to wait to receive several advertisements for
466	   that VPLS instance from other PEs to improve the efficiency of label
467	   block allocation.

469	3.2.2.  VPLS BGP NLRI

471	   The VPLS BGP NLRI described below, with a new AFI and SAFI (see [3])
472	   is used to exchange VPLS membership and demultiplexors.

474	   A VPLS BGP NLRI has the following information elements: a VE ID, a VE
475	   Block Offset, a VE Block Size and a label base.  The format of the
476	   VPLS NLRI is given below.  The AFI is the L2VPN AFI (to be assigned
477	   by IANA), and the SAFI is the VPLS SAFI (65).  The Length field is in
478	   octets.

480	      +------------------------------------+
481	      |  Length (2 octets)                 |
482	      +------------------------------------+
483	      |  Route Distinguisher  (8 octets)   |
484	      +------------------------------------+
485	      |  VE ID (2 octets)                  |
486	      +------------------------------------+
487	      |  VE Block Offset (2 octets)        |
488	      +------------------------------------+
489	      |  VE Block Size (2 octets)          |
490	      +------------------------------------+
491	      |  Label Base (3 octets)             |
492	      +------------------------------------+

494	   Figure 2: BGP NLRI for VPLS Information

496	   A PE participating in a VPLS must have at least one VE ID.  If the PE
497	   is the VE, it typically has one VE ID.  If the PE is connected to
498	   several u-PEs, it has a distinct VE ID for each u-PE.  It may
499	   additionally have a VE ID for itself, if it itself acts as a VE for
500	   that VPLS.  In what follows, we will call the PE announcing the VPLS
501	   NLRI PE-a, and we will assume that PE-a owns VE ID V (either
502	   belonging to PE-a itself, or to a u-PE connected to PE-a).

504	   VE IDs are typically assigned by the network administrator.  Their
505	   scope is local to a VPLS.  A given VE ID should belong to only one
506	   PE, unless a CE is multi-homed (see Section 3.5).

508	   A label block is a set of demultiplexor labels used to reach a given
509	   VE ID.  A VPLS BGP NLRI with VE ID V, VE Block Offset VBO, VE Block
510	   Size VBS and label base LB communicates to its peers the following:

512	       label block for V: labels from LB to (LB + VBS - 1), and

514	       remote VE set for V: from VBO to (VBO + VBS - 1).

516	   There is a one-to-one correspondence between the remote VE set and
517	   the label block: VE ID (VBO + n) corresponds to label (LB + n).

519	3.2.3.  PW Setup and Teardown

521	   Suppose PE-a is part of VPLS foo, and makes an announcement with VE
522	   ID V, VE Block Offset VBO, VE Block Size VBS and label base LB.  If
523	   PE-b is also part of VPLS foo, and has VE ID W, PE-b does the
524	   following:

526	   1.  checks if W is part of PE-a's 'remote VE set': if VBO <= W < VBO
527	       + VBS, then W is part of PE-a's remote VE set.  If not, PE-b
528	       ignores this message, and skips the rest of this procedure.

530	   2.  sets up a PW to PE-a: the demultiplexor label to send traffic
531	       from PE-b to PE-a is computed as (LB + W - VBO).

533	   3.  checks if V is part of any 'remote VE set' that PE-b announced,
534	       i.e., PE-b checks if V belongs to some remote VE set that PE-b
535	       announced, say with VE Block Offset VBO', VE Block Size VBS' and
536	       label base LB'.  If not, PE-b MUST make a new announcement as
537	       described in Section 3.3.

539	   4.  sets up a PW from PE-a: the demultiplexor label over which PE-b
540	       should expect traffic from PE-a is computed as: (LB' + V - VBO').

542	   If Y withdraws an NLRI for V that X was using, then X MUST tear down
543	   its ends of the pseudowire between X and Y.

545	3.2.4.  Signaling PE Capabilities

547	   The following extended attribute, the "Layer2 Info Extended
548	   Community", is used to signal control information about the
549	   pseudowires to be setup for a given VPLS.  This information includes
550	   the Encaps Type (type of encapsulation on the pseudowires), Control
551	   Flags (control information regarding the pseudowires) and the Maximum
552	   Transmission Unit (MTU) to be used on the pseudowires.

554	   The Encaps Type for VPLS is 19.

556	      +------------------------------------+
557	      | Extended community type (2 octets) |
558	      +------------------------------------+
559	      |  Encaps Type (1 octet)             |
560	      +------------------------------------+
561	      |  Control Flags (1 octet)           |
562	      +------------------------------------+
563	      |  Layer-2 MTU (2 octet)             |
564	      +------------------------------------+
565	      |  Reserved (2 octets)               |
566	      +------------------------------------+

568	   Figure 3: Layer2 Info Extended Community

570	       0 1 2 3 4 5 6 7
571	      +-+-+-+-+-+-+-+-+
572	      |   MBZ     |C|S|      (MBZ = MUST Be Zero)
573	      +-+-+-+-+-+-+-+-+

575	   Figure 4: Control Flags Bit Vector

577	   With reference to Figure 4, the following bits in the Control Flags
578	   are defined; the remaining bits, designated MBZ, MUST be set to zero
579	   when sending and MUST be ignored when receiving this community.

581	        Name   Meaning
582	           C   A Control word ([5]) MUST or MUST NOT be present when
583	               sending VPLS packets to this PE, depending on whether C
584	               is 1 or 0, respectively
585	           S   Sequenced delivery of frames MUST or MUST NOT be used
586	               when sending VPLS packets to this PE. depending on
587	               whether S is 1 or 0, respectively

589	3.3.  BGP VPLS Operation

591	   To create a new VPLS, say VPLS foo, a network administrator must pick
592	   a RT for VPLS foo, say RT-foo.  This will be used by all PEs that
593	   serve VPLS foo.  To configure a given PE, say PE-a, to be part of
594	   VPLS foo, the network administrator only has to choose a VE ID V for
595	   PE-a.  (If PE-a is connected to u-PEs, PE-a may be configured with
596	   more than one VE ID; in that case, the following is done for each VE
597	   ID).  The PE may also be configured with a Route Distinguisher (RD);
598	   if not, it generates a unique RD for VPLS foo.  Say the RD is
599	   RD-foo-a.  PE-a then generates an initial label block and a remote VE
600	   set for V, defined by VE Block Offset VBO, VE Block Size VBS and
601	   label base LB.  These may be empty.

603	   PE-a then creates a VPLS BGP NLRI with RD RD-foo-a, VE ID V, VE Block
604	   Offset VBO, VE Block Size VBS and label base LB.  To this, it
605	   attaches a Layer2 Info Extended Community and a RT, RT-foo.  It sets
606	   the BGP Next Hop for this NLRI as itself, and announces this NLRI to
607	   its peers.  The Network Layer protocol associated with the Network
608	   Address of the Next Hop for the combination <AFI=L2VPN AFI, SAFI=VPLS
609	   SAFI> is IP; this association is required by [3], Section 5.  If the
610	   value of the Length of the Next Hop field is 4, then the Next Hop
611	   contains an IPv4 address.  If this value is 16, then the Next Hop
612	   contains an IPv6 address.

614	   If PE-a hears from another PE, say PE-b, a VPLS BGP announcement with
615	   RT-foo and VE ID W, then PE-a knows that PE-b is a member of the same
616	   VPLS (autodiscovery).  PE-a then has to set up its part of a VPLS
617	   pseudowire between PE-a and PE-b, using the mechanisms in
618	   Section 3.2.  Similarly, PE-b will have discovered that PE-a is in
619	   the same VPLS, and PE-b must set up its part of the VPLS pseudowire.
620	   Thus, signaling and pseudowire setup is also achieved with the same
621	   Update message.

623	   If W is not in any remote VE set that PE-a announced for VE ID V in
624	   VPLS foo, PE-b will not be able to set up its part of the pseudowire
625	   to PE-a.  To address this, PE-a can choose to withdraw the old
626	   announcement(s) it made for VPLS foo, and announce a new Update with
627	   a larger remote VE set and corresponding label block that covers all
628	   VE IDs that are in VPLS foo.  This however, may cause some service
629	   disruption.  An alternative for PE-a is to create a new remote VE set
630	   and corresponding label block, and announce them in a new Update,
631	   without withdrawing previous announcements.

633	   If PE-a's configuration is changed to remove VE ID V from VPLS foo,
634	   then PE-a MUST withdraw all its announcements for VPLS foo that
635	   contain VE ID V. If all of PE-a's links to its CEs in VPLS foo go
636	   down, then PE-a SHOULD either withdraw all its NLRIs for VPLS foo, or
637	   let other PEs in the VPLS foo know in some way that PE-a is no longer
638	   connected to its CEs.

640	3.4.  Multi-AS VPLS

642	   As in [13] and [10], the above autodiscovery and signaling functions
643	   are typically announced via I-BGP.  This assumes that all the sites
644	   in a VPLS are connected to PEs in a single Autonomous System (AS).

646	   However, sites in a VPLS may connect to PEs in different ASes.  This
647	   leads to two issues: 1) there would not be an I-BGP connection
648	   between those PEs, so some means of signaling across ASes is needed;
649	   and 2) there may not be PE-to-PE tunnels between the ASes.

651	   A similar problem is solved in [10], Section 10.  Three methods are
652	   suggested to address issue (1); all these methods have analogs in
653	   multi-AS VPLS.

655	   Here is a diagram for reference:

657	        __________       ____________       ____________       __________
658	       /          \     /            \     /            \     /          \
659	                   \___/        AS 1  \   /  AS 2        \___/
660	                                       \ /
661	         +-----+           +-------+    |    +-------+           +-----+
662	         | PE1 | ---...--- | ASBR1 | ======= | ASBR2 | ---...--- | PE2 |
663	         +-----+           +-------+    |    +-------+           +-----+
664	                    ___                / \                ___
665	                   /   \              /   \              /   \
666	       \__________/     \____________/     \____________/     \__________/

668	   Figure 6: Inter-AS VPLS
669	   As in the above reference, three methods for signaling inter-provider
670	   VPLS are given; these are presented in order of increasing
671	   scalability.  Method (a) is the easiest to understand conceptually,
672	   and the easiest to deploy; however, it requires an Ethernet
673	   interconnect between the ASes, and both VPLS control and data plane
674	   state on the AS border routers (ASBRs).  Method (b) requires VPLS
675	   control plane state on the ASBRs and MPLS on the AS-AS interconnect
676	   (which need not be Ethernet).  Method (c) requires MPLS on the AS-AS
677	   interconnect, but no VPLS state of any kind on the ASBRs.

679	3.4.1.  a) VPLS-to-VPLS connections at the ASBRs.

681	   In this method, an AS Border Router (ASBR1) acts as a PE for all
682	   VPLSs that span AS1 and an AS to which ASBR1 is connected, such as
683	   AS2 here.  The ASBR on the neighboring AS (ASBR2) is viewed by ASBR1
684	   as a CE for the VPLSs that span AS1 and AS2; similarly, ASBR2 acts as
685	   a PE for this VPLS from AS2's point of view, and views ASBR1 as a CE.

687	   This method does not require MPLS on the ASBR1-ASBR2 link, but does
688	   require that this link carry Ethernet traffic, and that there be a
689	   separate VLAN sub-interface for each VPLS traversing this link.  It
690	   further requires that ASBR1 does the PE operations (discovery,
691	   signaling, MAC address learning, flooding, encapsulation, etc.) for
692	   all VPLSs that traverse ASBR1.  This imposes a significant burden on
693	   ASBR1, both on the control plane and the data plane, which limits the
694	   number of multi-AS VPLSs.

696	   Note that in general, there will be multiple connections between a
697	   pair of ASes, for redundancy.  In this case, the Spanning Tree
698	   Protocol (STP) ([14]), or some other means of loop detection and
699	   prevention, must be run on each VPLS that spans these ASes, so that a
700	   loop-free topology can be constructed in each VPLS.  This imposes a
701	   further burden on the ASBRs and PEs participating in those VPLSs, as
702	   these devices would need to run a loop detection algorithm for each
703	   such VPLS.  How this may be achieved is outside the scope of this
704	   document.

706	3.4.2.  b) EBGP redistribution of VPLS information between ASBRs.

708	   This method requires I-BGP peerings between the PEs in AS1 and ASBR1
709	   in AS1 (perhaps via route reflectors), an E-BGP peering between ASBR1
710	   and ASBR2 in AS2, and I-BGP peerings between ASBR2 and the PEs in
711	   AS2.  In the above example, PE1 sends a VPLS NLRI to ASBR1 with a
712	   label block and itself as the BGP nexthop; ASBR1 sends the NLRI to
713	   ASBR2 with new labels and itself as the BGP nexthop; and ASBR2 sends
714	   the NLRI to PE2 with new labels and itself as the nexthop.

716	   The VPLS NLRI that ASBR1 sends to ASBR2 (and the NLRI that ASBR2
717	   sends to PE2) is identical to the VPLS NLRI that PE1 sends to ASBR1,
718	   except for the label block.  To be precise, the Length, the Route
719	   Distinguisher, the VE ID, the VE Block Offset, and the VE Block Size
720	   MUST be the same; the Label Base may be different.  Furthermore,
721	   ASBR1 must also update its forwarding path as follows: if the Label
722	   Base sent by PE1 is L1, the Label-block Size is N, the Label Base
723	   sent by ASBR1 is L2, and the tunnel label from ASBR1 to PE1 is T,
724	   then ASBR1 must install the following in the forwarding path:

726	      swap L2 with L1 and push T,

728	      swap L2+1 with L1+1 and push T, ...

730	      swap L2+N-1 with L1+N-1 and push T.

732	   ASBR2 must act similarly, except that it may not need a tunnel label
733	   if it is directly connected with ASBR1.

735	   When PE2 wants to send a VPLS packet to PE1, PE2 uses its VE ID to
736	   get the right VPLS label from ASBR2's label block for PE1, and uses a
737	   tunnel label to reach ASBR2.  ASBR2 swaps the VPLS label with the
738	   label from ASBR1; ASBR1 then swaps the VPLS label with the label from
739	   PE1, and pushes a tunnel label to reach PE1.

741	   In this method, one needs MPLS on the ASBR1-ASBR2 interface, but
742	   there is no requirement that the link layer be Ethernet.
743	   Furthermore, the ASBRs take part in distributing VPLS information.
744	   However, the data plane requirements of the ASBRs is much simpler
745	   than in method (a), being limited to label operations.  Finally, the
746	   construction of loop-free VPLS topologies is done by routing
747	   decisions, viz.  BGP path and nexthop selection, so there is no need
748	   to run the Spanning Tree Protocol on a per-VPLS basis.  Thus, this
749	   method is considerably more scalable than method (a).

751	3.4.3.  c) Multi-hop EBGP redistribution of VPLS information between
752	        ASes.

754	   In this method, there is a multi-hop E-BGP peering between the PEs
755	   (or preferably, a Route Reflector) in AS1 and the PEs (or Route
756	   Reflector) in AS2.  PE1 sends a VPLS NLRI with labels and nexthop
757	   self to PE2; if this is via route reflectors, the BGP nexthop is not
758	   changed.  This requires that there be a tunnel LSP from PE1 to PE2.
759	   This tunnel LSP can be created exactly as in [10], section 10 (c),
760	   for example using E-BGP to exchange labeled IPv4 routes for the PE
761	   loopbacks.

763	   When PE1 wants to send a VPLS packet to PE2, it pushes the VPLS label
764	   corresponding to its own VE ID onto the packet.  It then pushes the
765	   tunnel label(s) to reach PE2.

767	   This method requires no VPLS information (in either the control or
768	   the data plane) on the ASBRs.  The ASBRs only need to set up PE-to-PE
769	   tunnel LSPs in the control plane, and do label operations in the data
770	   plane.  Again, as in the case of method (b), the construction of
771	   loop-free VPLS topologies is done by routing decisions, i.e., BGP
772	   path and nexthop selection, so there is no need to run the Spanning
773	   Tree Protocol on a per-VPLS basis.  This option is likely to be the
774	   most scalable of the three methods presented here.

776	3.4.4.  Allocation of VE IDs Across Multiple ASes

778	   In order to ease the allocation of VE IDs for a VPLS that spans
779	   multiple ASes, one can allocate ranges for each AS.  For example, AS1
780	   uses VE IDs in the range 1 to 100, AS2 from 101 to 200, etc.  If
781	   there are 10 sites attached to AS1 and 20 to AS2, the allocated VE
782	   IDs could be 1-10 and 101 to 120.  This minimizes the number of VPLS
783	   NLRIs that are exchanged while ensuring that VE IDs are kept unique.

785	   In the above example, if AS1 needed more than 100 sites, then another
786	   range can be allocated to AS1.  The only caveat is that there be no
787	   overlap between VE ID ranges among ASes.  The exception to this rule
788	   is multi-homing, which is dealt with below.

790	3.5.  Multi-homing and Path Selection

792	   It is often desired to multi-home a VPLS site, i.e., to connect it to
793	   multiple PEs, perhaps even in different ASes.  In such a case, the
794	   PEs connected to the same site can either be configured with the same
795	   VE ID or with different VE IDs.  In the latter case, it is mandatory
796	   to run STP on the CE device, and possibly on the PEs, to construct a
797	   loop-free VPLS topology.  How this can be accomplished is outside the
798	   scope of this document; however, the rest of this section will
799	   describe in some detail the former case.

801	   In the case where the PEs connected to the same site are assigned the
802	   same VE ID, a loop-free topology is constructed by routing
803	   mechanisms, in particular, by BGP path selection.  When a BGP speaker
804	   receives two equivalent NLRIs (see below for the definition), it
805	   applies standard path selection criteria such as Local Preference and
806	   AS Path Length to determine which NLRI to choose; it MUST pick only
807	   one.  If the chosen NLRI is subsequently withdrawn, the BGP speaker
808	   applies path selection to the remaining equivalent VPLS NLRIs to pick
809	   another; if none remain, the forwarding information associated with
810	   that NLRI is removed.

812	   Two VPLS NLRIs are considered equivalent from a path selection point
813	   of view if the Route Distinguisher, the VE ID and the VE Block Offset
814	   are the same.  If two PEs are assigned the same VE ID in a given
815	   VPLS, they MUST use the same Route Distinguisher, and they SHOULD
816	   announce the same VE Block Size for a given VE Offset.

818	3.6.  Hierarchical BGP VPLS

820	   This section discusses how one can scale the VPLS control plane when
821	   using BGP.  There are at least three aspects of scaling the control
822	   plane:

824	   1.  alleviating the full mesh connectivity requirement among VPLS BGP
825	       speakers;

827	   2.  limiting BGP VPLS message passing to just the interested speakers
828	       rather than all BGP speakers; and

830	   3.  simplifying the addition and deletion of BGP speakers, whether
831	       for VPLS or other applications.

833	   Fortunately, the use of BGP for Internet routing as well as for IP
834	   VPNs has yielded several good solutions for all these problems.  The
835	   basic technique is hierarchy, using BGP Route Reflectors (RRs) ([6]).
836	   The idea is to designate a small set of Route Reflectors which are
837	   themselves fully meshed, and then establish a BGP session between
838	   each BGP speaker and one or more RRs.  In this way, there is no need
839	   of direct full mesh connectivity among all the BGP speakers.  If the
840	   particular scaling needs of a provider requires a large number of
841	   RRs, then this technique can be applied recursively: the full mesh
842	   connectivity among the RRs can be brokered by yet another level of
843	   RRs.  The use of RRs solves problems 1 and 3 above.

845	   It is important to note that RRs, as used for VPLS and VPNs, are
846	   purely a control plane technique.  The use of RRs introduces no data
847	   plane state and no data plane forwarding requirements on the RRs, and
848	   does not in any way change the forwarding path of VPLS traffic.  This
849	   is in contrast to the technique of Hierarchical VPLS defined in [8].

851	   Another consequence of this approach is that it is not required that
852	   one set of RRs handles all BGP messages, or that a particular RR
853	   handle all messages from a given PE.  One can define several sets of
854	   RRs, for example a set to handle VPLS, another to handle IP VPNs and
855	   another for Internet routing.  Another partitioning could be to have
856	   some subset of VPLSs and IP VPNs handled by one set of RRs, and
857	   another subset of VPLSs and IP VPNs handled by another set of RRs;
858	   the use of Route Target Filtering (RTF), described in [11] can make
859	   this simpler and more effective.

861	   Finally, problem 2 (that of limiting BGP VPLS message passing to just
862	   the interested BGP speakers) is addressed by the use of RTF.  This
863	   technique is orthogonal to the use of RRs, but works well in
864	   conjunction with RRs.  RTF is also very effective in inter-AS VPLS;
865	   more details on how RTF works and its benefits are provided in [11].

867	   It is worth mentioning an aspect of the control plane that is often a
868	   source of confusion.  No MAC addresses are exchanged via BGP.  All
869	   MAC address learning and aging is done in the data plane individually
870	   by each PE.  The only task of BGP VPLS message exchange is
871	   autodiscovery and label exchange.

873	   Thus, BGP processing for VPLS occurs when

875	   1.  a PE joins or leaves a VPLS; or

877	   2.  a failure occurs in the network, bringing down a PE-PE tunnel or
878	       a PE-CE link.

880	   These events are relatively rare, and typically, each such event
881	   causes one BGP update to be generated.  Coupled with BGP's messaging
882	   efficiency when used for signaling VPLS, these observations lead to
883	   the conclusion that BGP as a control plane for VPLS will scale quite
884	   well both in terms of processing and memory requirements.

886	4.  Data Plane

888	   This section discusses two aspects of the data plane for PEs and
889	   u-PEs implementing VPLS: encapsulation and forwarding.

891	4.1.  Encapsulation

893	   Ethernet frames received from CE devices are encapsulated for
894	   transmission over the packet switched network connecting the PEs.
895	   The encapsulation is as in [5], with one change: a PE that sets the P
896	   bit in the Control Flags strips the outermost VLAN from an Ethernet
897	   frame received from a CE before encapsulating it, and pushes a VLAN
898	   onto a decapsulated frame before sending it to a CE.

900	4.2.  Forwarding

902	   VPLS packets are classified as belonging to a given service instance
903	   and associated forwarding table based on the interface over which the
904	   packet is received.  Packets are forwarded in the context of the
905	   service instance based on the destination MAC address.  The former
906	   mapping is determined by configuration.  The latter is the focus of
907	   this section.

909	4.2.1.  MAC address learning

911	   As was mentioned earlier, the key distinguishing feature of VPLS is
912	   that it is a multipoint service.  This means that the entire Service
913	   Provider network should appear as a single logical learning bridge
914	   for each VPLS that the SP network supports.  The logical ports for
915	   the SP "bridge" are the customer ports as well as the pseudowires on
916	   a VE.  Just as a learning bridge learns MAC addresses on its ports,
917	   the SP bridge must learn MAC addresses at its VEs.

919	   Learning consists of associating source MAC addresses of packets with
920	   the (logical) ports on which they arrive; this association is the
921	   Forwarding Information Base (FIB).  The FIB is used for forwarding
922	   packets.  For example, suppose the bridge receives a packet with
923	   source MAC address S on (logical) port P. If subsequently, the bridge
924	   receives a packet with destination MAC address S, it knows that it
925	   should send the packet out on port P.

927	   If a VE learns a source MAC address S on logical port P, then later
928	   sees S on a different port P', then the VE MUST update its FIB to
929	   reflect the new port P'.  A VE MAY implement a mechanism to damp
930	   flapping of source ports for a given MAC address.

932	4.2.2.  Aging

934	   VPLS PEs SHOULD have an aging mechanism to remove a MAC address
935	   associated with a logical port, much the same as learning bridges do.
936	   This is required so that a MAC address can be relearned if it "moves"
937	   from a logical port to another logical port, either because the
938	   station to which that MAC address belongs really has moved, or
939	   because of a topology change in the LAN that causes this MAC address
940	   to arrive on a new port.  In addition, aging reduces the size of a
941	   VPLS MAC table to just the active MAC addresses, rather than all MAC
942	   addresses in that VPLS.

944	   The "age" of a source MAC address S on a logical port P is the time
945	   since it was last seen as a source MAC on port P. If the age exceeds
946	   the aging time T, S MUST be flushed from the FIB.  This of course
947	   means that every time S is seen as a source MAC address on port P,
948	   S's age is reset.

950	   An implementation SHOULD provide a configurable knob to set the aging
951	   time T on a per-VPLS basis.  In addition, an implementation MAY
952	   accelerate aging of all MAC addresses in a VPLS if it detects certain
953	   situations, such as a Spanning Tree topology change in that VPLS.

955	4.2.3.  Flooding

957	   When a bridge receives a packet to a destination that is not in its
958	   FIB, it floods the packet on all the other ports.  Similarly, a VE
959	   will flood packets to an unknown destination to all other VEs in the
960	   VPLS.

962	   In Figure 1 above, if CE2 sent an Ethernet frame to PE2, and the
963	   destination MAC address on the frame was not in PE2's FIB (for that
964	   VPLS), then PE2 would be responsible for flooding that frame to every
965	   other PE in the same VPLS.  On receiving that frame, PE1 would be
966	   responsible for further flooding the frame to CE1 and CE5 (unless PE1
967	   knew which CE "owned" that MAC address).

969	   On the other hand, if PE3 received the frame, it could delegate
970	   further flooding of the frame to its u-PE.  If PE3 was connected to 2
971	   u-PEs, it would announce that it has two u-PEs.  PE3 could either
972	   announce that it is incapable of flooding, in which case it would
973	   receive two frames, one for each u-PE, or it could announce that it
974	   is capable of flooding, in which case it would receive one copy of
975	   the frame, which it would then send to both u-PEs.

977	4.2.4.  Broadcast and Multicast

979	   There is a well-known broadcast MAC address.  An Ethernet frame whose
980	   destination MAC address is the broadcast MAC address must be sent to
981	   all stations in that VPLS.  This can be accomplished by the same
982	   means that is used for flooding.

984	   There is also an easily recognized set of "multicast" MAC addresses.
985	   Ethernet frames with a destination multicast MAC address MAY be
986	   broadcast to all stations; a VE MAY also use certain techniques to
987	   restrict transmission of multicast frames to a smaller set of
988	   receivers, those that have indicated interest in the corresponding
989	   multicast group.  Discussion of this is outside the scope of this
990	   document.

992	4.2.5.  "Split Horizon" Forwarding

994	   When a PE capable of flooding (say PEx) receives a broadcast Ethernet
995	   frame, or one with an unknown destination MAC address, it must flood
996	   the frame.  If the frame arrived from an attached CE, PEx must send a
997	   copy of the frame to every other attached CE, as well as to all other
998	   PEs participating in the VPLS.  If, on the other hand, the frame
999	   arrived from another PE (say PEy), PEx must send a copy of the packet
1000	   only to attached CEs.  PEx MUST NOT send the frame to other PEs,
1001	   since PEy would have already done so.  This notion has been termed
1002	   "split horizon" forwarding, and is a consequence of the PEs being
1003	   logically fully meshed for VPLS.

1005	   Split horizon forwarding rules apply to broadcast and multicast
1006	   packets, as well as packets to an unknown MAC address.

1008	4.2.6.  Qualified and Unqualified Learning

1010	   The key for normal Ethernet MAC learning is usually just the
1011	   (6-octet) MAC address.  This is called "unqualified learning".
1012	   However, it is also possible that the key for learning includes the
1013	   VLAN tag when present; this is called "qualified learning".

1015	   In the case of VPLS, learning is done in the context of a VPLS
1016	   instance, which typically corresponds to a customer.  If the customer
1017	   uses VLAN tags, one can make the same distinctions of qualified and
1018	   unqualified learning.  If the key for learning within a VPLS is just
1019	   the MAC address, then this VPLS is operating under unqualified
1020	   learning.  If the key for learning is (customer VLAN tag + MAC
1021	   address), then this VPLS is operating under qualified learning.

1023	   Choosing between qualified and unqualified learning involves several
1024	   factors, the most important of which is whether one wants a single
1025	   global broadcast domain (unqualified), or a broadcast domain per VLAN
1026	   (qualified).  The latter makes flooding and broadcasting more
1027	   efficient, but requires larger MAC tables.  These considerations
1028	   apply equally to normal Ethernet forwarding and to VPLS.

1030	4.2.7.  Class of Service

1032	   In order to offer different Classes of Service within a VPLS, an
1033	   implementation MAY choose to map 802.1p bits in a customer Ethernet
1034	   frame with a VLAN tag to an appropriate setting of EXP bits in the
1035	   pseudowire and/or tunnel label, allowing for differential treatment
1036	   of VPLS frames in the packet-switched network.

1038	   To be useful, an implementation SHOULD allow this mapping function to
1039	   be different for each VPLS, as each VPLS customer may have their own
1040	   view of the required behavior for a given setting of 802.1p bits.

1042	5.  Deployment Options

1044	   In deploying a network that supports VPLS, the SP must decide what
1045	   functions the VPLS-aware device closest to the customer (the VE)
1046	   supports.  The default case described in this document is that the VE
1047	   is a PE.  However, there are a number of reasons that the VE might be
1048	   a device that does all the Layer 2 functions (such as MAC address
1049	   learning and flooding), and a limited set of Layer 3 functions (such
1050	   as communicating to its PE), but, for example, doesn't do full-
1051	   fledged discovery and PE-to-PE signaling.  Such a device is called a
1052	   "u-PE".

1054	   As both of these cases have benefits, one would like to be able to
1055	   "mix and match" these scenarios.  The signaling mechanism presented
1056	   here allows this.  For example, in a given provider network, one PE
1057	   may be directly connected to CE devices; another may be connected to
1058	   u-PEs that are connected to CEs; and a third may be connected
1059	   directly to a customer over some interfaces and to u-PEs over others.
1060	   All these PEs perform discovery and signaling in the same manner.
1061	   How they do learning and forwarding depends on whether or not there
1062	   is a u-PE; however, this is a local matter, and is not signaled.
1063	   However, the details of the operation of a u-PE and its interactions
1064	   with PEs and other u-PEs is beyond the scope of this document.

1066	6.  Security Considerations

1068	   The focus in Virtual Private LAN Service is the privacy of data,
1069	   i.e., that data in a VPLS is only distributed to other nodes in that
1070	   VPLS and not to any external agent or other VPLS.  Note that VPLS
1071	   does not offer security or authentication: VPLS packets are sent in
1072	   the clear in the packet-switched network, and a man-in-the-middle can
1073	   eavesdrop, and may be able to inject packets into the data stream.
1074	   If security is desired, the PE-to-PE tunnels can be IPsec tunnels.
1075	   For more security, the end systems in the VPLS sites can use
1076	   appropriate means of encryption to secure their data even before it
1077	   enters the Service Provider network.

1079	   There are two aspects to achieving data privacy in a VPLS: securing
1080	   the control plane, and protecting the forwarding path.  Compromise of
1081	   the control plane could result in a PE sending data belonging to some
1082	   VPLS to another VPLS, or blackholing VPLS data, or even sending it to
1083	   an eavesdropper, none of which are acceptable from a data privacy
1084	   point of view.  Since all control plane exchanges are via BGP,
1085	   techniques such as in [2] help authenticate BGP messages, making it
1086	   harder to spoof updates (which can be used to divert VPLS traffic to
1087	   the wrong VPLS), or withdraws (denial of service attacks).  In the
1088	   multi-AS options (b) and (c), this also means protecting the inter-AS
1089	   BGP sessions, between the ASBRs, the PEs or the Route Reflectors.
1090	   Note that [2] will not help in keeping VPLS labels private -- knowing
1091	   the labels, one can eavesdrop on VPLS traffic.  However, this
1092	   requires access to the data path within a Service Provider network.

1094	   Protecting the data plane requires ensuring that PE-to-PE tunnels are
1095	   well-behaved (this is outside the scope of this document), and that
1096	   VPLS labels are accepted only from valid interfaces.  For a PE, valid
1097	   interfaces comprise links from P routers.  For an ASBR, a valid
1098	   interface is a link from an ASBR in an AS that is part of a given
1099	   VPLS.  It is especially important in the case of multi-AS VPLSs that
1100	   one accept VPLS packets only from valid interfaces.

1102	7.  IANA Considerations

1104	   IANA is asked to allocate an AFI for L2VPN information (suggested
1105	   value: 25).  [NOTE to IANA: This should be the same as the AFI
1106	   requested by [9].]

1108	8.  References

1110	8.1.  Normative References

1112	   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
1113	        Levels", BCP 14, RFC 2119, March 1997.

1115	   [2]  Heffernan, A., "Protection of BGP Sessions via the TCP MD5
1116	        Signature Option", RFC 2385, August 1998.

1118	   [3]  Bates, T., "Multiprotocol Extensions for BGP-4",
1119	        draft-ietf-idr-rfc2858bis-07 (work in progress), August 2005.

1121	   [4]  Rekhter, Y., "BGP Extended Communities Attribute",
1122	        draft-ietf-idr-bgp-ext-communities-09 (work in progress),
1123	        July 2005.

1125	   [5]  Martini, L., "Encapsulation Methods for Transport of Ethernet
1126	        Over MPLS Networks", draft-ietf-pwe3-ethernet-encap-11 (work in
1127	        progress), December 2005.

1129	8.2.  Informative References

1131	   [6]   Bates, T., Chandra, R., and E. Chen, "BGP Route Reflection - An
1132	         Alternative to Full Mesh IBGP", RFC 2796, April 2000.

1134	   [7]   Andersson, L. and E. Rosen, "Framework for Layer 2 Virtual
1135	         Private Networks (L2VPNs)", draft-ietf-l2vpn-l2-framework-05
1136	         (work in progress), June 2004.

1138	   [8]   Lasserre, M. and V. Kompella, "Virtual Private LAN Services
1139	         over MPLS", draft-ietf-l2vpn-vpls-ldp-08 (work in progress),
1140	         November 2005.

1142	   [9]   Ould-Brahim, H., "Using BGP as an Auto-Discovery Mechanism for
1143	         Layer-3 and Layer-2 VPNs", draft-ietf-l3vpn-bgpvpn-auto-06
1144	         (work in progress), June 2005.

1146	   [10]  Rosen, E., "BGP/MPLS IP VPNs", draft-ietf-l3vpn-rfc2547bis-03
1147	         (work in progress), October 2004.

1149	   [11]  Marques, P., "Constrained VPN Route Distribution",
1150	         draft-ietf-l3vpn-rt-constrain-02 (work in progress), June 2005.

1152	   [12]  Martini, L., "Pseudowire Setup and Maintenance using the Label
1153	         Distribution Protocol", draft-ietf-pwe3-control-protocol-17
1154	         (work in progress), June 2005.

1156	   [13]  Kompella, K., "Layer 2 VPNs Over Tunnels",
1157	         draft-kompella-l2vpn-l2vpn-00 (work in progress), January 2004.

1159	   [14]  Institute of Electrical and Electronics Engineers, "Information
1160	         technology - Telecommunications and information exchange
1161	         between systems - Local and metropolitan area networks - Common
1162	         specifications - Part 3: Media Access Control (MAC) Bridges:
1163	         Revision. This is a revision of ISO/IEC 10038: 1993, 802.1j-
1164	         1992 and 802.6k-1992.  It incorporates P802.11c, P802.1p and
1165	         P802.12e.  ISO/IEC 15802-3: 1998.", IEEE Standard 802.1D,
1166	         July 1998.

1168	Appendix A.  Contributors

1170	   The following contributed to this document:

1172	           Javier Achirica, Telefonica
1173	           Loa Andersson, Acreo
1174	           Chaitanya Kodeboyina, Juniper
1175	           Giles Heron, Tellabs
1176	           Sunil Khandekar, Alcatel
1177	           Vach Kompella, Alcatel
1178	           Marc Lasserre, Riverstone
1179	           Pierre Lin
1180	           Pascal Menezes
1181	           Ashwin Moranganti, Appian
1182	           Hamid Ould-Brahim, Nortel
1183	           Seo Yeong-il, Korea Tel

1185	Appendix B.  Acknowledgements

1187	   Thanks to Joe Regan and Alfred Nothaft for their contributions.  Many
1188	   thanks too to Eric Ji, Chaitanya Kodeboyina, Mike Loomis and Elwyn
1189	   Davies for their detailed reviews.

1191	Authors' Addresses

1193	   Kireeti Kompella (editor)
1194	   Juniper Networks
1195	   1194 N. Mathilda Ave.
1196	   Sunnyvale, CA  94089
1197	   US

1199	   Email: kireeti@juniper.net

1201	   Yakov Rekhter (editor)
1202	   Juniper Networks
1203	   1194 N. Mathilda Ave.
1204	   Sunnyvale, CA  94089
1205	   US

1207	   Email: yakov@juniper.net

1209	Intellectual Property Statement

1211	   The IETF takes no position regarding the validity or scope of any
1212	   Intellectual Property Rights or other rights that might be claimed to
1213	   pertain to the implementation or use of the technology described in
1214	   this document or the extent to which any license under such rights
1215	   might or might not be available; nor does it represent that it has
1216	   made any independent effort to identify any such rights.  Information
1217	   on the procedures with respect to rights in RFC documents can be
1218	   found in BCP 78 and BCP 79.

1220	   Copies of IPR disclosures made to the IETF Secretariat and any
1221	   assurances of licenses to be made available, or the result of an
1222	   attempt made to obtain a general license or permission for the use of
1223	   such proprietary rights by implementers or users of this
1224	   specification can be obtained from the IETF on-line IPR repository at
1225	   http://www.ietf.org/ipr.

1227	   The IETF invites any interested party to bring to its attention any
1228	   copyrights, patents or patent applications, or other proprietary
1229	   rights that may cover technology that may be required to implement
1230	   this standard.  Please address the information to the IETF at
1231	   ietf-ipr@ietf.org.

1233	Disclaimer of Validity

1235	   This document and the information contained herein are provided on an
1236	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1237	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1238	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1239	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1240	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1241	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1243	Copyright Statement

1245	   Copyright (C) The Internet Society (2005).  This document is subject
1246	   to the rights, licenses and restrictions contained in BCP 78, and
1247	   except as set forth therein, the authors retain all their rights.

1249	Acknowledgment

1251	   Funding for the RFC Editor function is currently provided by the
1252	   Internet Society.