idnits 2.17.1 draft-ietf-l2vpn-vpls-bgp-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1241. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1218. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1225. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1231. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 3 instances of too long lines in the document, the longest one being 2 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 28, 2005) is 6687 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2385 (ref. '2') (Obsoleted by RFC 5925) == Outdated reference: A later version (-10) exists of draft-ietf-idr-rfc2858bis-07 -- Obsolete informational reference (is this intentional?): RFC 2796 (ref. '6') (Obsoleted by RFC 4456) == Outdated reference: A later version (-09) exists of draft-ietf-l2vpn-vpls-ldp-08 == Outdated reference: A later version (-09) exists of draft-ietf-l3vpn-bgpvpn-auto-06 == Outdated reference: A later version (-10) exists of draft-kompella-l2vpn-l2vpn-00 Summary: 5 errors (**), 0 flaws (~~), 6 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Kompella, Ed. 3 Internet-Draft Y. Rekhter, Ed. 4 Expires: July 1, 2006 Juniper Networks 5 December 28, 2005 7 Virtual Private LAN Service 8 draft-ietf-l2vpn-vpls-bgp-06 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on July 1, 2006. 35 Copyright Notice 37 Copyright (C) The Internet Society (2005). 39 Abstract 41 Virtual Private LAN (Local Area Network) Service (VPLS), also known 42 as Transparent LAN Service, and Virtual Private Switched Network 43 service, is a useful Service Provider offering. The service offers a 44 Layer 2 Virtual Private Network (VPN); however, in the case of VPLS, 45 the customers in the VPN are connected by a multipoint Ethernet LAN, 46 in contrast to the usual Layer 2 VPNs, which are point-to-point in 47 nature. 49 This document describes the functions required to offer VPLS, a 50 mechanism for signaling a VPLS, and rules for forwarding VPLS frames 51 across a packet switched network. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 1.1. Scope of this Document . . . . . . . . . . . . . . . . . . 4 57 1.2. Conventions used in this document . . . . . . . . . . . . 5 58 1.3. Changes from version 05 to 06 . . . . . . . . . . . . . . 5 59 1.4. Changes from version 04 to 05 . . . . . . . . . . . . . . 5 60 1.5. Changes from version 03 to 04 . . . . . . . . . . . . . . 6 61 2. Functional Model . . . . . . . . . . . . . . . . . . . . . . . 7 62 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7 63 2.2. Assumptions . . . . . . . . . . . . . . . . . . . . . . . 8 64 2.3. Interactions . . . . . . . . . . . . . . . . . . . . . . . 8 65 3. Control Plane . . . . . . . . . . . . . . . . . . . . . . . . 10 66 3.1. Autodiscovery . . . . . . . . . . . . . . . . . . . . . . 10 67 3.1.1. Functions . . . . . . . . . . . . . . . . . . . . . . 10 68 3.1.2. Protocol Specification . . . . . . . . . . . . . . . . 11 69 3.2. Signaling . . . . . . . . . . . . . . . . . . . . . . . . 11 70 3.2.1. Label Blocks . . . . . . . . . . . . . . . . . . . . . 12 71 3.2.2. VPLS BGP NLRI . . . . . . . . . . . . . . . . . . . . 12 72 3.2.3. PW Setup and Teardown . . . . . . . . . . . . . . . . 13 73 3.2.4. Signaling PE Capabilities . . . . . . . . . . . . . . 14 74 3.3. BGP VPLS Operation . . . . . . . . . . . . . . . . . . . . 15 75 3.4. Multi-AS VPLS . . . . . . . . . . . . . . . . . . . . . . 16 76 3.4.1. a) VPLS-to-VPLS connections at the ASBRs. . . . . . . 17 77 3.4.2. b) EBGP redistribution of VPLS information between 78 ASBRs. . . . . . . . . . . . . . . . . . . . . . . . . 17 79 3.4.3. c) Multi-hop EBGP redistribution of VPLS 80 information between ASes. . . . . . . . . . . . . . . 18 81 3.4.4. Allocation of VE IDs Across Multiple ASes . . . . . . 19 82 3.5. Multi-homing and Path Selection . . . . . . . . . . . . . 19 83 3.6. Hierarchical BGP VPLS . . . . . . . . . . . . . . . . . . 20 84 4. Data Plane . . . . . . . . . . . . . . . . . . . . . . . . . . 22 85 4.1. Encapsulation . . . . . . . . . . . . . . . . . . . . . . 22 86 4.2. Forwarding . . . . . . . . . . . . . . . . . . . . . . . . 22 87 4.2.1. MAC address learning . . . . . . . . . . . . . . . . . 22 88 4.2.2. Aging . . . . . . . . . . . . . . . . . . . . . . . . 23 89 4.2.3. Flooding . . . . . . . . . . . . . . . . . . . . . . . 23 90 4.2.4. Broadcast and Multicast . . . . . . . . . . . . . . . 23 91 4.2.5. "Split Horizon" Forwarding . . . . . . . . . . . . . . 24 92 4.2.6. Qualified and Unqualified Learning . . . . . . . . . . 24 93 4.2.7. Class of Service . . . . . . . . . . . . . . . . . . . 25 94 5. Deployment Options . . . . . . . . . . . . . . . . . . . . . . 26 95 6. Security Considerations . . . . . . . . . . . . . . . . . . . 27 96 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 28 97 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 98 8.1. Normative References . . . . . . . . . . . . . . . . . . . 29 99 8.2. Informative References . . . . . . . . . . . . . . . . . . 29 100 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 31 101 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 32 102 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 33 103 Intellectual Property and Copyright Statements . . . . . . . . . . 34 105 1. Introduction 107 Virtual Private LAN Service (VPLS), also known as Transparent LAN 108 Service, and Virtual Private Switched Network service, is a useful 109 service offering. A Virtual Private LAN appears in (almost) all 110 respects as an Ethernet LAN to customers of a Service Provider. 111 However, in a VPLS, the customers are not all connected to a single 112 LAN; the customers may be spread across a metro or wide area. In 113 essence, a VPLS glues together several individual LANs across a 114 packet-switched network to appear and function as a single LAN ([7]). 115 This is accomplished by incorporating MAC address learning, flooding 116 and forwarding functions in the context of pseudowires that connect 117 these individual LANs across the packet-switched network. 119 This document details the functions needed to offer VPLS, and then 120 goes on to describe a mechanism for the autodiscovery of the 121 endpoints of a VPLS as well as for signaling a VPLS. It also 122 describes how VPLS frames are transported over tunnels across a 123 packet switched network. The autodiscovery and signaling mechanism 124 uses BGP as the control plane protocol. This document also briefly 125 discusses deployment options, in particular, the notion of decoupling 126 functions across devices. 128 Alternative approaches include: [13], which allows one to build a 129 Layer 2 VPN with Ethernet as the interconnect; and [12]), which 130 allows one to set up an Ethernet connection across a packet-switched 131 network. Both of these, however, offer point-to-point Ethernet 132 services. What distinguishes VPLS from the above two is that a VPLS 133 offers a multipoint service. A mechanism for setting up pseudowires 134 for VPLS using the Label Distribution Protocol (LDP) is defined in 135 [8]. 137 1.1. Scope of this Document 139 This document has four major parts: defining a VPLS functional model; 140 defining a control plane for setting up VPLS; defining the data plane 141 for VPLS (encapsulation and forwarding of data); and defining various 142 deployment options. 144 The functional model underlying VPLS is laid out in Section 2. This 145 describes the service being offered, the network components that 146 interact to provide the service, and at a high level their 147 interactions. 149 The control plane described in this document uses Multiprotocol BGP 150 [3] to establish VPLS service, i.e., for the autodiscovery of VPLS 151 members and for the setup and teardown of the pseudowires that 152 constitute a given VPLS instance. Section 3 focuses on this, and 153 also describes how a VPLS that spans Autonomous System boundaries is 154 set up, as well as how multi-homing is handled. Using BGP as the 155 control plane for VPNs is not new (see [13], [10] and [9]): what is 156 described here is based on the mechanisms proposed in [10]. 158 The forwarding plane and the actions that a participating Provider 159 Edge (PE) router offering the VPLS service must take is described in 160 Section 4. 162 In Section 5, the notion of 'decoupled' operation is defined, and the 163 interaction of decoupled and non-decoupled PEs is described. 164 Decoupling allows for more flexible deployment of VPLS. 166 1.2. Conventions used in this document 168 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 169 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 170 document are to be interpreted as described in RFC 2119 ([1]). 172 1.3. Changes from version 05 to 06 174 [NOTE to RFC Editor: this section is to be removed before 175 publication.] 177 Changes in response to GenART review. 179 Updated Abstract and Introduction to make it clear that VPLS is an 180 Ethernet-based service. 182 Added sections on Aging, Broadcast and Multicast, Qualified and 183 Unqualified learning and CoS. Also added a section on scaling the 184 BGP control plane. These were requested for consistency between the 185 BGP and LDP VPLS documents. 187 Added a section clarifying the concepts of label blocks, why they are 188 necessary and how they are used. 190 For multi-AS operation, added a short introduction to the three 191 options, comparing their usage. 193 Lots of clean-up: consistent usage of terms, expansion of acronyms 194 before use, references. 196 1.4. Changes from version 04 to 05 198 [NOTE to RFC Editor: this section is to be removed before 199 publication.] 200 Updated IANA section to reflect agreement with authors of [9] that 201 the two docs should use the same AFI for L2VPN information. 203 Addressed comments received from Alex Zinin. No technical changes, 204 but a more complete description to cover the issues that Alex raised: 206 1. encoding of BGP NEXT_HOP for the new AFI/SAFI is not described 208 2. VE ID, Block offset, Block size, Label base are not described 209 anywhere 211 3. no information on how the receiving PE choose the PW label 213 4. section 3.2.2 talks about PE capabilities all of a sudden and 214 introduces a L2 Info Community, whose fields and use are not 215 described 217 Changes to address these: 219 1. Broke up section 3.2.1 into "Concepts" and "PW Setup". 221 2. Expanded section on "Signaling PE Capabilities". 223 3. Added a new section 3.3 "BGP VPLS Operation". 225 4. Minor tweaking, e.g. to fix section number references. 227 1.5. Changes from version 03 to 04 229 [NOTE to RFC Editor: this section is to be removed before 230 publication.] 232 Incorporated IDR review comments from Eric Ji, Chaitanya Kodeboyina, 233 and Mike Loomis. Most changes are clarifications and rewording for 234 better readability. The substantive changes are to remove several 235 flags from the control field. 237 2. Functional Model 239 This will be described with reference to the following figure. 241 ----- 242 / A1 \ 243 ---- ____CE1 | 244 / \ -------- -------- / | | 245 | A2 CE2- / \ / PE1 \ / 246 \ / \ / \___/ | \ ----- 247 ---- ---PE2 | \ 248 | | \ ----- 249 | Service Provider Network | \ / \ 250 | | CE5 A5 | 251 | ___ | / \ / 252 |----| \ / \ PE4_/ ----- 253 |u-PE|--PE3 / \ / 254 |----| -------- ------- 255 ---- / | ---- 256 / \/ \ / \ CE = Customer Edge Device 257 | A3 CE3 --CE4 A4 | PE = Provider Edge Router 258 \ / \ / u-PE = Layer 2 Aggregation 259 ---- ---- A = Customer site n 261 Figure 1: Example of a VPLS 263 2.1. Terminology 265 Terminology similar to that in [10] is used: a Service Provider (SP) 266 network with P (Provider-only) and PE (Provider Edge) routers, and 267 customers with CE (Customer Edge) devices. Here, however, there is 268 an additional concept, that of a "u-PE", a Layer 2 PE device used for 269 Layer 2 aggregation. The notion of u-PE is described further in 270 Section 5. PE and u-PE devices are "VPLS-aware", which means that 271 they know that a VPLS service is being offered. We will call these 272 VPLS edge devices, which could be either a PE or an u-PE, a VE. 274 In contrast, the CE device (which may be owned and operated by either 275 the SP or the customer) is VPLS-unaware; as far as the CE is 276 concerned, it is connected to the other CEs in the VPLS via a Layer 2 277 switched network. This means that there should be no changes to a CE 278 device, either to the hardware or the software, in order to offer 279 VPLS. 281 A CE device may be connected to a PE or a u-PE via Layer 2 switches 282 that are VPLS-unaware. From a VPLS point of view, such Layer 2 283 switches are invisible, and hence will not be discussed further. 284 Furthermore, a u-PE may be connected to a PE via Layer 2 and Layer 3 285 devices; this will be discussed further in a later section. 287 The term "demultiplexor" refers to an identifier in a data packet 288 that identifies both the VPLS to which the packet belongs as well as 289 the ingress PE. In this document, the demultiplexor is an MPLS 290 label. 292 The term "VPLS" will refer to the service as well as a particular 293 instantiation of the service (i.e., an emulated LAN); it should be 294 clear from the context which usage is intended. 296 2.2. Assumptions 298 The Service Provider Network is a packet switched network. The PEs 299 are assumed to be (logically) fully meshed with tunnels over which 300 packets that belong to a service (such as VPLS) are encapsulated and 301 forwarded. These tunnels can be IP tunnels, such as GRE, or MPLS 302 tunnels, established by RSVP-TE or LDP. These tunnels are 303 established independently of the services offered over them; the 304 signaling and establishment of these tunnels are not discussed in 305 this document. 307 "Flooding" and MAC address "learning" (see Section 4) are an integral 308 part of VPLS. However, these activities are private to an SP device, 309 i.e., in the VPLS described below, no SP device requests another SP 310 device to flood packets or learn MAC addresses on its behalf. 312 All the PEs participating in a VPLS are assumed to be fully meshed in 313 the data plane, i.e., there is a bidirectional pseudowire between 314 every pair of PEs participating in that VPLS, and thus every 315 (ingress) PE can send a VPLS packet to the egress PE(s) directly, 316 without the need for an intermediate PE (see Section 4.2.5.) This 317 requires that VPLS PEs are logically fully meshed in the control 318 plane so that a PE can send a message to another PE to set up the 319 necessary pseudowires. See Section 3.6 for a discussion on 320 alternatives to achieve a logical full mesh in the control plane. 322 2.3. Interactions 324 VPLS is a "LAN Service" in that CE devices that belong to VPLS V can 325 interact through the SP network as if they were connected by a LAN. 326 VPLS is "private" in that CE devices that belong to different VPLSs 327 cannot interact. VPLS is "virtual" in that multiple VPLSs can be 328 offered over a common packet switched network. 330 PE devices interact to "discover" all the other PEs participating in 331 the same VPLS, and to exchange demultiplexors. These interactions 332 are control-driven, not data-driven. 334 u-PEs interact with PEs to establish connections with remote PEs or 335 u-PEs in the same VPLS. This interaction is control-driven. 337 PE devices can participate simultaneously in both VPLS and IP VPNs 338 ([10]). These are independent services, and the information 339 exchanged for each type of service is kept separate as the Network 340 Layer Reachability Information (NLRI) used for this exchange have 341 different Address Family Identifiers (AFI) and Subsequent Address 342 Family Identifiers (SAFI). Consequently, an implementation MUST 343 maintain a separate routing storage for each service. However, 344 multiple services can use the same underlying tunnels; the VPLS or 345 VPN label is used to demultiplex the packets belonging to different 346 services. 348 3. Control Plane 350 There are two primary functions of the VPLS control plane: 351 autodiscovery, and setup and teardown of the pseudowires that 352 constitute the VPLS, often called signaling. Section 3.1 and 353 Section 3.2 describe these functions. Both of these functions are 354 accomplished with a single BGP Update advertisement; Section 3.3 355 describes how this is done by detailing BGP protocol operation for 356 VPLS. Section 3.4 describes the setting up of pseudowires that span 357 Autonomous Systems. Section 3.5 describes how multi-homing is 358 handled. 360 3.1. Autodiscovery 362 Discovery refers to the process of finding all the PEs that 363 participate in a given VPLS instance. A PE can either be configured 364 with the identities of all the other PEs in a given VPLS, or the PE 365 can use some protocol to discover the other PEs. The latter is 366 called autodiscovery. 368 The former approach is fairly configuration-intensive, especially 369 since it is required that the PEs participating in a given VPLS are 370 fully meshed (i.e., that every PE in a given VPLS establish 371 pseudowires to every other PE in that VPLS). Furthermore, when the 372 topology of a VPLS changes (i.e., a PE is added to, or removed from 373 the VPLS), the VPLS configuration on all PEs in that VPLS must be 374 changed. 376 In the autodiscovery approach, each PE "discovers" which other PEs 377 are part of a given VPLS by means of some protocol, in this case BGP. 378 This allows each PE's configuration to consist only of the identity 379 of the VPLS instance established on this PE, not the identity of 380 every other PE in that VPLS instance -- that is auto-discovered. 381 Moreover, when the topology of a VPLS changes, only the affected PE's 382 configuration changes; other PEs automatically find out about the 383 change and adapt. 385 3.1.1. Functions 387 A PE that participates in a given VPLS instance V must be able to 388 tell all other PEs in VPLS V that it is also a member of V. A PE must 389 also have a means of declaring that it no longer participates in a 390 VPLS. To do both of these, the PE must have a means of identifying a 391 VPLS and a means by which to communicate to all other PEs. 393 U-PE devices also need to know what constitutes a given VPLS; 394 however, they don't need the same level of detail. The PE (or PEs) 395 to which a u-PE is connected gives the u-PE an abstraction of the 396 VPLS; this is described in section 5. 398 3.1.2. Protocol Specification 400 The specific mechanism for autodiscovery described here is based on 401 [13] and [10]; it uses BGP extended communities [4] to identify 402 members of a VPLS, in particular, the Route Target community, whose 403 format is described in [4]. The semantics of the use of Route 404 Targets is described in [10]; their use in VPLS is identical. 406 As it has been assumed that VPLSs are fully meshed, a single Route 407 Target RT suffices for a given VPLS V, and in effect that RT is the 408 identifier for VPLS V. 410 A PE announces (typically via I-BGP) that it belongs to VPLS V by 411 annotating its NLRIs for V (see next subsection) with Route Target 412 RT, and acts on this by accepting NLRIs from other PEs that have 413 Route Target RT. A PE announces that it no longer participates in V 414 by withdrawing all NLRIs that it had advertised with Route Target RT. 416 3.2. Signaling 418 Once discovery is done, each pair of PEs in a VPLS must be able to 419 establish (and tear down) pseudowires to each other, i.e., exchange 420 (and withdraw) demultiplexors. This process is known as signaling. 421 Signaling is also used to transmit certain characteristics of the 422 pseudowires that a PE sets up for a given VPLS. 424 Recall that a demultiplexor is used to distinguish among several 425 different streams of traffic carried over a tunnel, each stream 426 possibly representing a different service. In the case of VPLS, the 427 demultiplexor not only says to which specific VPLS a packet belongs, 428 but also identifies the ingress PE. The former information is used 429 for forwarding the packet; the latter information is used for 430 learning MAC addresses. The demultiplexor described here is an MPLS 431 label. However, note that the PE-to-PE tunnels need not be MPLS 432 tunnels. 434 Using a distinct BGP Update message to send a demultiplexor to each 435 remote PE would require the originating PE to send N such messages 436 for N remote PEs. The solution described in this document allows a 437 PE to send a single (common) Update message that contains 438 demultiplexors for all the remote PEs, instead of N individual 439 messages. Doing this reduces the control plane load both on the 440 originating PE as well as on the BGP Route Reflectors that may be 441 involved in distributing this Update to other PEs. 443 3.2.1. Label Blocks 445 To accomplish this, we introduce the notion of "label blocks". A 446 label block, defined by a label base LB and a VE block size VBS, is a 447 contiguous set of labels {LB, LB+1, ..., LB+VBS-1}. Here's how label 448 blocks work. All PEs within a given VPLS are assigned unique VE IDs 449 as part of their configuration. A PE X wishing to send a VPLS update 450 sends the same label block information to all other PEs. Each 451 receiving PE infers the label intended for PE X by adding their 452 (unique) VE ID to the label base. In this manner, each receiving PE 453 gets a unique demultiplexor for PE X for that VPLS. 455 This simple notion is enhanced with the concept of a VE block offset 456 VBO. A label block defined by is the set {LB+VBO, LB+ 457 VBO+1, ..., LB+VBO+VBS-1}. Thus, instead of a single large label 458 block to cover all VE IDs in a VPLS, one can have several label 459 blocks, each with a different label base. This makes label block 460 management easier, and also allows PE X to cater gracefully to a PE 461 joining a VPLS with a VE ID that is not covered by the set of label 462 blocks that that PE X has already advertised. 464 When a PE starts up, or is configured with a new VPLS instance, the 465 BGP process may wish to wait to receive several advertisements for 466 that VPLS instance from other PEs to improve the efficiency of label 467 block allocation. 469 3.2.2. VPLS BGP NLRI 471 The VPLS BGP NLRI described below, with a new AFI and SAFI (see [3]) 472 is used to exchange VPLS membership and demultiplexors. 474 A VPLS BGP NLRI has the following information elements: a VE ID, a VE 475 Block Offset, a VE Block Size and a label base. The format of the 476 VPLS NLRI is given below. The AFI is the L2VPN AFI (to be assigned 477 by IANA), and the SAFI is the VPLS SAFI (65). The Length field is in 478 octets. 480 +------------------------------------+ 481 | Length (2 octets) | 482 +------------------------------------+ 483 | Route Distinguisher (8 octets) | 484 +------------------------------------+ 485 | VE ID (2 octets) | 486 +------------------------------------+ 487 | VE Block Offset (2 octets) | 488 +------------------------------------+ 489 | VE Block Size (2 octets) | 490 +------------------------------------+ 491 | Label Base (3 octets) | 492 +------------------------------------+ 494 Figure 2: BGP NLRI for VPLS Information 496 A PE participating in a VPLS must have at least one VE ID. If the PE 497 is the VE, it typically has one VE ID. If the PE is connected to 498 several u-PEs, it has a distinct VE ID for each u-PE. It may 499 additionally have a VE ID for itself, if it itself acts as a VE for 500 that VPLS. In what follows, we will call the PE announcing the VPLS 501 NLRI PE-a, and we will assume that PE-a owns VE ID V (either 502 belonging to PE-a itself, or to a u-PE connected to PE-a). 504 VE IDs are typically assigned by the network administrator. Their 505 scope is local to a VPLS. A given VE ID should belong to only one 506 PE, unless a CE is multi-homed (see Section 3.5). 508 A label block is a set of demultiplexor labels used to reach a given 509 VE ID. A VPLS BGP NLRI with VE ID V, VE Block Offset VBO, VE Block 510 Size VBS and label base LB communicates to its peers the following: 512 label block for V: labels from LB to (LB + VBS - 1), and 514 remote VE set for V: from VBO to (VBO + VBS - 1). 516 There is a one-to-one correspondence between the remote VE set and 517 the label block: VE ID (VBO + n) corresponds to label (LB + n). 519 3.2.3. PW Setup and Teardown 521 Suppose PE-a is part of VPLS foo, and makes an announcement with VE 522 ID V, VE Block Offset VBO, VE Block Size VBS and label base LB. If 523 PE-b is also part of VPLS foo, and has VE ID W, PE-b does the 524 following: 526 1. checks if W is part of PE-a's 'remote VE set': if VBO <= W < VBO 527 + VBS, then W is part of PE-a's remote VE set. If not, PE-b 528 ignores this message, and skips the rest of this procedure. 530 2. sets up a PW to PE-a: the demultiplexor label to send traffic 531 from PE-b to PE-a is computed as (LB + W - VBO). 533 3. checks if V is part of any 'remote VE set' that PE-b announced, 534 i.e., PE-b checks if V belongs to some remote VE set that PE-b 535 announced, say with VE Block Offset VBO', VE Block Size VBS' and 536 label base LB'. If not, PE-b MUST make a new announcement as 537 described in Section 3.3. 539 4. sets up a PW from PE-a: the demultiplexor label over which PE-b 540 should expect traffic from PE-a is computed as: (LB' + V - VBO'). 542 If Y withdraws an NLRI for V that X was using, then X MUST tear down 543 its ends of the pseudowire between X and Y. 545 3.2.4. Signaling PE Capabilities 547 The following extended attribute, the "Layer2 Info Extended 548 Community", is used to signal control information about the 549 pseudowires to be setup for a given VPLS. This information includes 550 the Encaps Type (type of encapsulation on the pseudowires), Control 551 Flags (control information regarding the pseudowires) and the Maximum 552 Transmission Unit (MTU) to be used on the pseudowires. 554 The Encaps Type for VPLS is 19. 556 +------------------------------------+ 557 | Extended community type (2 octets) | 558 +------------------------------------+ 559 | Encaps Type (1 octet) | 560 +------------------------------------+ 561 | Control Flags (1 octet) | 562 +------------------------------------+ 563 | Layer-2 MTU (2 octet) | 564 +------------------------------------+ 565 | Reserved (2 octets) | 566 +------------------------------------+ 568 Figure 3: Layer2 Info Extended Community 570 0 1 2 3 4 5 6 7 571 +-+-+-+-+-+-+-+-+ 572 | MBZ |C|S| (MBZ = MUST Be Zero) 573 +-+-+-+-+-+-+-+-+ 575 Figure 4: Control Flags Bit Vector 577 With reference to Figure 4, the following bits in the Control Flags 578 are defined; the remaining bits, designated MBZ, MUST be set to zero 579 when sending and MUST be ignored when receiving this community. 581 Name Meaning 582 C A Control word ([5]) MUST or MUST NOT be present when 583 sending VPLS packets to this PE, depending on whether C 584 is 1 or 0, respectively 585 S Sequenced delivery of frames MUST or MUST NOT be used 586 when sending VPLS packets to this PE. depending on 587 whether S is 1 or 0, respectively 589 3.3. BGP VPLS Operation 591 To create a new VPLS, say VPLS foo, a network administrator must pick 592 a RT for VPLS foo, say RT-foo. This will be used by all PEs that 593 serve VPLS foo. To configure a given PE, say PE-a, to be part of 594 VPLS foo, the network administrator only has to choose a VE ID V for 595 PE-a. (If PE-a is connected to u-PEs, PE-a may be configured with 596 more than one VE ID; in that case, the following is done for each VE 597 ID). The PE may also be configured with a Route Distinguisher (RD); 598 if not, it generates a unique RD for VPLS foo. Say the RD is 599 RD-foo-a. PE-a then generates an initial label block and a remote VE 600 set for V, defined by VE Block Offset VBO, VE Block Size VBS and 601 label base LB. These may be empty. 603 PE-a then creates a VPLS BGP NLRI with RD RD-foo-a, VE ID V, VE Block 604 Offset VBO, VE Block Size VBS and label base LB. To this, it 605 attaches a Layer2 Info Extended Community and a RT, RT-foo. It sets 606 the BGP Next Hop for this NLRI as itself, and announces this NLRI to 607 its peers. The Network Layer protocol associated with the Network 608 Address of the Next Hop for the combination is IP; this association is required by [3], Section 5. If the 610 value of the Length of the Next Hop field is 4, then the Next Hop 611 contains an IPv4 address. If this value is 16, then the Next Hop 612 contains an IPv6 address. 614 If PE-a hears from another PE, say PE-b, a VPLS BGP announcement with 615 RT-foo and VE ID W, then PE-a knows that PE-b is a member of the same 616 VPLS (autodiscovery). PE-a then has to set up its part of a VPLS 617 pseudowire between PE-a and PE-b, using the mechanisms in 618 Section 3.2. Similarly, PE-b will have discovered that PE-a is in 619 the same VPLS, and PE-b must set up its part of the VPLS pseudowire. 620 Thus, signaling and pseudowire setup is also achieved with the same 621 Update message. 623 If W is not in any remote VE set that PE-a announced for VE ID V in 624 VPLS foo, PE-b will not be able to set up its part of the pseudowire 625 to PE-a. To address this, PE-a can choose to withdraw the old 626 announcement(s) it made for VPLS foo, and announce a new Update with 627 a larger remote VE set and corresponding label block that covers all 628 VE IDs that are in VPLS foo. This however, may cause some service 629 disruption. An alternative for PE-a is to create a new remote VE set 630 and corresponding label block, and announce them in a new Update, 631 without withdrawing previous announcements. 633 If PE-a's configuration is changed to remove VE ID V from VPLS foo, 634 then PE-a MUST withdraw all its announcements for VPLS foo that 635 contain VE ID V. If all of PE-a's links to its CEs in VPLS foo go 636 down, then PE-a SHOULD either withdraw all its NLRIs for VPLS foo, or 637 let other PEs in the VPLS foo know in some way that PE-a is no longer 638 connected to its CEs. 640 3.4. Multi-AS VPLS 642 As in [13] and [10], the above autodiscovery and signaling functions 643 are typically announced via I-BGP. This assumes that all the sites 644 in a VPLS are connected to PEs in a single Autonomous System (AS). 646 However, sites in a VPLS may connect to PEs in different ASes. This 647 leads to two issues: 1) there would not be an I-BGP connection 648 between those PEs, so some means of signaling across ASes is needed; 649 and 2) there may not be PE-to-PE tunnels between the ASes. 651 A similar problem is solved in [10], Section 10. Three methods are 652 suggested to address issue (1); all these methods have analogs in 653 multi-AS VPLS. 655 Here is a diagram for reference: 657 __________ ____________ ____________ __________ 658 / \ / \ / \ / \ 659 \___/ AS 1 \ / AS 2 \___/ 660 \ / 661 +-----+ +-------+ | +-------+ +-----+ 662 | PE1 | ---...--- | ASBR1 | ======= | ASBR2 | ---...--- | PE2 | 663 +-----+ +-------+ | +-------+ +-----+ 664 ___ / \ ___ 665 / \ / \ / \ 666 \__________/ \____________/ \____________/ \__________/ 668 Figure 6: Inter-AS VPLS 669 As in the above reference, three methods for signaling inter-provider 670 VPLS are given; these are presented in order of increasing 671 scalability. Method (a) is the easiest to understand conceptually, 672 and the easiest to deploy; however, it requires an Ethernet 673 interconnect between the ASes, and both VPLS control and data plane 674 state on the AS border routers (ASBRs). Method (b) requires VPLS 675 control plane state on the ASBRs and MPLS on the AS-AS interconnect 676 (which need not be Ethernet). Method (c) requires MPLS on the AS-AS 677 interconnect, but no VPLS state of any kind on the ASBRs. 679 3.4.1. a) VPLS-to-VPLS connections at the ASBRs. 681 In this method, an AS Border Router (ASBR1) acts as a PE for all 682 VPLSs that span AS1 and an AS to which ASBR1 is connected, such as 683 AS2 here. The ASBR on the neighboring AS (ASBR2) is viewed by ASBR1 684 as a CE for the VPLSs that span AS1 and AS2; similarly, ASBR2 acts as 685 a PE for this VPLS from AS2's point of view, and views ASBR1 as a CE. 687 This method does not require MPLS on the ASBR1-ASBR2 link, but does 688 require that this link carry Ethernet traffic, and that there be a 689 separate VLAN sub-interface for each VPLS traversing this link. It 690 further requires that ASBR1 does the PE operations (discovery, 691 signaling, MAC address learning, flooding, encapsulation, etc.) for 692 all VPLSs that traverse ASBR1. This imposes a significant burden on 693 ASBR1, both on the control plane and the data plane, which limits the 694 number of multi-AS VPLSs. 696 Note that in general, there will be multiple connections between a 697 pair of ASes, for redundancy. In this case, the Spanning Tree 698 Protocol (STP) ([14]), or some other means of loop detection and 699 prevention, must be run on each VPLS that spans these ASes, so that a 700 loop-free topology can be constructed in each VPLS. This imposes a 701 further burden on the ASBRs and PEs participating in those VPLSs, as 702 these devices would need to run a loop detection algorithm for each 703 such VPLS. How this may be achieved is outside the scope of this 704 document. 706 3.4.2. b) EBGP redistribution of VPLS information between ASBRs. 708 This method requires I-BGP peerings between the PEs in AS1 and ASBR1 709 in AS1 (perhaps via route reflectors), an E-BGP peering between ASBR1 710 and ASBR2 in AS2, and I-BGP peerings between ASBR2 and the PEs in 711 AS2. In the above example, PE1 sends a VPLS NLRI to ASBR1 with a 712 label block and itself as the BGP nexthop; ASBR1 sends the NLRI to 713 ASBR2 with new labels and itself as the BGP nexthop; and ASBR2 sends 714 the NLRI to PE2 with new labels and itself as the nexthop. 716 The VPLS NLRI that ASBR1 sends to ASBR2 (and the NLRI that ASBR2 717 sends to PE2) is identical to the VPLS NLRI that PE1 sends to ASBR1, 718 except for the label block. To be precise, the Length, the Route 719 Distinguisher, the VE ID, the VE Block Offset, and the VE Block Size 720 MUST be the same; the Label Base may be different. Furthermore, 721 ASBR1 must also update its forwarding path as follows: if the Label 722 Base sent by PE1 is L1, the Label-block Size is N, the Label Base 723 sent by ASBR1 is L2, and the tunnel label from ASBR1 to PE1 is T, 724 then ASBR1 must install the following in the forwarding path: 726 swap L2 with L1 and push T, 728 swap L2+1 with L1+1 and push T, ... 730 swap L2+N-1 with L1+N-1 and push T. 732 ASBR2 must act similarly, except that it may not need a tunnel label 733 if it is directly connected with ASBR1. 735 When PE2 wants to send a VPLS packet to PE1, PE2 uses its VE ID to 736 get the right VPLS label from ASBR2's label block for PE1, and uses a 737 tunnel label to reach ASBR2. ASBR2 swaps the VPLS label with the 738 label from ASBR1; ASBR1 then swaps the VPLS label with the label from 739 PE1, and pushes a tunnel label to reach PE1. 741 In this method, one needs MPLS on the ASBR1-ASBR2 interface, but 742 there is no requirement that the link layer be Ethernet. 743 Furthermore, the ASBRs take part in distributing VPLS information. 744 However, the data plane requirements of the ASBRs is much simpler 745 than in method (a), being limited to label operations. Finally, the 746 construction of loop-free VPLS topologies is done by routing 747 decisions, viz. BGP path and nexthop selection, so there is no need 748 to run the Spanning Tree Protocol on a per-VPLS basis. Thus, this 749 method is considerably more scalable than method (a). 751 3.4.3. c) Multi-hop EBGP redistribution of VPLS information between 752 ASes. 754 In this method, there is a multi-hop E-BGP peering between the PEs 755 (or preferably, a Route Reflector) in AS1 and the PEs (or Route 756 Reflector) in AS2. PE1 sends a VPLS NLRI with labels and nexthop 757 self to PE2; if this is via route reflectors, the BGP nexthop is not 758 changed. This requires that there be a tunnel LSP from PE1 to PE2. 759 This tunnel LSP can be created exactly as in [10], section 10 (c), 760 for example using E-BGP to exchange labeled IPv4 routes for the PE 761 loopbacks. 763 When PE1 wants to send a VPLS packet to PE2, it pushes the VPLS label 764 corresponding to its own VE ID onto the packet. It then pushes the 765 tunnel label(s) to reach PE2. 767 This method requires no VPLS information (in either the control or 768 the data plane) on the ASBRs. The ASBRs only need to set up PE-to-PE 769 tunnel LSPs in the control plane, and do label operations in the data 770 plane. Again, as in the case of method (b), the construction of 771 loop-free VPLS topologies is done by routing decisions, i.e., BGP 772 path and nexthop selection, so there is no need to run the Spanning 773 Tree Protocol on a per-VPLS basis. This option is likely to be the 774 most scalable of the three methods presented here. 776 3.4.4. Allocation of VE IDs Across Multiple ASes 778 In order to ease the allocation of VE IDs for a VPLS that spans 779 multiple ASes, one can allocate ranges for each AS. For example, AS1 780 uses VE IDs in the range 1 to 100, AS2 from 101 to 200, etc. If 781 there are 10 sites attached to AS1 and 20 to AS2, the allocated VE 782 IDs could be 1-10 and 101 to 120. This minimizes the number of VPLS 783 NLRIs that are exchanged while ensuring that VE IDs are kept unique. 785 In the above example, if AS1 needed more than 100 sites, then another 786 range can be allocated to AS1. The only caveat is that there be no 787 overlap between VE ID ranges among ASes. The exception to this rule 788 is multi-homing, which is dealt with below. 790 3.5. Multi-homing and Path Selection 792 It is often desired to multi-home a VPLS site, i.e., to connect it to 793 multiple PEs, perhaps even in different ASes. In such a case, the 794 PEs connected to the same site can either be configured with the same 795 VE ID or with different VE IDs. In the latter case, it is mandatory 796 to run STP on the CE device, and possibly on the PEs, to construct a 797 loop-free VPLS topology. How this can be accomplished is outside the 798 scope of this document; however, the rest of this section will 799 describe in some detail the former case. 801 In the case where the PEs connected to the same site are assigned the 802 same VE ID, a loop-free topology is constructed by routing 803 mechanisms, in particular, by BGP path selection. When a BGP speaker 804 receives two equivalent NLRIs (see below for the definition), it 805 applies standard path selection criteria such as Local Preference and 806 AS Path Length to determine which NLRI to choose; it MUST pick only 807 one. If the chosen NLRI is subsequently withdrawn, the BGP speaker 808 applies path selection to the remaining equivalent VPLS NLRIs to pick 809 another; if none remain, the forwarding information associated with 810 that NLRI is removed. 812 Two VPLS NLRIs are considered equivalent from a path selection point 813 of view if the Route Distinguisher, the VE ID and the VE Block Offset 814 are the same. If two PEs are assigned the same VE ID in a given 815 VPLS, they MUST use the same Route Distinguisher, and they SHOULD 816 announce the same VE Block Size for a given VE Offset. 818 3.6. Hierarchical BGP VPLS 820 This section discusses how one can scale the VPLS control plane when 821 using BGP. There are at least three aspects of scaling the control 822 plane: 824 1. alleviating the full mesh connectivity requirement among VPLS BGP 825 speakers; 827 2. limiting BGP VPLS message passing to just the interested speakers 828 rather than all BGP speakers; and 830 3. simplifying the addition and deletion of BGP speakers, whether 831 for VPLS or other applications. 833 Fortunately, the use of BGP for Internet routing as well as for IP 834 VPNs has yielded several good solutions for all these problems. The 835 basic technique is hierarchy, using BGP Route Reflectors (RRs) ([6]). 836 The idea is to designate a small set of Route Reflectors which are 837 themselves fully meshed, and then establish a BGP session between 838 each BGP speaker and one or more RRs. In this way, there is no need 839 of direct full mesh connectivity among all the BGP speakers. If the 840 particular scaling needs of a provider requires a large number of 841 RRs, then this technique can be applied recursively: the full mesh 842 connectivity among the RRs can be brokered by yet another level of 843 RRs. The use of RRs solves problems 1 and 3 above. 845 It is important to note that RRs, as used for VPLS and VPNs, are 846 purely a control plane technique. The use of RRs introduces no data 847 plane state and no data plane forwarding requirements on the RRs, and 848 does not in any way change the forwarding path of VPLS traffic. This 849 is in contrast to the technique of Hierarchical VPLS defined in [8]. 851 Another consequence of this approach is that it is not required that 852 one set of RRs handles all BGP messages, or that a particular RR 853 handle all messages from a given PE. One can define several sets of 854 RRs, for example a set to handle VPLS, another to handle IP VPNs and 855 another for Internet routing. Another partitioning could be to have 856 some subset of VPLSs and IP VPNs handled by one set of RRs, and 857 another subset of VPLSs and IP VPNs handled by another set of RRs; 858 the use of Route Target Filtering (RTF), described in [11] can make 859 this simpler and more effective. 861 Finally, problem 2 (that of limiting BGP VPLS message passing to just 862 the interested BGP speakers) is addressed by the use of RTF. This 863 technique is orthogonal to the use of RRs, but works well in 864 conjunction with RRs. RTF is also very effective in inter-AS VPLS; 865 more details on how RTF works and its benefits are provided in [11]. 867 It is worth mentioning an aspect of the control plane that is often a 868 source of confusion. No MAC addresses are exchanged via BGP. All 869 MAC address learning and aging is done in the data plane individually 870 by each PE. The only task of BGP VPLS message exchange is 871 autodiscovery and label exchange. 873 Thus, BGP processing for VPLS occurs when 875 1. a PE joins or leaves a VPLS; or 877 2. a failure occurs in the network, bringing down a PE-PE tunnel or 878 a PE-CE link. 880 These events are relatively rare, and typically, each such event 881 causes one BGP update to be generated. Coupled with BGP's messaging 882 efficiency when used for signaling VPLS, these observations lead to 883 the conclusion that BGP as a control plane for VPLS will scale quite 884 well both in terms of processing and memory requirements. 886 4. Data Plane 888 This section discusses two aspects of the data plane for PEs and 889 u-PEs implementing VPLS: encapsulation and forwarding. 891 4.1. Encapsulation 893 Ethernet frames received from CE devices are encapsulated for 894 transmission over the packet switched network connecting the PEs. 895 The encapsulation is as in [5], with one change: a PE that sets the P 896 bit in the Control Flags strips the outermost VLAN from an Ethernet 897 frame received from a CE before encapsulating it, and pushes a VLAN 898 onto a decapsulated frame before sending it to a CE. 900 4.2. Forwarding 902 VPLS packets are classified as belonging to a given service instance 903 and associated forwarding table based on the interface over which the 904 packet is received. Packets are forwarded in the context of the 905 service instance based on the destination MAC address. The former 906 mapping is determined by configuration. The latter is the focus of 907 this section. 909 4.2.1. MAC address learning 911 As was mentioned earlier, the key distinguishing feature of VPLS is 912 that it is a multipoint service. This means that the entire Service 913 Provider network should appear as a single logical learning bridge 914 for each VPLS that the SP network supports. The logical ports for 915 the SP "bridge" are the customer ports as well as the pseudowires on 916 a VE. Just as a learning bridge learns MAC addresses on its ports, 917 the SP bridge must learn MAC addresses at its VEs. 919 Learning consists of associating source MAC addresses of packets with 920 the (logical) ports on which they arrive; this association is the 921 Forwarding Information Base (FIB). The FIB is used for forwarding 922 packets. For example, suppose the bridge receives a packet with 923 source MAC address S on (logical) port P. If subsequently, the bridge 924 receives a packet with destination MAC address S, it knows that it 925 should send the packet out on port P. 927 If a VE learns a source MAC address S on logical port P, then later 928 sees S on a different port P', then the VE MUST update its FIB to 929 reflect the new port P'. A VE MAY implement a mechanism to damp 930 flapping of source ports for a given MAC address. 932 4.2.2. Aging 934 VPLS PEs SHOULD have an aging mechanism to remove a MAC address 935 associated with a logical port, much the same as learning bridges do. 936 This is required so that a MAC address can be relearned if it "moves" 937 from a logical port to another logical port, either because the 938 station to which that MAC address belongs really has moved, or 939 because of a topology change in the LAN that causes this MAC address 940 to arrive on a new port. In addition, aging reduces the size of a 941 VPLS MAC table to just the active MAC addresses, rather than all MAC 942 addresses in that VPLS. 944 The "age" of a source MAC address S on a logical port P is the time 945 since it was last seen as a source MAC on port P. If the age exceeds 946 the aging time T, S MUST be flushed from the FIB. This of course 947 means that every time S is seen as a source MAC address on port P, 948 S's age is reset. 950 An implementation SHOULD provide a configurable knob to set the aging 951 time T on a per-VPLS basis. In addition, an implementation MAY 952 accelerate aging of all MAC addresses in a VPLS if it detects certain 953 situations, such as a Spanning Tree topology change in that VPLS. 955 4.2.3. Flooding 957 When a bridge receives a packet to a destination that is not in its 958 FIB, it floods the packet on all the other ports. Similarly, a VE 959 will flood packets to an unknown destination to all other VEs in the 960 VPLS. 962 In Figure 1 above, if CE2 sent an Ethernet frame to PE2, and the 963 destination MAC address on the frame was not in PE2's FIB (for that 964 VPLS), then PE2 would be responsible for flooding that frame to every 965 other PE in the same VPLS. On receiving that frame, PE1 would be 966 responsible for further flooding the frame to CE1 and CE5 (unless PE1 967 knew which CE "owned" that MAC address). 969 On the other hand, if PE3 received the frame, it could delegate 970 further flooding of the frame to its u-PE. If PE3 was connected to 2 971 u-PEs, it would announce that it has two u-PEs. PE3 could either 972 announce that it is incapable of flooding, in which case it would 973 receive two frames, one for each u-PE, or it could announce that it 974 is capable of flooding, in which case it would receive one copy of 975 the frame, which it would then send to both u-PEs. 977 4.2.4. Broadcast and Multicast 979 There is a well-known broadcast MAC address. An Ethernet frame whose 980 destination MAC address is the broadcast MAC address must be sent to 981 all stations in that VPLS. This can be accomplished by the same 982 means that is used for flooding. 984 There is also an easily recognized set of "multicast" MAC addresses. 985 Ethernet frames with a destination multicast MAC address MAY be 986 broadcast to all stations; a VE MAY also use certain techniques to 987 restrict transmission of multicast frames to a smaller set of 988 receivers, those that have indicated interest in the corresponding 989 multicast group. Discussion of this is outside the scope of this 990 document. 992 4.2.5. "Split Horizon" Forwarding 994 When a PE capable of flooding (say PEx) receives a broadcast Ethernet 995 frame, or one with an unknown destination MAC address, it must flood 996 the frame. If the frame arrived from an attached CE, PEx must send a 997 copy of the frame to every other attached CE, as well as to all other 998 PEs participating in the VPLS. If, on the other hand, the frame 999 arrived from another PE (say PEy), PEx must send a copy of the packet 1000 only to attached CEs. PEx MUST NOT send the frame to other PEs, 1001 since PEy would have already done so. This notion has been termed 1002 "split horizon" forwarding, and is a consequence of the PEs being 1003 logically fully meshed for VPLS. 1005 Split horizon forwarding rules apply to broadcast and multicast 1006 packets, as well as packets to an unknown MAC address. 1008 4.2.6. Qualified and Unqualified Learning 1010 The key for normal Ethernet MAC learning is usually just the 1011 (6-octet) MAC address. This is called "unqualified learning". 1012 However, it is also possible that the key for learning includes the 1013 VLAN tag when present; this is called "qualified learning". 1015 In the case of VPLS, learning is done in the context of a VPLS 1016 instance, which typically corresponds to a customer. If the customer 1017 uses VLAN tags, one can make the same distinctions of qualified and 1018 unqualified learning. If the key for learning within a VPLS is just 1019 the MAC address, then this VPLS is operating under unqualified 1020 learning. If the key for learning is (customer VLAN tag + MAC 1021 address), then this VPLS is operating under qualified learning. 1023 Choosing between qualified and unqualified learning involves several 1024 factors, the most important of which is whether one wants a single 1025 global broadcast domain (unqualified), or a broadcast domain per VLAN 1026 (qualified). The latter makes flooding and broadcasting more 1027 efficient, but requires larger MAC tables. These considerations 1028 apply equally to normal Ethernet forwarding and to VPLS. 1030 4.2.7. Class of Service 1032 In order to offer different Classes of Service within a VPLS, an 1033 implementation MAY choose to map 802.1p bits in a customer Ethernet 1034 frame with a VLAN tag to an appropriate setting of EXP bits in the 1035 pseudowire and/or tunnel label, allowing for differential treatment 1036 of VPLS frames in the packet-switched network. 1038 To be useful, an implementation SHOULD allow this mapping function to 1039 be different for each VPLS, as each VPLS customer may have their own 1040 view of the required behavior for a given setting of 802.1p bits. 1042 5. Deployment Options 1044 In deploying a network that supports VPLS, the SP must decide what 1045 functions the VPLS-aware device closest to the customer (the VE) 1046 supports. The default case described in this document is that the VE 1047 is a PE. However, there are a number of reasons that the VE might be 1048 a device that does all the Layer 2 functions (such as MAC address 1049 learning and flooding), and a limited set of Layer 3 functions (such 1050 as communicating to its PE), but, for example, doesn't do full- 1051 fledged discovery and PE-to-PE signaling. Such a device is called a 1052 "u-PE". 1054 As both of these cases have benefits, one would like to be able to 1055 "mix and match" these scenarios. The signaling mechanism presented 1056 here allows this. For example, in a given provider network, one PE 1057 may be directly connected to CE devices; another may be connected to 1058 u-PEs that are connected to CEs; and a third may be connected 1059 directly to a customer over some interfaces and to u-PEs over others. 1060 All these PEs perform discovery and signaling in the same manner. 1061 How they do learning and forwarding depends on whether or not there 1062 is a u-PE; however, this is a local matter, and is not signaled. 1063 However, the details of the operation of a u-PE and its interactions 1064 with PEs and other u-PEs is beyond the scope of this document. 1066 6. Security Considerations 1068 The focus in Virtual Private LAN Service is the privacy of data, 1069 i.e., that data in a VPLS is only distributed to other nodes in that 1070 VPLS and not to any external agent or other VPLS. Note that VPLS 1071 does not offer security or authentication: VPLS packets are sent in 1072 the clear in the packet-switched network, and a man-in-the-middle can 1073 eavesdrop, and may be able to inject packets into the data stream. 1074 If security is desired, the PE-to-PE tunnels can be IPsec tunnels. 1075 For more security, the end systems in the VPLS sites can use 1076 appropriate means of encryption to secure their data even before it 1077 enters the Service Provider network. 1079 There are two aspects to achieving data privacy in a VPLS: securing 1080 the control plane, and protecting the forwarding path. Compromise of 1081 the control plane could result in a PE sending data belonging to some 1082 VPLS to another VPLS, or blackholing VPLS data, or even sending it to 1083 an eavesdropper, none of which are acceptable from a data privacy 1084 point of view. Since all control plane exchanges are via BGP, 1085 techniques such as in [2] help authenticate BGP messages, making it 1086 harder to spoof updates (which can be used to divert VPLS traffic to 1087 the wrong VPLS), or withdraws (denial of service attacks). In the 1088 multi-AS options (b) and (c), this also means protecting the inter-AS 1089 BGP sessions, between the ASBRs, the PEs or the Route Reflectors. 1090 Note that [2] will not help in keeping VPLS labels private -- knowing 1091 the labels, one can eavesdrop on VPLS traffic. However, this 1092 requires access to the data path within a Service Provider network. 1094 Protecting the data plane requires ensuring that PE-to-PE tunnels are 1095 well-behaved (this is outside the scope of this document), and that 1096 VPLS labels are accepted only from valid interfaces. For a PE, valid 1097 interfaces comprise links from P routers. For an ASBR, a valid 1098 interface is a link from an ASBR in an AS that is part of a given 1099 VPLS. It is especially important in the case of multi-AS VPLSs that 1100 one accept VPLS packets only from valid interfaces. 1102 7. IANA Considerations 1104 IANA is asked to allocate an AFI for L2VPN information (suggested 1105 value: 25). [NOTE to IANA: This should be the same as the AFI 1106 requested by [9].] 1108 8. References 1110 8.1. Normative References 1112 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 1113 Levels", BCP 14, RFC 2119, March 1997. 1115 [2] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 1116 Signature Option", RFC 2385, August 1998. 1118 [3] Bates, T., "Multiprotocol Extensions for BGP-4", 1119 draft-ietf-idr-rfc2858bis-07 (work in progress), August 2005. 1121 [4] Rekhter, Y., "BGP Extended Communities Attribute", 1122 draft-ietf-idr-bgp-ext-communities-09 (work in progress), 1123 July 2005. 1125 [5] Martini, L., "Encapsulation Methods for Transport of Ethernet 1126 Over MPLS Networks", draft-ietf-pwe3-ethernet-encap-11 (work in 1127 progress), December 2005. 1129 8.2. Informative References 1131 [6] Bates, T., Chandra, R., and E. Chen, "BGP Route Reflection - An 1132 Alternative to Full Mesh IBGP", RFC 2796, April 2000. 1134 [7] Andersson, L. and E. Rosen, "Framework for Layer 2 Virtual 1135 Private Networks (L2VPNs)", draft-ietf-l2vpn-l2-framework-05 1136 (work in progress), June 2004. 1138 [8] Lasserre, M. and V. Kompella, "Virtual Private LAN Services 1139 over MPLS", draft-ietf-l2vpn-vpls-ldp-08 (work in progress), 1140 November 2005. 1142 [9] Ould-Brahim, H., "Using BGP as an Auto-Discovery Mechanism for 1143 Layer-3 and Layer-2 VPNs", draft-ietf-l3vpn-bgpvpn-auto-06 1144 (work in progress), June 2005. 1146 [10] Rosen, E., "BGP/MPLS IP VPNs", draft-ietf-l3vpn-rfc2547bis-03 1147 (work in progress), October 2004. 1149 [11] Marques, P., "Constrained VPN Route Distribution", 1150 draft-ietf-l3vpn-rt-constrain-02 (work in progress), June 2005. 1152 [12] Martini, L., "Pseudowire Setup and Maintenance using the Label 1153 Distribution Protocol", draft-ietf-pwe3-control-protocol-17 1154 (work in progress), June 2005. 1156 [13] Kompella, K., "Layer 2 VPNs Over Tunnels", 1157 draft-kompella-l2vpn-l2vpn-00 (work in progress), January 2004. 1159 [14] Institute of Electrical and Electronics Engineers, "Information 1160 technology - Telecommunications and information exchange 1161 between systems - Local and metropolitan area networks - Common 1162 specifications - Part 3: Media Access Control (MAC) Bridges: 1163 Revision. This is a revision of ISO/IEC 10038: 1993, 802.1j- 1164 1992 and 802.6k-1992. It incorporates P802.11c, P802.1p and 1165 P802.12e. ISO/IEC 15802-3: 1998.", IEEE Standard 802.1D, 1166 July 1998. 1168 Appendix A. Contributors 1170 The following contributed to this document: 1172 Javier Achirica, Telefonica 1173 Loa Andersson, Acreo 1174 Chaitanya Kodeboyina, Juniper 1175 Giles Heron, Tellabs 1176 Sunil Khandekar, Alcatel 1177 Vach Kompella, Alcatel 1178 Marc Lasserre, Riverstone 1179 Pierre Lin 1180 Pascal Menezes 1181 Ashwin Moranganti, Appian 1182 Hamid Ould-Brahim, Nortel 1183 Seo Yeong-il, Korea Tel 1185 Appendix B. Acknowledgements 1187 Thanks to Joe Regan and Alfred Nothaft for their contributions. Many 1188 thanks too to Eric Ji, Chaitanya Kodeboyina, Mike Loomis and Elwyn 1189 Davies for their detailed reviews. 1191 Authors' Addresses 1193 Kireeti Kompella (editor) 1194 Juniper Networks 1195 1194 N. Mathilda Ave. 1196 Sunnyvale, CA 94089 1197 US 1199 Email: kireeti@juniper.net 1201 Yakov Rekhter (editor) 1202 Juniper Networks 1203 1194 N. Mathilda Ave. 1204 Sunnyvale, CA 94089 1205 US 1207 Email: yakov@juniper.net 1209 Intellectual Property Statement 1211 The IETF takes no position regarding the validity or scope of any 1212 Intellectual Property Rights or other rights that might be claimed to 1213 pertain to the implementation or use of the technology described in 1214 this document or the extent to which any license under such rights 1215 might or might not be available; nor does it represent that it has 1216 made any independent effort to identify any such rights. Information 1217 on the procedures with respect to rights in RFC documents can be 1218 found in BCP 78 and BCP 79. 1220 Copies of IPR disclosures made to the IETF Secretariat and any 1221 assurances of licenses to be made available, or the result of an 1222 attempt made to obtain a general license or permission for the use of 1223 such proprietary rights by implementers or users of this 1224 specification can be obtained from the IETF on-line IPR repository at 1225 http://www.ietf.org/ipr. 1227 The IETF invites any interested party to bring to its attention any 1228 copyrights, patents or patent applications, or other proprietary 1229 rights that may cover technology that may be required to implement 1230 this standard. Please address the information to the IETF at 1231 ietf-ipr@ietf.org. 1233 Disclaimer of Validity 1235 This document and the information contained herein are provided on an 1236 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1237 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1238 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1239 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1240 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1241 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1243 Copyright Statement 1245 Copyright (C) The Internet Society (2005). This document is subject 1246 to the rights, licenses and restrictions contained in BCP 78, and 1247 except as set forth therein, the authors retain all their rights. 1249 Acknowledgment 1251 Funding for the RFC Editor function is currently provided by the 1252 Internet Society.