idnits 2.17.1 draft-ietf-l2vpn-vpls-bgp-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 817. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 809), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 38. ** The document claims conformance with section 10 of RFC 2026, but uses some RFC 3978/3979 boilerplate. As RFC 3978/3979 replaces section 10 of RFC 2026, you should not claim conformance with it if you have changed to using RFC 3978/3979 boilerplate. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ( - It does however have an RFC 2026 Section 10.4(A) Disclaimer.) ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure Invitation -- however, there's a paragraph with a matching beginning. Boilerplate error? ( - It does however have an RFC 2026 Section 10.4(B) IPR Disclosure Invitation.) ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document is more than 15 pages and seems to lack a Table of Contents. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 2005) is 7033 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '1' is mentioned on line 58, but not defined == Missing Reference: '2' is mentioned on line 68, but not defined == Missing Reference: '3' is mentioned on line 399, but not defined == Missing Reference: '4' is mentioned on line 79, but not defined == Missing Reference: '5' is mentioned on line 85, but not defined == Missing Reference: '6' is mentioned on line 295, but not defined == Missing Reference: '7' is mentioned on line 502, but not defined == Missing Reference: '8' is mentioned on line 262, but not defined == Missing Reference: '9' is mentioned on line 263, but not defined == Outdated reference: A later version (-11) exists of draft-ietf-pwe3-ethernet-encap-06 ** Obsolete normative reference: RFC 2385 (ref. '11') (Obsoleted by RFC 5925) Summary: 12 errors (**), 0 flaws (~~), 11 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Kompella (Editor) 3 Internet Draft Y. Rekhter (Editor) 4 Category: Standards Track Juniper Networks 5 Expires: July 2005 January 2005 6 draft-ietf-l2vpn-vpls-bgp-03.txt 8 Virtual Private LAN Service 10 Status of this Memo 12 By submitting this Internet-Draft, I certify that any applicable 13 patent or other IPR claims of which I am aware have been disclosed, 14 or will be disclosed, and any of which I become aware will be 15 disclosed, in accordance with RFC 3668. 17 This document is an Internet-Draft and is in full conformance with 18 all provisions of Section 10 of RFC2026. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 Copyright Notice 38 Copyright (C) The Internet Society (2005). All Rights Reserved. 40 Abstract 42 Virtual Private LAN Service (VPLS), also known as Transparent LAN 43 Service, and Virtual Private Switched Network service, is a useful 44 Service Provider offering. The service offered is a Layer 2 Virtual 45 Private Network (VPN); however, in the case of VPLS, the customers in 46 the VPN are connected by a multipoint network, in contrast to the 47 usual Layer 2 VPNs, which are point-to-point in nature. 49 This document describes the functions required to offer VPLS, and 50 describes a mechanism for signaling a VPLS, as well as for forwarding 51 VPLS frames across a packet switched network. 53 Conventions used in this document 55 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 56 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 57 document are to be interpreted as described in RFC 2119 [1]. 59 1. Introduction 61 Virtual Private LAN Service (VPLS), also known as Transparent LAN 62 Service, and Virtual Private Switched Network service, is a useful 63 service offering. A Virtual Private LAN appears in (almost) all 64 respects as a LAN to customers of a Service Provider. However, in a 65 VPLS, the customers are not all connected to a single LAN; the 66 customers may be spread across a metro or wide area. In essence, a 67 VPLS glues several individual LANs across a packet-switched network 68 to appear and function as a single LAN [2]. 70 This document describes the functions needed to offer VPLS, and goes 71 on to describe a mechanism for signaling a VPLS, as well as a 72 mechanism for transport of VPLS frames over tunnels across a packet 73 switched network. The signaling mechanism uses BGP as the control 74 plane protocol. This document also briefly discusses deployment 75 options, in particular, the notion of decoupling functions across 76 devices. 78 Alternative approaches include: [3], which allows one to build a 79 Layer 2 VPN with Ethernet as the interconnect; and [4], which allows 80 one to set up an Ethernet connection across a packet-switched 81 network. Both of these, however, offer point-to-point Ethernet 82 services. What distinguishes VPLS from the above two is that a VPLS 83 offers a multipoint service. A mechanism for setting up pseudowires 84 for VPLS using the Label Distribution Protocol (LDP) is defined in 85 [5]. 87 1.1. Scope of this Document 89 This document has four major parts: defining a VPLS functional model; 90 defining a control plane for setting up VPLS; defining the data plane 91 for VPLS (encapsulation and forwarding of data); and defining various 92 deployment options. 94 The functional model underlying VPLS is laid out in section 2. This 95 describes the service being offered, the network components that 96 interact to provide the service, and at a high level their 97 interactions. 99 The control plane described in this document uses Multiprotocol BGP 100 [6] to establish VPLS service, i.e., for the autodiscovery of VPLS 101 members and for the setup and teardown of the pseudowires that 102 constitute a given VPLS. Section 3 also describes how a VPLS that 103 spans Autonomous System boundaries is set up, as well as how 104 multi-homing is handled. Using BGP as the control plane for VPNs is 105 not new (see [3], [7] and [8]): what is described here is based on 106 the mechanisms proposed in [7]. 108 The forwarding plane and the actions that a participating PE must 109 take is described in section 4. 111 In section 5, the notion of 'decoupled' operation is defined, and the 112 interaction of decoupled and non-decoupled PEs is described. 113 Decoupling allows for more flexible deployment of VPLS. 115 2. Functional Model 117 This will be described with reference to Figure 1. 119 Figure 1: Example of a VPLS 120 ----- 121 / A1 \ 122 ---- ____CE1 | 123 / \ -------- -------- / | | 124 | A2 CE2- / \ / PE1 \ / 125 \ / \ / \___/ | \ ----- 126 ---- ---PE2 | \ 127 | | \ ----- 128 | Service Provider Network | \ / \ 129 | | CE5 A5 | 130 | ___ | / \ / 131 |----| \ / \ PE4_/ ----- 132 |u-PE|--PE3 / \ / 133 |----| -------- ------- 135 ---- / | ---- 136 / \/ \ / \ CE = Customer Edge Device 137 | A3 CE3 --CE4 A4 | PE = Provider Edge Router 138 \ / \ / u-PE = Layer 2 Aggregation 139 ---- ---- A = Customer site n 141 2.1. Terminology 143 Terminology similar to that in [7] is used, with the addition of "u- 144 PE", a Layer 2 PE device used for Layer 2 aggregation. A u-PE is 145 owned and operated by the Service Provider (as is the PE). PE and u- 146 PE devices are "VPLS-aware", which means that they know that a VPLS 147 service is being offered. We will call these VPLS edge devices, 148 which could be either a PE or an u-PE, a VE. 150 In contrast, the CE device (which may be owned and operated by either 151 the SP or the customer) is VPLS-unaware; as far as the CE is 152 concerned, it is connected to the other CEs in the VPLS via a Layer 2 153 switched network. This means that there should be no changes to a CE 154 device, either to the hardware or the software, in order to offer 155 VPLS. 157 A CE device may be connected to a PE or a u-PE via Layer 2 switches 158 that are VPLS-unaware. From a VPLS point of view, such Layer 2 159 switches are invisible, and hence will not be discussed further. 160 Furthermore, a u-PE may be connected to a PE via Layer 2 and Layer 3 161 devices; this will be discussed further in a later section. 163 The term "demultiplexor" refers to an identifier in a data packet 164 that identifies both the VPLS to which the packet belongs as well as 165 the ingress PE. In this document, the demultiplexor is an MPLS 166 label. 168 The term "VPLS" will refer to the service as well as a particular 169 instantiation of the service (i.e., an emulated LAN); it should be 170 clear from the context which usage is intended. 172 2.2. Assumptions 174 The Service Provider Network is a packet switched network. The PEs 175 are assumed to be (logically) full-meshed with tunnels over which 176 packets that belong to a service (such as VPLS) are encapsulated and 177 forwarded. These tunnels can be IP tunnels, such as GRE, or MPLS 178 tunnels, established by RSVP-TE or LDP. These tunnels are 179 established independently of the services offered over them; the 180 signaling and establishment of these tunnels are not discussed in 181 this document. 183 "Flooding" and MAC address "learning" (see section 4) are an integral 184 part of VPLS. However, these activities are private to an SP device, 185 i.e., in the VPLS described below, no SP device requests another SP 186 device to flood packets or learn MAC addresses on its behalf. 188 All the PEs participating in a VPLS are assumed to be fully meshed, 189 i.e., every (ingress) PE can send a VPLS packet to the egress PE(s) 190 directly, without the need for an intermediate PE (see the section 191 below on "Split Horizon" Flooding). This assumption reduces (but 192 does not eliminate) the need to run Spanning Tree Protocol among the 193 PEs. 195 2.3. Interactions 197 VPLS is a successful "LAN Service" if CE devices that belong to VPLS 198 V can interact through the SP network as if they were connected by a 199 LAN. VPLS is "private" if CE devices that belong to different VPLSs 200 cannot interact. VPLS is "virtual" if multiple VPLSs can be offered 201 over a common packet switched network. 203 PE devices interact to "discover" all the other PEs participating in 204 the same VPLS (i.e., that are attached to CE devices that belong to 205 the same VPLS), and to exchange demultiplexors. These interactions 206 are control-driven, not data-driven. 208 U-PEs interact with PEs to establish connections with remote PEs or 209 u-PEs in the same VPLS. Again, this interaction is control-driven. 211 3. Control Plane 213 There are two primary functions of the VPLS control plane: 214 autodiscovery, and setup and teardown of the pseudowires that 215 constitute the VPLS, often called signaling. The first two 216 subsections describe these functions. The next subsection describes 217 the setting up of pseudowires that span Autonomous Systems. The last 218 subsection details how multi-homing is handled. 220 3.1. Autodiscovery 222 Discovery refers to the process of finding all the PEs that 223 participate in a given VPLS. A PE can either be configured with the 224 identities of all the other PEs in a given VPLS, or the PE can use 225 some protocol to discover the other PEs. The latter is called 226 autodiscovery. 228 The former approach is fairly configuration-intensive, especially 229 since it is required (in this and other VPLS approaches) that the PEs 230 participating in a given VPLS are fully meshed (i.e., every pair of 231 PEs in a given VPLS establish pseudowires to each other). 232 Furthermore, when the topology of a VPLS changes (i.e., a PE is added 233 to, or removed from the VPLS), the VPLS configuration on all PEs in 234 that VPLS must be changed. 236 In the autodiscovery approach, each PE "discovers" which other PEs 237 are part of a given VPLS by means of some protocol, in this case BGP. 238 This allows each PE's configuration to consist only of the identity 239 of the VPLS that each customer belongs to, not the identity of every 240 other PE in that VPLS. Moreover, when the topology of a VPLS 241 changes, only the affected PE's configuration changes; other PEs 242 automatically find out about the change and adapt. 244 3.1.1. Functions 246 A PE that participates in a given VPLS V must be able to tell all 247 other PEs in VPLS V that it is also a member of V. A PE must also 248 have a means of declaring that it no longer participates in a VPLS. 249 To do both of these, the PE must have a means of identifying a VPLS 250 and a means by which to communicate to all other PEs. 252 U-PE devices also need to know what constitutes a given VPLS; 253 however, they don't need the same level of detail. The PE (or PEs) 254 to which a u-PE is connected gives the u-PE an abstraction of the 255 VPLS; this is described in section 5. 257 3.1.2. Protocol Specification 259 The specific mechanism for autodiscovery described here is based on 260 [3] and [7]; it uses BGP extended communities [9] to identify members 261 of a VPLS. A more generic autodiscovery mechanism is described in 262 [8]. The specific extended community used is the Route Target, whose 263 format is described in [9]. The semantics of the use of Route 264 Targets is described in [7]; their use in VPLS is identical. 266 As it has been assumed that VPLSs are fully meshed, a single Route 267 Target RT suffices for a given VPLS V, and in effect that RT is the 268 identifier for VPLS V. 270 A PE announces (typically via I-BGP) that it belongs to VPLS V by 271 annotating its NLRIs for V (see next subsection) with Route Target 272 RT, and acts on this by accepting NLRIs from other PEs that have 273 Route Target RT. A PE announces that it no longer participates in V 274 by withdrawing all NLRIs that it had advertised with Route Target RT. 276 3.2. Signaling 278 Once discovery is done, each pair of PEs in a VPLS must be able to 279 establish (and tear down) pseudowires to each other, i.e., exchange 280 (and withdraw) demultiplexors. This process is known as signaling. 281 Signaling is also used to initiate "relearning", and to transmit 282 certain characteristics of the PE regarding a given VPLS. 284 Recall that a demultiplexor is used to distinguish among several 285 different streams of traffic carried over a tunnel, each stream 286 possibly representing a different service. In the case of VPLS, the 287 demultiplexor not only says to which specific VPLS a packet belongs, 288 but also identifies the ingress PE. The former information is used 289 for forwarding the packet; the latter information is used for 290 learning MAC addresses. The demultiplexor described here is an MPLS 291 label, even though the PE-to-PE tunnels may not be MPLS tunnels. 293 3.2.1. Setup and Teardown 295 The VPLS BGP NLRI described below, with a new AFI and SAFI (see [6]) 296 is used to exchange demultiplexors. 298 A PE advertises a VPLS NLRI for each VPLS that it participates in. 299 If the PE is doing learning and flooding, i.e., it is the VE, it 300 announces a single set of VPLS NLRIs for each VPLS that it is in. If 301 the PE is connected to several u-PEs, it announces one set of VPLS 302 NLRIs for each u-PE. A hybrid scheme is also possible, where the PE 303 learns MAC addresses on some interfaces (over which it is directly 304 connected to CEs) and delegates learning on other interfaces (over 305 which it is connected to u-PEs). In this case, the PE would announce 306 one set of VPLS NLRIs for each u-PE that has customer ports in a 307 given VPLS, and one set for itself, if it has customer ports in that 308 VPLS. 310 Each set of NLRIs defines the demultiplexors for a range of other PEs 311 in the VPLS. Ideally, a single NLRI suffices to cover all PEs in a 312 VPLS; however, there are cases (such as a newly added PE) where the 313 pre-existing NLRI does not have enough labels. In such cases, 314 advertising an additional NLRI for the same VPLS serves to add labels 315 for the new PEs without disrupting service to the pre-existing PEs. 316 If service disruption is acceptable (or when the PE restarts its BGP 317 process), a PE MAY consider coalescing all NLRIs for a VPLS into a 318 single NLRI. 320 If a PE X is part of VPLS V, and X receives a VPLS NLRI for V from PE 321 Y that includes a demultiplexor that X can use, X sets up its ends of 322 a pair of pseudowires between X and Y. X may also have to advertise 323 a new NLRI for V that includes a demultiplexor that Y can use, if its 324 pre-existing NLRI for V did not include a demultiplexor for Y. 326 If Y's configuration is changed to remove it from VPLS V, then Y MUST 327 withdraw all its NLRIs for V. If all Y's links to CEs in V go down, 328 then Y SHOULD either withdraw all its NLRIs for V, or let other PEs 329 in the VPLS V know in some way that Y is no longer connected to its 330 CEs. 332 If Y withdraws an NLRI for V that X was using, then X MUST tear down 333 its ends of the pseudowires between X and Y. 335 The format of the VPLS NLRI is given below. The AFI and SAFI are the 336 same as for the L2 VPN NLRI [3]. 338 Figure 2: BGP NLRI for VPLS Information 340 +------------------------------------+ 341 | Length (2 octets) | 342 +------------------------------------+ 343 | Route Distinguisher (8 octets) | 344 +------------------------------------+ 345 | VE ID (2 octets) | 346 +------------------------------------+ 347 | VE Block Offset (2 octets) | 348 +------------------------------------+ 349 | VE Block Size (2 octets) | 350 +------------------------------------+ 351 | Label Base (3 octets) | 352 +------------------------------------+ 354 3.2.2. Signaling PE Capabilities 356 The Encaps Type and Control Flags are encoded in an extended 357 attribute. The community type also is used in L2 VPNs [3]. 359 The Encaps Type for VPLS is 19. 361 Figure 4: Control Flags Bit Vector 363 0 1 2 3 4 5 6 7 364 +-+-+-+-+-+-+-+-+ 365 | MBZ |P|Q|F|C|S| (MBZ = MUST Be Zero) 366 +-+-+-+-+-+-+-+-+ 368 Figure 3: layer2-info extended community 370 +------------------------------------+ 371 | Extended community type (2 octets) | 372 +------------------------------------+ 373 | Encaps Type (1 octet) | 374 +------------------------------------+ 375 | Control Flags (1 octet) | 376 +------------------------------------+ 377 | Layer-2 MTU (2 octet) | 378 +------------------------------------+ 379 | Reserved (2 octets) | 380 +------------------------------------+ 382 With reference to Figure 4, the following bits are defined; the MBZ 383 bits MUST be set to zero. 385 Name Meaning 386 P If set to 1, then the PE will strip the outermost VLAN 387 tag from the customer frame on ingress, and push a 388 VLAN tag on egress. If set to 0, the customer frame 389 is left unchanged. 390 Q Reserved. 391 F If set to 1 (0), the PE is (not) capable of flooding. 392 C If set to 1 (0), Control word is (not) required when 393 encapsulating Layer 2 frames [10]. 394 S If set to 1 (0), Sequenced delivery of frames is (not) 395 required. 397 3.3. Multi-AS VPLS 399 As in [3] and [7], the above autodiscovery and signaling functions 400 are typically announced via I-BGP. This assumes that all the sites 401 in a VPLS are connected to PEs in a single Autonomous System (AS). 403 However, sites in a VPLS may connect to PEs in different ASes. This 404 leads to two issues: 1) there would not be an I-BGP connection 405 between those PEs, so some means of signaling across ASes may be 406 needed; and 2) there may not be PE-to-PE tunnels between the ASes. 408 A similar problem is solved in [7], Section 10. Three methods are 409 suggested to address issue (1); all these methods have analogs in 410 multi-AS VPLS. 412 Here is a diagram for reference: 414 __________ ____________ ____________ __________ 415 / \ / \ / \ / \ 416 \___/ AS 1 \ / AS 2 \___/ 417 \ / 418 +-----+ +-------+ | +-------+ +-----+ 419 | PE1 | ---...--- | ASBR1 | ======= | ASBR2 | ---...--- | PE2 | 420 +-----+ +-------+ | +-------+ +-----+ 421 ___ / \ ___ 422 / \ / \ / \ 423 \__________/ \____________/ \____________/ \__________/ 424 a) VPLS-to-VPLS connections at the AS border routers. 426 In this method, an AS Border Router (ASBR1) acts as a PE for all 427 VPLSs that span AS1 and an AS to which ASBR1 is connected, such as 428 AS2 here. The ASBR on the neighboring AS (ASBR2) is viewed by 429 ASBR1 as a CE for the VPLSs that span AS1 and AS2; similarly, 430 ASBR2 acts as a PE for this VPLS from AS2's point of view, and 431 views ASBR1 as a CE. 433 This method does not require MPLS on the ASBR1-ASBR2 link, but 434 does require that this link carry Ethernet traffic, and that there 435 be a separate VLAN sub-interface for each VPLS traversing this 436 link. It further requires that ASBR1 does the PE operations 437 (discovery, signaling, MAC address learning, flooding, 438 encapsulation, etc.) for all VPLSs that traverse ASBR1. This 439 imposes a significant burden on ASBR1, both on the control plane 440 and the data plane, which limits the number of multi-AS VPLSs. 442 Note that in general, there will be multiple connections between a 443 pair of ASes, for redundancy. In this case, the Spanning Tree 444 Protocol must be run on each VPLS that spans these ASes, so that a 445 loop-free topology can be constructed in each VPLS. This imposes 446 a further burden on the ASBRs and PEs participating in those 447 VPLSs, as these devices would need to run the Spanning Tree 448 Protocol for each such VPLS.. 450 b) EBGP redistribution of VPLS information between ASBRs. 452 This method requires I-BGP peerings between the PEs in AS1 and 453 ASBR1 in AS1 (perhaps via route reflectors), an E-BGP peering 454 between ASBR1 and ASBR2 in AS2, and I-BGP peerings between ASBR2 455 and the PEs in AS2. In the above example, PE1 sends a VPLS NLRI 456 to ASBR1 with a label block and itself as the BGP nexthop; ASBR1 457 sends the NLRI to ASBR2 with new labels and itself as the BGP 458 nexthop; and ASBR2 sends the NLRI to PE2 with new labels and 459 itself as the nexthop. 461 The VPLS NLRI that ASBR1 sends to ASBR2 (and the NLRI that ASBR2 462 sends to PE2) is identical to the VPLS NLRI that PE1 sends to 463 ASBR1, except for the label block. To be precise, the Length, the 464 Route Distinguisher, the VE ID, the VE Block Offset, and the VE 465 Block Size MUST be the same; the Label Base may be different. 466 Furthermore, ASBR1 must also update its forwarding path as 467 follows: if the Label Base sent by PE1 is L1, the Label-block Size 468 is N, the Label Base sent by ASBR1 is L2, and the tunnel label 469 from ASBR1 to PE1 is T, then ASBR1 must install the following in 470 the forwarding path: 471 swap L2 with L1 and push T, 472 swap L2+1 with L1+1 and push T, 473 ... 474 swap L2+N-1 with L1+N-1 and push T. 476 ASBR2 must act similarly, except that it may not need a tunnel 477 label if it is directly connected with ASBR1. 479 When PE2 wants to send a VPLS packet to PE1, PE2 uses its VE ID to 480 get the right VPLS label from ASBR2's label block for PE1, and 481 uses a tunnel label to reach ASBR2. ASBR2 swaps the VPLS label 482 with the label from ASBR1; ASBR1 then swaps the VPLS label with 483 the label from PE1, and pushes a tunnel label to reach PE1. 485 In this method, one needs MPLS on the ASBR1-ASBR2 interface, but 486 there is no requirement that the link layer be Ethernet. 487 Furthermore, the ASBRs take part in distributing VPLS information. 488 However, the data plane requirements of the ASBRs is much simpler 489 than in method (a), being limited to label operations. Finally, 490 the construction of loop-free VPLS topologies is done by routing 491 decisions, viz. BGP path and nexthop selection, so there is no 492 need to run the Spanning Tree Protocol on a per-VPLS basis. Thus, 493 this method is considerably more scalable than method (a). 495 c) Multi-hop EBGP redistribution of VPLS information between ASes. 497 In this method, there is a multi-hop E-BGP peering between the PEs 498 (or preferably, a Route Reflector) in AS1 and the PEs (or Route 499 Reflector) in AS2. PE1 sends a VPLS NLRI with labels and nexthop 500 self to PE2; if this is via route reflectors, the BGP nexthop is 501 not changed. This requires that there be a tunnel LSP from PE1 to 502 PE2. This tunnel LSP can be created exactly as in [7], section 10 503 (c), for example using E-BGP to exchange labeled IPv4 routes for 504 the PE loopbacks. 506 When PE1 wants to send a VPLS packet to PE2, it pushes the VPLS 507 label corresponding to its own VE ID onto the packet. It then 508 pushes the tunnel label(s) to reach PE2. 510 This method requires no VPLS information (in either the control or 511 the data plane) on the ASBRs. The ASBRs only need to set up 512 PE-to-PE tunnel LSPs in the control plane, and do label operations 513 in the data plane. Again, as in the case of method (b), the 514 construction of loop-free VPLS topologies is done by routing 515 decisions, i.e., BGP path and nexthop selection, so there is no 516 need to run the Spanning Tree Protocol on a per-VPLS basis. This 517 option is likely to be the most scalable of the three methods 518 presented here. 520 In order to ease the allocation of VE IDs for a VPLS that spans 521 multiple ASes, one can allocate ranges for each AS. For example, AS1 522 uses VE IDs in the range 1 to 100, AS2 from 101 to 200, etc. If 523 there are 10 sites attached to AS1 and 20 to AS2, the allocated VE 524 IDs could be 1-10 and 101 to 120. This minimizes the number of VPLS 525 NLRIs that are exchanged while ensuring that VE IDs are kept unique. 527 In the above example, if AS1 needed more than 100 sites, then another 528 range can be allocated to AS1. The only caveat is that there is no 529 overlap between VE ID ranges among ASes. The exception to this rule 530 is multi-homing, which is dealt with below. 532 3.4. Multi-homing and Path Selection 534 It is often desired to multi-home a VPLS site, i.e., to connect it to 535 multiple PEs, perhaps even in different ASes. In such a case, the 536 PEs connected to the same site can either be configured with the same 537 VE ID or with different VE IDs. In the latter case, it is mandatory 538 to run STP on the CE device, and possibly on the PEs, to construct a 539 loop-free VPLS topology. 541 In the case where the PEs connected to the same site are assigned the 542 same VE ID, a loop-free topology is constructed by routing 543 mechanisms, in particular, by BGP path selection. When a BGP speaker 544 receives two equivalent NLRIs (see below for the definition), it 545 applies standard path selection criteria such as Local Preference and 546 AS Path Length to determine which NLRI to choose; it MUST pick only 547 one. If the chosen NLRI is subsequently withdrawn, the BGP speaker 548 applies path selection to the remaining equivalent VPLS NLRIs to pick 549 another; if none remain, the forwarding information associated with 550 that NLRI is removed. 552 Two VPLS NLRIs are considered equivalent from a path selection point 553 of view if the Route Distinguisher, the VE ID and the VE Block Offset 554 are the same. If two PEs are assigned the same VE ID in a given 555 VPLS, they MUST use the same Route Distinguisher, and they MUST 556 announce the same VE Block Size for a given VE Offset. 558 4. Data Plane 560 This section discusses two aspects of the data plane for PEs and u- 561 PEs implementing VPLS: encapsulation and forwarding. 563 4.1. Encapsulation 565 Ethernet frames received from CE devices are encapsulated for 566 transmission over the packet switched network connecting the PEs. 567 The encapsulation is as in [10], with one change: a PE that sets the 568 P bit in the Control Flags strips the outermost VLAN from an Ethernet 569 frame received from a CE before encapsulating it, and pushes a VLAN 570 onto a decapsulated frame before sending it to a CE. 572 4.2. Forwarding 574 Forwarding of VPLS packets is based on the interface over which the 575 packet is received, which determines which VPLS the packet belongs 576 to, and the destination MAC address. The former mapping is 577 determined by configuration. The latter is the focus of this 578 section. 580 4.2.1. MAC address learning 582 As was mentioned earlier, the key distinguishing feature of VPLS is 583 that it is a multipoint service. This means that the entire Service 584 Provider network should appear as a single logical learning bridge 585 for each VPLS that the SP network supports. The logical ports for 586 the SP "bridge" are the connections from the SP edge, be it a PE or a 587 u-PE, to the CE. Just as a learning bridge learns MAC addresses on 588 its ports, the SP bridge must learn MAC addresses at its VEs. 590 Learning consists of associating source MAC addresses of packets with 591 the (logical) ports on which they arrive; this association is the 592 Forwarding Information Base (FIB). The FIB is used for forwarding 593 packets. For example, suppose the bridge receives a packet with 594 source MAC address S on (logical) port P. If subsequently, the 595 bridge receives a packet with destination MAC address S, it knows 596 that it should send the packet out on port P. 598 There are two modes of learning: qualified and unqualified learning. 600 In qualified learning, the learning decisions at the VE are based on 601 the customer ethernet packet's MAC address and VLAN tag, if one 602 exists. This VLAN is often called the "service delimiting VLAN". 603 Each VLAN on a given port is mapped to a different service (VPLS, IP 604 VPN, point-to-point Layer 2 VPN, etc.); each VLAN that is mapped to a 605 VPLS service has its own VPLS FIB. 607 In unqualified learning, learning is based on a customer ethernet 608 packet's MAC address only. This is also called "port-mode VPLS". 610 4.2.2. Flooding 612 When a bridge receives a packet to a destination that is not in its 613 FIB, it floods the packet on all the other ports. Similarly, a VE 614 will flood packets to an unknown destination to all other VEs in the 615 VPLS. 617 In Figure 1 above, if CE2 sent an Ethernet frame to PE2, and the 618 destination MAC address on the frame was not in PE2's FIB (for that 619 VPLS), then PE2 would be responsible for flooding that frame to every 620 other PE in the same VPLS. On receiving that frame, PE1 would be 621 responsible for further flooding the frame to CE1 and CE5 (unless PE1 622 knew which CE "owned" that MAC address). 624 On the other hand, if PE3 received the frame, it could delegate 625 further flooding of the frame to its u-PE. If PE3 was connected to 2 626 u-PEs, it would announce that it has two u-PEs. PE3 could either 627 announce that it is incapable of flooding, in which case it would 628 receive two frames, one for each u-PE, or it could announce that it 629 is capable of flooding, in which case it would receive one copy of 630 the frame, which it would then send to both u-PEs. 632 4.2.3. "Split Horizon" Flooding 634 When a PE capable of flooding receives a broadcast Ethernet frame, or 635 one with an unknown destination MAC address, it must flood the frame. 636 If the frame arrived from an attached CE, the PE must send a copy of 637 the frame to every other attached CE, as well as to all PEs 638 participating in the VPLS. If the frame arrived from another PE, 639 however, the PE must only send a copy of the packet to attached CEs. 640 The PE MUST NOT send the frame to other PEs. This notion has been 641 termed "split horizon" flooding, and is a consequence of the PEs 642 being logically full-meshed -- if a broadcast frame is received from 643 PEx, then PEx would have sent a copy to all other PEs. 645 5. Deployment Options 647 In deploying a network that supports VPLS, the SP must decide whether 648 the VPLS-aware device closest to the customer (the VE) is a u-PE or a 649 PE. The default case described in this document is that the VE is a 650 PE. However, there are a number of reasons that the VE might be a u- 651 PE, i.e., a device that does layer 2 functions such as MAC address 652 learning and flooding, and some limited layer 3 functions such as 653 communicating to its PE, but doesn't do full-fledged discovery and 654 PE-to-PE signaling. 656 As both of these cases have benefits, one would like to be able to 657 "mix and match" these scenarios. The signaling mechanism presented 658 here allows this. PE1 may be directly connected to CE devices; PE2 659 may be connected to u-PEs that are connected to CEs; and PE3 may be 660 connected directly to a customer over some interfaces and to u-PEs 661 over others. All these PEs do discovery and signaling in the same 662 manner. How they do learning and forwarding depends on whether or 663 not there is a u-PE; however, this is a local matter, and is not 664 signaled. 666 6. Normative References 668 [ 1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 669 Levels", BCP 14, RFC 2119, March 1997 671 [ 6] Bates, T., Rekhter, Y., Chandra, R., and Katz, D., 672 "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000 674 [ 9] Sangli, S., D. Tappan, and Y. Rekhter, "BGP Extended Communities 675 Attribute", draft-ietf-idr-bgp-ext-communities-07.txt (work in 676 progress) 678 [10] Martini, L., et al, "Encapsulation Methods for Transport of 679 Ethernet Frames Over IP/MPLS Networks", draft-ietf- 680 pwe3-ethernet-encap-06.txt (work in progress) 682 [11] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 683 Signature Option," RFC 2385, August 1998 685 7. Informative References 687 [ 2] Andersson, L., and Rosen, E., "Framework for Layer 2 Virtual 688 Private Networks (L2VPNs)", draft-ietf-l2vpn-l2-framework-04.txt 689 (work in progress) 691 [ 3] Kompella, K., (Editor), "Layer 2 VPNs Over Tunnels", draft- 692 kompella-l2vpn-l2vpn-00.txt (work in progress) 694 [ 4] Martini, L., et al, "Pseudowire Setup and Maintenance using LDP" 695 draft-ietf-pwe3-control-protocol-06.txt (work in progress) 697 [ 5] Kompella, V., et al, "Virtual Private LAN Services over MPLS", 698 draft-ietf-ppvpn-vpls-ldp-03.txt (work in progress) 700 [ 7] Rosen, E., and Rekhter, Y., Editors, "BGP/MPLS VPNs", draft- 701 ietf-l3vpn-rfc2547bis-01.txt (work in progress) 703 [ 8] Ould-Brahim, H., Rosen, E., and Rekhter, Y., "Using BGP as an 704 Auto-Discovery Mechanism for Layer-3 and Layer-2 VPNs", draft- 705 ietf-l3vpn-bgpvpn-auto-04.txt (work in progress) 707 Security Considerations 709 The focus in Virtual Private LAN Service is the privacy of data, 710 i.e., that data in a VPLS is only distributed to other nodes in that 711 VPLS and not to any external agent or other VPLS. Note that VPLS 712 does not offer security or authentication: VPLS packets are sent in 713 the clear in the packet-switched network, and a man-in-the-middle can 714 eavesdrop, and may be able to inject packets into the data stream. 715 If security is desired, the PE-to-PE tunnels can be IPsec tunnels. 716 For more security, the end systems in the VPLS sites can use 717 appropriate means of encryption to secure their data even before it 718 enters the Service Provider network. 720 There are two aspects to achieving data privacy in a VPLS: securing 721 the control plane, and protecting the forwarding path. Compromise of 722 the control plane could result in a PE sending data belonging to some 723 VPLS to another VPLS, or blackholing VPLS data, or even sending it to 724 an eavesdropper, none of which are acceptable from a data privacy 725 point of view. Since all control plane exchanges are via BGP, 726 techniques such as in [11] help authenticate BGP messages, making it 727 harder to spoof updates (which can be used to divert VPLS traffic to 728 the wrong VPLS), or withdraws (denial of service attacks). In the 729 multi-AS options (b) and (c), this also means protecting the inter-AS 730 BGP sessions, between the ASBRs, the PEs or the Route Reflectors. 731 Note that [11] will not help in keeping VPLS labels private -- 732 knowing the labels, one can eavesdrop on VPLS traffic. However, this 733 requires access to the data path within a Service Provider network. 735 Protecting the data plane requires ensuring that PE-to-PE tunnels are 736 well-behaved (this is outside the scope of this document), and that 737 VPLS labels are accepted only from valid interfaces. For a PE, valid 738 interfaces comprise links from P routers. For an ASBR, a valid 739 interface is a link from an ASBR in an AS that is part of a given 740 VPLS. It is especially important in the case of multi-AS VPLSs that 741 one accept VPLS packets only from valid interfaces. 743 IANA Considerations 745 IANA is asked to allocate an AFI for Layer 2 information (suggested 746 value: 25). 748 Contributors 750 The following contributed to this document: 752 Javier Achirica, Telefonica 753 Loa Andersson, TLA 754 Chaitanya Kodeboyina, Juniper 755 Giles Heron, Consultant 756 Sunil Khandekar, Alcatel 757 Vach Kompella, Alcatel 758 Marc Lasserre, Riverstone 759 Pierre Lin, Yipes 760 Pascal Menezes, Terabeam 761 Ashwin Moranganti, Appian 762 Hamid Ould-Brahim, Nortel 763 Seo Yeong-il, Korea Tel 765 Acknowledgments 767 Thanks to Joe Regan and Alfred Nothaft for their contributions. 769 Authors' Addresses 771 Kireeti Kompella 772 Juniper Networks 773 1194 N. Mathilda Ave 774 Sunnyvale, CA 94089 775 kireeti@juniper.net 777 Yakov Rekhter 778 Juniper Networks 779 1194 N. Mathilda Ave 780 Sunnyvale, CA 94089 781 yakov@juniper.net 783 IPR Notice 785 The IETF takes no position regarding the validity or scope of any 786 intellectual property or other rights that might be claimed to 787 pertain to the implementation or use of the technology described in 788 this document or the extent to which any license under such rights 789 might or might not be available; neither does it represent that it 790 has made any effort to identify any such rights. Information on the 791 IETF's procedures with respect to rights in standards-track and 792 standards-related documentation can be found in BCP-11. Copies of 793 claims of rights made available for publication and any assurances of 794 licenses to be made available, or the result of an attempt made to 795 obtain a general license or permission for the use of such 796 proprietary rights by implementors or users of this specification can 797 be obtained from the IETF Secretariat. 799 The IETF invites any interested party to bring to its attention any 800 copyrights, patents or patent applications, or other proprietary 801 rights which may cover technology that may be required to practice 802 this standard. Please address the information to the IETF Executive 803 Director. 805 Full Copyright Notice 807 Copyright (C) The Internet Society (2005). This document is subject 808 to the rights, licenses and restrictions contained in BCP 78, and 809 except as set forth therein, the authors retain all their rights. 811 This document and the information contained herein are provided on an 812 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 813 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 814 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 815 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 816 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 817 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 819 Acknowledgement 821 Funding for the RFC Editor function is currently provided by the 822 Internet Society.