idnits 2.17.1 draft-kompella-ppvpn-l2vpn-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 715 has weird spacing: '...nt with top l...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'KEYWORDS' is mentioned on line 89, but not defined -- Looks like a reference, but probably isn't: '0' on line 710 -- Looks like a reference, but probably isn't: '4' on line 745 == Missing Reference: 'INETVPN' is mentioned on line 900, but not defined -- Looks like a reference, but probably isn't: '8' on line 1056 == Unused Reference: 'BGP-RFSH' is defined on line 1154, but no explicit reference was found in the text == Unused Reference: 'IPVPN-MCAST' is defined on line 1168, but no explicit reference was found in the text ** Obsolete normative reference: RFC 1771 (ref. 'BGP') (Obsoleted by RFC 4271) ** Obsolete normative reference: RFC 2842 (ref. 'BGP-CAP') (Obsoleted by RFC 3392) ** Obsolete normative reference: RFC 2858 (ref. 'BGP-MP') (Obsoleted by RFC 4760) -- Possible downref: Non-RFC (?) normative reference: ref. 'BGP-ORF' -- Possible downref: Non-RFC (?) normative reference: ref. 'EXT-COMM' -- Possible downref: Non-RFC (?) normative reference: ref. 'L2-ENCAP' -- Obsolete informational reference (is this intentional?): RFC 2547 (ref. 'IPVPN') (Obsoleted by RFC 4364) Summary: 7 errors (**), 0 flaws (~~), 7 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Kompella (Juniper) 3 Internet Draft M. Leelanivas (Juniper) 4 Expiration Date: October 2003 Q. Vohra (Juniper) 5 J. Achirica (Telefonica) 6 draft-kompella-ppvpn-l2vpn-03.txt R. Bonica (WorldCom) 7 D. Cooper (Global Crossing) 8 C. Liljenstolpe (C & W) 9 E. Metz (KPN Dutch Telecom) 10 H. Ould-Brahim (Nortel) 11 C. Sargor (CoSine) 12 H. Shah (Tenor) 13 V. Srinivasan (CoSine) 14 Z. Zhang (Unisphere) 16 Layer 2 VPNs Over Tunnels 18 Status of this Memo 20 This document is an Internet-Draft and is in full conformance with 21 all provisions of Section 10 of RFC2026. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF), its areas, and its working groups. Note that 25 other groups may also distribute working documents as Internet- 26 Drafts. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as ``work in progress.'' 33 The list of current Internet-Drafts can be accessed at 34 http://www.ietf.org/ietf/1id-abstracts.txt 36 The list of Internet-Draft Shadow Directories can be accessed at 37 http://www.ietf.org/shadow.html. 39 Copyright Notice 41 Copyright (C) The Internet Society (2003). All Rights Reserved. 43 Abstract 45 Virtual Private Networks (VPNs) based on Frame Relay or ATM circuits 46 have been around a long time. While these VPNs work well, the costs 47 of maintaining separate networks for Internet traffic and VPNs and 48 the administrative burden of provisioning these VPNs have led Service 49 Providers to look for alternative solutions. In this document, we 50 present a VPN solution where from the customer's point of view, the 51 VPN is based on Layer 2 circuits, but the Service Provider maintains 52 and manages a single network for IP, IP VPNs, and Layer 2 VPNs. 54 0.1. ID Summary 56 SUMMARY 58 This ID describes an approach to provisioning Layer 2 VPNs in a 59 Service Provider network. From the VPN customers' point of view, the 60 VPNs look like the traditional Layer 2 connections (Frame Relay, ATM, 61 ...); the benefits here are to the SP, whose job provisioning and 62 managing the connections within their network is simplified. 64 RELATED DOCUMENTS 66 draft-rosen-ppvpn-l2-signaling-02.txt 68 WHERE DOES IT FIT IN THE PICTURE OF THE SUB-IP WORK 70 Belongs in PPVPN. 72 WHY IS IT TARGETED AT THIS WG 74 This document describes a mechanism for Provider-Provisioned Layer 2 75 VPNs. 77 JUSTIFICATION 79 "Traditional" Layer 2 VPNs are very common, widely deployed and 80 incontrovertably useful. The techniques described here show how the 81 work that a provider must do within its network to provision Layer 2 82 VPNs can be made much simpler and more automated. 84 Conventions used in this document 86 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 87 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 88 document are to be interpreted as described in RFC-2119 [KEYWORDS]. 90 1. Introduction 92 The first corporate networks were based on dedicated leased lines 93 interconnecting the various offices of the corporation. Such 94 networks offered connectivity and little else: they didn't scale 95 well, they were expensive for the service providers (and hence for 96 their customers), and provisioning them was a slow and arduous task. 98 The first Virtual Private Networks (VPNs) were based on Layer 2 99 circuits: X.25, Frame Relay and ATM (see [VPN]). Layer 2 VPNs were 100 easier to provision, and virtual circuits allowed the service 101 provider to share a common infrastructure for all the VPNs. These 102 features were passed on to the customers in terms of cost savings. 103 However, while Layer 2 VPNs were a significant step forward from 104 dedicated lines, they still had their drawbacks. First, they tied 105 the service provider VPN infrastructure to a single medium (e.g., 106 ATM). This became even more of a burden if the Internet 107 infrastructure was to share the same physical links. Second, the 108 Internet infrastructure and the VPN infrastructure, even if they 109 shared the same physical network, needed separate administration and 110 maintenance. Third, while provisioning was much easier than for 111 dedicated lines, it was still complex. This was especially evident 112 in the effort to add a site to an existing VPN. 114 This document offers a solution that preserves the advantages of a 115 Layer 2 VPN while allowing the Service Provider to maintain and 116 manage a single network for IP, IP VPNs ([IPVPN]) and Layer 2 VPNs, 117 and reducing the provisioning problem significantly. In particular, 118 adding a site to an existing VPN in most cases requires configuring 119 just the Provider Edge router connected to the new site. 121 To ease the restriction that all sites within a single VPN connect 122 via the same layer 2 technology, this document proposes a limited 123 form of layer 2 interworking, restricted to IP only as the layer 3 124 protocol. 126 The solution we propose scales well because the amount of forwarding 127 state maintained in the core routers of the Service Provider Network 128 is independent of the number of layer 2 VPNs provisioned over the SP 129 network. This is achieved by using tunnels to carry the data, with a 130 "demultiplexing field" that identifies individual VCs. These tunnels 131 could be MPLS, GRE, or any other tunnel technology that offers a 132 demultiplexing field; the signaling of these tunnels is outside the 133 scope of this document. The specific approach taken here is to use a 134 32-bit demultiplexing field formatted as an MPLS label; other sizes 135 and formats are clearly possible, and will be defined as needed. 137 This approach combines auto-discovery of VPN sites with the 138 signalling of the demultiplexing fields for L2VPN PVCs. This is 139 possible because the mechanism used for auto-discovery (BGP) is also 140 capable of distributing Layer 2 information as well as the 141 demultiplexing field. 143 The rest of this section discusses the relative merits of Layer 2 and 144 Layer 3 VPNs. Section 4 describes the operation of a Layer 2 VPN. 145 Section 5 describes IP-only layer 2 interworking. Section 6 146 describes how the L2 packets are transported across the SP network. 147 Section 7 discusses BGP as a mechanism for auto-discovery and 148 signalling of Layer 2 VPNs. 150 1.1. Terminology 152 We assume that the reader is familiar with Multi-Protocol Label 153 Switching (MPLS [MPLS]) and the Border Gateway Protocol version 4 154 (BGP [BGP]). 156 The terminology we use follows. A "customer" is a customer of a 157 Service Provider seeking to interconnect the various "sites" 158 (independently connected networks) through the Service Provider's 159 network, while maintaining privacy of communication and address 160 space. The device in a customer site that connects to a Service 161 Provider router is termed the CE (customer edge device); this device 162 may be a router or a switch. The Service Provider router to which a 163 CE connects is termed a PE. A router in the Service Provider's 164 network which doesn't connect directly to any CE is termed P. These 165 definitions follow those given in [IPVPN]. 167 We also introduce three new terms: 169 VPN Label - the demultiplexing field which identifies an L2VPN PVC to 170 the edge of the SP network, i.e., the PE. 172 Tunnel - a PE-to-PE tunnel that is used to carry multiple types of 173 data. P routers in the SP core forward this data based on the tunnel 174 header and not on the data within, thus limiting the Layer 2 state to 175 the PE routers who host the Layer 2 circuit. 177 CE ID - a number that uniquely identifies a CE within an L2 VPN. 178 More accurately, the CE ID identifies a physical connection from the 179 CE device to the PE. Say a CE connected to a PE over a DS-3 for 180 Frame Relay access to a VPN; this DS-3 would need a CE ID. The CE 181 would also have N DLCIs over this DS-3 to speak to N other sites in 182 the VPN. 184 A CE may be connected to multiple PEs (or multiply connected to a 185 PE), in which case it would have a CE ID for each connection. If 186 these connections are in the same VPN, the CE IDs would have to be 187 different. A CE may also be part of many L2 VPNs; it would need one 188 (or more) CE ID(s) for each L2 VPN of which it is a member. 190 For the case of inter-Provider L2 VPNs, there needs to be some 191 coordination of allocation of CE IDs. One solution is to allocate 192 ranges for each SP. Other solutions may be forthcoming. 194 1.2. Advantages of Layer 2 VPNs 196 We define a Layer 2 VPN as one where a Service Provider provides a 197 layer 2 network to the customer. As far as the customer is 198 concerned, they have (say) Frame Relay circuits connecting the 199 various sites; each CE is configured with a DLCI with which to talk 200 to other CEs. Within the Service Provider's network, though, the 201 layer 2 packets are transported within tunnels, which could be MPLS 202 Label-Switched Paths (LSPs) or GRE tunnels, as examples. 204 The Service Provider does not participate in the customer's layer 3 205 network, in particular, in the routing, resulting in several 206 advantages to the SP as a whole and to PE routers in particular. 208 1.2.1. Separation of Administrative Responsibilities 210 In a Layer 2 VPN, the Service Provider is responsible for Layer 2 211 connectivity; the customer is responsible for Layer 3 connectivity, 212 which includes routing. If the customer says that host x in site A 213 cannot reach host y in site B, the Service Provider need only 214 demonstrate that site A is connected to site B. The details of how 215 routes for host y reach host x are the customer's responsibility. 217 Another very important factor is that once a PE provides Layer 2 218 connectivity to its connected CE, its job is done. A misbehaving CE 219 can at worst flap its interface. On the other hand, a misbehaving CE 220 in a Layer 3 VPN can flap its routes, leading to instability of the 221 PE router or even the entire SP network. This means that the Service 222 Provider must aggressively damp route flaps from a CE; this is common 223 enough with external BGP peers, but in the case of VPNs, the scale of 224 the problem is much larger; also, the CE-PE routing protocol may not 225 be BGP, and thus not have BGP's flap damping control. 227 1.2.2. Migrating from Traditional Layer 2 VPNs 229 Since "traditional" Layer 2 VPNs (i.e., real Frame Relay circuits 230 connecting sites) are indistinguishable from tunnel-based VPNs from 231 the customer's point-of-view, migrating from one to the other raises 232 few issues. With Layer 3 VPNs, special care has to be taken that 233 routes within the traditional VPN are not preferred over the Layer 3 234 VPN routes (the so-called "backdoor routing" problem, whose solution 235 requires protocol changes that are somewhat ad hoc). 237 1.2.3. Privacy of Routing 239 In a Layer 2 VPN, the privacy of customer routing is a natural 240 fallout of the fact that the Service Provider does not participate in 241 routing. The SP routers need not do anything special to keep 242 customer routes separate from other customers or from the Internet; 243 there is no need for per-VPN routing tables, and the additional 244 complexity this imposes on PE routers. 246 1.2.4. Layer 3 Independence 248 Since the Service Provider simply provides Layer 2 connectivity, the 249 customer can run any Layer 3 protocols they choose. If the SP were 250 participating in customer routing, it would be vital that the 251 customer and SP both use the same layer 3 protocol(s) and routing 252 protocols. 254 Note that IP-only layer 2 interworking doesn't have this benefit as 255 it restricts the layer 3 to IP only. 257 1.2.5. PE Scaling 259 In the Layer 2 VPN scheme described below, each PE transmits a single 260 small chunk of information about every CE that the PE is connected to 261 to every other PE. That means that each PE need only maintain a 262 single chunk of information from each CE in each VPN, and keep a 263 single "route" to every site in every VPN. This means that both the 264 Forwarding Information Base and the Routing Information Base scale 265 well with the number of sites and number of VPNs. Furthermore, the 266 scaling properties are independent of the customer: the only germane 267 quantity is the total number of VPN sites. 269 This is to be contrasted with Layer 3 VPNs, where each CE in a VPN 270 may have an arbitrary number of routes that need to be carried by the 271 SP. This leads to two issues. First, both the information stored at 272 each PE and the number of routes installed by the PE for a CE in a 273 VPN can be (in principle) unbounded, which means in practice that a 274 PE must restrict itself to installing routes associated with the VPNs 275 that it is currently a member of. Second, a CE can send a large 276 number of routes to its PE, which means that the PE must protect 277 itself against such a condition. Thus, the SP must enforce limits on 278 the number of routes accepted from a CE; this in turn requires the PE 279 router to offer such control. 281 The scaling issues of Layer 3 VPNs come into sharp focus at a BGP 282 route reflector (RR). An RR cannot keep all the advertised routes in 283 every VPN since the number of routes will be too large. The 284 following solutions/extensions are needed to address this issue: 286 1) RRs could be partitioned so that each RR services a subset of 287 VPNs so that no single RR has to carry all the routes. 288 2) An RR could use a preconfigured list of Route-Targets for its 289 inbound route filtering. The RR may also need to install 290 Outbound Route Filters [BGP-ORF] which contain the above list 291 of Route-Targets on each of its peers so that they do not send 292 unnecessary VPN routes. This method also requires significant 293 extensions along with the fact that multiple RRs are needed to 294 service different sets of VPNs. 296 1.2.6. Ease of Configuration 298 Configuring traditional Layer 2 VPNs was a burden primarily because 299 of the O(n*n) nature of the task. If there are n CEs in a Frame 300 Relay VPN, say full-mesh connected, n*(n-1)/2 DLCI PVCs must be 301 provisioned across the SP network. At each CE, (n-1) DLCIs must be 302 configured to reach each of the other CEs. Furthermore, when a new 303 CE is added, n new DLCI PVCs must be provisioned; also, each existing 304 CE must be updated with a new DLCI to reach the new CE. 306 In our proposal, PVCs are tunnelled across the SP network. The 307 tunnels used are provisioned independent of the L2VPNs, using 308 signalling protocols (in case of MPLS, LDP or RSVP-TE can be used), 309 or set up by configuration; and the number of tunnels is independent 310 of the number of L2VPNs. This reduces a large part of the 311 provisioning burden. 313 Furthermore, we assume that DLCIs at the CE edge are relatively 314 cheap; and VPN labels in the SP network are cheap. This allows the 315 SP to "over-provision" VPNs: for example, allocate 50 CEs to a VPN 316 when only 20 are needed. With this over-provisioning, adding a new 317 CE to a VPN requires configuring just the new CE and its associated 318 PE; existing CEs and their PEs need not be re-configured. Note that 319 if DLCIs at the CE edge are expensive, e.g. if these DLCIs are 320 provisioned across a switched network, one could provision them as 321 and when needed, at the expense of extra configuration. This need not 322 still result in extra state in the SP network, i.e. an intelligent 323 implementation can allow overprovisioning of the pool of VPN labels. 325 1.3. Advantages of Layer 3 VPNs 327 Layer 3 VPNs ([IPVPN] in particular) offer a good solution when the 328 customer traffic is wholly IP, customer routing is reasonably simple, 329 and the customer sites connect to the SP with a variety of Layer 2 330 technologies. 332 1.3.1. Layer 2 Independence 334 One major restriction in a Layer 2 VPN is that the Layer 2 medium 335 with which the various sites of a single VPN connect to the SP must 336 be uniform. On the other hand, the various sites of a Layer 3 VPN 337 can connect to the SP with any supported media; for example, some 338 sites may connect with Frame Relay circuits, and others with 339 Ethernet. 341 This restriction of layer 2 VPN is alleviated by the IP-only layer 2 342 interworking proposed in this document. This comes at the cost of 343 losing the layer 3 independence. 345 A corollary to this is that the number of sites that can be in a 346 Layer 2 VPN is determined by the number of Layer 2 circuits that the 347 Layer 2 technology provides. For example, if the Layer 2 technology 348 is Frame Relay with 2-octet DLCIs, a CE can connect to at most about 349 a thousand other CEs in a VPN. 351 1.3.2. SP Routing as Added Value 353 Another problem with Layer 2 VPNs is that the CE router in a VPN must 354 be able to deal with having N routing peers, where N is the number of 355 sites in the VPN. This can be alleviated by manipulating the 356 topology of the VPN. For example, a hub-and-spoke VPN architecture 357 means that only one CE router (the hub) needs to deal with N 358 neighbors. However, in a Layer 3 VPN, a CE router need only deal 359 with one neighbor, the PE router. Thus, the SP can offer Layer 3 360 VPNs as a value-added service to its customers. 362 Moreover, with layer 2 VPNs it is up to a customer to build and 363 operate the whole network. With Layer 3 VPNs, a customer is just 364 responsible for building and operating routing within each site, 365 which is likely to be much simpler than building and operating 366 routing for the whole VPN. That, in turn, makes Layer 3 VPNs more 367 suitable for customers who don't have sufficient routing expertise, 368 again allowing the SP to provide added value. 370 As mentioned later, multicast routing and forwarding is another 371 value-added service that an SP can offer. 373 1.3.3. Class-of-Service 375 Class-of-Service issues have been addressed for Layer 3 VPNs. Since 376 the PE router has visibility into the network layer (IP), the PE 377 router can take on the tasks of CoS classification and routing. This 378 restriction on layer 2 VPNs is again eased in the case of IP-only 379 layer 2 interworking, as the PE router has visibility into the 380 network layer (IP). 382 1.4. Multicast Routing 384 There are two aspects to multicast routing that we will consider. On 385 the protocol front, supporting IP multicast in a Layer 3 VPN requires 386 PE routers to participate in the multicast routing instance of the 387 customer, and thus keep some related state information. 389 In the Layer 2 VPN case, the CE routers run native multicast routing 390 directly. The SP network just provides pipes to connect the CE 391 routers; PEs are unaware whether the CEs run multicast or not, and 392 thus do not have to participate in multicast protocols or keep 393 multicast state information. 395 On the forwarding front, in a Layer 3 VPN, CE routers do not 396 replicate multicast packets; thus, the CE-PE link carries only one 397 copy of a multicast packet. Whether replication occurs at the 398 ingress PE, or somewhere within the SP network depends on the 399 sophistication of the Layer 3 VPN multicast solution. The simple 400 solution where a PE replicates packets for each of its CEs may place 401 considerable burden on the PE. More complex solutions may require 402 VPN multicast state in the SP network, but may significantly reduce 403 the traffic in the SP network by delaying packet replication until 404 needed. 406 In a Layer 2 VPN, packet replication occurs at the CE. This has the 407 advantage of distributing the burden of replication among the CEs 408 rather than focusing it on the PE to which they are attached, and 409 thus will scale better. However, the CE-PE link will need to carry 410 the multiple copies of multicast packets. 412 Thus, just as in the case of unicast routing, the SP has the choice 413 to offer a value-added service (multicast routing and forwarding) at 414 some cost (multicast state and packet replication) using a Layer 3 415 VPN, or to keep it simple and use a Layer 2 VPN. 417 2. Operation of a Layer 2 VPN 419 The following simple example of a customer with 4 sites connected to 420 3 PE routers in a Service Provider network will hopefully illustrate 421 the various aspects of the operation of a Layer 2 VPN. For 422 simplicity, we assume that a full-mesh topology is desired. 424 In what follows, Frame Relay serves as the Layer 2 medium, and each 425 CE has multiple DLCIs to its PE, each to connect to another CE in the 426 VPN. If the Layer 2 medium were ATM, then each CE would have 427 multiple VPI/VCIs to connect to other CEs. For PPP and Cisco HDLC, 428 each CE would have multiple physical interfaces to connect to other 429 CEs. In the case of IP-only layer 2 interworking, each CE could have 430 a mix of one or more of the above layer 2 mediums to connect to other 431 CEs. 433 2.1. Network Topology 435 Consider a Service Provider network with edge routers PE0, PE1, and 436 PE2. Assume that PE0 and PE1 are IGP neighbors, and PE2 is more than 437 one hop away from PE0. 439 Suppose that a customer C has 4 sites S0, S1, S2 and S3 that C wants 440 to connect via the Service Provider's network using Frame Relay. 441 Site S0 has CE0 and CE1 both connected to PE0. Site S1 has CE2 442 connected to PE0. Site S2 has CE3 connected to PE1 and CE4 connected 443 to PE2. Site S3 has CE5 connected to PE2. (See the Figure 1 below.) 444 Suppose further that C wants to "over-provision" each current site, 445 in expectation that the number of sites will grow to at least 10 in 446 the near future. However, CE4 is only provisioned with 9 DLCIs. 447 (Note that the signalling mechanism discussed in section 7 will allow 448 a site to grow in terms of connectivity to other sites at a later 449 point of time at the cost of additional signalling, i.e., over- 450 provisioning is not a must but a recommendation). 452 Suppose finally that CE0 and CE2 have DLCIs 100 through 109 free; CE1 453 and CE3 have DLCIs 200 through 209 free; CE4 has DLCIs 107, 209, 265, 454 301, 414, 555, 654, 777 and 888 free; and CE5 has DLCIs 417-426. 456 2.2. Configuration 458 The following sub-sections detail the configuration that is needed to 459 provision the above VPN. For the purpose of exposition, we assume 460 that the customer will connect to the SP with Frame Relay circuits, 461 and that the customer's IGP of choice is OSPF. 463 While we focus primarily on the configuration that an SP has to do, 465 Figure 1: Example Network Topology 467 S0 S3 468 .............. .............. 469 . . . . 470 . +-----+ . . . 471 . | CE0 |-----------+ . +-----+ . 472 . +-----+ . | . | CE5 | . 473 . . | . +--+--+ . 474 . +-----+ . | . | . 475 . | CE1 |-------+ | .......|...... 476 . +-----+ . | | / 477 . . | | / 478 .............. | | / 479 | | SP Network / 480 .....|...|.............................../..... 481 . | | / . 482 . +-+---+-+ +-------+ / . 483 . | PE0 |-------| P |-- | . 484 . +-+---+-+ +-------+ \ | . 485 . / \ \ +---+---+ . 486 . | -----+ --| PE2 | . 487 . | | +---+---+ . 488 . | +---+---+ / . 489 . | | PE1 | / . 490 . | +---+---+ / . 491 . | \ / . 492 ...|.............|.............../............. 493 | | / 494 | | / 495 | | / 496 S1 | | S2 / 497 .............. | ........|........../...... 498 . . | . | | . 499 . +-----+ . | . +--+--+ +--+--+ . 500 . | CE2 |-----+ . | CE3 | | CE4 | . 501 . +-----+ . . +-----+ +-----+ . 502 . . . . 503 .............. .......................... 505 we touch upon the configuration requirements of CEs as well. The 506 main point of contact in CE-PE configuration is that both must agree 507 on the DLCIs that will be used on the interface connecting them. 509 If the PE-CE connection is Frame Relay, it is recommended to run LMI 510 between the PE and CE with the PE as DCE and the CE as DTE. For the 511 case of ATM VCs, OAM cells may be used; for PPP and Cisco HDLC, 512 keepalives may be used. The PPP and cisco hdlc keepalives could be 513 between local and remote CE if both CEs connect via the same layer 2 514 medium. 516 In case of IP-only layer 2 interworking, if CE1, attached to PE0, 517 connects to CE3, attached to PE1, via an L2VPN circuit, the layer 2 518 medium between CE1 and PE0 is independent of the layer 2 medium 519 between CE3 and PE1. Each side will run its own layer 2 specific 520 link management protocol, e.g., LMI, LCP, etc. PE0 will inform PE1 521 about the status of its local circuit to CE1 via the circuit status 522 vector TLV defined in section 7. Similarly PE1 will inform PE0 about 523 the status of its local circuit to CE3. 525 2.2.1. CE Configuration 527 Each CE that belongs to a VPN is given a "CE ID". CE IDs must be 528 unique in the context of a VPN. We assume that the CE ID for CE-k is 529 k. 531 Each CE is configured to communicate with its corresponding PE with 532 the set of DLCIs given above; for example, CE0 is configured with 533 DLCIs 100 through 109. OSPF is configured to run over each DLCI. In 534 general, a CE is configured with a list of circuits, all with the 535 same layer 2 encapsulation type, e.g., DLCIs, VCIs, physical PPP 536 interface etc. (IP-only layer 2 interworking allows a mix of layer 2 537 encapsulation types). The size of this list/set determines the 538 number of remote CEs a given CE can communicate with. Denote the 539 size of this list/set as the CE's range. 541 Each CE also "knows" which DLCI connects it to each other CE. A 542 simple algorithm is to use the CE ID of the other CE as an index into 543 the DLCI list this CE has (with zero-based indexing, i.e., 0 is the 544 first index). For example, CE0 is connected to CE3 through its 545 fourth DLCI, 103; CE4 is connected to CE2 by the third DLCI in its 546 list, namely 265. This is the methodology used in the examples 547 below; the actual methodology used to pick the DLCI to be used is a 548 local matter; the key factor is that CE-k may communicate with CE-m 549 using a different DLCI from the DLCI that CE-m uses to communicate to 550 CE-k, i.e., the SP network effectively acts as a giant Frame Relay 551 switch. This is very important, as it decouples the DLCIs used at 552 each CE site, making for much simpler provisioning. 554 2.2.2. PE Configuration 556 Each PE is configured with the VPNs in which it participates. Each 557 VPN is configured with a Route Target community [IPVPN] which 558 uniquely identifies the VPN within the SP network. For each VPN, the 559 PE has a list of CEs, which are members of that VPN. For each CE, 560 the PE knows the CE ID, its range and which DLCIs to expect from the 561 CE. 563 2.2.3. Adding a New Site 565 The first step in adding a new site to a VPN is to pick a new CE ID. 566 If all current members of the VPN are over-provisioned, i.e., their 567 range includes the new CE ID, adding the new site is a purely local 568 task. Otherwise, the sites whose range doesn't include the new CE ID 569 and wish to communicate directly with the new CE must have their 570 ranges increased by allocating additional local circuits to 571 incorporate the new CE ID. 573 The next step is ensuring that the new site has the required 574 connectivity (see below). This may require tweaking the connectivity 575 mechanism; however, in several common cases, the only configuration 576 needed is local to the PE to which the CE is attached. 578 The rest of the configuration is a local matter between the new CE 579 and the PE to which it is attached. 581 It bears repeating that the key to making additions easy is over- 582 provisioning and the algorithm for mapping a CE-id to a DLCI which is 583 used for connecting to the corresponding CE. However, what is being 584 over-provisioned is the number of DLCIs/VCIs that connect the CE to 585 the PE. This is a local matter, and generally is not an issue. 587 2.3. PE Information Exchange 589 When a PE is configured with all the needed information for a CE, it 590 first of all chooses a contiguous set of n labels, where n is the 591 CE's initial range. Denote a contiguous set of labels by a label- 592 block. Call the smallest label in this label-block the label-base 593 and the number of labels in the label-block as label-range. 595 To allow a CE to grow its connectivity at a later point of time 596 additional DLCIs might be added between the CE and its PE. To 597 advertise the additional capacity of a CE without disrupting existing 598 connectivity to the site, a new label-block is picked with k labels, 599 where k is the the number of additional circuits. This process might 600 be repeated several times as and when a CE's range needs growing. 602 The PE then advertises for this CE all its label-blocks. Each label- 603 block is propagated in a separate BGP NLRI (see figure 3). This is 604 the basic Layer 2 VPN advertisement. This same advertisement is sent 605 to all other PEs. Note that PEs that may not be part of the VPN can 606 receive and keep this information, in case at some future point, a CE 607 connected to the PE joins the VPN. 609 So as to be able to distinguish between the multiple label-blocks of 610 a given CE, notion of a block offset is introduced. The block offset 611 identifies the position of a given label-block in the set of label 612 blocks of a given CE. A remote site with CE ID m will connect to 613 this CE using a label selected from one of the label blocks such that 614 the following condition holds true for that label-block : 616 block offset <= m < block offset + label-range 618 If the PE-CE physical link goes down, or the CE configuration is 619 removed, all its advertised label-blocks are withdrawn. 621 Note that an implementation can easily allow allocation of a label- 622 block which is larger than the actual number of DLCIs provisioned. 623 This allows DLCIs to be provisioned as and when needed without 624 increasing the state in the network, at the cost of extra signalling 625 and configuration. 627 2.3.1. PE Advertisement Processing 629 When a PE receives a Layer 2 VPN advertisement, it checks if the 630 received Route Target community matches any VPN that it is a member 631 of. If not, the PE may store the advertisement for future use, or 632 may discard it. Since we use BGP as the auto-discovery and 633 signalling protocol, a PE can use the BGP Route Refresh capability to 634 learn all the discarded advertisements pertaining to a VPN at a later 635 time, when the VPN is configured on the PE. 637 Otherwise, suppose the advertisement is from PE A for VPN X, CE m, 638 and a label-block Lm. Add this label-block to the existing label- 639 blocks for CE m in VPN X. For the purpose of further discussion we 640 denote a label-block from CE m as Lm. Denote Lm's block offset as 641 LOm, label-base as LBm, and label-range as LRm. 643 For each CE that the receiving PE B is connected to that is a member 644 of VPN X, PE B does the following. 646 0) Look up the configuration information associated with the CE. 647 If the encapsulation type for VPN X in the advertisement does 648 not match the configured encapsulation type for VPN X, stop. 649 (Note that for IP-only layer 2 interworking a separate 650 encapsulation type is defined). 651 1) Say the configured CE ID is k, and the DLCI list is Dk[]. 652 A label-block of k is denoted by Lk. Denote Lk's block offset 653 as LOk, label-base as LBk, and label-range as LRk. 654 2) Check if k = m. If so, issue an error: "CE ID k has been 655 allocated to two CEs in VPN X (check CE at PE A)". Stop. 656 3) Search among all the label-blocks from m for one which 657 satisfies LOm <= k < LOm + LRm. If none found, issue a 658 warning : "Cannot communicate with CE m (PE A) of VPN X: 659 outside range" and stop. Otherwise let Lm be the label-block 660 found. 661 4) Search among all the label-blocks of k for one which 662 satisfies LOk <= m < LOk + LRk. If none found, issue a 663 warning : "Cannot communicate with CE m (PE A) of VPN X: 664 outside range" and stop. Otherwise let Lk be the label-block 665 found. 666 5) Look in the appropriate table to see which label-stack will 667 get to PE A. This is the "tunnel" label-stack, Z. 668 6) The DLCI that CE-k will use to talk to CE-m is Dk[m]. Then 669 "VPN" label for sending packets to CE-m is (LBm + k - LOm) if 670 The "VPN" label on which to expect packets from CE-m is 671 (LBk + m - LOk). 672 7) Install a "route" such that packets from CE-k with DLCI Dk[m] 673 will be sent with tunnel label-stack Z, VPN label 674 (LBm + k - LOm). Also, install a route such that packets 675 received with label (LBk + m - LOk) will be mapped to DLCI 676 Dk[m] and be sent to CE k. 677 8) Activate DLCI Dk[m] to the CE. This can be done using LMI. 679 If an advertisement is withdrawn, the appropriate DLCIs must be de- 680 activated, and the corresponding routes must be removed from the 681 forwarding table. 683 2.3.2. Example of PE Advertisement Processing 685 Consider the example network of Figure 1. Let S0, S1, S2 and S3 686 belong to the same VPN, say VPN1. Suppose PE2 receives an 687 advertisement from PE0 for VPN1, CE ID 0 and a label block L0 with 688 block offset LO0 = 0, label-range LR0 = 10 and label base LB0 = 1000. 689 Since PE2 is connected to CE4 which is also in VPN1, PE2 does the 690 following: 692 0) Look up the configuration information associated with CE4. 693 The advertised encapsulation type matches the configured 694 encapsulation type (both are Frame Relay), so proceed. 695 1) CE4 is configured with DLCI list D4[] is [ 107, 209, 265, 696 301, 414, 555, 654, 777, 888]. A label-block L4 is allocated 697 to CE4 with block offset LO4 = 0, label-range LR4 = 9 and 698 a label-base LB4 = 4000 699 2) CE0 and CE4 have ids 0 and 4 respectively, so step 2 of 4.3.1 700 is skipped. 701 3) Since CE4's id falls in the label-block L0 from CE0, i.e. 702 LO0 <= 4 < LO0 + LR0, L0 is the label-block selected in step 3 703 of 4.3.1 704 4) Since CE0's id falls in the label-block L4 of CE4, i.e. 705 LO4 <= 0 < LO4 + LR4, L4 is the label-block selected in step 4 706 of 4.3.1 707 5) Look in the appropriate table on PE2 to see which tunnel 708 label-stack will get to PE0. Let the label-stack be a single 709 label, 10001. 710 6) The DLCI that CE4 will use to talk to CE0 is D4[0], i.e., 107. 711 The VPN label for sending packets to CE0 is (LB0 + 4 - LO0), 712 i.e 1004. The VPN label on which to expect packets from CE0 713 is (LB4 + 0 - LO4), i.e., 4000. 714 7) Install a "route" such that packets from CE4 with DLCI 107 715 will be sent with top label 10001, VPN label 1004. Also, 716 install a route such that packets received with label 4000 will 717 be mapped to DLCI 107 and be sent to CE4. 718 8) Activate DLCI 107 to CE4. 720 Since CE5 is also attached to PE2, PE2 needs to do processing similar 721 to the above for CE5. 723 Similarly, when PE0 receives an advertisement from PE2 for VPN1, CE4, 724 with and a label block L4 with block offset LO4 = 0, label-range LR4 725 = 9 and label base LB4 = 4000. PE0 processes the advertisement for 726 CE0 (and CE1, which is also in VPN1). 728 0) Look up the configuration information associated with CE0. 729 The advertised encapsulation type matches the configured 730 encapsulation type (both are Frame Relay), so proceed. 731 1) CE0 is configured with a DLCI list D0[] is [100 - 109], 732 Label-block L0 is allocated to CE0 with block offset LO0 = 0, 733 label-range LR0 = 10 and a label-base LB0 = 1000 (which 734 was advertised to PE2) 735 2) CE0 and CE4 have ids 0 and 4 respectively, so step 2 of 4.3.1 736 is skipped. 737 3) Since CE0's id falls in the label-block L4 of CE4, i.e. 738 LO4 <= 0 < LO4 + LR4, L4 is the label-block selected in step 4 739 of 4.3.1 740 4) Since CE4's id falls in the label-block L0 from CE0, i.e. 741 LO0 <= 4 < LO0 + LR0, L0 is the label-block selected in step 3 742 of 4.3.1 743 5) Let the tunnel label-stack to reach PE2 be a single label, 744 9999. 745 6) The DLCI which CE0 will use to talk to CE4 is D0[4], i.e., 104. 747 The VPN label for sending packets to CE4 is (LB4 + 0 - LO4), 748 i.e., 4000. The VPN label on which to expect packets from CE4 749 is (LB0 + 4 - LO4), i.e., 1004. 750 7) Install a "route" such that packets from CE0 with DLCI 104 751 will be sent with top label 9999, VPN label 4000. Also, 752 install a route that packets received with label 1004 will be 753 mapped to DLCI 104 and be sent to CE0. 754 8) Activate DLCI 104 to CE0. 756 Note that the VPN label of 4000, computed by PE0, for sending packets 757 from CE0 to CE4 is the same as what PE2 computed as the incoming 758 label for receiving packets originated at CE0 and destined to CE4. 759 Similarly, the VPN label of 1004, computed by PE0, for receiving 760 packets from CE4 to CE0 is same as what PE2 computed as the outgoing 761 label for sending packets originated at CE4 and destined to CE0. 763 2.3.3. Generalizing the VPN Topology 765 In the above, we assumed for simplicity that the VPN was a full mesh. 766 To allow for more general VPN topologies, a mechanism based on 767 filtering on BGP extended communities can be used (see section 7). 769 3. Layer 2 Interworking 771 As defined so far in this document, all CE-PE connections for a given 772 Layer 2 VPN must use the same layer 2 encapsulation, e.g., they must 773 all be Frame Relay. This is often a burdensome restriction. One 774 answer is to use an existing Layer 2 interworking mechanism, for 775 example, Frame Relay-ATM interworking. 777 In this document, we take a different approach: we postulate that the 778 network layer is IP, and base Layer 2 interworking on that. Thus, 779 one can choose between pure Layer 2 VPNs, with a stringent Layer 2 780 restriction but with Layer 3 independence, or a Layer 2 interworking 781 VPNs, where there is no restriction on Layer 2, but Layer 3 must be 782 IP. Of course, a PE may choose to implement Frame Relay-ATM 783 interworking. For example, an ATM Layer 2 VPN could have some CEs 784 connect via Frame Relay links, if their PE could translate Frame 785 Relay to ATM transparent to the rest of the VPN. This would be 786 private to the CE-PE connection, and such a course is outside the 787 scope of this document. 789 For Layer 2 interworking as defined here, when an IP packet arrives 790 at a PE, its Layer 2 address is noted, then all Layer 2 overhead is 791 stripped, leaving just the IP packet. Then, a VPN label is added, 792 and the packet is encapsulated in the PE-PE tunnel (as required by 793 the tunnel technology). Finally, the packet is forwarded. Note that 794 the forwarding decision is made on the basis of the Layer 2 795 information, not the IP header. At the egress, the VPN label 796 determines to which CE the packet must be sent, and over which 797 virtual circuit; from this, the egress PE can also determine the 798 Layer 2 encapsulation to place on the packet once the VPN label is 799 stripped. 801 An added benefit of restricting interworking to IP only as the layer 802 3 technology is that the provider's network can provide IP Diffserv 803 or any other IP based QOS mechanism to the L2VPN customer. The 804 ingress PE can set up IP/TCP/UDP based classifiers to do DiffServ 805 marking, and other functions like policing and shaping on the L2 806 circuits of the VPN customer. Note the division of labor: the CE 807 determines the destination CE, and encodes that in the Layer 2 808 address. The ingress PE thus determines the egress PE and VPN label 809 based on the Layer 2 address supplied by the CE, but the ingress PE 810 can choose the tunnel to reach the egress PE (in the case that there 811 are different tunnels for each CoS/DiffServ code point), or the CoS 812 bits to place in the tunnel (in the case where a single tunnel 813 carries multiple CoS/DiffServ code points) based on its own 814 classification of the packet. 816 4. Packet Transport 818 When a packet arrives at a PE from a CE in a Layer 2 VPN, the layer 2 819 address of the packet identifies to which other CE the packet is 820 destined. The procedure outlined above installs a route that maps 821 the layer 2 address to a tunnel (which identifies the PE to which the 822 destination CE is attached) and a VPN label (which identifies the 823 destination CE). If the egress PE is the same as the ingress PE, no 824 tunnel or VPN label is needed. 826 The packet may then be modified (depending on the layer 2 827 encapsulation). In case of IP-only layer 2 interworking, the layer 2 828 header is completely stripped off till the IP header. Then, a VPN 829 label and tunnel encapsulation are added as specified by the route 830 described above, and the packet is sent to the egress PE. 832 If the egress PE is the same as the ingress, the packet "arrives" 833 with no labels. Otherwise, the packet arrives with the VPN label, 834 which is used to determine which CE is the destination CE. The 835 packet is restored to a fully-formed layer 2 packet, and then sent to 836 the CE. 838 4.1. Layer 2 MTU 840 This document requires that the Layer 2 MTU configured on all the 841 access circuits connecting CEs to PEs in an L2VPN be the same. This 842 can be ensured by passing the configured layer 2 MTU in the 843 Layer2-info extended community when advertising L2VPN label-blocks. 844 On receiving L2VPN label-block from remote PEs in a VPN, the MTU 845 value carried in the layer2-info extendend community should be 846 compared against the configured value for the VPN. If they don't 847 match, then the label-block should be ignored. 849 The MTU on the Layer 2 access links MUST be chosen such that the size 850 of the L2 frames plus the L2VPN header does not exceed the MTU of the 851 SP network. Layer 2 frames that exceed the MTU after encapsulation 852 MUST be dropped. For the case of IP-only layer 2 interworking the IP 853 MTU on the layer 2 access link must be chosen such that the size of 854 the IP packet and the L2VPN header does not exceed the MTU of the SP 855 network. 857 4.2. Layer 2 Frame Format 859 The modification to the Layer 2 frame depends on the Layer 2 type. 860 This document requires that the encapsulation methods used in 861 transporting of layer 2 frames over tunnels be the same as described 862 in [L2-ENCAP], except in the case of IP-only Layer 2 Interworking 863 which is described in section 6.2. 865 4.3. IP-only Layer 2 Interworking 867 Figure 2: Format of IP-only layer 2 interworking packet 869 +----------------------------+ 870 | Tunnel | VPN | IP | VPN label is the 871 | Encap | Label | Packet | demultiplexing field 872 +----------------------------+ 874 At the ingress PE, an L2 frame's L2 header is completely stripped off 875 and is carried over as an IP packet within the SP network (Figure 2). 876 The forwarding decision is still based on the L2 address of the 877 incoming L2 frame. At the egress PE, the IP packet is encapsulated 878 back in an L2 frame and transported over to the destination CE. The 879 forwarding decision at the egress PE is based on the VPN label as 880 before. The L2 technology between egress PE and CE is independent of 881 the L2 technology between ingress PE and CE. 883 5. Auto-discovery and Signalling of Layer 2 VPNs 885 BGP version 4 ([BGP]) is used as the auto-discovery and signalling 886 protocol for Layer 2 VPNs described in this document. 888 In BGP, the Multiprotocol Extensions [BGP-MP] are used to carry 889 L2-VPN signalling information. [BGP-MP] defines the format of two 890 BGP attributes (MP_REACH_NLRI and MP_UNREACH_NLRI) that can be used 891 to announce and withdraw the announcement of reachability 892 information. We introduce a new address family identifier (AFI) for 893 L2-VPN [to be assigned by IANA], a new subsequent address family 894 identifier (SAFI) [to be assigned by IANA], and also a new NLRI 895 format for carrying the individual L2-VPN label-block information. 896 One or more NLRIs will be carried in the above-mentioned BGP 897 attributes. L2VPN NLRIs MUST be accompanied by one or more extended 898 communities. This document proposes the reuse of ROUTE TARGET 899 extended community defined in [EXT-COMM]. Its usage is exactly the 900 same as in the case of [INETVPN]. 902 PEs receiving VPN information may filter advertisements based on the 903 extended communities, thus controlling CE-to-CE connectivity. 905 The format of the Layer 2 VPN NLRI is as shown in Figure 3 below. 906 One or more such NLRIs can be carried in a single MP_REACH_NLRI or 907 MP_REACH_NLRI attribute. An L2VPN NLRI is uniquely identified by the 908 RD, CE ID and the Label-block Offset. So an L2VPN NLRI carried in 909 MP_UNREACH_NLRI attribute must contain only these 3 fields other than 910 the length field. 912 Figure 3: BGP NLRI for L2 VPN Information 914 +------------------------------------+ 915 | Length (2 octets) | 916 +------------------------------------+ 917 | Route Distinguisher (8 octets) | 918 +------------------------------------+ 919 | CE ID (2 octets) | 920 +------------------------------------+ 921 | Label-block Offset (2 octets) | 922 +------------------------------------+ 923 | Label Base (3 octets) | 924 +------------------------------------+ 925 | Variable TLVs (0 to N octets) | 926 | ... | 927 +------------------------------------+ 929 5.1. L2VPN NLRI Format 931 5.1.1. Length 933 The Length field indicates the length in octets of the L2-VPN address 934 information. 936 5.1.2. Route Distinguisher 938 Has the same meaning as in [IPVPN]. 940 5.1.3. CE ID 942 A 16 bit number which uniquely identifies a CE in a VPN. 944 5.1.4. Label-Block Offset 946 A 16 bit number which identifies the position of a label-block within 947 a set of label-blocks of a given CE. This enables a remote CE to 948 select a label block when picking the VPN label for sending traffic 949 destined to the CE this label-block corresponds to, such that : 951 label-block offset <= remote CE id. 953 5.1.5. Label base 955 The label-base which is to be used for determining the VPN label for 956 forwarding packets to the CE identified by CE ID 958 5.1.6. Sub-TLVs 960 New sub-TLVs can be introduced as needed. 962 L2VPN TLVs can be added to extend the information carried in the L2 963 VPN NLRI. In L2VPN TLVs, type is 1 octet, length is 2 octets and 964 represents the size of the value field in bits. 966 5.1.7. Circuit Status Vector 968 A new sub-TLV is introduced to carry the status of an L2VPN PVC 969 between a pair of PEs. This sub-TLV is a mandatory part of 970 MP_REACH_NLRI. 972 Note that an L2VPN PVC is bidirectional, composed of two simplex 973 connection going in opposite directions. A simplex connection 974 consists of the 3 segments: 1) the local access circuit between the 975 source CE and the ingress PE, 2) the tunnel LSP between the ingress 976 and egress PEs, and 3) the access circuit between the egress PE and 977 the destination CE. 979 To monitor the status of a PVC, a PE needs to monitor the status of 980 both simplex connections. Since it knows that status of its access 981 circuit, and the status of the tunnel towards the remote PE, it can 982 inform the remote PE of these two. Similarly, the remote PE can 983 inform the status of its access circuit to its local CE and the 984 status of the tunnel to the first PE. Combining the local and the 985 remote information, a PE can determine the status of a PVC. 987 The basic unit of advertisement in L2VPN for a given CE is a label- 988 block. Each label within a label-block corresponds to a PVC on the 989 CE. So its natural to advertise the local status information for all 990 PVCs corresponding to a label-block along with the label-block's 991 NLRI. This is done by introducing the circuit status vector TLV. 992 The value field of this TLV is a bit-vector, each bit of which 993 indicates the status of the PVC associated with the corresponding 994 label in the label-block. Bit value 0 indicates that the local 995 circuit and the tunnel LSP to the remote PE is up, while a value of 1 996 indicates that either or both of them are down. 998 PE A, while selecting a label from a label-block (advertised by PE B, 999 for remote CE m, and VPN X) for one of its local CE n (in VPN X) can 1000 also determine the status of the corresponding PVC (between CE n and 1001 CE m) by looking at the appropriate bit in the circuit status vector. 1003 Type field for the circuit status vector TLV is TBD. 1005 The length field of the TLV specifies the length of the value field 1006 in bits. The value field is padded to the nearest octet boundary. 1008 Note that the length field corresponds to the number of labels in the 1009 label-block, i.e., the label-block range. Label-block range enables 1010 a CE to select a label block (among several label-blocks advertised 1011 by a CE) when picking the VPN label for sending traffic destined to 1012 the CE this label-block corresponds to, such that : 1014 received label-block offset <= local CE id < received label-block range. 1016 5.2. Layer2-Info Extended Community 1018 This document introduces a new extended community, Layer2-Info, to 1019 allow carrying layer 2 specific information in a VPN. This extended 1020 community MUST be carried as part of path attribute in all BGP update 1021 messages carrying L2VPN NLRIs. The encoding of this community is 1022 shown in figure 4. 1024 Figure 4: layer2-info extended community 1026 +------------------------------------+ 1027 | Extended community type (2 octets) | 1028 +------------------------------------+ 1029 | Encaps Type (1 octet) | 1030 +------------------------------------+ 1031 | Cntrl Flags (1 octet) | 1032 +------------------------------------+ 1033 | Layer-2 MTU (2 octet) | 1034 +------------------------------------+ 1035 | Reserved (2 octets) | 1036 +------------------------------------+ 1038 5.2.1. Extended Community Type 1040 TBD. 1042 5.2.2. Encapsulation Type 1044 Identifies the layer 2 encapsulation, e.g., ATM, Frame Relay etc. 1045 The following encapsulation types are defined: 1047 Value Encapsulation 1048 0 Reserved 1049 1 Frame Relay 1050 2 ATM AAL5 VCC transport 1051 3 ATM transparent cell transport 1052 4 Ethernet VLAN 1053 5 Ethernet 1054 6 Cisco-HDLC 1055 7 PPP 1056 8 CEM [8] 1057 9 ATM VCC cell transport 1058 10 ATM VPC cell transport 1059 11 MPLS 1060 12 VPLS 1061 64 IP-interworking 1063 5.2.3. Control Flags 1065 This is a bit vector, defined as in Figure 5. 1067 The following bits are defined; the MBZ bits MUST be set to zero. 1069 Name Meaning 1071 Figure 5: Control Flags Bit Vector 1073 0 1 2 3 4 5 6 7 1074 +-+-+-+-+-+-+-+-+ 1075 | MBZ |Q|F|C|S| (MBZ = MUST Be Zero) 1076 +-+-+-+-+-+-+-+-+ 1078 C If set to 1(0), Control word is (not) required when 1079 encapsulating Layer 2 frames [L2-ENCAP]. 1080 S If set to 1(0), Sequenced delivery of frames is (not) 1081 required. 1083 The Q and F flags are reserved for other use. 1085 5.2.4. Layer-2 MTU 1087 Specifies the layer-2 specific MTU of all the circuits in all the 1088 label-blocks advertised with this extended community. This allows 1089 for checking of the layer 2 MTU being same for all the circuits 1090 across all the sites in a VPN. 1092 5.3. BGP L2 VPN capability 1094 The BGP Multiprotocol capability extension [BGP-CAP] is used to 1095 indicate that the BGP speaker wants to negotiate L2 VPN capability 1096 with its peers. The capability code is 1, the capability length is 1097 4, and the AFI and SAFI values will be set to the L2 VPN AFI and L2 1098 VPN SAFI (discussed in seccion 7) respectively. 1100 5.4. Advantages of Using BGP 1102 PE routers in an SP network typically run BGP v4. This means that 1103 SPs are familiar with using BGP, and have already configured BGP on 1104 their PEs, so configuring and using BGP to signal Layer 2 VPNs is not 1105 much of an additional burden to the SP operators. 1107 Another advantage of using BGP is that with BGP it is easier to build 1108 inter-provider VPNs. Mechanisms for this are similar as that 1109 described in [IPVPN]. Option a) and b) described there could be 1110 adapted with slight modification for the l2vpn case but have adverse 1111 scaling issue in the l2vpn context. So we recommend using option C) 1112 which in l2vpn context would require an ASBR to maintain labeled IPv4 1113 /32 routes to PEs within its AS and use EBGP to distribute these 1114 routes to other ASes. This results in creation of an LSP from a PE in 1115 one AS to another PE in another AS. Now these PEs can run multihop 1116 EBGP to exchange L2VPN information. The L2VPN traffic will be 1117 tunnelled thru the inter-AS LSP established between PEs as described 1118 above. 1120 6. Acknowledgments 1122 The authors would like to thank Chaitanya Kodeboyina Dennis Ferguson, 1123 Der-Hwa Gan, Dave Katz, Nischal Sheth, John Stewart, and Paul Traina 1124 for the enlightening discussions that helped shape the ideas 1125 presented here, and Ross Callon for his valuable comments. 1127 The idea of using extended communities for more general connectivity 1128 of a Layer 2 VPN was a contribution by Yakov Rekhter, who also gave 1129 many useful comments on the text; many thanks to him. 1131 7. Security Considerations 1133 The security aspects of this solution will be discussed at a later 1134 time. 1136 8. IANA Considerations 1138 (To be filled in in a later revision.) 1140 9. Normative References 1142 [BGP] Rekhter, Y., and Li, T., "A Border Gateway Protocol 4 (BGP-4)", 1143 RFC 1771, March 1995. 1145 [BGP-CAP] Chandra, R., and Scudder, J., "Capabilities Advertisement 1146 with BGP-4", RFC 2842, May 2000. 1148 [BGP-MP] Bates, T., Rekhter, Y., Chandra, R., and Katz, D., 1149 "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000 1151 [BGP-ORF] Chen, E., and Rekhter, Y., "Cooperative Route Filtering 1152 Capability for BGP-4", March 2000 (work in progress). 1154 [BGP-RFSH] Chen, E., "Route Refresh Capability for BGP-4", RFC2918, 1155 September 2000. 1157 [EXT-COMM] Ramachandra, S., Tappan, D.,Rekhtar, Y., "BGP Extended 1158 Communities Attribute" (work in progress). 1160 [L2-ENCAP] Martini, et. al., "Encapsulation Methods for Transport of 1161 Layer 2 Frames Over MPLS", November 2001 (work in progress). 1163 10. Informative References 1165 [IPVPN] Rosen, E., and Rekhter, Y., "BGP/MPLS VPNs", RFC 2547, March 1166 1999. 1168 [IPVPN-MCAST] Rosen, et. al., "Multicast in MPLS/BGP VPNs", November 1169 2000 (work in progress). 1171 [MPLS] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 1172 Label Switching Architecture", RFC 3031, January 2001. 1174 [VPN] Kosiur, Dave, "Building and Managing Virtual Private Networks", 1175 Wiley Computer Publishing, 1998. 1177 Authors' Addresses 1179 Kireeti Kompella 1180 Juniper Networks 1181 1194 N. Mathilda Ave 1182 Sunnyvale, CA 94089 1183 kireeti@juniper.net 1185 Manoj Leelanivas 1186 Juniper Networks 1187 1194 N. Mathilda Ave 1188 Sunnyvale, CA 94089 1189 manoj@juniper.net 1191 Quaizar Vohra 1192 Juniper Networks 1193 1194 N. Mathilda Ave 1194 Sunnyvale, CA 94089 1195 qv@juniper.net 1197 Javier Achirica 1198 Telefonica Data 1199 javier.achirica@telefonica-data.com 1201 Ronald P. Bonica 1202 WorldCom 1203 22001 Loudoun County Pkwy 1204 Ashburn, Virginia, 20147 1205 rbonica@mci.net 1207 Dave Cooper 1208 Global Crossing 1209 960 Hamlin Court 1210 Sunnyvale, CA 94089 1211 email: dcooper@gblx.net 1213 Chris Liljenstolpe 1214 Cable & Wireless 1215 11700 Plaza America Drive 1216 Reston, VA 20190 1217 chris@cw.net 1219 Eduard Metz 1220 KPNQwest 1221 Scorpius 60 1222 2130 GE Hoofddorp, The Netherlands 1223 email: eduard.metz@kpnqwest.com 1225 Chandramouli Sargor 1226 CoSine Communications 1227 1200 Bridge Parkway 1228 Redwood City, CA 94065 1229 csargor@cosinecom.com 1231 Himanshu Shah 1232 Tenor Networks 1233 100 Nanog Park 1234 Acton, MA 01720 1235 hshah@tenornetworks.com 1237 Vijay Srinivasan 1238 CoSine Communications 1239 1200 Bridge Parkway 1240 Redwood City, CA 94065 1241 vijay@cosinecom.com 1243 Hamid Ould-Brahim 1244 Nortel Networks 1245 P O Box 3511 Station C 1246 Ottawa ON K1Y 4H7 Canada 1247 Phone: +1 (613) 765 3418 1248 Email: hbrahim@nortelnetworks.com 1250 Zhaohui Zhang 1251 Juniper Networks 1252 10 Technology Park Drive 1253 Westford, MA 01886 1254 zzhang@unispherenetworks.com 1256 Intellectual Property Considerations 1258 Juniper Networks may seek patent or other intellectual property 1259 protection for some of all of the technologies disclosed in this 1260 document. If any standards arising from this document are or become 1261 protected by one or more patents assigned to Juniper Networks, 1262 Juniper intends to disclose those patents and license them on 1263 reasonable and non-discriminatory terms. 1265 CoSine Communications may seek patent or other intellectual property 1266 protection for some of all of the technologies disclosed in this 1267 document. If any standards arising from this document are or become 1268 protected by one or more patents assigned to CoSine Communications, 1269 CoSine intends to disclose those patents and license them on 1270 reasonable and non-discriminatory terms. 1272 Full Copyright Statement 1274 Copyright (C) The Internet Society (2003). All Rights Reserved. 1276 This document and translations of it may be copied and furnished to 1277 others, and derivative works that comment on or otherwise explain it 1278 or assist in its implementation may be prepared, copied, published 1279 and distributed, in whole or in part, without restriction of any 1280 kind, provided that the above copyright notice and this paragraph are 1281 included on all such copies and derivative works. However, this 1282 document itself may not be modified in any way, such as by removing 1283 the copyright notice or references to the Internet Society or other 1284 Internet organizations, except as needed for the purpose of 1285 developing Internet standards in which case the procedures for 1286 copyrights defined in the Internet Standards process must be 1287 followed, or as required to translate it into languages other than 1288 English. 1290 The limited permissions granted above are perpetual and will not be 1291 revoked by the Internet Society or its successors or assigns. 1293 This document and the information contained herein is provided on an 1294 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 1295 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 1296 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 1297 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 1298 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.