idnits 2.17.1 draft-ietf-l2vpn-vpls-multihoming-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC4761, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4761, updated by this document, for RFC5378 checks: 2003-07-22) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 15, 2013) is 3928 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.kothari-l2vpn-vpls-flush' is defined on line 844, but no explicit reference was found in the text == Unused Reference: 'RFC4456' is defined on line 861, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Kothari 3 Internet-Draft Gainspeed 4 Updates: 4761 (if approved) K. Kompella 5 Intended status: Standards Track Juniper Networks 6 Expires: January 16, 2014 W. Henderickx 7 F. Balus 8 Alcatel-Lucent 9 J. Uttaro 10 AT&T 11 S. Palislamovic 12 Alcatel-Lucent 13 W. Lin 14 Juniper Networks 15 July 15, 2013 17 BGP based Multi-homing in Virtual Private LAN Service 18 draft-ietf-l2vpn-vpls-multihoming-06.txt 20 Abstract 22 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 23 Network (VPN) that gives its customers the appearance that their 24 sites are connected via a Local Area Network (LAN). It is often 25 required for the Service Provider (SP) to give the customer redundant 26 connectivity to some sites, often called "multi-homing". This memo 27 shows how BGP-based multi-homing can be offered in the context of LDP 28 and BGP VPLS solutions. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at http://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on January 16, 2014. 47 Copyright Notice 48 Copyright (c) 2013 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 1.1. General Terminology . . . . . . . . . . . . . . . . . . . 3 65 1.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 4 66 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 2.1. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 4 68 2.2. VPLS Multi-homing Considerations . . . . . . . . . . . . 5 69 3. Multi-homing Operation . . . . . . . . . . . . . . . . . . . 6 70 3.1. Customer Edge (CE) NLRI . . . . . . . . . . . . . . . . . 6 71 3.2. Provisioning Model . . . . . . . . . . . . . . . . . . . 7 72 3.3. Designated Forwarder Election . . . . . . . . . . . . . . 8 73 3.3.1. Attributes . . . . . . . . . . . . . . . . . . . . . 8 74 3.3.2. Variables Used . . . . . . . . . . . . . . . . . . . 9 75 3.3.3. Election Procedures . . . . . . . . . . . . . . . . . 11 76 3.4. DF Election on PEs . . . . . . . . . . . . . . . . . . . 13 77 4. Multi-AS VPLS . . . . . . . . . . . . . . . . . . . . . . . . 13 78 4.1. Route Origin Extended Community . . . . . . . . . . . . . 13 79 4.2. VPLS Preference . . . . . . . . . . . . . . . . . . . . . 13 80 4.3. Use of BGP attributes in Inter-AS Methods . . . . . . . . 14 81 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS 82 Information between ASBRs . . . . . . . . . . . . . . 15 83 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of 84 VPLS Information between ASes . . . . . . . . . 16 85 5. MAC Flush Operations . . . . . . . . . . . . . . . . . . . . 16 86 5.1. MAC Flush Indicators . . . . . . . . . . . . . . . . . . 16 87 5.2. Minimizing the effects of fast link transitions . . . . . 17 88 6. Backwards Compatibility . . . . . . . . . . . . . . . . . . . 18 89 6.1. BGP based VPLS . . . . . . . . . . . . . . . . . . . . . 18 90 6.2. LDP VPLS with BGP Auto-discovery . . . . . . . . . . . . 18 91 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 92 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 93 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 94 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 95 10.1. Normative References . . . . . . . . . . . . . . . . . . 19 96 10.2. Informative References . . . . . . . . . . . . . . . . . 19 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 99 1. Introduction 101 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 102 Network (VPN) that gives its customers the appearance that their 103 sites are connected via a Local Area Network (LAN). It is often 104 required for a Service Provider (SP) to give the customer redundant 105 connectivity to one or more sites, often called "multi-homing". 106 [RFC4761] explains how VPLS can be offered using BGP for auto- 107 discovery and signaling; section 3.5 of that document describes how 108 multi-homing can be achieved in this context. [RFC6074] explains how 109 VPLS can be offered using BGP for auto-discovery (BGP-AD) and 110 [RFC4762] explains how VPLS can be offered using LDP for signaling. 111 This document provides a BGP-based multi-homing solution applicable 112 to both BGP and LDP VPLS technologies. Note that BGP MH can be used 113 for LDP VPLS without the use of the BGP-AD solution. 115 Section 2 lays out some of the scenarios for multi-homing, other ways 116 that this can be achieved, and some of the expectations of BGP-based 117 multi-homing. Section 3 defines the components of BGP-based multi- 118 homing, and the procedures required to achieve this. Section 7 may 119 someday discuss security considerations. 121 1.1. General Terminology 123 Some general terminology is defined here; most is from [RFC4761], 124 [RFC4762] or [RFC4364]. Terminology specific to this memo is 125 introduced as needed in later sections. 127 A "Customer Edge" (CE) device, typically located on customer 128 premises, connects to a "Provider Edge" (PE) device, which is owned 129 and operated by the SP. A "Provider" (P) device is also owned and 130 operated by the SP, but has no direct customer connections. A "VPLS 131 Edge" (VE) device is a PE that offers VPLS services. 133 A VPLS domain represents a bridging domain per customer. A Route 134 Target community as described in [RFC4360] is typically used to 135 identify all the PE routers participating in a particular VPLS 136 domain. A VPLS site is a grouping of ports on a PE that belong to 137 the same VPLS domain. The terms "VPLS instance" and "VPLS domain" 138 are used interchangeably in this document. 140 A VPLS site connected to only one PE is called as single-homed VPLS 141 site. The terms "VPLS site" and "CE site" are used interchangeably 142 in this document. 144 A VPLS site connected to multiple PEs is called as multi-homed site. 146 A BGP VPLS NLRI for the base VPLS instance that has non-zero VE block 147 offset, VE block size and label base is called as VE NLRI in this 148 document. Each VPLS instance is uniquely identified by a VE-ID. VE- 149 ID is carried in the BGP VPLS NLRI as specified in section 3.2.2 in 150 [RFC4761]. 152 A VPLS NLRI with value zero for the VE block offset, VE block size 153 and label base is called as CE NLRI in this document. 154 Section Section 3.1 defines CE NLRI and provides more detail. 156 A Multi-homed (MH) site is uniquely identified by a CE-ID. Sites are 157 referred to as local or remote depending on whether they are 158 configured on the PE router in context or on one of the remote PE 159 routers (network peers). A single-homed site can also be assigned a 160 CE-ID, but it is not mandatory to configure a CE-ID for single-homed 161 sites. Section Section 3.1 provides detail on CE-ID. 163 1.2. Conventions 165 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 166 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 167 document are to be interpreted as described in [RFC2119]. 169 2. Background 171 This section describes various scenarios where multi-homing may be 172 required, and the implications thereof. It also describes some of 173 the singular properties of VPLS multi-homing, and what that means 174 from both an operational point of view and an implementation point of 175 view. There are other approaches for providing multi-homing such as 176 Spanning Tree Protocol, and this document specifies use of BGP for 177 multi-homing. Comprehensive comparison among the approaches is 178 outside the scope of this document. 180 2.1. Scenarios 182 CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 for redundant 183 connectivity. 185 ............... 186 . . ___ CE2 187 ___ PE1 . / 188 / : PE3 189 __/ : Service : 190 CE1 __ : Provider PE4 191 \ : : \___ CE3 192 \___ PE2 . 193 . . 194 ............... 196 Figure 1: Scenario 1 198 CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 for redundant 199 connectivity. However, CE4, which is also in the same VPLS domain, 200 is single-homed to just PE1. 202 CE4 ------- ............... 203 \ . . ___ CE2 204 ___ PE1 . / 205 / : PE3 206 __/ : Service : 207 CE1 __ : Provider PE4 208 \ : : \___ CE3 209 \___ PE2 . 210 . . 211 ............... 213 Figure 2: Scenario 2 215 2.2. VPLS Multi-homing Considerations 217 The first (perhaps obvious) fact about a multi-homed VPLS CE, such as 218 CE1 in Figure 1 is that if CE1 is an Ethernet switch or bridge, a 219 loop has been created in the customer VPLS. This is a dangerous 220 situation for an Ethernet network, and the loop must be broken. Even 221 if CE1 is a router, it will get duplicates every time a packet is 222 flooded, which is clearly undesirable. 224 The next is that (unlike the case of IP-based multi-homing) only one 225 of PE1 and PE2 can be actively sending traffic, either towards CE1 or 226 into the SP cloud. That is to say, load balancing techniques will 227 not work. All other PEs MUST choose the same designated forwarder 228 for a multi-homed site. Call the PE that is chosen to send traffic 229 to/from CE1 the "designated forwarder". 231 In Figure 2, CE1 and CE4 must be dealt with independently, since CE1 232 is dual-homed, but CE4 is not. 234 3. Multi-homing Operation 236 This section describes procedures for electing a designated forwarder 237 among the set of PEs that are multi-homed to a customer site. The 238 procedures described in this section are applicable to BGP based 239 VPLS, LDP based VPLS with BGP-AD or a VPLS that contains a mix of 240 both BGP and LDP signaled PWs. 242 3.1. Customer Edge (CE) NLRI 244 Section 3.2.2 in [RFC4761] specifies a NLRI to be used for BGP based 245 VPLS (BGP VPLS NLRI). The format of the BGP VPLS NLRI is shown 246 below. 248 +------------------------------------+ 249 | Length (2 octets) | 250 +------------------------------------+ 251 | Route Distinguisher (8 octets) | 252 +------------------------------------+ 253 | VE ID (2 octets) | 254 +------------------------------------+ 255 | VE Block Offset (2 octets) | 256 +------------------------------------+ 257 | VE Block Size (2 octets) | 258 +------------------------------------+ 259 | Label Base (3 octets) | 260 +------------------------------------+ 262 BGP VPLS NLRI 264 For multi-homing operation, a customer-edge NLRI (CE NLRI) is 265 proposed that uses BGP VPLS NLRI with the following fields set to 266 zero: VE Block Offset, VE Block Size and Label Base. In addition, 267 the VE-ID field of the NLRI is set to CE-ID. Thus, the CE NLRI 268 contains 2 octets indicating the length, 8 octets for Route 269 Distinguisher, 2 octets for CE-ID and 7 octets with value zero. 271 It is valid to have non-zero VE block offset, VE block size and label 272 base in the VPLS NLRI for a multi-homed site. VPLS operations, 273 including multi-homing, in such a case are outside the scope of this 274 document. However, for interoperability with existing deployments 275 that use non-zero VE block offset, VE block size and label base for 276 multi-homing operation, Section 6.1 provides more detail. 278 Wherever VPLS NLRI is used in this document, context must be used to 279 infer if it is applicable for CE NLRI, VE NLRI or for both. 281 3.2. Provisioning Model 283 It is mandatory that each instance within a VPLS domain MUST be 284 provisioned with a unique Route Distinguisher value. Unique Route 285 Distinguisher allows VPLS advertisements from different VPLS PEs to 286 be distinct even if the advertisements have the same VE-ID, which can 287 occur in case of multi-homing. This allows standard BGP path 288 selection rules to be applied to VPLS advertisements. 290 Each VPLS PE must advertise a unique VE-ID with non-zero VE Block 291 Offset, VE Block Size and Label Base values in the BGP NLRI. VE-ID 292 is associated with the base VPLS instance and the NLRI associated 293 with it must be used for creating PWs among VPLS PEs. Any single- 294 homed customer sites connected to the VPLS instance do not require 295 any special addressing. However, an administrator (SP operator) can 296 chose to have a CE-ID for a single-homed site as well. Any multi- 297 homed customer sites connected to the VPLS instance require special 298 addressing, which is achieved by use of CE-ID. A set of customer 299 sites are distinguished as multi-homed if they all have the same CE- 300 ID. The following examples illustrate the use of VE-ID and CE-ID. 302 Figure 1 shows a customer site, CE1, multi-homed to two VPLS PEs, PE1 303 and PE2. In order for all VPLS PEs to set up PWs to each other, each 304 VPLS PE must be configured with a unique VE-ID for its base VPLS 305 instance. In addition, in order for all VPLS PEs within the same 306 VPLS domain to elect one of the multi-homed PEs as the designated 307 forwarder, an indicator that the PEs are multi-homed to the same 308 customer site is required. This is achieved by assigning the same 309 VPLS site ID (CE-ID) on PE1 and PE2 for CE1. When remote VPLS PEs 310 receive NLRI advertisement from PE1 and PE2 for CE1, the two NLRI 311 advertisements for CE1 are identified as candidates for designated 312 forwarder selection due to the same CE-ID. Thus, same CE-ID MUST be 313 assigned on all VPLS PEs that are multi-homed to the same customer 314 site. 316 Figure 2 shows two customer sites, CE1 and CE4, connected to PE1 with 317 CE1 multi-homed to PE1 and PE2. Similar to Figure 1 provisioning 318 model, each VPLS PE must be configured with a unique VE-ID for it 319 base VPLS instance. CE1 which is multi-homed to PE1 and PE2 requires 320 configuration of CE-ID and both PE1 and PE2 MUST be provisioned with 321 the same CE-ID for CE1. CE2, CE3 and CE4 are single-homed sites and 322 do not require special addressing. However, an operator can chose to 323 configure a CE-ID for CE4 on PE1. By doing so, remote PEs can 324 determine that PE1 has two VPLS sites, CE1 and CE4. If both CE1 and 325 CE4 connectivity to PE1 is down, remote PEs can chose not to send 326 multicast traffic to PE1 as there are no VPLS sites reachable via 327 PE1. If CE4 was not assigned a unique CE-ID, remote PEs have no way 328 to know if there are other VPLS sites attached and hence, would 329 always send multicast traffic to PE1. While CE2 and CE3 can also be 330 configured with unique CE-IDs, there is no advantage in doing so as 331 both PE3 and PE4 have exactly one VPLS site. 333 Note that a CE-ID=0 is invalid and a PE should discard such an 334 advertisement. 336 Use of multiple VE-IDs per VPLS instance for either multi-homing 337 operation or for any other purpose is outside the scope of this 338 document. However, for interoperability with existing deployments 339 that use multiple VE-IDs, Section 6.1 provides more detail. 341 3.3. Designated Forwarder Election 343 BGP-based multi-homing for VPLS relies on standard BGP path selection 344 and VPLS DF election. The net result of doing both BGP path 345 selection and VPLS DF election is that of electing a single 346 designated forwarder (DF) among the set of PEs to which a customer 347 site is multi-homed. All the PEs that are elected as non-designated 348 forwarders MUST keep their attachment circuit to the multi-homed CE 349 in blocked status (no forwarding). 351 These election algorithms operate on VPLS advertisements, which 352 include both the NLRI and attached BGP attributes. These election 353 algorithms are applicable to all VPLS NLRIs, and not just to CE 354 NLRIs. In order to simplify the explanation of these algorithms, we 355 will use a number of variables derived from fields in the VPLS 356 advertisement. These variables are: RD, SITE-ID, VBO, DOM, ACS, PREF 357 and PE-ID. The notation ADV -> means that from a received VPLS advertisement ADV, the 359 respective variables were derived. The following sections describe 360 two attributes needed for DF election, then describe the variables 361 and how they are derived from fields in VPLS advertisement ADV, and 362 finally describe how DF election is done. 364 3.3.1. Attributes 366 The procedures below refer to two attributes: the Route Origin 367 community (see Section 4.1) and the L2-info community (see 368 Section 4.2). These attributes are required for inter-AS operation; 369 for generality, the procedures below show how they are to be used. 370 The procedures also outline how to handle the case that either or 371 both are not present. 373 For BGP-based Multi-homing, ADV MUST contain an L2-info extended 374 community as specified in [RFC4761]. Within this community are 375 various control flags. Two new control flags are proposed in this 376 document. Figure 3 shows the position of the new 'D' and 'F' flags. 378 Control Flags Bit Vector 380 0 1 2 3 4 5 6 7 381 +-+-+-+-+-+-+-+-+ 382 |D|Z|F|Z|Z|Z|C|S| (Z = MUST Be Zero) 383 +-+-+-+-+-+-+-+-+ 385 Figure 3 387 1. 'D' (Down): Indicates connectivity status. In case of CE NLRI, 388 the connectivity status is between a CE site and a VPLS PE. In 389 case of VE NLRI, the connectivity status is for the VPLS 390 instance. In case of CE NLRI, the bit MUST be set to one if all 391 the attachment circuits connecting a CE site to a VPLS PE are 392 down. In case of VE NLRI, the bit must be set to one if the VPLS 393 instance is operationally down. Note that a VPLS instance that 394 has no connectivity to any of its sites must be considered as 395 operationally down. 397 2. 'F' (Flush): Indicates when to flush MAC state. A designated 398 forwarder must set the F bit and a non-designated forwarder must 399 clear the F bit when sending BGP CE NLRIs for multi-homed sites. 400 A state transition from one to zero for the F bit can be used by 401 a remote PE to flush all the MACs learned from the PE that is 402 transitioning from designated forwarder to non-designated 403 forwarder. Refer to Section 5 for more details on the use case. 404 Note that F bit is only applicable to VE NLRI and is not 405 applicable to CE NLRI. 407 3.3.2. Variables Used 409 3.3.2.1. RD 411 RD is simply set to the Route Distinguisher field in the NLRI part of 412 ADV. 414 3.3.2.2. SITE-ID 416 SITE-ID is simply set to the VE-ID field in the NLRI part of the ADV. 418 Note that no distinction is made whether VE-ID is for a multi-homed 419 site or not. 421 3.3.2.3. VBO 423 VBO is simply set to the VE Block Offset field in the NLRI part of 424 ADV. 426 3.3.2.4. DOM 428 This variable, indicating the VPLS domain to which ADV belongs, is 429 derived by applying BGP policy to the Route Target extended 430 communities in ADV. The details of how this is done are outside the 431 scope of this document. 433 3.3.2.5. ACS 435 ACS is the status of the attachment circuits for a given site of a 436 VPLS. ACS = 1 if all attachment circuits for the site are down, and 437 0 otherwise. 439 ACS is set to the value of the 'D' bit in ADV that belongs to CE 440 NLRI. If ADV belongs to base VPLS instance (VE NLRI) with non-zero 441 label block values, no change must be made to ACS. 443 3.3.2.6. PREF 445 PREF is derived from the Local Preference (LP) attribute in ADV as 446 well as the VPLS Preference field (VP) in the L2-info extended 447 community. If the Local Preference attribute is missing, LP is set 448 to 0; if the L2-info community is missing, VP is set to 0. The 449 following table shows how PREF is computed from LP and VP. 451 +------------+--------------+------------+--------------------------+ 452 | VP Value | LP Value | PREF Value | Comment | 453 +------------+--------------+------------+--------------------------+ 454 | 0 | 0 | 0 | malformed advertisement, | 455 | | | | unless ACS=1 | 456 | | | | | 457 | 0 | 1 to | LP | backwards compatibility | 458 | | (2^16-1) | | | 459 | | | | | 460 | 0 | 2^16 to | (2^16-1) | backwards compatibility | 461 | | (2^32-1) | | | 462 | | | | | 463 | >0 | LP same as | VP | Implementation supports | 464 | | VP | | VP | 465 | | | | | 466 | >0 | LP != VP | 0 | malformed advertisement | 467 +------------+--------------+------------+--------------------------+ 469 Table 1 471 3.3.2.7. PE-ID 473 If ADV contains a Route Origin (RO) community (see Section 4.1) with 474 type 0x01, then PE-ID is set to the Global Administrator sub-field of 475 the RO. Otherwise, if ADV has an ORIGINATOR_ID attribute, then PE-ID 476 is set to the ORIGINATOR_ID. Otherwise, PE-ID is set to the BGP 477 Identifier. 479 3.3.3. Election Procedures 481 The election procedures described in this section apply equally to 482 BGP VPLS and LDP VPLS. A distinction MUST NOT be made on whether the 483 NLRI is a multi-homing NLRI or not. Subset of these procedures 484 documented in standard BGP best path selection deals with general IP 485 Prefix BGP route selection processing as defined in [RFC4271]. A 486 separate part of the algorithm defined under VPLS DF election is 487 specific to designated forwarded election procedures performed on 488 VPLS advertisements. A concept of bucketization is introduced to 489 define route selection rules for VPLS advertisements. Note that this 490 is a conceptual description of the process; an implementation MAY 491 choose to realize this differently as long as the semantics are 492 preserved. 494 3.3.3.1. Bucketization for standard BGP path selection 496 An advertisement 498 ADV -> 500 is put into the bucket for . In other words, the 501 information in BGP path selection consists of and 502 only advertisements with exact same are candidates 503 for BGP path selection procedure as defined in [RFC4271]. 505 3.3.3.2. Bucketization for VPLS DF Election 507 An advertisement 509 ADV -> 511 is discarded if DOM is not of interest to the VPLS PE. Otherwise, 512 ADV is put into the bucket for . In other words, all 513 advertisements for a particular VPLS domain that have the same SITE- 514 ID are candidates for VPLS DF election. 516 3.3.3.3. Tie-breaking Rules 518 This section describes the tie-breaking rules for VPLS DF election. 519 Tie-breaking rules for VPLS DF election are applied to candidate 520 advertisements by all VPLS PEs and the actions taken by VPLS PEs 521 based on the VPLS DF election result are described in Section 3.4. 523 Given two advertisements ADV1 and ADV2 from a given bucket, first 524 compute the variables needed for DF election: 526 ADV1 -> 527 ADV2 -> 529 Note that SITE-ID1 = SITE-ID2 and DOM1 = DOM2, since ADV1 and ADV2 530 came from the same bucket. Then the following tie-breaking rules 531 MUST be applied in the given order. 533 1. if (ACS1 != 1) AND (ACS2 == 1) ADV1 wins; stop 534 if (ACS1 == 1) AND (ACS2 != 1) ADV2 wins; stop 535 else continue 537 2. if (PREF1 > PREF2) ADV1 wins; stop; 538 else if (PREF1 < PREF2) ADV2 wins; stop; 539 else continue 541 3. if (PE-ID1 < PE-ID2) ADV1 wins; stop; 542 else if (PE-ID1 > PE-ID2) ADV2 wins; stop; 543 else ADV1 and ADV2 are from the same VPLS PE 545 If there is no winner and ADV1 and ADV2 are from the same PE, a VPLS 546 PE MUST retain both ADV1 and ADV2. 548 3.4. DF Election on PEs 550 DF election algorithm MUST be run by all multi-homed VPLS PEs. In 551 addition, all other PEs SHOULD also run the DF election algorithm. 552 As a result of the DF election, multi-homed PEs that lose the DF 553 election for a SITE-ID MUST put the ACs associated with the SITE-ID 554 in non-forwarding state. 556 DF election result on the egress PEs can be used in traffic 557 forwarding decision. Figure 2 shows two customer sites, CE1 and CE4, 558 connected to PE1 with CE1 multi-homed to PE1 and PE2. If PE1 is the 559 designated forwarder for CE1, based on the DF election result, PE3 560 can chose to not send unknown unicast and multicast traffic to PE2 as 561 PE2 is not the designated forwarder for any customer site and it has 562 no other single homed sites connected to it. 564 4. Multi-AS VPLS 566 This section describes multi-homing in an inter-AS context. 568 4.1. Route Origin Extended Community 570 Due to lack of information about the PEs that originate the VPLS 571 NLRIs in inter-AS operations, Route Origin Extended Community 572 [RFC4360] is used to carry the source PE's IP address. 574 To use Route Origin Extended Community for carrying the originator 575 VPLS PE's loopback address, the type field of the community MUST be 576 set to 0x01 and the Global Administrator sub-field MUST be set to the 577 PE's loopback IP address. 579 4.2. VPLS Preference 581 When multiple PEs are assigned the same site ID for multi-homing, it 582 is often desired to be able to control the selection of a particular 583 PE as the designated forwarder. Section 3.5 in [RFC4761] describes 584 the use of BGP Local Preference in path selection to choose a 585 particular NLRI, where Local Preference indicates the degree of 586 preference for a particular VE. The use of Local Preference is 587 inadequate when VPLS PEs are spread across multiple ASes as Local 588 Preference is not carried across AS boundary. A new field, VPLS 589 preference (VP), is introduced in this document that can be used to 590 accomplish this. VPLS preference indicates a degree of preference 591 for a particular customer site. VPLS preference is not mandatory for 592 intra-AS operation; the algorithm explained in Section 3.3 will work 593 with or without the presence of VPLS preference. 595 Section 3.2.4 in [RFC4761] describes the Layer2 Info Extended 596 Community that carries control information about the pseudowires. 597 The last two octets that were reserved now carries VPLS preference as 598 shown in Figure 4. 600 +------------------------------------+ 601 | Extended community type (2 octets) | 602 +------------------------------------+ 603 | Encaps Type (1 octet) | 604 +------------------------------------+ 605 | Control Flags (1 octet) | 606 +------------------------------------+ 607 | Layer-2 MTU (2 octet) | 608 +------------------------------------+ 609 | VPLS Preference (2 octets) | 610 +------------------------------------+ 612 Figure 4: Layer2 Info Extended Community 614 A VPLS preference is a 2-octets unsigned integer. A value of zero 615 indicates absence of a VP and is not a valid preference value. This 616 interpretation is required for backwards compatibility. 617 Implementations using Layer2 Info Extended Community as described in 618 (Section 3.2.4) [RFC4761] MUST set the last two octets as zero since 619 it was a reserved field. 621 For backwards compatibility, if VPLS preference is used, then BGP 622 Local Preference MUST be set to the value of VPLS preference. Note 623 that a Local Preference value of zero for a CE-ID is not valid unless 624 'D' bit in the control flags is set (see 625 [I-D.kothari-l2vpn-auto-site-id]). In addition, Local Preference 626 value greater than or equal to 2^16 for VPLS advertisements is not 627 valid. 629 4.3. Use of BGP attributes in Inter-AS Methods 631 Section 3.4 in [RFC4761] and section 4 in [RFC6074] describe three 632 methods (a, b and c) to connect sites in a VPLS to PEs that are 633 across multiple AS. Since VPLS advertisements in method (a) do not 634 cross AS boundaries, multi-homing operations for method (a) remain 635 exactly the same as they are within as AS. However, for method (b) 636 and (c), VPLS advertisements do cross AS boundary. This section 637 describes the VPLS operations for method (b) and method (c). 638 Consider Figure 5 for inter-AS VPLS with multi-homed customer sites. 640 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS Information 641 between ASBRs 643 AS1 AS2 644 ........ ........ 645 CE2 _______ . . . . 646 ___ PE1 . . PE3 --- CE3 647 / : . . : 648 __/ : : : : 649 CE1 __ : ASBR1 --- ASBR2 : 650 \ : : : : 651 \___ PE2 . . PE4 ---- CE4 652 . . . . 653 ........ ........ 655 Figure 5: Inter-AS VPLS 657 A customer has four sites, CE1, CE2, CE3 and CE4. CE1 is multi-homed 658 to PE1 and PE2 in AS1. CE2 is single-homed to PE1. CE3 and CE4 are 659 also single homed to PE3 and PE4 respectively in AS2. Assume that in 660 addition to the base LDP/BGP VPLS addressing (VSI-IDs/VE-IDs), CE-ID 661 1 is assigned for CE1. After running DF election algorithm, all four 662 VPLS PEs must elect the same designated forwarder for CE1 site. 663 Since BGP Local Preference is not carried across AS boundary, VPLS 664 preference as described in Section 4.2 MUST be used for carrying site 665 preference in inter-AS VPLS operations. 667 For Inter-AS method (b) ASBR1 will send a VPLS NLRI received from PE1 668 to ASBR2 with itself as the BGP nexthop. ASBR2 will send the 669 received NLRI from ASBR1 to PE3 and PE4 with itself as the BGP 670 nexthop. Since VPLS PEs use BGP Local Preference in DF election, for 671 backwards compatibility, ASBR2 MUST set the Local Preference value in 672 the VPLS advertisements it sends to PE3 and PE4 to the VPLS 673 preference value contained in the VPLS advertisement it receives from 674 ASBR1. ASBR1 MUST do the same for the NLRIs it sends to PE1 and PE2. 675 If ASBR1 receives a VPLS advertisement without a valid VPLS 676 preference from a PE within its AS, then ASBR1 MUST set the VPLS 677 preference in the advertisements to the Local Preference value before 678 sending it to ASBR2. Similarly, ASBR2 must do the same for 679 advertisements without VPLS Preference it receives from PEs within 680 its AS. Thus, in method (b), ASBRs MUST update the VPLS and Local 681 Preference based on the advertisements they receive either from an 682 ASBR or a PE within their AS. 684 In Figure 5, PE1 will send the VPLS advertisements with Route Origin 685 Extended Community containing its loopback address. PE2 will do the 686 same. Even though PE3 receives the VPLS advertisements for VE-ID 1 687 and 2 from the same BGP nexthop, ASBR2, the source PE address 688 contained in the Route Origin Extended Community is different for the 689 CE1 and CE2 advertisements, and thus, PE3 creates two PWs, one for 690 CE1 (for VE-ID 1) and another one for CE2 (for VE-ID 2). 692 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of VPLS 693 Information between ASes 695 In this method, there is a multi-hop E-BGP peering between the PEs or 696 Route Reflectors in AS1 and the PEs or Route Reflectors in AS2. 697 There is no VPLS state in either control or data plane on the ASBRs. 698 The multi-homing operations on the PEs in this method are exactly the 699 same as they are in intra-AS scenario. However, since Local 700 Preference is not carried across AS boundary, the translation of LP 701 to VP and vice versa MUST be done by RR, if RR is used to reflect 702 VPLS advertisements to other ASes. This is exactly the same as what 703 a ASBR does in case of method (b). A RR must set the VP to the LP 704 value in an advertisement before sending it to other ASes and must 705 set the LP to the VP value in an advertisement that it receives from 706 other ASes before sending to the PEs within the AS. 708 5. MAC Flush Operations 710 In a service provider VPLS network, customer MAC learning is confined 711 to PE devices and any intermediate nodes, such as a Route Reflector, 712 do not have any state for MAC addresses. 714 Topology changes either in the service provider's network or in 715 customer's network can result in the movement of MAC addresses from 716 one PE device to another. Such events can result into traffic being 717 dropped due to stale state of MAC addresses on the PE devices. Age 718 out timers that clear the stale state will resume the traffic 719 forwarding, but age out timers are typically in minutes, and 720 convergence of the order of minutes can severely impact customer's 721 service. To handle such events and expedite convergence of traffic, 722 flushing of affected MAC addresses is highly desirable. 724 5.1. MAC Flush Indicators 725 If 'D' bit in the control flags is set in a received VE NLRI, the 726 receiving PE must flush all the MAC addresses learned from the PE 727 advertising the failure. 729 Anytime a designated forwarder change occurs, a remote PE must flush 730 all the MAC addresses it learned from the PE that lost the DF 731 election (old designated forwarder). If multiple customer sites are 732 connected to the same PE, PE1 as shown in Figure 2, and redundancy 733 per site is desired when multi-homing procedures described in this 734 document are in effect, then it is desirable to flush just the 735 relevant MAC addresses from a particular site when the site 736 connectivity is lost. However, procedures for flushing a limited set 737 of MAC addresses is beyond the scope of this document. Use of either 738 'D' or 'F' bit in control flags only allows to flush all MAC 739 addresses associated with a PE. 741 Designated forwarder change can occur in absence of failures, such as 742 when an attachment circuit comes up. Consider the case in Figure 2 743 where PE1-CE1 link is non-operational and PE2 is the designated 744 forwarder for CE1. Also assume that Local Preference of PE1 is 745 higher than PE2. When PE1-CE1 link becomes operational, PE1 will 746 send a BGP CE advertisement for CE1 to all it's peers. If PE3 747 performs the DF election before PE2, there is a chance that PE3 might 748 learn MAC addresses from PE2 after it was done electing PE1. This 749 can happen since PE2 has not yet processed the BGP CE advertisement 750 from PE1 and as a result continues to send traffic to PE3. This can 751 cause traffic from PE3 to CE1 to black-hole until those MAC addresses 752 are deleted due to age out timers. Therefore, to avoid such race- 753 conditions, a designated forwarder must set the F bit and a non- 754 designated forwarder must clear the F bit when sending BGP CE 755 advertisements. A state transition from one to zero for the 'F' bit 756 can be used by a remote PE to flush all the MACs learned from the PE 757 that is transitioning from designated forwarder to non-designated 758 forwarder. 760 5.2. Minimizing the effects of fast link transitions 762 Certain failure scenarios may result in fast transitions of the link 763 towards the multi-homing CE which in turn will generate fast status 764 transitions of one or multiple multi-homed sites reflected through 765 multiple BGP CE advertisements and LDP MAC Flush messages. 767 It is recommended that a timer to damp the link flaps be used for the 768 port towards the multi-homed CE to minimize the number of MAC Flush 769 events in the remote PEs and the occurrences of BGP state compression 770 for F bit transitions. A timer value more than the time it takes BGP 771 to converge in the network is recommended. 773 6. Backwards Compatibility 775 No forwarding loops are formed when PEs or Route Reflectors that do 776 not support procedures defined in this section co exist in the 777 network with PEs or Route Reflectors that do support. 779 6.1. BGP based VPLS 781 As explained in this section, multi-homed PEs to the same customer 782 site MUST assign the same CE-ID and related NLRI SHOULD contain the 783 block offset, block size and label base as zero. Remote PEs that 784 lack support of multi-homing operations specified in this document 785 will fail to create any PWs for the multi-homed CE-IDs due to the 786 label value of zero and thus, the multi-homing NLRI should have no 787 impact on the operation of Remote PEs that lack support of multi- 788 homing operations specified in this document. 790 For compatibility with PEs that use multiple VE-IDs with non-zero 791 label block values for multi-homing operation, it is a requirement 792 that a PE receiving such advertisements must use the labels in the 793 NLRIs associated with lowest VE-ID for PW creation. It is possible 794 that maintaining PW association with lowest VE-ID can result in PW 795 flap, and thus, traffic loss. However, it is necessary to maintain 796 the association of PW with the lowest VE-ID as it provides 797 deterministic DF election among all the VPLS PEs. 799 6.2. LDP VPLS with BGP Auto-discovery 801 The BGP-AD NLRI has a prefix length of 12 containing only a 8 bytes 802 RD and a 4 bytes VSI-ID. If a LDP VPLS PEs running BGP AD lacks 803 support of multi-homing operations specified in this document, it 804 SHOULD ignore a CE NLRI with the length field of 17. As a result it 805 will not ask LDP to create any PWs for the multi-homed Site-ID and 806 thus, the multi-homing NLRI should have no impact on LDP VPLS 807 operation. MH PEs may use existing LDP MAC Flush to flush the remote 808 LDP VPLS PEs or may use the MAC Flush procedures as described in 809 Section 5 811 7. Security Considerations 813 No new security issues are introduced beyond those that are described 814 in [RFC4761] and [RFC4762]. 816 8. IANA Considerations 818 At this time, this memo includes no request to IANA. 820 9. Acknowledgments 822 The authors would like to thank Yakov Rekhter, Nischal Sheth, Mitali 823 Singh and Ian Cowburn for their insightful comments and probing 824 questions. 826 10. References 828 10.1. Normative References 830 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 831 Requirement Levels", BCP 14, RFC 2119, March 1997. 833 [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service 834 (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 835 4761, January 2007. 837 [RFC6074] Rosen, E., Davie, B., Radoaca, V., and W. Luo, 838 "Provisioning, Auto-Discovery, and Signaling in Layer 2 839 Virtual Private Networks (L2VPNs)", RFC 6074, January 840 2011. 842 10.2. Informative References 844 [I-D.kothari-l2vpn-vpls-flush] 845 Kothari, B. and R. Fernando, "VPLS Flush in BGP-based 846 Virtual Private LAN Service", draft-kothari-l2vpn-vpls- 847 flush-00 (work in progress), October 2008. 849 [I-D.kothari-l2vpn-auto-site-id] 850 Kothari, B., Kompella, K., and T. IV, "Automatic 851 Generation of Site IDs for Virtual Private LAN Service", 852 draft-kothari-l2vpn-auto-site-id-01 (work in progress), 853 October 2008. 855 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 856 Communities Attribute", RFC 4360, February 2006. 858 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 859 Networks (VPNs)", RFC 4364, February 2006. 861 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 862 Reflection: An Alternative to Full Mesh Internal BGP 863 (IBGP)", RFC 4456, April 2006. 865 [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service 866 (VPLS) Using Label Distribution Protocol (LDP) Signaling", 867 RFC 4762, January 2007. 869 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 870 Protocol 4 (BGP-4)", RFC 4271, January 2006. 872 Authors' Addresses 874 Bhupesh Kothari 875 Gainspeed 876 295 Santa Ana Court 877 Sunnyvale, CA 94085 878 US 880 Email: bhupesh@gainspeed.com 882 Kireeti Kompella 883 Juniper Networks 884 1194 N. Mathilda Ave. 885 Sunnyvale, CA 94089 886 US 888 Email: kireeti.kompella@gmail.com 890 Wim Henderickx 891 Alcatel-Lucent 893 Email: wim.henderickx@alcatel-lucent.be 895 Florin Balus 896 Alcatel-Lucent 898 Email: florin.balus@alcatel-lucent.com 900 James Uttaro 901 AT&T 902 200 S. Laurel Avenue 903 Middletown, NJ 07748 904 US 906 Email: uttaro@att.com 907 Senad Palislamovic 908 Alcatel-Lucent 910 Email: senad.palislamovic@alcatel-lucent.com 912 Wen Lin 913 Juniper Networks 915 Email: wlin@juniper.net