idnits 2.17.1 draft-ietf-l2vpn-vpls-multihoming-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 59 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC4761, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4761, updated by this document, for RFC5378 checks: 2003-07-22) -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 21, 2012) is 4199 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC4456' is defined on line 856, but no explicit reference was found in the text == Unused Reference: 'RFC4271' is defined on line 864, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force B. Kothari 2 Internet Draft Cohere Networks 3 Updates: 4761 (if approved) 4 Intended status: Standards Track K. Kompella 5 Expires: April 2013 Contrail Systems 7 W. Henderickx 8 F. Balus 9 S. Palislamovic 10 Alcatel-Lucent 12 J. Uttaro 13 AT&T 15 W. Lin 16 Juniper Networks 18 October 21, 2012 20 BGP based Multi-homing in Virtual Private LAN Service 21 draft-ietf-l2vpn-vpls-multihoming-04.txt 23 Abstract 25 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 26 Network (VPN) that gives its customers the appearance that their 27 sites are connected via a Local Area Network (LAN). It is often 28 required for the Service Provider (SP) to give the customer 29 redundant connectivity to some sites, often called "multi-homing". 30 This memo shows how BGP-based multi-homing can be offered in the 31 context of LDP and BGP VPLS solutions. 33 Status of this Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF), its areas, and its working groups. Note that 40 other groups may also distribute working documents as Internet- 41 Drafts. 43 Internet-Drafts are draft documents valid for a maximum of six 44 months and may be updated, replaced, or obsoleted by other documents 45 at any time. It is inappropriate to use Internet-Drafts as 46 reference material or to cite them other than as "work in progress." 48 The list of current Internet-Drafts can be accessed at 49 http://www.ietf.org/ietf/1id-abstracts.txt 51 The list of Internet-Draft Shadow Directories can be accessed at 52 http://www.ietf.org/shadow.html 54 This Internet-Draft will expire on April 21, 2013. 56 Copyright Notice 58 Copyright (c) 2012 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (http://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with 66 respect to this document. Code Components extracted from this 67 document must include Simplified BSD License text as described in 68 Section 4.e of the Trust Legal Provisions and are provided without 69 warranty as described in the Simplified BSD License. 71 This document may contain material from IETF Documents or IETF 72 Contributions published or made publicly available before November 73 10, 2008. The person(s) controlling the copyright in some of this 74 material may not have granted the IETF Trust the right to allow 75 modifications of such material outside the IETF Standards Process. 76 Without obtaining an adequate license from the person(s) controlling 77 the copyright in such materials, this document may not be modified 78 outside the IETF Standards Process, and derivative works of it may 79 not be created outside the IETF Standards Process, except to format 80 it for publication as an RFC or to translate it into languages other 81 than English. 83 Table of Contents 85 1. Introduction...................................................4 86 1.1. General Terminology.......................................4 87 1.2. Conventions used in this document.........................5 88 2. Background.....................................................6 89 2.1. Scenarios.................................................6 90 2.2. VPLS Multi-homing Considerations..........................7 91 3. Multi-homing Operations........................................8 92 3.1. Provisioning Model........................................8 93 3.2. Multi-homing NLRI.........................................8 94 3.3. Designated Forwarder Election.............................9 95 3.3.1. Attributes...........................................9 96 3.3.2. Variables Used.......................................9 97 3.3.2.1. RD.............................................10 98 3.3.2.2. MH-ID..........................................10 99 3.3.2.3. VBO............................................10 100 3.3.2.4. DOM............................................10 101 3.3.2.5. ACS............................................10 102 3.3.2.6. PREF...........................................11 103 3.3.2.7. PE-ID..........................................11 104 3.4. VPLS DF Election on PEs..................................14 105 3.5. Pseudowire binding and traffic forwarding rules..........14 106 3.5.1. Site-ID Binding Properties..........................14 107 3.5.2. Standby Pseudowire Properties.......................15 108 4. Multi-AS VPLS.................................................16 109 4.1. Route Origin Extended Community..........................16 110 4.2. Preference...............................................16 111 4.3. Use of BGP-MH attributes in Inter-AS Methods.............17 112 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS 113 Information between ASBRs ......18 114 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of 115 VPLS Information between ASes ..............19 116 5. MAC Flush Operations..........................................20 117 5.1. MAC List FLush...........................................20 118 5.2. Implicit MAC Flush.......................................21 119 5.3. Minimizing the effects of fast link transitions..........22 120 6. Backwards Compatibility.......................................23 121 6.1. BGP based VPLS...........................................23 122 6.2. LDP VPLS with BGP Auto-discovery.........................23 123 7. Security Considerations.......................................24 124 8. IANA Considerations...........................................25 125 9. Acknowledgments...............................................26 126 10. References...................................................27 127 10.1. Normative References....................................27 128 10.2. Informative References..................................27 130 1. Introduction 132 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 133 Network (VPN) that gives its customers the appearance that their 134 sites are connected via a Local Area Network (LAN). It is often 135 required for a Service Provider (SP) to give the customer redundant 136 connectivity to one or more sites, often called "multi-homing". 137 [RFC4761] explains how VPLS can be offered using BGP for auto- 138 discovery and signaling; section 3.5 of that document describes how 139 multi-homing can be achieved in this context. [RFC6074] explains how 140 VPLS can be offered using BGP for auto- discovery, (BGP-AD) and 141 [RFC4762] explains how VPLS can be offered using LDP for signaling. 142 This document provides a BGP-based multi-homing solution applicable 143 to both BGP and LDP VPLS technologies. Note that BGP MH can be used 144 for LDP VPLS without the use of the BGP- AD solution. 146 Section 2 lays out some of the scenarios for multi-homing, other 147 ways that this can be achieved, and some of the expectations of BGP- 148 based multi-homing. Section 3 defines the components of BGP-based 149 multi-homing, and the procedures required to achieve this. Section 150 7 may someday discuss security considerations. 152 1.1. General Terminology 154 Some general terminology is defined here; most is from [RFC4761], 155 [RFC4762] or [RFC4364]. Terminology specific to this memo is 156 introduced as needed in later sections. 158 A "Customer Edge" (CE) device, typically located on customer 159 premises, connects to a "Provider Edge" (PE) device, which is owned 160 and operated by the SP. A "Provider" (P) device is also owned and 161 operated by the SP, but has no direct customer connections. A "VPLS 162 Edge" (VE) device is a PE that offers VPLS services. 164 A VPLS domain represents a bridging domain per customer. A Route 165 Target community as described in [RFC4360] is typically used to 166 identify all the PE routers participating in a particular VPLS 167 domain. A VPLS site is a grouping of ports on a PE that belong to 168 the same VPLS domain. A Multi-homed (MH) site is uniquely 169 identified by a MH site ID (MH-ID). Sites are referred to as local 170 or remote depending on whether they are configured on the PE router 171 in context or on one of the remote PE routers (network peers). 173 1.2. Conventions used in this document 175 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 176 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 177 document are to be interpreted as described in RFC-2119 [RFC2119]. 179 2. Background 181 This section describes various scenarios where multi-homing may be 182 required, and the implications thereof. It also describes some of 183 the singular properties of VPLS multi-homing, and what that means 184 from both an operational point of view and an implementation point 185 of view. There are other approaches for providing multi-homing such 186 as Spanning Tree Protocol, and this document specifies use of BGP 187 for multi-homing. Comprehensive comparison among the approaches is 188 outside the scope of this document. 190 2.1. Scenarios 192 CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 for 193 redundant connectivity. 195 ............... 196 . . ___ CE2 197 ___ PE1 . / 198 / : PE3 199 __/ : Service : 200 CE1 __ : Provider PE4 201 \ : : \___ CE3 202 \___ PE2 . 203 . . 204 ............... 206 Figure 1: Scenario 1 208 CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 for 209 redundant connectivity. However, CE4, which is also in the same 210 VPLS domain, is single-homed to just PE1. 212 CE4 ------- ............... 213 \ . . ___ CE2 214 ___ PE1 . / 215 / : PE3 216 __/ : Service : 217 CE1 __ : Provider PE4 218 \ : : \___ CE3 219 \___ PE2 . 220 . . 221 ............... 223 Figure 2: Scenario 2 225 2.2. VPLS Multi-homing Considerations 227 The first (perhaps obvious) fact about a multi-homed VPLS CE, such 228 as CE1 in Figure 1 is that if CE1 is an Ethernet switch or bridge, a 229 loop has been created in the customer VPLS. This is a dangerous 230 situation for an Ethernet network, and the loop must be broken. 231 Even if CE1 is a router, it will get duplicates every time a packet 232 is flooded, which is clearly undesirable. 234 The next is that (unlike the case of IP-based multi-homing) only one 235 of PE1 and PE2 can be actively sending traffic, either towards CE1 236 or into the SP cloud. That is to say, load balancing techniques 237 will not work. All other PEs MUST choose the same designated 238 forwarder for a multi-homed site. Call the PE that is chosen to 239 send traffic to/from CE1 the "designated forwarder". 241 In Figure 2, CE1 and CE4 must be dealt with independently, since CE1 242 is dual-homed, but CE4 is not. 244 3. Multi-homing Operations 246 This section describes procedures for electing a designated 247 forwarder among the set of PEs that are multi-homed to a customer 248 site. The procedures described in this section are applicable to 249 BGP based VPLS, LDP based VPLS with BGP-AD or a VPLS that contains a 250 mix of both BGP and LDP signaled PWs. 252 3.1. Provisioning Model 254 Figure 1 shows a customer site, CE1, multi-homed to two VPLS PEs, 255 PE1 and PE2. In order for all VPLS PEs within the same VPLS domain 256 to elect one of the multi-homed PEs as the designated forwarder, an 257 indicator that the PEs are multi-homed to the same customer site is 258 required. This is achieved by assigning the same multi-homed site 259 ID (MH-ID) on PE1 and PE2 for CE1. When remote VPLS PEs receive 260 NLRI advertisement from PE1 and PE2 for CE1, the two NLRI 261 advertisements for CE1 are identified as candidates for designated 262 forwarder selection due to the same MH-ID. Thus, same MH-ID SHOULD 263 be assigned on all VPLS PEs that are multi-homed to the same 264 customer site. Note that a MH-ID=0 is invalid and a PE should 265 discard such an advertisement. 267 3.2. Multi-homing NLRI 269 Section 3.2.2 in [RFC4761] describes the encoding of the BGP VPLS 270 NLRI with the following fields: VE-ID, VE block offset, VE block 271 size and label base. While this NLRI MAY be used for multi-homing, 272 a modified version of it, as detailed in this paragraph, is used for 273 identifying the multi-homed customers sites. The VE-ID field in the 274 NLRI is set to MH-ID; the VE block offset, VE block size and label 275 base are set to zero. Thus, the NLRI contains 2 octets indicating 276 the length, 8 octets for Route Distinguisher, 2 octets for MH-ID and 277 7 octets with value zero. 279 Figure 2 shows two customer sites, CE1 and CE4, connected to PE1 280 with CE1 multi-homed to PE1 and PE2. CE4 does not require special 281 addressing, being associated with the base VPLS instance identified 282 by the VSI-ID for LDP VPLS and VE-ID for BGP VPLS. However, CE1 283 which is multi-homed to PE1 and PE2 requires configuration of MH-ID 284 and both PE1 and PE2 MUST be provisioned with the same MH-ID for 285 CE1. As stated above, to ensure backward capabilities, it is valid 286 to use BGP VPLS NLRI for multi-homing operations. As such, it is 287 valid to have non-zero VE block offset, VE block size and label base 288 in the VPLS NLRI for multi-homed site. 290 3.3. Designated Forwarder Election 292 BGP-based multi-homing for VPLS relies on Standard BGP best path 293 selection and VPLS DF election. The net result of doing both 294 elections is that of electing a single designated forwarder (DF) 295 among the set of PEs to which a customer site is multi-homed. All 296 the PEs that are elected as non-designated forwarders MUST keep 297 their attachment circuit to the multi-homed CE in blocked status (no 298 forwarding). 300 These election algorithms operate on VPLS advertisements, which 301 include both the NLRI and attached BGP attributes. Given that 302 semantics of BGP VPLS NLRI does not necessarily follow a standard IP 303 Prefix form, a construct of advertisement (ADV) with the variables 304 of interest is defined. The variables of interest are: RD, MH-ID, 305 VBO, DOM, ACS, PREF and PE-ID, so the notation of: 307 ADV = 309 The following sections describe two attributes needed for standard 310 BGP best path selection and VPLS DF election, the variables derived 311 from fields in VPLS advertisement ADV, and finally elaborate on the 312 selection processes. 314 3.3.1. Attributes 316 The procedures below refer to two attributes: the Route Origin 317 community (see Section 4.1) and the L2-info community (see Section 318 4.2). These attributes are required for inter-AS operation; for 319 generality, the procedures below show how they are to be used. The 320 procedures also outline how to handle the case that either or both 321 are not present. 323 3.3.2. Variables Used 324 3.3.2.1. RD 326 RD is simply set to the Route Distinguisher field in the NLRI part 327 of ADV. Actual process of assigning Route Distinguisher values must 328 guarantee its uniqueness per PE node. Therefore, two multi-homed 329 PEs offering the same VPLS service to a common set of CEs MUST 330 allocate different RD values for this site respectively. 332 3.3.2.2. MH-ID 334 MH-ID is simply set to the VE-ID field in the NLRI part of ADV. The 335 same MH-ID MUST be assigned to all PEs that are connected to the 336 same multi-homed site. 338 3.3.2.3. VBO 340 VBO is simply set to the VE Block Offset field in the NLRI part of 341 ADV. This field will typically be zero. 343 3.3.2.4. DOM 345 This variable, indicating the VPLS domain to which ADV belongs, is 346 derived by applying BGP policy to the Route Target extended 347 communities in ADV. The details of how this is done are outside the 348 scope of this document. 350 3.3.2.5. ACS 352 ACS is the status of the attachment circuits for a given site of a 353 VPLS. ACS = 1 if all attachment circuits for the site are down, and 354 0 otherwise. 356 For BGP-based Multi-homing, ADV MUST contain an L2-info extended 357 community; within this community are control flags. One of these 358 flags is the 'D' bit, described in [I-D.kothari-l2vpn-auto-site-id]. 359 ACS is set to the value of the 'D' bit in ADV. 361 3.3.2.6. PREF 363 PREF is derived from the Local Preference (LP) attribute in ADV as 364 well as the VPLS Preference field (VP) in the L2-info extended 365 community. If the Local Preference attribute is missing, LP is set 366 to 0; if the L2-info community is missing, VP is set to 0. The 367 following table shows how PREF is computed from LP and VP. 369 +---------+---------------+----------+------------------------------+ 370 | VP | LP Value | PREF | Comment | 371 | Value | | Value | | 372 +---------+---------------+----------+------------------------------+ 373 | 0 | 0 | 0 | malformed advertisement, | 374 | | | | unless ACS=1 | 375 | | | | | 376 | 0 | 1 to (2^16-1) | LP | backwards compatibility | 377 | | | | | 378 | 0 | 2^16 to | (2^16-1) | backwards compatibility | 379 | | (2^32-1) | | | 380 | | | | | 381 | >0 | LP same as VP | VP | Implementation supports VP | 382 | | | | | 383 | >0 | LP != VP | 0 | malformed advertisement | 384 +---------+---------------+----------+------------------------------+ 386 Table 1 388 3.3.2.7. PE-ID 390 If ADV contains a Route Origin (RO) community (see Section 4.1) with 391 type 0x01, then PE-ID is set to the Global Administrator sub-field 392 of the RO. Otherwise, if ADV has an ORIGINATOR_ID attribute, then 393 PE-ID is set to the ORIGINATOR_ID. Otherwise, PE-ID is set to the 394 BGP Identifier. 396 3.3.3. Election Procedures 398 The election procedures described in this section apply equally to 399 BGP VPLS and LDP VPLS. Subset of these procedures documented in 400 standard BGP best path selection deals with general IP Prefix BGP 401 route selection processing as defined in RFC 4271 and RFC 4364. A 402 separate part of the algorithm defined under VPLS DF election is 403 very specific to designated forwarded election procedures performed 404 on per VPLS instance bases. Given that the notion of VPLS 405 advertisement is not commonly used destination IP Prefix, a concept 406 of bucketization is introduced. By bucketizing advertisements and 407 running them through two different sets of procedures based on 408 variables of interest, we are effectively adopting common sets of 409 route selection rules to the VPLS environment. A distinction MUST 410 NOT be made on whether the NLRI is a multi-homing NLRI or not. Note 411 that this is a conceptual description of the process; an 412 implementation MAY choose to realize this differently as long as the 413 semantics are preserved. 415 3.3.3.1. Bucketization and BGP Best Path Selection 417 From the advertisement 419 ADV -> 421 we select variables of interest that satisfy a notion of the same 422 route as it is applicable to BGP election. As such, advertisements 423 with the exact same RD, MH-ID and VBO are candidates for BGP 424 Selection and put into BGP election bucket. 426 ADV -> 428 A standard set of BGP path selection rules, as defined in RFC 4271 429 and 4264 is applied as tie-breaking mechanism. These tie-breaking 430 rules are applied to candidate advertisements by a BGP speaker 431 responsible for processing and redistribution of BGP VPLS and MH 432 NLRI. If there is no winner and both ADVs are from the same PE, BGP 433 path selection should simply consider this an update. 435 3.3.3.2. Bucketization and VPLS DF Election 437 An advertisement 439 ADV -> 441 is discarded if DOM is not of interest to the VPLS PE. Otherwise, 442 ADV is put into the bucket for . In other words, all 443 advertisements for a particular VPLS domain that have the same MH-ID 444 and common DOM are candidates for VPLS DF election. Tie breaking 445 rules for VPLS DF election are different from standard BGP best path 446 selection. As outlined in 3.3, the main reason for that is the fact 447 that only single VPLS PE can be a designated forwarder for a given 448 site. Tie-breaking rules for VPLS DF election are applied to 449 candidate advertisements by all VPLS PEs and the actions taken by 450 VPLS PEs based on the VPLS DF election result are described 451 in Section 3.4. 453 Given two advertisements ADV1 and ADV2 from a given bucket, first 454 compute the variables needed for DF election: 456 ADV1 -> 457 ADV2 -> 459 Note that MH-ID1 = MH-ID2 and DOM1 = DOM2, since ADV1 and ADV2 came 460 from the same bucket. Then the following tie-breaking rules MUST be 461 applied in the given order. 463 1. if (ACS1 != 1) AND (ACS2 == 1) ADV1 wins; stop 464 if (ACS1 == 1) AND (ACS2 != 1) ADV2 wins; stop 465 else continue 467 2. if (PREF1 > PREF2) ADV1 wins; stop; 468 else if (PREF1 < PREF2) ADV2 wins; stop; 469 else continue 471 3. if (PE-ID1 < PE-ID2) ADV1 wins; stop; 472 else if (PE-ID1 > PE-ID2) ADV2 wins; stop; 473 else ADV1 and ADV2 are from the same VPLS PE 475 If there is no winner and ADV1 and ADV2 are from the same PE, a VPLS 476 PE MUST retain both ADV1 and ADV2. 478 3.4. VPLS DF Election on PEs 480 DF election algorithm MUST be run by all multi-homed VPLS PEs. In 481 addition, all other PEs SHOULD also run the DF election algorithm. 482 As a result of the DF election, multi-homed PEs that lose the DF 483 election for a MH-ID MUST put the ACs associated with the MH-ID in 484 non-forwarding state. DF election result on the egress PEs can be 485 used in traffic forwarding decision. 487 Figure 2 shows two customer sites, CE1 and CE4, connected to PE1 488 with CE1 multi-homed to PE1 and PE2. If PE1 is the designated 489 forwarder for CE1, based on the DF election result, PE3 can chose to 490 not send unknown unicast and multicast traffic to PE2 as PE2 is not 491 the designated forwarder for any customer site and it has no other 492 single homed sites connected to it. 494 3.5. Pseudowire binding and traffic forwarding rules 496 3.5.1. Site-ID Binding Properties 498 For the use case where a single PE provides connectivity to a set of 499 CEs from which some on multi-homed and others are not, only single 500 pseudowire MAY be established. For example, if PE1 provides VPLS 501 service to CE1 and CE4 which are both part of the same VPLS domain, 502 but different sites, and CE1 is multi-homed, but CE4 is not (as 503 described in figure 2), PE3 would establish only single pseudowire 504 toward PE1. A design needs to ensure that regardless of PE1's 505 forwarding state in respect to DF or non-DF for multi-homed CE1, 506 PE3s access to CE4 is established. Since label allocation and 507 pseudowire established is tied to site-ID, we need to ensure that 508 proper pseudowire bindings are established. 510 For set of given advertisements with the common DOM but with 511 different Site-ID values, a VPLS PE speaker SHOULD instantiate and 512 bind the pseudowire based on advertisement with the lowest Site-ID 513 value. Otherwise, binding would be completely random and during DF 514 changes for multi-homed site, non-multi-homed CE might suffer 515 traffic loss. 517 3.5.2. Standby Pseudowire Properties 519 As the notion of the convergence is addressed for transport plane by 520 use of RSVP FRR and LDP LFA, it is evident that similar solution is 521 required at the service level plane as well. This is most evident 522 in large-scale deployments as it takes quite long time to converge. 523 Ingress PE usually has to handle multiple relatively larger tasks 524 and re-signal all pseudowires affected by egress PE or AC failure. 525 Therefore, an implementation MAY choose to optimize the convergence 526 by pre-signaling the second, standby, pseudowire toward non-DF end 527 point for every active VPLS in 1:1 fashion. This greatly improves 528 the convergence times. However, details of such implementation are 529 still under research. 531 4. Multi-AS VPLS 533 This section describes multi-homing in an inter-AS context. 535 4.1. Route Origin Extended Community 537 Due to lack of information about the PEs that originate the VPLS 538 NLRIs in inter-AS operations, Route Origin Extended Community 539 [RFC4360] is used to carry the source PE's IP address. 541 To use Route Origin Extended Community for carrying the originator 542 VPLS PE's loopback address, the type field of the community MUST be 543 set to 0x01 and the Global Administrator sub-field MUST be set to 544 the PE's loopback IP address. 546 4.2. Preference 548 When multiple PEs are assigned the same site ID for multi-homing, it 549 is often desired to be able to control the selection of a particular 550 PE as the designated forwarder. Section 3.5 in [RFC4761] describes 551 the use of BGP Local Preference in path selection to choose a 552 particular NLRI, where Local Preference indicates the degree of 553 preference for a particular VE. The use of Local Preference is 554 inadequate when VPLS PEs are spread across multiple ASes as Local 555 Preference is not carried across AS boundary. A new field, VPLS 556 preference (VP), is introduced in this document that can be used to 557 accomplish this. VPLS preference indicates a degree of preference 558 for a particular customer site. VPLS preference is not mandatory 559 for intra-AS operation; the algorithm explained in Section 3.3 will 560 work with or without the presence of VPLS preference. 562 Section 3.2.4 in [RFC4761] describes the Layer2 Info Extended 563 Community that carries control information about the pseudowires. 564 The last two octets that were reserved now carries VPLS preference 565 as shown in Figure 3. 567 +------------------------------------+ 568 | Extended community type (2 octets) | 569 +------------------------------------+ 570 | Encaps Type (1 octet) | 571 +------------------------------------+ 572 | Control Flags (1 octet) | 573 +------------------------------------+ 574 | Layer-2 MTU (2 octet) | 575 +------------------------------------+ 576 | VPLS Preference (2 octets) | 577 +------------------------------------+ 579 Figure 3: Layer2 Info Extended Community 581 A VPLS preference is a 2-octets unsigned integer. A value of zero 582 indicates absence of a VP and is not a valid preference value. This 583 interpretation is required for backwards compatibility. 584 Implementations using Layer2 Info Extended Community as described in 585 (Section 3.2.4) [RFC4761] MUST set the last two octets as zero since 586 it was a reserved field. 588 For backwards compatibility, if VPLS preference is used, then BGP 589 Local Preference MUST be set to the value of VPLS preference. Note 590 that a Local Preference value of zero for a MH-ID is not valid nless 591 'D' bit in the control flags is set (see [I-D.kothari-l2vpn-auto- 592 site-id]). In addition, Local Preference value greater than or 593 equal to 2^16 for VPLS advertisements is not valid. 595 4.3. Use of BGP-MH attributes in Inter-AS Methods 597 Section 3.4 in [RFC4761] and section 4 in [RFC6074] describe three 598 methods (a, b and c) to connect sites in a VPLS to PEs that are 599 across multiple AS. Since VPLS advertisements in method (a) do not 600 cross AS boundaries, multi-homing operations for method (a) remain 601 exactly the same as they are within as AS. However, for method (b) 602 and (c), VPLS advertisements do cross AS boundary. This section 603 describes the VPLS operations for method (b) and method (c). 604 Consider Figure 4 for inter-AS VPLS with multi-homed customer sites. 606 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS Information 607 between ASBRs 609 AS1 AS2 610 ........ ........ 611 CE2 _______ . . . . 612 ___ PE1 . . PE3 --- CE3 613 / : . . : 614 __/ : : : : 615 CE1 __ : ASBR1 --- ASBR2 : 616 \ : : : : 617 \___ PE2 . . PE4 ---- CE4 618 . . . . 619 ........ ........ 621 Figure 4: Inter-AS VPLS 623 A customer has four sites, CE1, CE2, CE3 and CE4. CE1 is multi- 624 homed to PE1 and PE2 in AS1. CE2 is single-homed to PE1. CE3 and 625 CE4 are also single homed to PE3 and PE4 respectively in AS2. 626 Assume that in addition to the base LDP/BGP VPLS addressing (VSI- 627 IDs/VE-IDs), MH ID 1 is assigned for CE1. After running DF election 628 algorithm, all four VPLS PEs must elect the same designated 629 forwarder for CE1 site. Since BGP Local Preference is not carried 630 across AS boundary, VPLS preference as described in Section 4.2 MUST 631 be used for carrying site preference in inter-AS VPLS operations. 633 For Inter-AS method (b) ASBR1 will send a VPLS NLRI received from 634 PE1 to ASBR2 with itself as the BGP nexthop. ASBR2 will send the 635 received NLRI from ASBR1 to PE3 and PE4 with itself as the BGP 636 nexthop. Since VPLS PEs use BGP Local Preference in DF election, 637 for backwards compatibility, ASBR2 MUST set the Local Preference 638 value in the VPLS advertisements it sends to PE3 and PE4 to the VPLS 639 preference value contained in the VPLS advertisement it receives 640 from ASBR1. ASBR1 MUST do the same for the NLRIs it sends to PE1 641 and PE2. If ASBR1 receives a VPLS advertisement without a valid 642 VPLS preference from a PE within its AS, then ASBR1 MUST set the 643 VPLS preference in the advertisements to the Local Preference value 644 before sending it to ASBR2. Similarly, ASBR2 must do the same for 645 advertisements without VPLS Preference it receives from PEs within 646 its AS. Thus, in method (b), ASBRs MUST update the VPLS and Local 647 Preference based on the advertisements they receive either from an 648 ASBR or a PE within their AS. 650 In Figure 4, PE1 will send the VPLS advertisements with Route Origin 651 Extended Community containing its loopback address. PE2 will do the 652 same. Even though PE3 receives the VPLS advertisements for VE-ID 1 653 and 2 from the same BGP nexthop, ASBR2, the source PE address 654 contained in the Route Origin Extended Community is different for 655 the CE1 and CE2 advertisements, and thus, PE3 creates two PWs, one 656 for CE1 (for VE-ID 1) and another one for CE2 (for VE-ID 2). 658 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of VPLS 659 Information between ASes 661 In this method, there is a multi-hop E-BGP peering between the PEs 662 or Route Reflectors in AS1 and the PEs or Route Reflectors in AS2. 663 There is no VPLS state in either control or data plane on the ASBRs. 664 The multi-homing operations on the PEs in this method are exactly 665 the same as they are in intra-AS scenario. However, since Local 666 Preference is not carried across AS boundary, the translation of LP 667 to VP and vice versa MUST be done by RR, if RR is used to reflect 668 VPLS advertisements to other ASes. This is exactly the same as what 669 an ASBR does in case of method (b). A RR must set the VP to the LP 670 value in an advertisement before sending it to other ASes and must 671 set the LP to the VP value in an advertisement that it receives from 672 other ASes before sending to the PEs within the AS. 674 5. MAC Flush Operations 676 In a service provider VPLS network, customer MAC learning is 677 confined to PE devices and any intermediate nodes, such as a Route 678 Reflector, do not have any state for MAC addresses. 680 Topology changes either in the service provider's network or in 681 customer's network can result in the movement of MAC addresses from 682 one PE device to another. Such events can result into traffic being 683 dropped due to stale state of MAC addresses on the PE devices. Age 684 out timers that clear the stale state will resume the traffic 685 forwarding, but age out timers are typically in minutes, and 686 convergence of the order of minutes can severely impact customer's 687 service. To handle such events and expedite convergence of traffic, 688 flushing of affected MAC addresses is highly desirable. 690 This section describes the scenarios where VPLS flush is desirable 691 and the specific VPLS Flush TLVs that provide capability to flush 692 the affected MAC addresses on the PE devices. All operations 693 described in this section are in context of a particular VPLS domain 694 and not across multiple VPLS domains. Mechanisms for MAC flush are 695 described in [I-D.kothari-l2vpn-vpls-flush] for BGP based VPLS and 696 in [RFC4762] for LDP based VPLS. 698 5.1. MAC List FLush 700 If multiple customer sites are connected to the same PE, PE1 as 701 shown in Figure 2, and redundancy per site is desired when multi- 702 homing procedures described in this document are in effect, then it 703 is desirable to flush just the relevant MAC addresses from a 704 particular site when the site connectivity is lost. 706 To flush particular set of MAC addresses, a PE SHOULD originate a 707 flush message with MAC list that contains a list of MAC addresses 708 that needs to be flushed. In Figure 2, if connectivity between CE1 709 and PE1 goes down and if PE1 was the designated forwarder for CE1, 710 PE1 MAY send a list of MAC addresses that belong to CE1 to all its 711 BGP peers. 713 It is RECOMMENDED that in case of excessive link flap of customer 714 attachment circuit in a short duration, a PE should have a means to 715 throttle advertisements of flush messages so that excessive flooding 716 of such advertisements do not occur. 718 5.2. Implicit MAC Flush 720 Implicit MAC Flush refers to the use of BGP MH advertisements by the 721 PEs to flush the MAC addresses learned from the previous designated 722 forwarder. 724 In case of a failure, when connectivity to a customer site is lost, 725 remote PEs learn that a particular site is no longer reachable. The 726 local PE either withdraws the VPLS NLRI that it previously 727 advertised for the site or it sends a BGP update message for the 728 site's VPLS NLRI with the 'D' bit set. In such cases, the remote 729 PEs can flush all the MACs that were learned from the PE which 730 reported the failure. 732 However, in cases when a designated forwarder change occurs in 733 absence of failures, such as when an attachment circuit comes up, 734 the BGP MH advertisement from the PE reporting the change is not 735 sufficient for MAC flush procedures. Consider the case in Figure 2 736 where PE1-CE1 link is non-operational and PE2 is the designated 737 forwarder for CE1. Also assume that Local Preference of PE1 is 738 higher than PE2. When PE1-CE1 link becomes operational, PE1 will 739 send a BGP MH advertisement to all it's peers. If PE3 elects PE1 as 740 the new designated forwarder for CE1 and as a result flushes all the 741 MACs learned from PE1 before PE2 elects itself as the non-designated 742 forwarder, there is a chance that PE3 might learn MAC addresses from 743 PE2 and as a result may black-hole traffic until those MAC addresses 744 are deleted due to age out timers. 746 A new flag 'F' is introduced in the Control Flags Bit Vector as a 747 deterministic way to indicate when to flush. 749 Control Flags Bit Vector 751 0 1 2 3 4 5 6 7 752 +-+-+-+-+-+-+-+-+ 753 |D|A|F|Z|Z|Z|C|S| (Z = MUST Be Zero) 754 +-+-+-+-+-+-+-+-+ 756 Figure 5 758 A designated forwarder must set the F bit and a non-designated 759 forwarder must clear the F bit when sending BGP MH advertisements. 760 A state transition from one to zero for the F bit can be used by a 761 remote PE to flush all the MACs learned from the PE that is 762 transitioning from designated forwarder to non-designated forwarder. 764 5.3. Minimizing the effects of fast link transitions 766 Certain failure scenarios may result in fast transitions of the link 767 towards the multi-homing CE which in turn will generate fast status 768 transitions of one or multiple multi-homed sites reflected through 769 multiple BGP MH advertisements and LDP MAC Flush messages. 771 It is recommended that a timer to damp the link flaps be used for 772 the port towards the multi-homed CE to minimize the number of MAC 773 Flush events in the remote PEs and the occurrences of BGP state 774 compressions for F bit transitions. A timer value more than the 775 time it takes BGP to converge in the network is recommended. 777 6. Backwards Compatibility 779 No forwarding loops are formed when PEs or Route Reflectors that do 780 not support procedures defined in this section co exist in the 781 network with PEs or Route Reflectors that do support. 783 6.1. BGP based VPLS 785 As explained in this section, multi-homed PEs to the same customer 786 site MUST assign the same MH-ID and related NLRI SHOULD contain the 787 block offset, block size and label base as zero. Remote PEs that 788 lack support of multi-homing operations specified in this document 789 will fail to create any PWs for the multi-homed MH-IDs due to the 790 label value of zero and thus, the multi-homing NLRI should have no 791 impact on the operation of Remote PEs that lack support of multi- 792 homing operations specified in this document. 794 6.2. LDP VPLS with BGP Auto-discovery 796 The BGP-AD NLRI has a prefix length of 12 containing only a 8 bytes 797 RD and a 4 bytes VSI-ID. If a LDP VPLS PEs running BGP AD lacks 798 support of multi-homing operations specified in this document, it 799 SHOULD ignore a MH NLRI with the length field of 17. As a result it 800 will not ask LDP to create any PWs for the multi-homed Site-ID and 801 thus, the multi-homing NLRI should have no impact on LDP VPLS 802 operation. MH PEs may use existing LDP MAC Flush to flush the 803 remote LDP VPLS PEs or may use the implicit MAC Flush procedure. 805 7. Security Considerations 807 No new security issues are introduced beyond those that are 808 described in [RFC4761] and [RFC4762]. 810 8. IANA Considerations 812 At this time, this memo includes no request to IANA. 814 9. Acknowledgments 816 The authors would like to thank Ian Cowburn, Yakov Rekhter, Nischal 817 Sheth, and Mitali Singh for their insightful comments and probing 818 questions. 820 This document was prepared using 2-Word-v2.0.template.dot. 822 10. References 824 10.1. Normative References 826 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 827 Requirement Levels", BCP 14, RFC 2119, March 1997. 829 [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN 830 Service (VPLS) Using BGP for Auto-Discovery and 831 Signaling", RFC 4761, January 2007. 833 [RFC6074] Rosen, E., "Provisioning, Autodiscovery, and Signaling 834 in L2VPNs", RFC 6074, January 2011. 836 10.2. Informative References 838 [I-D.kothari-l2vpn-vpls-flush] 839 Kothari, B. and R. Fernando, "VPLS Flush in BGP-based 840 Virtual Private LAN Service", 841 draft-kothari-l2vpn-vpls-flush-00 (work in progress), 842 October 2008. 844 [I-D.kothari-l2vpn-auto-site-id] 845 Kothari, B., Kompella, K., and T. IV, "Automatic 846 Generation of Site IDs for Virtual Private LAN Service", 847 draft-kothari-l2vpn-auto-site-id-01 (work in progress), 848 October 2008. 850 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 851 Communities Attribute", RFC 4360, February 2006. 853 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 854 Networks (VPNs)", RFC 4364, February 2006. 856 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 857 Reflection: An Alternative to Full Mesh Internal BGP 858 (IBGP)", RFC 4456, April 2006. 860 [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN 861 Service (VPLS) Using Label Distribution Protocol (LDP) 862 Signaling", RFC 4762, January 2007. 864 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 865 Protocol 4 (BGP-4)", RFC 4271, January 2006. 867 Authors' Addresses 869 Bhupesh Kothari 870 Cohere Networks 871 Email: bhupesh@cohere.net 873 Kireeti Kompella 874 Contrail Systems 875 Email: kireeti.kompella@gmail.com 877 Wim Henderickx 878 Alcatel-Lucent 879 Email: wim.henderickx@alcatel-lucent.be 881 Florin Balus 882 Alcatel-Lucent 883 Email: florin.balus@alcatel-lucent.com 885 Senad Palislamovic 886 Alcatel-Lucent 887 Email: senad.palislamovic@alcatel-lucent.com 889 James Uttaro 890 AT&T 891 200 S. Laurel Avenue 892 Middletown, NJ 07748, US 893 Email: uttaro@att.com 895 Wen Lin 896 Juniper Networks 897 Email: wlin@juniper.net