idnits 2.17.1 draft-ietf-bess-vpls-multihoming-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC4761]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4761, updated by this document, for RFC5378 checks: 2003-07-22) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 23, 2019) is 1739 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Kothari 3 Internet-Draft Augtera Networks 4 Updates: 4761 (if approved) K. Kompella 5 Intended status: Standards Track Juniper Networks 6 Expires: January 24, 2020 W. Henderickx 7 Nokia 8 F. Balus 9 Cisco 10 J. Uttaro 11 AT&T 12 July 23, 2019 14 BGP based Multi-homing in Virtual Private LAN Service 15 draft-ietf-bess-vpls-multihoming-04.txt 17 Abstract 19 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 20 Network (VPN) that gives its customers the appearance that their 21 sites are connected via a Local Area Network (LAN). It is often 22 required for the Service Provider (SP) to give the customer redundant 23 connectivity to some sites, often called "multi-homing". This memo 24 shows how BGP-based multi-homing can be offered in the context of LDP 25 and BGP VPLS solutions. This document updates [RFC4761] by defining 26 new flags in the Control Flags field of the Layer2 Info Extended 27 Community. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on January 24, 2020. 46 Copyright Notice 48 Copyright (c) 2019 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (https://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 1.1. General Terminology . . . . . . . . . . . . . . . . . . . 3 65 1.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 4 66 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 2.1. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 5 68 2.2. VPLS Multi-homing Considerations . . . . . . . . . . . . 5 69 3. Multi-homing Operation . . . . . . . . . . . . . . . . . . . 6 70 3.1. Customer Edge (CE) NLRI . . . . . . . . . . . . . . . . . 6 71 3.2. Deployment Considerations . . . . . . . . . . . . . . . . 7 72 3.3. Designated Forwarder Election . . . . . . . . . . . . . . 8 73 3.3.1. Attributes . . . . . . . . . . . . . . . . . . . . . 8 74 3.3.2. Variables Used . . . . . . . . . . . . . . . . . . . 9 75 3.3.3. Election Procedures . . . . . . . . . . . . . . . . . 11 76 3.4. DF Election on PEs . . . . . . . . . . . . . . . . . . . 13 77 3.5. Pseudowire and Site-ID Binding Properties . . . . . . . . 13 78 4. Multi-AS VPLS . . . . . . . . . . . . . . . . . . . . . . . . 13 79 4.1. Route Origin Extended Community . . . . . . . . . . . . . 13 80 4.2. VPLS Preference . . . . . . . . . . . . . . . . . . . . . 14 81 4.3. Use of BGP attributes in Inter-AS Methods . . . . . . . . 15 82 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS 83 Information between ASBRs . . . . . . . . . . . . . . 15 84 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of 85 VPLS Information between ASes . . . . . . . . . 16 86 5. MAC Flush Operations . . . . . . . . . . . . . . . . . . . . 17 87 5.1. MAC Flush Indicators . . . . . . . . . . . . . . . . . . 17 88 5.2. Minimizing the effects of fast link transitions . . . . . 18 89 6. Backwards Compatibility . . . . . . . . . . . . . . . . . . . 18 90 6.1. BGP based VPLS . . . . . . . . . . . . . . . . . . . . . 18 91 6.2. LDP VPLS with BGP Auto-discovery . . . . . . . . . . . . 19 92 7. Security Considerations . . . . . . . . . . . . . . . . . . . 19 93 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 94 9. Contributing Authors . . . . . . . . . . . . . . . . . . . . 19 95 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 96 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 97 11.1. Normative References . . . . . . . . . . . . . . . . . . 20 98 11.2. Informative References . . . . . . . . . . . . . . . . . 20 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 101 1. Introduction 103 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 104 Network (VPN) that gives its customers the appearance that their 105 sites are connected via a Local Area Network (LAN). It is often 106 required for a Service Provider (SP) to give the customer redundant 107 connectivity to one or more sites, often called "multi-homing". 108 [RFC4761] explains how VPLS can be offered using BGP for auto- 109 discovery and signaling; section 3.5 of that document describes how 110 multi-homing can be achieved in this context. [RFC6074] explains how 111 VPLS can be offered using BGP for auto-discovery (BGP-AD) and 112 [RFC4762] explains how VPLS can be offered using LDP for signaling. 113 This document provides a BGP-based multi-homing solution applicable 114 to both BGP and LDP VPLS technologies. Note that BGP MH can be used 115 for LDP VPLS without the use of the BGP-AD solution. 117 Section 2 lays out some of the scenarios for multi-homing, other ways 118 that this can be achieved, and some of the expectations of BGP-based 119 multi-homing. Section 3 defines the components of BGP-based multi- 120 homing, and the procedures required to achieve this. 122 1.1. General Terminology 124 Some general terminology is defined here; most is from [RFC4761], 125 [RFC4762] or [RFC4364]. Terminology specific to this memo is 126 introduced as needed in later sections. 128 A "Customer Edge" (CE) device, typically located on customer 129 premises, connects to a "Provider Edge" (PE) device, which is owned 130 and operated by the SP. A "Provider" (P) device is also owned and 131 operated by the SP, but has no direct customer connections. A "VPLS 132 Edge" (VE) device is a PE that offers VPLS services. 134 A VPLS domain represents a bridging domain per customer. A Route 135 Target community as described in [RFC4360] is typically used to 136 identify all the PE routers participating in a particular VPLS 137 domain. A VPLS site is a grouping of ports on a PE that belong to 138 the same VPLS domain. The terms "VPLS instance" and "VPLS domain" 139 are used interchangeably in this document. 141 A VPLS site is a grouping of ports on a PE that belong to the same 142 VPLS domain. The terms "VPLS instance" and "VPLS domain" are used 143 interchangeably in this document. 145 If the CE devices that connect to a VPLS site's ports have 146 connectivity to any other PE device then the VPLS site is called a 147 multi-homed VPLS site. Otherwise, it is called a single-homed VPLS 148 site. The ports are partitioned between VPLS sites such that each 149 port is in no more than one VPLS site. The terms "VPLS site" and "CE 150 site" are used interchangeably in this document. 152 A BGP VPLS NLRI for the base VPLS instance that has non-zero VE block 153 offset, VE block size and label base is called as VE NLRI in this 154 document. Each VPLS instance is uniquely identified by a VE-ID. VE- 155 ID is carried in the BGP VPLS NLRI as specified in section 3.2.2 in 156 [RFC4761]. 158 A VPLS NLRI with value zero for the VE block offset, VE block size 159 and label base is called as CE NLRI in this document. 160 Section Section 3.1 defines CE NLRI and provides more detail. 162 A Multi-homed (MH) site is uniquely identified by a CE-ID. Sites are 163 referred to as local or remote depending on whether they are 164 configured on the PE router in context or on one of the remote PE 165 routers (network peers). A single-homed site can also be assigned a 166 CE-ID, but it is not mandatory to configure a CE-ID for single-homed 167 sites. Section Section 3.1 provides detail on CE-ID. 169 1.2. Conventions 171 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 172 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 173 document are to be interpreted as described in [RFC2119]. 175 2. Background 177 This section describes various scenarios where multi-homing may be 178 required, and the implications thereof. It also describes some of 179 the singular properties of VPLS multi-homing, and what that means 180 from both an operational point of view and an implementation point of 181 view. There are other approaches for providing multi-homing such as 182 Spanning Tree Protocol, and this document specifies use of BGP for 183 multi-homing. Comprehensive comparison among the approaches is 184 outside the scope of this document. 186 2.1. Scenarios 188 In Figure 1, CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 189 for redundant connectivity. 191 ............... 192 . . ___ CE2 193 ___ PE1 . / 194 / : PE3 195 __/ : Service : 196 CE1 __ : Provider PE4 197 \ : : \___ CE3 198 \___ PE2 . 199 . . 200 ............... 202 Figure 1: Scenario 1 204 In Figure 2, CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 205 for redundant connectivity. However, CE4, which is also in the same 206 VPLS domain, is single-homed to just PE1. 208 CE4 ------- ............... 209 \ . . ___ CE2 210 ___ PE1 . / 211 / : PE3 212 __/ : Service : 213 CE1 __ : Provider PE4 214 \ : : \___ CE3 215 \___ PE2 . 216 . . 217 ............... 219 Figure 2: Scenario 2 221 2.2. VPLS Multi-homing Considerations 223 The first (perhaps obvious) fact about a multi-homed VPLS CE, such as 224 CE1 in Figure 1 is that if CE1 is an Ethernet switch or bridge, a 225 loop has been created in the customer VPLS. This is a dangerous 226 situation for an Ethernet network, and the loop must be broken. Even 227 if CE1 is a router, it will get duplicates every time a packet is 228 flooded, which is clearly undesirable. 230 The next is that (unlike the case of IP-based multi-homing) only one 231 of PE1 and PE2 can be actively sending traffic, either towards CE1 or 232 into the SP cloud. That is to say, load balancing techniques will 233 not work. All other PEs MUST choose the same designated forwarder 234 for a multi-homed site. Call the PE that is chosen to send traffic 235 to/from CE1 the "designated forwarder". 237 In Figure 2, CE1 and CE4 must be dealt with independently, since CE1 238 is dual-homed, but CE4 is not. 240 3. Multi-homing Operation 242 This section describes procedures for electing a designated forwarder 243 among the set of PEs that are multi-homed to a customer site. The 244 procedures described in this section are applicable to BGP based 245 VPLS, LDP based VPLS with BGP-AD or a VPLS that contains a mix of 246 both BGP and LDP signaled PWs. 248 3.1. Customer Edge (CE) NLRI 250 Section 3.2.2 in [RFC4761] specifies a NLRI to be used for BGP based 251 VPLS (BGP VPLS NLRI). The format of the BGP VPLS NLRI is shown 252 below. 254 +------------------------------------+ 255 | Length (2 octets) | 256 +------------------------------------+ 257 | Route Distinguisher (8 octets) | 258 +------------------------------------+ 259 | VE ID (2 octets) | 260 +------------------------------------+ 261 | VE Block Offset (2 octets) | 262 +------------------------------------+ 263 | VE Block Size (2 octets) | 264 +------------------------------------+ 265 | Label Base (3 octets) | 266 +------------------------------------+ 268 Figure 3: BGP VPLS NLRI 270 For multi-homing operation, a customer-edge NLRI (CE NLRI) is 271 proposed that uses BGP VPLS NLRI with the following fields set to 272 zero: VE Block Offset, VE Block Size and Label Base. In addition, 273 the VE-ID field of the NLRI is set to CE-ID. Thus, the CE NLRI 274 contains 2 octets indicating the length, 8 octets for Route 275 Distinguisher, 2 octets for CE-ID and 7 octets with value zero. 277 It is valid to have non-zero VE block offset, VE block size and label 278 base in the VPLS NLRI for a multi-homed site. VPLS operations, 279 including multi-homing, in such a case are outside the scope of this 280 document. However, for interoperability with existing deployments 281 that use non-zero VE block offset, VE block size and label base for 282 multi-homing operation, Section 6.1 provides more detail. 284 Wherever VPLS NLRI is used in this document, context must be used to 285 infer if it is applicable for CE NLRI, VE NLRI or for both. 287 3.2. Deployment Considerations 289 It is mandatory that each instance within a VPLS domain MUST be 290 provisioned with a unique Route Distinguisher value. Unique Route 291 Distinguisher allows VPLS advertisements from different VPLS PEs to 292 be distinct even if the advertisements have the same VE-ID, which can 293 occur in case of multi-homing. This allows standard BGP path 294 selection rules to be applied to VPLS advertisements. 296 Each VPLS PE must advertise a unique VE-ID with non-zero VE Block 297 Offset, VE Block Size and Label Base values in the BGP NLRI. VE-ID 298 is associated with the base VPLS instance and the NLRI associated 299 with it must be used for creating PWs among VPLS PEs. Any single- 300 homed customer sites connected to the VPLS instance do not require 301 any special addressing. However, an administrator (SP operator) can 302 choose to have a CE-ID for a single-homed site as well. Any multi- 303 homed customer sites connected to the VPLS instance require special 304 addressing, which is achieved by use of CE-ID. A set of customer 305 sites are distinguished as multi-homed if they all have the same CE- 306 ID. The following examples illustrate the use of VE-ID and CE-ID. 308 Figure 1 shows a customer site, CE1, multi-homed to two VPLS PEs, PE1 309 and PE2. In order for all VPLS PEs to set up PWs to each other, each 310 VPLS PE must be configured with a unique VE-ID for its base VPLS 311 instance. In addition, in order for all VPLS PEs within the same 312 VPLS domain to elect one of the multi-homed PEs as the designated 313 forwarder, an indicator that the PEs are multi-homed to the same 314 customer site is required. This is achieved by assigning the same 315 VPLS site ID (CE-ID) on PE1 and PE2 for CE1. When remote VPLS PEs 316 receive NLRI advertisement from PE1 and PE2 for CE1, the two NLRI 317 advertisements for CE1 are identified as candidates for designated 318 forwarder selection due to the same CE-ID. Thus, same CE-ID MUST be 319 assigned on all VPLS PEs that are multi-homed to the same customer 320 site. 322 Figure 2 shows two customer sites, CE1 and CE4, connected to PE1 with 323 CE1 multi-homed to PE1 and PE2. Similar to Figure 1 provisioning 324 model, each VPLS PE must be configured with a unique VE-ID for it 325 base VPLS instance. CE1 which is multi-homed to PE1 and PE2 requires 326 configuration of CE-ID and both PE1 and PE2 MUST be provisioned with 327 the same CE-ID for CE1. CE2 and CE3 are single-homed sites and do 328 not require special addressing. However, an operator must configure 329 a CE-ID for CE4 on PE1. By doing so, remote PEs can determine that 330 PE1 has two VPLS sites, CE1 and CE4. If both CE1 and CE4 331 connectivity to PE1 is down, remote PEs can choose based on D bit in 332 VE NLRI not to send multicast traffic to PE1 as there are no VPLS 333 sites reachable via PE1. If CE4 was not assigned a unique CE-ID, 334 remote PEs have no way to know if there are other VPLS sites attached 335 and hence, would always send multicast traffic to PE1. While CE2 and 336 CE3 can also be configured with unique CE-IDs, there is no advantage 337 in doing so as both PE3 and PE4 have exactly one VPLS site. 339 Note that a CE-ID=0 is invalid and a PE should discard such an 340 advertisement. 342 Use of multiple VE-IDs per VPLS instance for either multi-homing 343 operation or for any other purpose is outside the scope of this 344 document. However, for interoperability with existing deployments 345 that use multiple VE-IDs, Section 6.1 provides more detail. 347 3.3. Designated Forwarder Election 349 BGP-based multi-homing for VPLS relies on standard BGP path selection 350 and VPLS DF election. The net result of doing both BGP path 351 selection and VPLS DF election is that of electing a single 352 designated forwarder (DF) among the set of PEs to which a customer 353 site is multi-homed. All the PEs that are elected as non-designated 354 forwarders MUST keep their attachment circuit to the multi-homed CE 355 in blocked status (no forwarding). 357 These election algorithms operate on VPLS advertisements, which 358 include both the NLRI and attached BGP attributes. These election 359 algorithms are applicable to all VPLS NLRIs, and not just to CE 360 NLRIs. In order to simplify the explanation of these algorithms, we 361 will use a number of variables derived from fields in the VPLS 362 advertisement. These variables are: RD, SITE-ID, VBO, DOM, ACS, PREF 363 and PE-ID. The notation ADV -> means that from a received VPLS advertisement ADV, the 365 respective variables were derived. The following sections describe 366 two attributes needed for DF election, then describe the variables 367 and how they are derived from fields in VPLS advertisement ADV, and 368 finally describe how DF election is done. 370 3.3.1. Attributes 372 The procedures below refer to two attributes: the Route Origin 373 community (see Section 4.1) and the L2-info community (see 374 Section 4.2). These attributes are required for inter-AS operation; 375 for generality, the procedures below show how they are to be used. 377 The procedures also outline how to handle the case that either or 378 both are not present. 380 For BGP-based Multi-homing, ADV MUST contain an L2-info extended 381 community as specified in [RFC4761]. Within this community are 382 various control flags. Two new control flags are proposed in this 383 document. Figure 4 shows the position of the new 'D' and 'F' flags. 385 Control Flags Bit Vector 387 0 1 2 3 4 5 6 7 388 +-+-+-+-+-+-+-+-+ 389 |D|Z|F|Z|Z|Z|C|S| (Z = MUST Be Zero) 390 +-+-+-+-+-+-+-+-+ 392 Figure 4 394 1. 'D' (Down): Indicates connectivity status. In case of CE NLRI, 395 the connectivity status is between a CE site and a VPLS PE. In 396 case of VE NLRI, the connectivity status is for the VPLS 397 instance. In case of CE NLRI, the bit MUST be set to one if all 398 the attachment circuits connecting a CE site to a VPLS PE are 399 down. In case of VE NLRI, the bit must be set to one if the VPLS 400 instance is operationally down. Note that a VPLS instance that 401 has no connectivity to any of its sites must be considered as 402 operationally down. 404 2. 'F' (Flush): Indicates when to flush MAC state. A designated 405 forwarder must set the F bit and a non-designated forwarder must 406 clear the F bit when sending BGP CE NLRIs for multi-homed sites. 407 A state transition from one to zero for the F bit can be used by 408 a remote PE to flush all the MACs learned from the PE that is 409 transitioning from designated forwarder to non-designated 410 forwarder. Refer to Section 5 for more details on the use case. 412 3.3.2. Variables Used 414 3.3.2.1. RD 416 RD is simply set to the Route Distinguisher field in the NLRI part of 417 ADV. Actual process of assigning Route Distinguisher values must 418 guarantee its uniqueness per PE node. Therefore, two multi-homed PEs 419 offering the same VPLS service to a common set of CEs MUST allocate 420 different RD values for this site respectively. 422 3.3.2.2. SITE-ID 424 SITE-ID is simply set to the VE-ID field in the NLRI part of the ADV. 426 Note that no distinction is made whether VE-ID is for a multi-homed 427 site or not. 429 3.3.2.3. VBO 431 VBO is simply set to the VE Block Offset field in the NLRI part of 432 ADV. 434 3.3.2.4. DOM 436 This variable, indicating the VPLS domain to which ADV belongs, is 437 derived by applying BGP policy to the Route Target extended 438 communities in ADV. The details of how this is done are outside the 439 scope of this document. 441 3.3.2.5. ACS 443 ACS is the status of the attachment circuits for a given site of a 444 VPLS. ACS = 1 if all attachment circuits for the site are down, and 445 0 otherwise. 447 ACS is set to the value of the 'D' bit in ADV that belongs to CE 448 NLRI. If ADV belongs to base VPLS instance (VE NLRI) with non-zero 449 label block values, no change must be made to ACS. 451 3.3.2.6. PREF 453 PREF is derived from the Local Preference (LP) attribute in ADV as 454 well as the VPLS Preference field (VP) in the L2-info extended 455 community. If the Local Preference attribute is missing, LP is set 456 to 0; if the L2-info community is missing, VP is set to 0. The 457 following table shows how PREF is computed from LP and VP. 459 +---------+---------------+----------+------------------------------+ 460 | VP | LP Value | PREF | Comment | 461 | Value | | Value | | 462 +---------+---------------+----------+------------------------------+ 463 | 0 | 0 | 0 | malformed advertisement, | 464 | | | | unless ACS=1 | 465 | | | | | 466 | 0 | 1 to (2^16-1) | LP | backwards compatibility | 467 | | | | | 468 | 0 | 2^16 to | (2^16-1) | backwards compatibility | 469 | | (2^32-1) | | | 470 | | | | | 471 | >0 | LP same as VP | VP | Implementation supports VP | 472 | | | | | 473 | >0 | LP != VP | 0 | malformed advertisement | 474 +---------+---------------+----------+------------------------------+ 476 Table 1 478 3.3.2.7. PE-ID 480 If ADV contains a Route Origin (RO) community (see Section 4.1) with 481 type 0x01, then PE-ID is set to the Global Administrator sub-field of 482 the RO. Otherwise, if ADV has an ORIGINATOR_ID attribute, then PE-ID 483 is set to the ORIGINATOR_ID. Otherwise, PE-ID is set to the BGP 484 Identifier. 486 3.3.3. Election Procedures 488 The election procedures described in this section apply equally to 489 BGP VPLS and LDP VPLS. A distinction MUST NOT be made on whether the 490 NLRI is a multi-homing NLRI or not. Subset of these procedures 491 documented in standard BGP best path selection deals with general IP 492 Prefix BGP route selection processing as defined in [RFC4271]. A 493 separate part of the algorithm defined under VPLS DF election is 494 specific to designated forwarded election procedures performed on 495 VPLS advertisements. A concept of bucketization is introduced to 496 define route selection rules for VPLS advertisements. Note that this 497 is a conceptual description of the process; an implementation MAY 498 choose to realize this differently as long as the semantics are 499 preserved. 501 3.3.3.1. Bucketization for standard BGP path selection 503 An advertisement 505 ADV -> 507 is put into the bucket for . In other words, the 508 information in BGP path selection consists of and 509 only advertisements with exact same are candidates 510 for BGP path selection procedure as defined in [RFC4271]. 512 3.3.3.2. Bucketization for VPLS DF Election 514 An advertisement 516 ADV -> 518 is discarded if DOM is not of interest to the VPLS PE. Otherwise, 519 ADV is put into the bucket for . In other words, all 520 advertisements for a particular VPLS domain that have the same SITE- 521 ID are candidates for VPLS DF election. 523 3.3.3.3. Tie-breaking Rules 525 This section describes the tie-breaking rules for VPLS DF election. 526 Tie-breaking rules for VPLS DF election are applied to candidate 527 advertisements by all VPLS PEs and the actions taken by VPLS PEs 528 based on the VPLS DF election result are described in Section 3.4. 530 Given two advertisements ADV1 and ADV2 from a given bucket, first 531 compute the variables needed for DF election: 533 ADV1 -> 534 ADV2 -> 536 Note that SITE-ID1 = SITE-ID2 and DOM1 = DOM2, since ADV1 and ADV2 537 came from the same bucket. Then the following tie-breaking rules 538 MUST be applied in the given order. 540 1. if (ACS1 != 1) AND (ACS2 == 1) ADV1 wins; stop 541 if (ACS1 == 1) AND (ACS2 != 1) ADV2 wins; stop 542 else continue 544 2. if (PREF1 > PREF2) ADV1 wins; stop; 545 else if (PREF1 < PREF2) ADV2 wins; stop; 546 else continue 548 3. if (PE-ID1 < PE-ID2) ADV1 wins; stop; 549 else if (PE-ID1 > PE-ID2) ADV2 wins; stop; 550 else ADV1 and ADV2 are from the same VPLS PE 552 If there is no winner and ADV1 and ADV2 are from the same PE, a VPLS 553 PE MUST retain both ADV1 and ADV2. 555 3.4. DF Election on PEs 557 DF election algorithm MUST be run by all multi-homed VPLS PEs. In 558 addition, all other PEs SHOULD also run the DF election algorithm. 559 As a result of the DF election, multi-homed PEs that lose the DF 560 election for a SITE-ID MUST put the ACs associated with the SITE-ID 561 in non-forwarding state. 563 DF election result on the egress PEs can be used in traffic 564 forwarding decision. Figure 2 shows two customer sites, CE1 and CE4, 565 connected to PE1 with CE1 multi-homed to PE1 and PE2. If PE1 is the 566 designated forwarder for CE1, based on the DF election result, PE3 567 can choose to not send unknown unicast and multicast traffic to PE2 568 as PE2 is not the designated forwarder for any customer site and it 569 has no other single homed sites connected to it. 571 3.5. Pseudowire and Site-ID Binding Properties 573 For the use case where a single PE provides connectivity to a set of 574 CEs from which some on multi-homed and others are not, only single 575 pseudowire MAY be established. For example, if PE1 provides VPLS 576 service to CE1 and CE4 which are both part of the same VPLS domain, 577 but different sites, and CE1 is multi-homed, but CE4 is not (as 578 described in figure 2), PE3 would establish only single pseudowire 579 toward PE1. A design needs to ensure that regardless of PE1's 580 forwarding state in respect to DF or non-DF for multi-homed CE1, PE3s 581 access to CE4 is established. Since label allocation and pseudowire 582 established is tied to site-ID, we need to ensure that proper 583 pseudowire bindings are established. 585 For set of given advertisements with the common DOM but with 586 different Site-ID values, a VPLS PE speaker SHOULD instantiate and 587 bind the pseudowire based on advertisement with the lowest Site-ID 588 value. Otherwise, binding would be completely random and during DF 589 changes for multi-homed site, non-multi-homed CE might suffer traffic 590 loss. 592 4. Multi-AS VPLS 594 This section describes multi-homing in an inter-AS context. 596 4.1. Route Origin Extended Community 598 Due to lack of information about the PEs that originate the VPLS 599 NLRIs in inter-AS operations, Route Origin Extended Community 600 [RFC4360] is used to carry the source PE's IP address. 602 To use Route Origin Extended Community for carrying the originator 603 VPLS PE's loopback address, the type field of the community MUST be 604 set to 0x01 and the Global Administrator sub-field MUST be set to the 605 PE's loopback IP address. 607 4.2. VPLS Preference 609 When multiple PEs are assigned the same site ID for multi-homing, it 610 is often desired to be able to control the selection of a particular 611 PE as the designated forwarder. Section 3.5 in [RFC4761] describes 612 the use of BGP Local Preference in path selection to choose a 613 particular NLRI, where Local Preference indicates the degree of 614 preference for a particular VE. The use of Local Preference is 615 inadequate when VPLS PEs are spread across multiple ASes as Local 616 Preference is not carried across AS boundary. A new field, VPLS 617 preference (VP), is introduced in this document that can be used to 618 accomplish this. VPLS preference indicates a degree of preference 619 for a particular customer site. VPLS preference is not mandatory for 620 intra-AS operation; the algorithm explained in Section 3.3 will work 621 with or without the presence of VPLS preference. 623 Section 3.2.4 in [RFC4761] describes the Layer2 Info Extended 624 Community that carries control information about the pseudowires. 625 The last two octets that were reserved now carries VPLS preference as 626 shown in Figure 5. 628 +------------------------------------+ 629 | Extended community type (2 octets) | 630 +------------------------------------+ 631 | Encaps Type (1 octet) | 632 +------------------------------------+ 633 | Control Flags (1 octet) | 634 +------------------------------------+ 635 | Layer-2 MTU (2 octet) | 636 +------------------------------------+ 637 | VPLS Preference (2 octets) | 638 +------------------------------------+ 640 Figure 5: Layer2 Info Extended Community 642 A VPLS preference is a 2-octets unsigned integer. A value of zero 643 indicates absence of a VP and is not a valid preference value. This 644 interpretation is required for backwards compatibility. 645 Implementations using Layer2 Info Extended Community as described in 646 (Section 3.2.4) [RFC4761] MUST set the last two octets as zero since 647 it was a reserved field. 649 For backwards compatibility, if VPLS preference is used, then BGP 650 Local Preference MUST be set to the value of VPLS preference. Note 651 that a Local Preference value of zero for a CE-ID is not valid unless 652 'D' bit in the control flags is set (see 653 [I-D.kothari-l2vpn-auto-site-id]). In addition, Local Preference 654 value greater than or equal to 2^16 for VPLS advertisements is not 655 valid. 657 4.3. Use of BGP attributes in Inter-AS Methods 659 Section 3.4 in [RFC4761] and section 4 in [RFC6074] describe three 660 methods (a, b and c) to connect sites in a VPLS to PEs that are 661 across multiple AS. Since VPLS advertisements in method (a) do not 662 cross AS boundaries, multi-homing operations for method (a) remain 663 exactly the same as they are within as AS. However, for method (b) 664 and (c), VPLS advertisements do cross AS boundary. This section 665 describes the VPLS operations for method (b) and method (c). 666 Consider Figure 6 for inter-AS VPLS with multi-homed customer sites. 668 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS Information 669 between ASBRs 671 AS1 AS2 672 ........ ........ 673 CE2 _______ . . . . 674 ___ PE1 . . PE3 --- CE3 675 / : . . : 676 __/ : : : : 677 CE1 __ : ASBR1 --- ASBR2 : 678 \ : : : : 679 \___ PE2 . . PE4 ---- CE4 680 . . . . 681 ........ ........ 683 Figure 6: Inter-AS VPLS 685 A customer has four sites, CE1, CE2, CE3 and CE4. CE1 is multi-homed 686 to PE1 and PE2 in AS1. CE2 is single-homed to PE1. CE3 and CE4 are 687 also single homed to PE3 and PE4 respectively in AS2. Assume that in 688 addition to the base LDP/BGP VPLS addressing (VSI-IDs/VE-IDs), CE-ID 689 1 is assigned for CE1. After running DF election algorithm, all four 690 VPLS PEs must elect the same designated forwarder for CE1 site. 691 Since BGP Local Preference is not carried across AS boundary, VPLS 692 preference as described in Section 4.2 MUST be used for carrying site 693 preference in inter-AS VPLS operations. 695 For Inter-AS method (b) ASBR1 will send a VPLS NLRI received from PE1 696 to ASBR2 with itself as the BGP nexthop. ASBR2 will send the 697 received NLRI from ASBR1 to PE3 and PE4 with itself as the BGP 698 nexthop. Since VPLS PEs use BGP Local Preference in DF election, for 699 backwards compatibility, ASBR2 MUST set the Local Preference value in 700 the VPLS advertisements it sends to PE3 and PE4 to the VPLS 701 preference value contained in the VPLS advertisement it receives from 702 ASBR1. ASBR1 MUST do the same for the NLRIs it sends to PE1 and PE2. 703 If ASBR1 receives a VPLS advertisement without a valid VPLS 704 preference from a PE within its AS, then ASBR1 MUST set the VPLS 705 preference in the advertisements to the Local Preference value before 706 sending it to ASBR2. Similarly, ASBR2 must do the same for 707 advertisements without VPLS Preference it receives from PEs within 708 its AS. Thus, in method (b), ASBRs MUST update the VPLS and Local 709 Preference based on the advertisements they receive either from an 710 ASBR or a PE within their AS. 712 In Figure 6, PE1 will send the VPLS advertisements with Route Origin 713 Extended Community containing its loopback address. PE2 will do the 714 same. Even though PE3 receives the VPLS advertisements for VE-ID 1 715 and 2 from the same BGP nexthop, ASBR2, the source PE address 716 contained in the Route Origin Extended Community is different for the 717 CE1 and CE2 advertisements, and thus, PE3 creates two PWs, one for 718 CE1 (for VE-ID 1) and another one for CE2 (for VE-ID 2). 720 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of VPLS 721 Information between ASes 723 In this method, there is a multi-hop E-BGP peering between the PEs or 724 Route Reflectors in AS1 and the PEs or Route Reflectors in AS2. 725 There is no VPLS state in either control or data plane on the ASBRs. 726 The multi-homing operations on the PEs in this method are exactly the 727 same as they are in intra-AS scenario. However, since Local 728 Preference is not carried across AS boundary, the translation of LP 729 to VP and vice versa MUST be done by RR, if RR is used to reflect 730 VPLS advertisements to other ASes. This is exactly the same as what 731 a ASBR does in case of method (b). A RR must set the VP to the LP 732 value in an advertisement before sending it to other ASes and must 733 set the LP to the VP value in an advertisement that it receives from 734 other ASes before sending to the PEs within the AS. 736 5. MAC Flush Operations 738 In a service provider VPLS network, customer MAC learning is confined 739 to PE devices and any intermediate nodes, such as a Route Reflector, 740 do not have any state for MAC addresses. 742 Topology changes either in the service provider's network or in 743 customer's network can result in the movement of MAC addresses from 744 one PE device to another. Such events can result into traffic being 745 dropped due to stale state of MAC addresses on the PE devices. Age 746 out timers that clear the stale state will resume the traffic 747 forwarding, but age out timers are typically in minutes, and 748 convergence of the order of minutes can severely impact customer's 749 service. To handle such events and expedite convergence of traffic, 750 flushing of affected MAC addresses is highly desirable. 752 5.1. MAC Flush Indicators 754 If 'D' bit in the control flags is set in a received VE NLRI, the 755 receiving PE SHOULD flush all the MAC addresses learned from the PE 756 advertising the failure. 758 Anytime a designated forwarder change occurs, a remote PE SHOULD 759 flush all the MAC addresses it learned from the PE that lost the DF 760 election (old designated forwarder). If multiple customer sites are 761 connected to the same PE, PE1 as shown in Figure 2, and redundancy 762 per site is desired when multi-homing procedures described in this 763 document are in effect, then it is desirable to flush just the 764 relevant MAC addresses from a particular site when the site 765 connectivity is lost. However, procedures for flushing a limited set 766 of MAC addresses are beyond the scope of this document. Use of 767 either 'D' or 'F' bit in control flags only allows to flush all MAC 768 addresses associated with a PE. 770 Designated forwarder change can occur in absence of failures, such as 771 when an attachment circuit comes up. Consider the case in Figure 2 772 where PE1-CE1 link is non-operational and PE2 is the designated 773 forwarder for CE1. Also assume that Local Preference of PE1 is 774 higher than PE2. When PE1-CE1 link becomes operational, PE1 will 775 send a BGP CE advertisement for CE1 to all it's peers. If PE3 776 performs the DF election before PE2, there is a chance that PE3 might 777 learn MAC addresses from PE2 after it was done electing PE1. This 778 can happen since PE2 has not yet processed the BGP CE advertisement 779 from PE1 and as a result continues to send traffic to PE3. This can 780 cause traffic from PE3 to CE1 to black-hole until those MAC addresses 781 are deleted due to age out timers. Therefore, to avoid such race- 782 conditions, a designated forwarder must set the F bit and a non- 783 designated forwarder must clear the F bit when sending BGP CE 784 advertisements. A state transition from one to zero for the 'F' bit 785 can be used by a remote PE to flush all the MACs learned from the PE 786 that is transitioning from designated forwarder to non-designated 787 forwarder. 789 5.2. Minimizing the effects of fast link transitions 791 Certain failure scenarios may result in fast transitions of the link 792 towards the multi-homing CE which in turn will generate fast status 793 transitions of one or multiple multi-homed sites reflected through 794 multiple BGP CE advertisements and LDP MAC Flush messages. 796 It is recommended that a timer to damp the link flaps be used for the 797 port towards the multi-homed CE to minimize the number of MAC Flush 798 events in the remote PEs and the occurrences of BGP state compression 799 for F bit transitions. A timer value more than the time it takes BGP 800 to converge in the network is recommended. 802 6. Backwards Compatibility 804 No forwarding loops are formed when PEs or Route Reflectors that do 805 not support procedures defined in this section co exist in the 806 network with PEs or Route Reflectors that do support. 808 6.1. BGP based VPLS 810 As explained in this section, multi-homed PEs to the same customer 811 site MUST assign the same CE-ID and related NLRI SHOULD contain the 812 block offset, block size and label base as zero. Remote PEs that 813 lack support of multi-homing operations specified in this document 814 will fail to create any PWs for the multi-homed CE-IDs due to the 815 label value of zero and thus, the multi-homing NLRI should have no 816 impact on the operation of Remote PEs that lack support of multi- 817 homing operations specified in this document. 819 For compatibility with PEs that use multiple VE-IDs with non-zero 820 label block values for multi-homing operation, it is a requirement 821 that a PE receiving such advertisements must use the labels in the 822 NLRIs associated with lowest VE-ID for PW creation. It is possible 823 that maintaining PW association with lowest VE-ID can result in PW 824 flap, and thus, traffic loss. However, it is necessary to maintain 825 the association of PW with the lowest VE-ID as it provides 826 deterministic DF election among all the VPLS PEs. 828 6.2. LDP VPLS with BGP Auto-discovery 830 The BGP-AD NLRI has a prefix length of 12 containing only a 8 bytes 831 RD and a 4 bytes VSI-ID. If a LDP VPLS PEs running BGP AD lacks 832 support of multi-homing operations specified in this document, it 833 SHOULD ignore a CE NLRI with the length field of 17. As a result it 834 will not ask LDP to create any PWs for the multi-homed Site-ID and 835 thus, the multi-homing NLRI should have no impact on LDP VPLS 836 operation. MH PEs may use existing LDP MAC Flush to flush the remote 837 LDP VPLS PEs or may use the MAC Flush procedures as described in 838 Section 5 840 7. Security Considerations 842 No new security issues are introduced beyond those that are described 843 in [RFC4761] and [RFC4762]. 845 8. IANA Considerations 847 IANA already has a registry for "Layer2 Info Extended Community 848 Control Flags Bit Vector" 851 This document requires two new bit flags to be assigned as follows: 853 Value Name Reference 854 ----- -------------------------------- -------------- 855 D Down connectivity status This document 856 F MAC flush indicator This document 858 9. Contributing Authors 860 The authors would also like to thank Senad Palislamovic and Wen Lin 861 for their contribution to the development of this document. 863 Senad Palislamovic 864 Nokia 865 Email: senad.palislamovic@nokia.com 867 Wen Lin 868 Juniper Networks 869 Email: wlin@juniper.net 871 10. Acknowledgments 873 The authors would like to thank Yakov Rekhter, Nischal Sheth, Mitali 874 Singh, Ian Cowburn and Jonathan Hardwick for their insightful 875 comments and probing questions. 877 11. References 879 11.1. Normative References 881 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 882 Requirement Levels", BCP 14, RFC 2119, 883 DOI 10.17487/RFC2119, March 1997, 884 . 886 [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private 887 LAN Service (VPLS) Using BGP for Auto-Discovery and 888 Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, 889 . 891 [RFC6074] Rosen, E., Davie, B., Radoaca, V., and W. Luo, 892 "Provisioning, Auto-Discovery, and Signaling in Layer 2 893 Virtual Private Networks (L2VPNs)", RFC 6074, 894 DOI 10.17487/RFC6074, January 2011, 895 . 897 11.2. Informative References 899 [I-D.kothari-l2vpn-auto-site-id] 900 Kothari, B., Kompella, K., and T. IV, "Automatic 901 Generation of Site IDs for Virtual Private LAN Service", 902 draft-kothari-l2vpn-auto-site-id-01 (work in progress), 903 October 2008. 905 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 906 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 907 DOI 10.17487/RFC4271, January 2006, 908 . 910 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 911 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 912 February 2006, . 914 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 915 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 916 2006, . 918 [RFC4762] Lasserre, M., Ed. and V. Kompella, Ed., "Virtual Private 919 LAN Service (VPLS) Using Label Distribution Protocol (LDP) 920 Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007, 921 . 923 Authors' Addresses 925 Bhupesh Kothari 926 Augtera Networks 928 Email: bhupesh@anvaya.net 930 Kireeti Kompella 931 Juniper Networks 932 1194 N. Mathilda Ave. 933 Sunnyvale, CA 94089 934 US 936 Email: kireeti.kompella@gmail.com 938 Wim Henderickx 939 Nokia 941 Email: wim.henderickx@nokia.com 943 Florin Balus 944 Cisco 946 Email: fbalus@gmail.com 948 James Uttaro 949 AT&T 950 200 S. Laurel Avenue 951 Middletown, NJ 07748 952 US 954 Email: uttaro@att.com