idnits 2.17.1 draft-ietf-bess-vpls-multihoming-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC4761, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4761, updated by this document, for RFC5378 checks: 2003-07-22) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (Mar 26, 2019) is 1829 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Kothari 3 Internet-Draft Augtera Networks 4 Updates: 4761 (if approved) K. Kompella 5 Intended status: Standards Track Juniper Networks 6 Expires: September 27, 2019 W. Henderickx 7 Nokia 8 F. Balus 9 Cisco 10 J. Uttaro 11 AT&T 12 Mar 26, 2019 14 BGP based Multi-homing in Virtual Private LAN Service 15 draft-ietf-bess-vpls-multihoming-03.txt 17 Abstract 19 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 20 Network (VPN) that gives its customers the appearance that their 21 sites are connected via a Local Area Network (LAN). It is often 22 required for the Service Provider (SP) to give the customer redundant 23 connectivity to some sites, often called "multi-homing". This memo 24 shows how BGP-based multi-homing can be offered in the context of LDP 25 and BGP VPLS solutions. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on September 27, 2019. 44 Copyright Notice 46 Copyright (c) 2019 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (https://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 62 1.1. General Terminology . . . . . . . . . . . . . . . . . . . 3 63 1.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 4 64 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 65 2.1. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 4 66 2.2. VPLS Multi-homing Considerations . . . . . . . . . . . . 5 67 3. Multi-homing Operation . . . . . . . . . . . . . . . . . . . 6 68 3.1. Customer Edge (CE) NLRI . . . . . . . . . . . . . . . . . 6 69 3.2. Deployment Considerations . . . . . . . . . . . . . . . . 7 70 3.3. Designated Forwarder Election . . . . . . . . . . . . . . 8 71 3.3.1. Attributes . . . . . . . . . . . . . . . . . . . . . 8 72 3.3.2. Variables Used . . . . . . . . . . . . . . . . . . . 9 73 3.3.3. Election Procedures . . . . . . . . . . . . . . . . . 11 74 3.4. DF Election on PEs . . . . . . . . . . . . . . . . . . . 12 75 3.5. Pseudowire and Site-ID Binding Properties . . . . . . . . 13 76 4. Multi-AS VPLS . . . . . . . . . . . . . . . . . . . . . . . . 13 77 4.1. Route Origin Extended Community . . . . . . . . . . . . . 13 78 4.2. VPLS Preference . . . . . . . . . . . . . . . . . . . . . 13 79 4.3. Use of BGP attributes in Inter-AS Methods . . . . . . . . 14 80 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS 81 Information between ASBRs . . . . . . . . . . . . . . 15 82 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of 83 VPLS Information between ASes . . . . . . . . . 16 84 5. MAC Flush Operations . . . . . . . . . . . . . . . . . . . . 16 85 5.1. MAC Flush Indicators . . . . . . . . . . . . . . . . . . 16 86 5.2. Minimizing the effects of fast link transitions . . . . . 17 87 6. Backwards Compatibility . . . . . . . . . . . . . . . . . . . 17 88 6.1. BGP based VPLS . . . . . . . . . . . . . . . . . . . . . 18 89 6.2. LDP VPLS with BGP Auto-discovery . . . . . . . . . . . . 18 90 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 91 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 92 9. Contributing Authors . . . . . . . . . . . . . . . . . . . . 19 93 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 94 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 95 11.1. Normative References . . . . . . . . . . . . . . . . . . 19 96 11.2. Informative References . . . . . . . . . . . . . . . . . 20 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 100 1. Introduction 102 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 103 Network (VPN) that gives its customers the appearance that their 104 sites are connected via a Local Area Network (LAN). It is often 105 required for a Service Provider (SP) to give the customer redundant 106 connectivity to one or more sites, often called "multi-homing". 107 [RFC4761] explains how VPLS can be offered using BGP for auto- 108 discovery and signaling; section 3.5 of that document describes how 109 multi-homing can be achieved in this context. [RFC6074] explains how 110 VPLS can be offered using BGP for auto-discovery (BGP-AD) and 111 [RFC4762] explains how VPLS can be offered using LDP for signaling. 112 This document provides a BGP-based multi-homing solution applicable 113 to both BGP and LDP VPLS technologies. Note that BGP MH can be used 114 for LDP VPLS without the use of the BGP-AD solution. 116 Section 2 lays out some of the scenarios for multi-homing, other ways 117 that this can be achieved, and some of the expectations of BGP-based 118 multi-homing. Section 3 defines the components of BGP-based multi- 119 homing, and the procedures required to achieve this. 121 1.1. General Terminology 123 Some general terminology is defined here; most is from [RFC4761], 124 [RFC4762] or [RFC4364]. Terminology specific to this memo is 125 introduced as needed in later sections. 127 A "Customer Edge" (CE) device, typically located on customer 128 premises, connects to a "Provider Edge" (PE) device, which is owned 129 and operated by the SP. A "Provider" (P) device is also owned and 130 operated by the SP, but has no direct customer connections. A "VPLS 131 Edge" (VE) device is a PE that offers VPLS services. 133 A VPLS domain represents a bridging domain per customer. A Route 134 Target community as described in [RFC4360] is typically used to 135 identify all the PE routers participating in a particular VPLS 136 domain. A VPLS site is a grouping of ports on a PE that belong to 137 the same VPLS domain. The terms "VPLS instance" and "VPLS domain" 138 are used interchangeably in this document. 140 A VPLS site is a grouping of ports on a PE that belong to the same 141 VPLS domain. The terms "VPLS instance" and "VPLS domain" are used 142 interchangeably in this document. 144 If the CE devices that connect to a VPLS site's ports have 145 connectivity to any other PE device then the VPLS site is called a 146 multi-homed VPLS site. Otherwise, it is called a single-homed VPLS 147 site. The ports are partitioned between VPLS sites such that each 148 port is in no more than one VPLS site. The terms "VPLS site" and "CE 149 site" are used interchangeably in this document. 151 A BGP VPLS NLRI for the base VPLS instance that has non-zero VE block 152 offset, VE block size and label base is called as VE NLRI in this 153 document. Each VPLS instance is uniquely identified by a VE-ID. VE- 154 ID is carried in the BGP VPLS NLRI as specified in section 3.2.2 in 155 [RFC4761]. 157 A VPLS NLRI with value zero for the VE block offset, VE block size 158 and label base is called as CE NLRI in this document. 159 Section Section 3.1 defines CE NLRI and provides more detail. 161 A Multi-homed (MH) site is uniquely identified by a CE-ID. Sites are 162 referred to as local or remote depending on whether they are 163 configured on the PE router in context or on one of the remote PE 164 routers (network peers). A single-homed site can also be assigned a 165 CE-ID, but it is not mandatory to configure a CE-ID for single-homed 166 sites. Section Section 3.1 provides detail on CE-ID. 168 1.2. Conventions 170 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 171 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 172 document are to be interpreted as described in [RFC2119]. 174 2. Background 176 This section describes various scenarios where multi-homing may be 177 required, and the implications thereof. It also describes some of 178 the singular properties of VPLS multi-homing, and what that means 179 from both an operational point of view and an implementation point of 180 view. There are other approaches for providing multi-homing such as 181 Spanning Tree Protocol, and this document specifies use of BGP for 182 multi-homing. Comprehensive comparison among the approaches is 183 outside the scope of this document. 185 2.1. Scenarios 186 In Figure 1, CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 187 for redundant connectivity. 189 ............... 190 . . ___ CE2 191 ___ PE1 . / 192 / : PE3 193 __/ : Service : 194 CE1 __ : Provider PE4 195 \ : : \___ CE3 196 \___ PE2 . 197 . . 198 ............... 200 Figure 1: Scenario 1 202 In Figure 2, CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 203 for redundant connectivity. However, CE4, which is also in the same 204 VPLS domain, is single-homed to just PE1. 206 CE4 ------- ............... 207 \ . . ___ CE2 208 ___ PE1 . / 209 / : PE3 210 __/ : Service : 211 CE1 __ : Provider PE4 212 \ : : \___ CE3 213 \___ PE2 . 214 . . 215 ............... 217 Figure 2: Scenario 2 219 2.2. VPLS Multi-homing Considerations 221 The first (perhaps obvious) fact about a multi-homed VPLS CE, such as 222 CE1 in Figure 1 is that if CE1 is an Ethernet switch or bridge, a 223 loop has been created in the customer VPLS. This is a dangerous 224 situation for an Ethernet network, and the loop must be broken. Even 225 if CE1 is a router, it will get duplicates every time a packet is 226 flooded, which is clearly undesirable. 228 The next is that (unlike the case of IP-based multi-homing) only one 229 of PE1 and PE2 can be actively sending traffic, either towards CE1 or 230 into the SP cloud. That is to say, load balancing techniques will 231 not work. All other PEs MUST choose the same designated forwarder 232 for a multi-homed site. Call the PE that is chosen to send traffic 233 to/from CE1 the "designated forwarder". 235 In Figure 2, CE1 and CE4 must be dealt with independently, since CE1 236 is dual-homed, but CE4 is not. 238 3. Multi-homing Operation 240 This section describes procedures for electing a designated forwarder 241 among the set of PEs that are multi-homed to a customer site. The 242 procedures described in this section are applicable to BGP based 243 VPLS, LDP based VPLS with BGP-AD or a VPLS that contains a mix of 244 both BGP and LDP signaled PWs. 246 3.1. Customer Edge (CE) NLRI 248 Section 3.2.2 in [RFC4761] specifies a NLRI to be used for BGP based 249 VPLS (BGP VPLS NLRI). The format of the BGP VPLS NLRI is shown 250 below. 252 +------------------------------------+ 253 | Length (2 octets) | 254 +------------------------------------+ 255 | Route Distinguisher (8 octets) | 256 +------------------------------------+ 257 | VE ID (2 octets) | 258 +------------------------------------+ 259 | VE Block Offset (2 octets) | 260 +------------------------------------+ 261 | VE Block Size (2 octets) | 262 +------------------------------------+ 263 | Label Base (3 octets) | 264 +------------------------------------+ 266 Figure 3: BGP VPLS NLRI 268 For multi-homing operation, a customer-edge NLRI (CE NLRI) is 269 proposed that uses BGP VPLS NLRI with the following fields set to 270 zero: VE Block Offset, VE Block Size and Label Base. In addition, 271 the VE-ID field of the NLRI is set to CE-ID. Thus, the CE NLRI 272 contains 2 octets indicating the length, 8 octets for Route 273 Distinguisher, 2 octets for CE-ID and 7 octets with value zero. 275 It is valid to have non-zero VE block offset, VE block size and label 276 base in the VPLS NLRI for a multi-homed site. VPLS operations, 277 including multi-homing, in such a case are outside the scope of this 278 document. However, for interoperability with existing deployments 279 that use non-zero VE block offset, VE block size and label base for 280 multi-homing operation, Section 6.1 provides more detail. 282 Wherever VPLS NLRI is used in this document, context must be used to 283 infer if it is applicable for CE NLRI, VE NLRI or for both. 285 3.2. Deployment Considerations 287 It is mandatory that each instance within a VPLS domain MUST be 288 provisioned with a unique Route Distinguisher value. Unique Route 289 Distinguisher allows VPLS advertisements from different VPLS PEs to 290 be distinct even if the advertisements have the same VE-ID, which can 291 occur in case of multi-homing. This allows standard BGP path 292 selection rules to be applied to VPLS advertisements. 294 Each VPLS PE must advertise a unique VE-ID with non-zero VE Block 295 Offset, VE Block Size and Label Base values in the BGP NLRI. VE-ID 296 is associated with the base VPLS instance and the NLRI associated 297 with it must be used for creating PWs among VPLS PEs. Any single- 298 homed customer sites connected to the VPLS instance do not require 299 any special addressing. However, an administrator (SP operator) can 300 choose to have a CE-ID for a single-homed site as well. Any multi- 301 homed customer sites connected to the VPLS instance require special 302 addressing, which is achieved by use of CE-ID. A set of customer 303 sites are distinguished as multi-homed if they all have the same CE- 304 ID. The following examples illustrate the use of VE-ID and CE-ID. 306 Figure 1 shows a customer site, CE1, multi-homed to two VPLS PEs, PE1 307 and PE2. In order for all VPLS PEs to set up PWs to each other, each 308 VPLS PE must be configured with a unique VE-ID for its base VPLS 309 instance. In addition, in order for all VPLS PEs within the same 310 VPLS domain to elect one of the multi-homed PEs as the designated 311 forwarder, an indicator that the PEs are multi-homed to the same 312 customer site is required. This is achieved by assigning the same 313 VPLS site ID (CE-ID) on PE1 and PE2 for CE1. When remote VPLS PEs 314 receive NLRI advertisement from PE1 and PE2 for CE1, the two NLRI 315 advertisements for CE1 are identified as candidates for designated 316 forwarder selection due to the same CE-ID. Thus, same CE-ID MUST be 317 assigned on all VPLS PEs that are multi-homed to the same customer 318 site. 320 Figure 2 shows two customer sites, CE1 and CE4, connected to PE1 with 321 CE1 multi-homed to PE1 and PE2. Similar to Figure 1 provisioning 322 model, each VPLS PE must be configured with a unique VE-ID for it 323 base VPLS instance. CE1 which is multi-homed to PE1 and PE2 requires 324 configuration of CE-ID and both PE1 and PE2 MUST be provisioned with 325 the same CE-ID for CE1. CE2 and CE3 are single-homed sites and do 326 not require special addressing. However, an operator must configure 327 a CE-ID for CE4 on PE1. By doing so, remote PEs can determine that 328 PE1 has two VPLS sites, CE1 and CE4. If both CE1 and CE4 329 connectivity to PE1 is down, remote PEs can choose based on D bit in 330 VE NLRI not to send multicast traffic to PE1 as there are no VPLS 331 sites reachable via PE1. If CE4 was not assigned a unique CE-ID, 332 remote PEs have no way to know if there are other VPLS sites attached 333 and hence, would always send multicast traffic to PE1. While CE2 and 334 CE3 can also be configured with unique CE-IDs, there is no advantage 335 in doing so as both PE3 and PE4 have exactly one VPLS site. 337 Note that a CE-ID=0 is invalid and a PE should discard such an 338 advertisement. 340 Use of multiple VE-IDs per VPLS instance for either multi-homing 341 operation or for any other purpose is outside the scope of this 342 document. However, for interoperability with existing deployments 343 that use multiple VE-IDs, Section 6.1 provides more detail. 345 3.3. Designated Forwarder Election 347 BGP-based multi-homing for VPLS relies on standard BGP path selection 348 and VPLS DF election. The net result of doing both BGP path 349 selection and VPLS DF election is that of electing a single 350 designated forwarder (DF) among the set of PEs to which a customer 351 site is multi-homed. All the PEs that are elected as non-designated 352 forwarders MUST keep their attachment circuit to the multi-homed CE 353 in blocked status (no forwarding). 355 These election algorithms operate on VPLS advertisements, which 356 include both the NLRI and attached BGP attributes. These election 357 algorithms are applicable to all VPLS NLRIs, and not just to CE 358 NLRIs. In order to simplify the explanation of these algorithms, we 359 will use a number of variables derived from fields in the VPLS 360 advertisement. These variables are: RD, SITE-ID, VBO, DOM, ACS, PREF 361 and PE-ID. The notation ADV -> means that from a received VPLS advertisement ADV, the 363 respective variables were derived. The following sections describe 364 two attributes needed for DF election, then describe the variables 365 and how they are derived from fields in VPLS advertisement ADV, and 366 finally describe how DF election is done. 368 3.3.1. Attributes 370 The procedures below refer to two attributes: the Route Origin 371 community (see Section 4.1) and the L2-info community (see 372 Section 4.2). These attributes are required for inter-AS operation; 373 for generality, the procedures below show how they are to be used. 374 The procedures also outline how to handle the case that either or 375 both are not present. 377 For BGP-based Multi-homing, ADV MUST contain an L2-info extended 378 community as specified in [RFC4761]. Within this community are 379 various control flags. Two new control flags are proposed in this 380 document. Figure 4 shows the position of the new 'D' and 'F' flags. 382 Control Flags Bit Vector 384 0 1 2 3 4 5 6 7 385 +-+-+-+-+-+-+-+-+ 386 |D|Z|F|Z|Z|Z|C|S| (Z = MUST Be Zero) 387 +-+-+-+-+-+-+-+-+ 389 Figure 4 391 1. 'D' (Down): Indicates connectivity status. In case of CE NLRI, 392 the connectivity status is between a CE site and a VPLS PE. In 393 case of VE NLRI, the connectivity status is for the VPLS 394 instance. In case of CE NLRI, the bit MUST be set to one if all 395 the attachment circuits connecting a CE site to a VPLS PE are 396 down. In case of VE NLRI, the bit must be set to one if the VPLS 397 instance is operationally down. Note that a VPLS instance that 398 has no connectivity to any of its sites must be considered as 399 operationally down. 401 2. 'F' (Flush): Indicates when to flush MAC state. A designated 402 forwarder must set the F bit and a non-designated forwarder must 403 clear the F bit when sending BGP CE NLRIs for multi-homed sites. 404 A state transition from one to zero for the F bit can be used by 405 a remote PE to flush all the MACs learned from the PE that is 406 transitioning from designated forwarder to non-designated 407 forwarder. Refer to Section 5 for more details on the use case. 409 3.3.2. Variables Used 411 3.3.2.1. RD 413 RD is simply set to the Route Distinguisher field in the NLRI part of 414 ADV. Actual process of assigning Route Distinguisher values must 415 guarantee its uniqueness per PE node. Therefore, two multi-homed PEs 416 offering the same VPLS service to a common set of CEs MUST allocate 417 different RD values for this site respectively. 419 3.3.2.2. SITE-ID 421 SITE-ID is simply set to the VE-ID field in the NLRI part of the ADV. 423 Note that no distinction is made whether VE-ID is for a multi-homed 424 site or not. 426 3.3.2.3. VBO 428 VBO is simply set to the VE Block Offset field in the NLRI part of 429 ADV. 431 3.3.2.4. DOM 433 This variable, indicating the VPLS domain to which ADV belongs, is 434 derived by applying BGP policy to the Route Target extended 435 communities in ADV. The details of how this is done are outside the 436 scope of this document. 438 3.3.2.5. ACS 440 ACS is the status of the attachment circuits for a given site of a 441 VPLS. ACS = 1 if all attachment circuits for the site are down, and 442 0 otherwise. 444 ACS is set to the value of the 'D' bit in ADV that belongs to CE 445 NLRI. If ADV belongs to base VPLS instance (VE NLRI) with non-zero 446 label block values, no change must be made to ACS. 448 3.3.2.6. PREF 450 PREF is derived from the Local Preference (LP) attribute in ADV as 451 well as the VPLS Preference field (VP) in the L2-info extended 452 community. If the Local Preference attribute is missing, LP is set 453 to 0; if the L2-info community is missing, VP is set to 0. The 454 following table shows how PREF is computed from LP and VP. 456 +---------+---------------+----------+------------------------------+ 457 | VP | LP Value | PREF | Comment | 458 | Value | | Value | | 459 +---------+---------------+----------+------------------------------+ 460 | 0 | 0 | 0 | malformed advertisement, | 461 | | | | unless ACS=1 | 462 | | | | | 463 | 0 | 1 to (2^16-1) | LP | backwards compatibility | 464 | | | | | 465 | 0 | 2^16 to | (2^16-1) | backwards compatibility | 466 | | (2^32-1) | | | 467 | | | | | 468 | >0 | LP same as VP | VP | Implementation supports VP | 469 | | | | | 470 | >0 | LP != VP | 0 | malformed advertisement | 471 +---------+---------------+----------+------------------------------+ 473 Table 1 475 3.3.2.7. PE-ID 477 If ADV contains a Route Origin (RO) community (see Section 4.1) with 478 type 0x01, then PE-ID is set to the Global Administrator sub-field of 479 the RO. Otherwise, if ADV has an ORIGINATOR_ID attribute, then PE-ID 480 is set to the ORIGINATOR_ID. Otherwise, PE-ID is set to the BGP 481 Identifier. 483 3.3.3. Election Procedures 485 The election procedures described in this section apply equally to 486 BGP VPLS and LDP VPLS. A distinction MUST NOT be made on whether the 487 NLRI is a multi-homing NLRI or not. Subset of these procedures 488 documented in standard BGP best path selection deals with general IP 489 Prefix BGP route selection processing as defined in [RFC4271]. A 490 separate part of the algorithm defined under VPLS DF election is 491 specific to designated forwarded election procedures performed on 492 VPLS advertisements. A concept of bucketization is introduced to 493 define route selection rules for VPLS advertisements. Note that this 494 is a conceptual description of the process; an implementation MAY 495 choose to realize this differently as long as the semantics are 496 preserved. 498 3.3.3.1. Bucketization for standard BGP path selection 500 An advertisement 502 ADV -> 504 is put into the bucket for . In other words, the 505 information in BGP path selection consists of and 506 only advertisements with exact same are candidates 507 for BGP path selection procedure as defined in [RFC4271]. 509 3.3.3.2. Bucketization for VPLS DF Election 511 An advertisement 513 ADV -> 515 is discarded if DOM is not of interest to the VPLS PE. Otherwise, 516 ADV is put into the bucket for . In other words, all 517 advertisements for a particular VPLS domain that have the same SITE- 518 ID are candidates for VPLS DF election. 520 3.3.3.3. Tie-breaking Rules 522 This section describes the tie-breaking rules for VPLS DF election. 523 Tie-breaking rules for VPLS DF election are applied to candidate 524 advertisements by all VPLS PEs and the actions taken by VPLS PEs 525 based on the VPLS DF election result are described in Section 3.4. 527 Given two advertisements ADV1 and ADV2 from a given bucket, first 528 compute the variables needed for DF election: 530 ADV1 -> 531 ADV2 -> 533 Note that SITE-ID1 = SITE-ID2 and DOM1 = DOM2, since ADV1 and ADV2 534 came from the same bucket. Then the following tie-breaking rules 535 MUST be applied in the given order. 537 1. if (ACS1 != 1) AND (ACS2 == 1) ADV1 wins; stop 538 if (ACS1 == 1) AND (ACS2 != 1) ADV2 wins; stop 539 else continue 541 2. if (PREF1 > PREF2) ADV1 wins; stop; 542 else if (PREF1 < PREF2) ADV2 wins; stop; 543 else continue 545 3. if (PE-ID1 < PE-ID2) ADV1 wins; stop; 546 else if (PE-ID1 > PE-ID2) ADV2 wins; stop; 547 else ADV1 and ADV2 are from the same VPLS PE 549 If there is no winner and ADV1 and ADV2 are from the same PE, a VPLS 550 PE MUST retain both ADV1 and ADV2. 552 3.4. DF Election on PEs 554 DF election algorithm MUST be run by all multi-homed VPLS PEs. In 555 addition, all other PEs SHOULD also run the DF election algorithm. 556 As a result of the DF election, multi-homed PEs that lose the DF 557 election for a SITE-ID MUST put the ACs associated with the SITE-ID 558 in non-forwarding state. 560 DF election result on the egress PEs can be used in traffic 561 forwarding decision. Figure 2 shows two customer sites, CE1 and CE4, 562 connected to PE1 with CE1 multi-homed to PE1 and PE2. If PE1 is the 563 designated forwarder for CE1, based on the DF election result, PE3 564 can choose to not send unknown unicast and multicast traffic to PE2 565 as PE2 is not the designated forwarder for any customer site and it 566 has no other single homed sites connected to it. 568 3.5. Pseudowire and Site-ID Binding Properties 570 For the use case where a single PE provides connectivity to a set of 571 CEs from which some on multi-homed and others are not, only single 572 pseudowire MAY be established. For example, if PE1 provides VPLS 573 service to CE1 and CE4 which are both part of the same VPLS domain, 574 but different sites, and CE1 is multi-homed, but CE4 is not (as 575 described in figure 2), PE3 would establish only single pseudowire 576 toward PE1. A design needs to ensure that regardless of PE1's 577 forwarding state in respect to DF or non-DF for multi-homed CE1, PE3s 578 access to CE4 is established. Since label allocation and pseudowire 579 established is tied to site-ID, we need to ensure that proper 580 pseudowire bindings are established. 582 For set of given advertisements with the common DOM but with 583 different Site-ID values, a VPLS PE speaker SHOULD instantiate and 584 bind the pseudowire based on advertisement with the lowest Site-ID 585 value. Otherwise, binding would be completely random and during DF 586 changes for multi-homed site, non-multi-homed CE might suffer traffic 587 loss. 589 4. Multi-AS VPLS 591 This section describes multi-homing in an inter-AS context. 593 4.1. Route Origin Extended Community 595 Due to lack of information about the PEs that originate the VPLS 596 NLRIs in inter-AS operations, Route Origin Extended Community 597 [RFC4360] is used to carry the source PE's IP address. 599 To use Route Origin Extended Community for carrying the originator 600 VPLS PE's loopback address, the type field of the community MUST be 601 set to 0x01 and the Global Administrator sub-field MUST be set to the 602 PE's loopback IP address. 604 4.2. VPLS Preference 606 When multiple PEs are assigned the same site ID for multi-homing, it 607 is often desired to be able to control the selection of a particular 608 PE as the designated forwarder. Section 3.5 in [RFC4761] describes 609 the use of BGP Local Preference in path selection to choose a 610 particular NLRI, where Local Preference indicates the degree of 611 preference for a particular VE. The use of Local Preference is 612 inadequate when VPLS PEs are spread across multiple ASes as Local 613 Preference is not carried across AS boundary. A new field, VPLS 614 preference (VP), is introduced in this document that can be used to 615 accomplish this. VPLS preference indicates a degree of preference 616 for a particular customer site. VPLS preference is not mandatory for 617 intra-AS operation; the algorithm explained in Section 3.3 will work 618 with or without the presence of VPLS preference. 620 Section 3.2.4 in [RFC4761] describes the Layer2 Info Extended 621 Community that carries control information about the pseudowires. 622 The last two octets that were reserved now carries VPLS preference as 623 shown in Figure 5. 625 +------------------------------------+ 626 | Extended community type (2 octets) | 627 +------------------------------------+ 628 | Encaps Type (1 octet) | 629 +------------------------------------+ 630 | Control Flags (1 octet) | 631 +------------------------------------+ 632 | Layer-2 MTU (2 octet) | 633 +------------------------------------+ 634 | VPLS Preference (2 octets) | 635 +------------------------------------+ 637 Figure 5: Layer2 Info Extended Community 639 A VPLS preference is a 2-octets unsigned integer. A value of zero 640 indicates absence of a VP and is not a valid preference value. This 641 interpretation is required for backwards compatibility. 642 Implementations using Layer2 Info Extended Community as described in 643 (Section 3.2.4) [RFC4761] MUST set the last two octets as zero since 644 it was a reserved field. 646 For backwards compatibility, if VPLS preference is used, then BGP 647 Local Preference MUST be set to the value of VPLS preference. Note 648 that a Local Preference value of zero for a CE-ID is not valid unless 649 'D' bit in the control flags is set (see 650 [I-D.kothari-l2vpn-auto-site-id]). In addition, Local Preference 651 value greater than or equal to 2^16 for VPLS advertisements is not 652 valid. 654 4.3. Use of BGP attributes in Inter-AS Methods 656 Section 3.4 in [RFC4761] and section 4 in [RFC6074] describe three 657 methods (a, b and c) to connect sites in a VPLS to PEs that are 658 across multiple AS. Since VPLS advertisements in method (a) do not 659 cross AS boundaries, multi-homing operations for method (a) remain 660 exactly the same as they are within as AS. However, for method (b) 661 and (c), VPLS advertisements do cross AS boundary. This section 662 describes the VPLS operations for method (b) and method (c). 663 Consider Figure 6 for inter-AS VPLS with multi-homed customer sites. 665 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS Information 666 between ASBRs 668 AS1 AS2 669 ........ ........ 670 CE2 _______ . . . . 671 ___ PE1 . . PE3 --- CE3 672 / : . . : 673 __/ : : : : 674 CE1 __ : ASBR1 --- ASBR2 : 675 \ : : : : 676 \___ PE2 . . PE4 ---- CE4 677 . . . . 678 ........ ........ 680 Figure 6: Inter-AS VPLS 682 A customer has four sites, CE1, CE2, CE3 and CE4. CE1 is multi-homed 683 to PE1 and PE2 in AS1. CE2 is single-homed to PE1. CE3 and CE4 are 684 also single homed to PE3 and PE4 respectively in AS2. Assume that in 685 addition to the base LDP/BGP VPLS addressing (VSI-IDs/VE-IDs), CE-ID 686 1 is assigned for CE1. After running DF election algorithm, all four 687 VPLS PEs must elect the same designated forwarder for CE1 site. 688 Since BGP Local Preference is not carried across AS boundary, VPLS 689 preference as described in Section 4.2 MUST be used for carrying site 690 preference in inter-AS VPLS operations. 692 For Inter-AS method (b) ASBR1 will send a VPLS NLRI received from PE1 693 to ASBR2 with itself as the BGP nexthop. ASBR2 will send the 694 received NLRI from ASBR1 to PE3 and PE4 with itself as the BGP 695 nexthop. Since VPLS PEs use BGP Local Preference in DF election, for 696 backwards compatibility, ASBR2 MUST set the Local Preference value in 697 the VPLS advertisements it sends to PE3 and PE4 to the VPLS 698 preference value contained in the VPLS advertisement it receives from 699 ASBR1. ASBR1 MUST do the same for the NLRIs it sends to PE1 and PE2. 700 If ASBR1 receives a VPLS advertisement without a valid VPLS 701 preference from a PE within its AS, then ASBR1 MUST set the VPLS 702 preference in the advertisements to the Local Preference value before 703 sending it to ASBR2. Similarly, ASBR2 must do the same for 704 advertisements without VPLS Preference it receives from PEs within 705 its AS. Thus, in method (b), ASBRs MUST update the VPLS and Local 706 Preference based on the advertisements they receive either from an 707 ASBR or a PE within their AS. 709 In Figure 6, PE1 will send the VPLS advertisements with Route Origin 710 Extended Community containing its loopback address. PE2 will do the 711 same. Even though PE3 receives the VPLS advertisements for VE-ID 1 712 and 2 from the same BGP nexthop, ASBR2, the source PE address 713 contained in the Route Origin Extended Community is different for the 714 CE1 and CE2 advertisements, and thus, PE3 creates two PWs, one for 715 CE1 (for VE-ID 1) and another one for CE2 (for VE-ID 2). 717 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of VPLS 718 Information between ASes 720 In this method, there is a multi-hop E-BGP peering between the PEs or 721 Route Reflectors in AS1 and the PEs or Route Reflectors in AS2. 722 There is no VPLS state in either control or data plane on the ASBRs. 723 The multi-homing operations on the PEs in this method are exactly the 724 same as they are in intra-AS scenario. However, since Local 725 Preference is not carried across AS boundary, the translation of LP 726 to VP and vice versa MUST be done by RR, if RR is used to reflect 727 VPLS advertisements to other ASes. This is exactly the same as what 728 a ASBR does in case of method (b). A RR must set the VP to the LP 729 value in an advertisement before sending it to other ASes and must 730 set the LP to the VP value in an advertisement that it receives from 731 other ASes before sending to the PEs within the AS. 733 5. MAC Flush Operations 735 In a service provider VPLS network, customer MAC learning is confined 736 to PE devices and any intermediate nodes, such as a Route Reflector, 737 do not have any state for MAC addresses. 739 Topology changes either in the service provider's network or in 740 customer's network can result in the movement of MAC addresses from 741 one PE device to another. Such events can result into traffic being 742 dropped due to stale state of MAC addresses on the PE devices. Age 743 out timers that clear the stale state will resume the traffic 744 forwarding, but age out timers are typically in minutes, and 745 convergence of the order of minutes can severely impact customer's 746 service. To handle such events and expedite convergence of traffic, 747 flushing of affected MAC addresses is highly desirable. 749 5.1. MAC Flush Indicators 751 If 'D' bit in the control flags is set in a received VE NLRI, the 752 receiving PE SHOULD flush all the MAC addresses learned from the PE 753 advertising the failure. 755 Anytime a designated forwarder change occurs, a remote PE SHOULD 756 flush all the MAC addresses it learned from the PE that lost the DF 757 election (old designated forwarder). If multiple customer sites are 758 connected to the same PE, PE1 as shown in Figure 2, and redundancy 759 per site is desired when multi-homing procedures described in this 760 document are in effect, then it is desirable to flush just the 761 relevant MAC addresses from a particular site when the site 762 connectivity is lost. However, procedures for flushing a limited set 763 of MAC addresses are beyond the scope of this document. Use of 764 either 'D' or 'F' bit in control flags only allows to flush all MAC 765 addresses associated with a PE. 767 Designated forwarder change can occur in absence of failures, such as 768 when an attachment circuit comes up. Consider the case in Figure 2 769 where PE1-CE1 link is non-operational and PE2 is the designated 770 forwarder for CE1. Also assume that Local Preference of PE1 is 771 higher than PE2. When PE1-CE1 link becomes operational, PE1 will 772 send a BGP CE advertisement for CE1 to all it's peers. If PE3 773 performs the DF election before PE2, there is a chance that PE3 might 774 learn MAC addresses from PE2 after it was done electing PE1. This 775 can happen since PE2 has not yet processed the BGP CE advertisement 776 from PE1 and as a result continues to send traffic to PE3. This can 777 cause traffic from PE3 to CE1 to black-hole until those MAC addresses 778 are deleted due to age out timers. Therefore, to avoid such race- 779 conditions, a designated forwarder must set the F bit and a non- 780 designated forwarder must clear the F bit when sending BGP CE 781 advertisements. A state transition from one to zero for the 'F' bit 782 can be used by a remote PE to flush all the MACs learned from the PE 783 that is transitioning from designated forwarder to non-designated 784 forwarder. 786 5.2. Minimizing the effects of fast link transitions 788 Certain failure scenarios may result in fast transitions of the link 789 towards the multi-homing CE which in turn will generate fast status 790 transitions of one or multiple multi-homed sites reflected through 791 multiple BGP CE advertisements and LDP MAC Flush messages. 793 It is recommended that a timer to damp the link flaps be used for the 794 port towards the multi-homed CE to minimize the number of MAC Flush 795 events in the remote PEs and the occurrences of BGP state compression 796 for F bit transitions. A timer value more than the time it takes BGP 797 to converge in the network is recommended. 799 6. Backwards Compatibility 801 No forwarding loops are formed when PEs or Route Reflectors that do 802 not support procedures defined in this section co exist in the 803 network with PEs or Route Reflectors that do support. 805 6.1. BGP based VPLS 807 As explained in this section, multi-homed PEs to the same customer 808 site MUST assign the same CE-ID and related NLRI SHOULD contain the 809 block offset, block size and label base as zero. Remote PEs that 810 lack support of multi-homing operations specified in this document 811 will fail to create any PWs for the multi-homed CE-IDs due to the 812 label value of zero and thus, the multi-homing NLRI should have no 813 impact on the operation of Remote PEs that lack support of multi- 814 homing operations specified in this document. 816 For compatibility with PEs that use multiple VE-IDs with non-zero 817 label block values for multi-homing operation, it is a requirement 818 that a PE receiving such advertisements must use the labels in the 819 NLRIs associated with lowest VE-ID for PW creation. It is possible 820 that maintaining PW association with lowest VE-ID can result in PW 821 flap, and thus, traffic loss. However, it is necessary to maintain 822 the association of PW with the lowest VE-ID as it provides 823 deterministic DF election among all the VPLS PEs. 825 6.2. LDP VPLS with BGP Auto-discovery 827 The BGP-AD NLRI has a prefix length of 12 containing only a 8 bytes 828 RD and a 4 bytes VSI-ID. If a LDP VPLS PEs running BGP AD lacks 829 support of multi-homing operations specified in this document, it 830 SHOULD ignore a CE NLRI with the length field of 17. As a result it 831 will not ask LDP to create any PWs for the multi-homed Site-ID and 832 thus, the multi-homing NLRI should have no impact on LDP VPLS 833 operation. MH PEs may use existing LDP MAC Flush to flush the remote 834 LDP VPLS PEs or may use the MAC Flush procedures as described in 835 Section 5 837 7. Security Considerations 839 No new security issues are introduced beyond those that are described 840 in [RFC4761] and [RFC4762]. 842 8. IANA Considerations 844 IANA already has a registry for "Layer2 Info Extended Community 845 Control Flags Bit Vector" 848 This document requires two new bit flags to be assigned as follows: 850 Value Name Reference 851 ----- -------------------------------- -------------- 852 D Down connectivity status This document 853 F MAC flush indicator This document 855 9. Contributing Authors 857 The authors would also like to thank Senad Palislamovic and Wen Lin 858 for their contribution to the development of this document. 860 Senad Palislamovic 861 Nokia 862 Email: senad.palislamovic@nokia.com 864 Wen Lin 865 Juniper Networks 866 Email: wlin@juniper.net 868 10. Acknowledgments 870 The authors would like to thank Yakov Rekhter, Nischal Sheth, Mitali 871 Singh, Ian Cowburn and Jonathan Hardwick for their insightful 872 comments and probing questions. 874 11. References 876 11.1. Normative References 878 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 879 Requirement Levels", BCP 14, RFC 2119, 880 DOI 10.17487/RFC2119, March 1997, 881 . 883 [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private 884 LAN Service (VPLS) Using BGP for Auto-Discovery and 885 Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, 886 . 888 [RFC6074] Rosen, E., Davie, B., Radoaca, V., and W. Luo, 889 "Provisioning, Auto-Discovery, and Signaling in Layer 2 890 Virtual Private Networks (L2VPNs)", RFC 6074, 891 DOI 10.17487/RFC6074, January 2011, 892 . 894 11.2. Informative References 896 [I-D.kothari-l2vpn-auto-site-id] 897 Kothari, B., Kompella, K., and T. IV, "Automatic 898 Generation of Site IDs for Virtual Private LAN Service", 899 draft-kothari-l2vpn-auto-site-id-01 (work in progress), 900 October 2008. 902 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 903 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 904 DOI 10.17487/RFC4271, January 2006, 905 . 907 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 908 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 909 February 2006, . 911 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 912 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 913 2006, . 915 [RFC4762] Lasserre, M., Ed. and V. Kompella, Ed., "Virtual Private 916 LAN Service (VPLS) Using Label Distribution Protocol (LDP) 917 Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007, 918 . 920 Authors' Addresses 922 Bhupesh Kothari 923 Augtera Networks 925 Email: bhupesh@anvaya.net 927 Kireeti Kompella 928 Juniper Networks 929 1194 N. Mathilda Ave. 930 Sunnyvale, CA 94089 931 US 933 Email: kireeti.kompella@gmail.com 935 Wim Henderickx 936 Nokia 938 Email: wim.henderickx@nokia.com 939 Florin Balus 940 Cisco 942 Email: fbalus@gmail.com 944 James Uttaro 945 AT&T 946 200 S. Laurel Avenue 947 Middletown, NJ 07748 948 US 950 Email: uttaro@att.com