idnits 2.17.1 draft-ietf-bess-vpls-multihoming-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC4761, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4761, updated by this document, for RFC5378 checks: 2003-07-22) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 6, 2016) is 3026 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Kothari 3 Internet-Draft Gainspeed 4 Updates: 4761 (if approved) K. Kompella 5 Intended status: Standards Track Juniper Networks 6 Expires: July 9, 2016 W. Henderickx 7 F. Balus 8 Alcatel-Lucent 9 J. Uttaro 10 AT&T 11 S. Palislamovic 12 Alcatel-Lucent 13 W. Lin 14 Juniper Networks 15 January 6, 2016 17 BGP based Multi-homing in Virtual Private LAN Service 18 draft-ietf-bess-vpls-multihoming-01.txt 20 Abstract 22 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 23 Network (VPN) that gives its customers the appearance that their 24 sites are connected via a Local Area Network (LAN). It is often 25 required for the Service Provider (SP) to give the customer redundant 26 connectivity to some sites, often called "multi-homing". This memo 27 shows how BGP-based multi-homing can be offered in the context of LDP 28 and BGP VPLS solutions. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at http://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on July 9, 2016. 47 Copyright Notice 49 Copyright (c) 2016 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. General Terminology . . . . . . . . . . . . . . . . . . . 3 66 1.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 4 67 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 2.1. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 4 69 2.2. VPLS Multi-homing Considerations . . . . . . . . . . . . 5 70 3. Multi-homing Operation . . . . . . . . . . . . . . . . . . . 6 71 3.1. Customer Edge (CE) NLRI . . . . . . . . . . . . . . . . . 6 72 3.2. Deployment Considerations . . . . . . . . . . . . . . . . 7 73 3.3. Designated Forwarder Election . . . . . . . . . . . . . . 8 74 3.3.1. Attributes . . . . . . . . . . . . . . . . . . . . . 8 75 3.3.2. Variables Used . . . . . . . . . . . . . . . . . . . 9 76 3.3.3. Election Procedures . . . . . . . . . . . . . . . . . 11 77 3.4. DF Election on PEs . . . . . . . . . . . . . . . . . . . 12 78 4. Multi-AS VPLS . . . . . . . . . . . . . . . . . . . . . . . . 13 79 4.1. Route Origin Extended Community . . . . . . . . . . . . . 13 80 4.2. VPLS Preference . . . . . . . . . . . . . . . . . . . . . 13 81 4.3. Use of BGP attributes in Inter-AS Methods . . . . . . . . 14 82 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS 83 Information between ASBRs . . . . . . . . . . . . . . 14 84 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of 85 VPLS Information between ASes . . . . . . . . . 16 86 5. MAC Flush Operations . . . . . . . . . . . . . . . . . . . . 16 87 5.1. MAC Flush Indicators . . . . . . . . . . . . . . . . . . 16 88 5.2. Minimizing the effects of fast link transitions . . . . . 17 89 6. Backwards Compatibility . . . . . . . . . . . . . . . . . . . 17 90 6.1. BGP based VPLS . . . . . . . . . . . . . . . . . . . . . 17 91 6.2. LDP VPLS with BGP Auto-discovery . . . . . . . . . . . . 18 92 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 93 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 94 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 95 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 96 10.1. Normative References . . . . . . . . . . . . . . . . . . 19 97 10.2. Informative References . . . . . . . . . . . . . . . . . 19 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 100 1. Introduction 102 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 103 Network (VPN) that gives its customers the appearance that their 104 sites are connected via a Local Area Network (LAN). It is often 105 required for a Service Provider (SP) to give the customer redundant 106 connectivity to one or more sites, often called "multi-homing". 107 [RFC4761] explains how VPLS can be offered using BGP for auto- 108 discovery and signaling; section 3.5 of that document describes how 109 multi-homing can be achieved in this context. [RFC6074] explains how 110 VPLS can be offered using BGP for auto-discovery (BGP-AD) and 111 [RFC4762] explains how VPLS can be offered using LDP for signaling. 112 This document provides a BGP-based multi-homing solution applicable 113 to both BGP and LDP VPLS technologies. Note that BGP MH can be used 114 for LDP VPLS without the use of the BGP-AD solution. 116 Section 2 lays out some of the scenarios for multi-homing, other ways 117 that this can be achieved, and some of the expectations of BGP-based 118 multi-homing. Section 3 defines the components of BGP-based multi- 119 homing, and the procedures required to achieve this. Section 7 may 120 someday discuss security considerations. 122 1.1. General Terminology 124 Some general terminology is defined here; most is from [RFC4761], 125 [RFC4762] or [RFC4364]. Terminology specific to this memo is 126 introduced as needed in later sections. 128 A "Customer Edge" (CE) device, typically located on customer 129 premises, connects to a "Provider Edge" (PE) device, which is owned 130 and operated by the SP. A "Provider" (P) device is also owned and 131 operated by the SP, but has no direct customer connections. A "VPLS 132 Edge" (VE) device is a PE that offers VPLS services. 134 A VPLS domain represents a bridging domain per customer. A Route 135 Target community as described in [RFC4360] is typically used to 136 identify all the PE routers participating in a particular VPLS 137 domain. A VPLS site is a grouping of ports on a PE that belong to 138 the same VPLS domain. The terms "VPLS instance" and "VPLS domain" 139 are used interchangeably in this document. 141 A VPLS site connected to only one PE is called as single-homed VPLS 142 site. The terms "VPLS site" and "CE site" are used interchangeably 143 in this document. 145 A VPLS site connected to multiple PEs is called as multi-homed site. 147 A BGP VPLS NLRI for the base VPLS instance that has non-zero VE block 148 offset, VE block size and label base is called as VE NLRI in this 149 document. Each VPLS instance is uniquely identified by a VE-ID. VE- 150 ID is carried in the BGP VPLS NLRI as specified in section 3.2.2 in 151 [RFC4761]. 153 A VPLS NLRI with value zero for the VE block offset, VE block size 154 and label base is called as CE NLRI in this document. 155 Section Section 3.1 defines CE NLRI and provides more detail. 157 A Multi-homed (MH) site is uniquely identified by a CE-ID. Sites are 158 referred to as local or remote depending on whether they are 159 configured on the PE router in context or on one of the remote PE 160 routers (network peers). A single-homed site can also be assigned a 161 CE-ID, but it is not mandatory to configure a CE-ID for single-homed 162 sites. Section Section 3.1 provides detail on CE-ID. 164 1.2. Conventions 166 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 167 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 168 document are to be interpreted as described in [RFC2119]. 170 2. Background 172 This section describes various scenarios where multi-homing may be 173 required, and the implications thereof. It also describes some of 174 the singular properties of VPLS multi-homing, and what that means 175 from both an operational point of view and an implementation point of 176 view. There are other approaches for providing multi-homing such as 177 Spanning Tree Protocol, and this document specifies use of BGP for 178 multi-homing. Comprehensive comparison among the approaches is 179 outside the scope of this document. 181 2.1. Scenarios 182 CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 for redundant 183 connectivity. 185 ............... 186 . . ___ CE2 187 ___ PE1 . / 188 / : PE3 189 __/ : Service : 190 CE1 __ : Provider PE4 191 \ : : \___ CE3 192 \___ PE2 . 193 . . 194 ............... 196 Figure 1: Scenario 1 198 CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 for redundant 199 connectivity. However, CE4, which is also in the same VPLS domain, 200 is single-homed to just PE1. 202 CE4 ------- ............... 203 \ . . ___ CE2 204 ___ PE1 . / 205 / : PE3 206 __/ : Service : 207 CE1 __ : Provider PE4 208 \ : : \___ CE3 209 \___ PE2 . 210 . . 211 ............... 213 Figure 2: Scenario 2 215 2.2. VPLS Multi-homing Considerations 217 The first (perhaps obvious) fact about a multi-homed VPLS CE, such as 218 CE1 in Figure 1 is that if CE1 is an Ethernet switch or bridge, a 219 loop has been created in the customer VPLS. This is a dangerous 220 situation for an Ethernet network, and the loop must be broken. Even 221 if CE1 is a router, it will get duplicates every time a packet is 222 flooded, which is clearly undesirable. 224 The next is that (unlike the case of IP-based multi-homing) only one 225 of PE1 and PE2 can be actively sending traffic, either towards CE1 or 226 into the SP cloud. That is to say, load balancing techniques will 227 not work. All other PEs MUST choose the same designated forwarder 228 for a multi-homed site. Call the PE that is chosen to send traffic 229 to/from CE1 the "designated forwarder". 231 In Figure 2, CE1 and CE4 must be dealt with independently, since CE1 232 is dual-homed, but CE4 is not. 234 3. Multi-homing Operation 236 This section describes procedures for electing a designated forwarder 237 among the set of PEs that are multi-homed to a customer site. The 238 procedures described in this section are applicable to BGP based 239 VPLS, LDP based VPLS with BGP-AD or a VPLS that contains a mix of 240 both BGP and LDP signaled PWs. 242 3.1. Customer Edge (CE) NLRI 244 Section 3.2.2 in [RFC4761] specifies a NLRI to be used for BGP based 245 VPLS (BGP VPLS NLRI). The format of the BGP VPLS NLRI is shown 246 below. 248 +------------------------------------+ 249 | Length (2 octets) | 250 +------------------------------------+ 251 | Route Distinguisher (8 octets) | 252 +------------------------------------+ 253 | VE ID (2 octets) | 254 +------------------------------------+ 255 | VE Block Offset (2 octets) | 256 +------------------------------------+ 257 | VE Block Size (2 octets) | 258 +------------------------------------+ 259 | Label Base (3 octets) | 260 +------------------------------------+ 262 BGP VPLS NLRI 264 For multi-homing operation, a customer-edge NLRI (CE NLRI) is 265 proposed that uses BGP VPLS NLRI with the following fields set to 266 zero: VE Block Offset, VE Block Size and Label Base. In addition, 267 the VE-ID field of the NLRI is set to CE-ID. Thus, the CE NLRI 268 contains 2 octets indicating the length, 8 octets for Route 269 Distinguisher, 2 octets for CE-ID and 7 octets with value zero. 271 It is valid to have non-zero VE block offset, VE block size and label 272 base in the VPLS NLRI for a multi-homed site. VPLS operations, 273 including multi-homing, in such a case are outside the scope of this 274 document. However, for interoperability with existing deployments 275 that use non-zero VE block offset, VE block size and label base for 276 multi-homing operation, Section 6.1 provides more detail. 278 Wherever VPLS NLRI is used in this document, context must be used to 279 infer if it is applicable for CE NLRI, VE NLRI or for both. 281 3.2. Deployment Considerations 283 It is mandatory that each instance within a VPLS domain MUST be 284 provisioned with a unique Route Distinguisher value. Unique Route 285 Distinguisher allows VPLS advertisements from different VPLS PEs to 286 be distinct even if the advertisements have the same VE-ID, which can 287 occur in case of multi-homing. This allows standard BGP path 288 selection rules to be applied to VPLS advertisements. 290 Each VPLS PE must advertise a unique VE-ID with non-zero VE Block 291 Offset, VE Block Size and Label Base values in the BGP NLRI. VE-ID 292 is associated with the base VPLS instance and the NLRI associated 293 with it must be used for creating PWs among VPLS PEs. Any single- 294 homed customer sites connected to the VPLS instance do not require 295 any special addressing. However, an administrator (SP operator) can 296 chose to have a CE-ID for a single-homed site as well. Any multi- 297 homed customer sites connected to the VPLS instance require special 298 addressing, which is achieved by use of CE-ID. A set of customer 299 sites are distinguished as multi-homed if they all have the same CE- 300 ID. The following examples illustrate the use of VE-ID and CE-ID. 302 Figure 1 shows a customer site, CE1, multi-homed to two VPLS PEs, PE1 303 and PE2. In order for all VPLS PEs to set up PWs to each other, each 304 VPLS PE must be configured with a unique VE-ID for its base VPLS 305 instance. In addition, in order for all VPLS PEs within the same 306 VPLS domain to elect one of the multi-homed PEs as the designated 307 forwarder, an indicator that the PEs are multi-homed to the same 308 customer site is required. This is achieved by assigning the same 309 VPLS site ID (CE-ID) on PE1 and PE2 for CE1. When remote VPLS PEs 310 receive NLRI advertisement from PE1 and PE2 for CE1, the two NLRI 311 advertisements for CE1 are identified as candidates for designated 312 forwarder selection due to the same CE-ID. Thus, same CE-ID MUST be 313 assigned on all VPLS PEs that are multi-homed to the same customer 314 site. 316 Figure 2 shows two customer sites, CE1 and CE4, connected to PE1 with 317 CE1 multi-homed to PE1 and PE2. Similar to Figure 1 provisioning 318 model, each VPLS PE must be configured with a unique VE-ID for it 319 base VPLS instance. CE1 which is multi-homed to PE1 and PE2 requires 320 configuration of CE-ID and both PE1 and PE2 MUST be provisioned with 321 the same CE-ID for CE1. CE2, CE3 and CE4 are single-homed sites and 322 do not require special addressing. However, an operator can chose to 323 configure a CE-ID for CE4 on PE1. By doing so, remote PEs can 324 determine that PE1 has two VPLS sites, CE1 and CE4. If both CE1 and 325 CE4 connectivity to PE1 is down, remote PEs can chose not to send 326 multicast traffic to PE1 as there are no VPLS sites reachable via 327 PE1. If CE4 was not assigned a unique CE-ID, remote PEs have no way 328 to know if there are other VPLS sites attached and hence, would 329 always send multicast traffic to PE1. While CE2 and CE3 can also be 330 configured with unique CE-IDs, there is no advantage in doing so as 331 both PE3 and PE4 have exactly one VPLS site. 333 Note that a CE-ID=0 is invalid and a PE should discard such an 334 advertisement. 336 Use of multiple VE-IDs per VPLS instance for either multi-homing 337 operation or for any other purpose is outside the scope of this 338 document. However, for interoperability with existing deployments 339 that use multiple VE-IDs, Section 6.1 provides more detail. 341 3.3. Designated Forwarder Election 343 BGP-based multi-homing for VPLS relies on standard BGP path selection 344 and VPLS DF election. The net result of doing both BGP path 345 selection and VPLS DF election is that of electing a single 346 designated forwarder (DF) among the set of PEs to which a customer 347 site is multi-homed. All the PEs that are elected as non-designated 348 forwarders MUST keep their attachment circuit to the multi-homed CE 349 in blocked status (no forwarding). 351 These election algorithms operate on VPLS advertisements, which 352 include both the NLRI and attached BGP attributes. These election 353 algorithms are applicable to all VPLS NLRIs, and not just to CE 354 NLRIs. In order to simplify the explanation of these algorithms, we 355 will use a number of variables derived from fields in the VPLS 356 advertisement. These variables are: RD, SITE-ID, VBO, DOM, ACS, PREF 357 and PE-ID. The notation ADV -> means that from a received VPLS advertisement ADV, the 359 respective variables were derived. The following sections describe 360 two attributes needed for DF election, then describe the variables 361 and how they are derived from fields in VPLS advertisement ADV, and 362 finally describe how DF election is done. 364 3.3.1. Attributes 366 The procedures below refer to two attributes: the Route Origin 367 community (see Section 4.1) and the L2-info community (see 368 Section 4.2). These attributes are required for inter-AS operation; 369 for generality, the procedures below show how they are to be used. 370 The procedures also outline how to handle the case that either or 371 both are not present. 373 For BGP-based Multi-homing, ADV MUST contain an L2-info extended 374 community as specified in [RFC4761]. Within this community are 375 various control flags. Two new control flags are proposed in this 376 document. Figure 3 shows the position of the new 'D' and 'F' flags. 378 Control Flags Bit Vector 380 0 1 2 3 4 5 6 7 381 +-+-+-+-+-+-+-+-+ 382 |D|Z|F|Z|Z|Z|C|S| (Z = MUST Be Zero) 383 +-+-+-+-+-+-+-+-+ 385 Figure 3 387 1. 'D' (Down): Indicates connectivity status. In case of CE NLRI, 388 the connectivity status is between a CE site and a VPLS PE. In 389 case of VE NLRI, the connectivity status is for the VPLS 390 instance. In case of CE NLRI, the bit MUST be set to one if all 391 the attachment circuits connecting a CE site to a VPLS PE are 392 down. In case of VE NLRI, the bit must be set to one if the VPLS 393 instance is operationally down. Note that a VPLS instance that 394 has no connectivity to any of its sites must be considered as 395 operationally down. 397 2. 'F' (Flush): Indicates when to flush MAC state. A designated 398 forwarder must set the F bit and a non-designated forwarder must 399 clear the F bit when sending BGP CE NLRIs for multi-homed sites. 400 A state transition from one to zero for the F bit can be used by 401 a remote PE to flush all the MACs learned from the PE that is 402 transitioning from designated forwarder to non-designated 403 forwarder. Refer to Section 5 for more details on the use case. 405 3.3.2. Variables Used 407 3.3.2.1. RD 409 RD is simply set to the Route Distinguisher field in the NLRI part of 410 ADV. 412 3.3.2.2. SITE-ID 414 SITE-ID is simply set to the VE-ID field in the NLRI part of the ADV. 416 Note that no distinction is made whether VE-ID is for a multi-homed 417 site or not. 419 3.3.2.3. VBO 421 VBO is simply set to the VE Block Offset field in the NLRI part of 422 ADV. 424 3.3.2.4. DOM 426 This variable, indicating the VPLS domain to which ADV belongs, is 427 derived by applying BGP policy to the Route Target extended 428 communities in ADV. The details of how this is done are outside the 429 scope of this document. 431 3.3.2.5. ACS 433 ACS is the status of the attachment circuits for a given site of a 434 VPLS. ACS = 1 if all attachment circuits for the site are down, and 435 0 otherwise. 437 ACS is set to the value of the 'D' bit in ADV that belongs to CE 438 NLRI. If ADV belongs to base VPLS instance (VE NLRI) with non-zero 439 label block values, no change must be made to ACS. 441 3.3.2.6. PREF 443 PREF is derived from the Local Preference (LP) attribute in ADV as 444 well as the VPLS Preference field (VP) in the L2-info extended 445 community. If the Local Preference attribute is missing, LP is set 446 to 0; if the L2-info community is missing, VP is set to 0. The 447 following table shows how PREF is computed from LP and VP. 449 +---------+---------------+----------+------------------------------+ 450 | VP | LP Value | PREF | Comment | 451 | Value | | Value | | 452 +---------+---------------+----------+------------------------------+ 453 | 0 | 0 | 0 | malformed advertisement, | 454 | | | | unless ACS=1 | 455 | | | | | 456 | 0 | 1 to (2^16-1) | LP | backwards compatibility | 457 | | | | | 458 | 0 | 2^16 to | (2^16-1) | backwards compatibility | 459 | | (2^32-1) | | | 460 | | | | | 461 | >0 | LP same as VP | VP | Implementation supports VP | 462 | | | | | 463 | >0 | LP != VP | 0 | malformed advertisement | 464 +---------+---------------+----------+------------------------------+ 466 Table 1 468 3.3.2.7. PE-ID 470 If ADV contains a Route Origin (RO) community (see Section 4.1) with 471 type 0x01, then PE-ID is set to the Global Administrator sub-field of 472 the RO. Otherwise, if ADV has an ORIGINATOR_ID attribute, then PE-ID 473 is set to the ORIGINATOR_ID. Otherwise, PE-ID is set to the BGP 474 Identifier. 476 3.3.3. Election Procedures 478 The election procedures described in this section apply equally to 479 BGP VPLS and LDP VPLS. A distinction MUST NOT be made on whether the 480 NLRI is a multi-homing NLRI or not. Subset of these procedures 481 documented in standard BGP best path selection deals with general IP 482 Prefix BGP route selection processing as defined in [RFC4271]. A 483 separate part of the algorithm defined under VPLS DF election is 484 specific to designated forwarded election procedures performed on 485 VPLS advertisements. A concept of bucketization is introduced to 486 define route selection rules for VPLS advertisements. Note that this 487 is a conceptual description of the process; an implementation MAY 488 choose to realize this differently as long as the semantics are 489 preserved. 491 3.3.3.1. Bucketization for standard BGP path selection 493 An advertisement 495 ADV -> 497 is put into the bucket for . In other words, the 498 information in BGP path selection consists of and 499 only advertisements with exact same are candidates 500 for BGP path selection procedure as defined in [RFC4271]. 502 3.3.3.2. Bucketization for VPLS DF Election 504 An advertisement 506 ADV -> 508 is discarded if DOM is not of interest to the VPLS PE. Otherwise, 509 ADV is put into the bucket for . In other words, all 510 advertisements for a particular VPLS domain that have the same SITE- 511 ID are candidates for VPLS DF election. 513 3.3.3.3. Tie-breaking Rules 515 This section describes the tie-breaking rules for VPLS DF election. 516 Tie-breaking rules for VPLS DF election are applied to candidate 517 advertisements by all VPLS PEs and the actions taken by VPLS PEs 518 based on the VPLS DF election result are described in Section 3.4. 520 Given two advertisements ADV1 and ADV2 from a given bucket, first 521 compute the variables needed for DF election: 523 ADV1 -> 524 ADV2 -> 526 Note that SITE-ID1 = SITE-ID2 and DOM1 = DOM2, since ADV1 and ADV2 527 came from the same bucket. Then the following tie-breaking rules 528 MUST be applied in the given order. 530 1. if (ACS1 != 1) AND (ACS2 == 1) ADV1 wins; stop 531 if (ACS1 == 1) AND (ACS2 != 1) ADV2 wins; stop 532 else continue 534 2. if (PREF1 > PREF2) ADV1 wins; stop; 535 else if (PREF1 < PREF2) ADV2 wins; stop; 536 else continue 538 3. if (PE-ID1 < PE-ID2) ADV1 wins; stop; 539 else if (PE-ID1 > PE-ID2) ADV2 wins; stop; 540 else ADV1 and ADV2 are from the same VPLS PE 542 If there is no winner and ADV1 and ADV2 are from the same PE, a VPLS 543 PE MUST retain both ADV1 and ADV2. 545 3.4. DF Election on PEs 547 DF election algorithm MUST be run by all multi-homed VPLS PEs. In 548 addition, all other PEs SHOULD also run the DF election algorithm. 549 As a result of the DF election, multi-homed PEs that lose the DF 550 election for a SITE-ID MUST put the ACs associated with the SITE-ID 551 in non-forwarding state. 553 DF election result on the egress PEs can be used in traffic 554 forwarding decision. Figure 2 shows two customer sites, CE1 and CE4, 555 connected to PE1 with CE1 multi-homed to PE1 and PE2. If PE1 is the 556 designated forwarder for CE1, based on the DF election result, PE3 557 can chose to not send unknown unicast and multicast traffic to PE2 as 558 PE2 is not the designated forwarder for any customer site and it has 559 no other single homed sites connected to it. 561 4. Multi-AS VPLS 563 This section describes multi-homing in an inter-AS context. 565 4.1. Route Origin Extended Community 567 Due to lack of information about the PEs that originate the VPLS 568 NLRIs in inter-AS operations, Route Origin Extended Community 569 [RFC4360] is used to carry the source PE's IP address. 571 To use Route Origin Extended Community for carrying the originator 572 VPLS PE's loopback address, the type field of the community MUST be 573 set to 0x01 and the Global Administrator sub-field MUST be set to the 574 PE's loopback IP address. 576 4.2. VPLS Preference 578 When multiple PEs are assigned the same site ID for multi-homing, it 579 is often desired to be able to control the selection of a particular 580 PE as the designated forwarder. Section 3.5 in [RFC4761] describes 581 the use of BGP Local Preference in path selection to choose a 582 particular NLRI, where Local Preference indicates the degree of 583 preference for a particular VE. The use of Local Preference is 584 inadequate when VPLS PEs are spread across multiple ASes as Local 585 Preference is not carried across AS boundary. A new field, VPLS 586 preference (VP), is introduced in this document that can be used to 587 accomplish this. VPLS preference indicates a degree of preference 588 for a particular customer site. VPLS preference is not mandatory for 589 intra-AS operation; the algorithm explained in Section 3.3 will work 590 with or without the presence of VPLS preference. 592 Section 3.2.4 in [RFC4761] describes the Layer2 Info Extended 593 Community that carries control information about the pseudowires. 594 The last two octets that were reserved now carries VPLS preference as 595 shown in Figure 4. 597 +------------------------------------+ 598 | Extended community type (2 octets) | 599 +------------------------------------+ 600 | Encaps Type (1 octet) | 601 +------------------------------------+ 602 | Control Flags (1 octet) | 603 +------------------------------------+ 604 | Layer-2 MTU (2 octet) | 605 +------------------------------------+ 606 | VPLS Preference (2 octets) | 607 +------------------------------------+ 609 Figure 4: Layer2 Info Extended Community 611 A VPLS preference is a 2-octets unsigned integer. A value of zero 612 indicates absence of a VP and is not a valid preference value. This 613 interpretation is required for backwards compatibility. 614 Implementations using Layer2 Info Extended Community as described in 615 (Section 3.2.4) [RFC4761] MUST set the last two octets as zero since 616 it was a reserved field. 618 For backwards compatibility, if VPLS preference is used, then BGP 619 Local Preference MUST be set to the value of VPLS preference. Note 620 that a Local Preference value of zero for a CE-ID is not valid unless 621 'D' bit in the control flags is set (see 622 [I-D.kothari-l2vpn-auto-site-id]). In addition, Local Preference 623 value greater than or equal to 2^16 for VPLS advertisements is not 624 valid. 626 4.3. Use of BGP attributes in Inter-AS Methods 628 Section 3.4 in [RFC4761] and section 4 in [RFC6074] describe three 629 methods (a, b and c) to connect sites in a VPLS to PEs that are 630 across multiple AS. Since VPLS advertisements in method (a) do not 631 cross AS boundaries, multi-homing operations for method (a) remain 632 exactly the same as they are within as AS. However, for method (b) 633 and (c), VPLS advertisements do cross AS boundary. This section 634 describes the VPLS operations for method (b) and method (c). 635 Consider Figure 5 for inter-AS VPLS with multi-homed customer sites. 637 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS Information 638 between ASBRs 639 AS1 AS2 640 ........ ........ 641 CE2 _______ . . . . 642 ___ PE1 . . PE3 --- CE3 643 / : . . : 644 __/ : : : : 645 CE1 __ : ASBR1 --- ASBR2 : 646 \ : : : : 647 \___ PE2 . . PE4 ---- CE4 648 . . . . 649 ........ ........ 651 Figure 5: Inter-AS VPLS 653 A customer has four sites, CE1, CE2, CE3 and CE4. CE1 is multi-homed 654 to PE1 and PE2 in AS1. CE2 is single-homed to PE1. CE3 and CE4 are 655 also single homed to PE3 and PE4 respectively in AS2. Assume that in 656 addition to the base LDP/BGP VPLS addressing (VSI-IDs/VE-IDs), CE-ID 657 1 is assigned for CE1. After running DF election algorithm, all four 658 VPLS PEs must elect the same designated forwarder for CE1 site. 659 Since BGP Local Preference is not carried across AS boundary, VPLS 660 preference as described in Section 4.2 MUST be used for carrying site 661 preference in inter-AS VPLS operations. 663 For Inter-AS method (b) ASBR1 will send a VPLS NLRI received from PE1 664 to ASBR2 with itself as the BGP nexthop. ASBR2 will send the 665 received NLRI from ASBR1 to PE3 and PE4 with itself as the BGP 666 nexthop. Since VPLS PEs use BGP Local Preference in DF election, for 667 backwards compatibility, ASBR2 MUST set the Local Preference value in 668 the VPLS advertisements it sends to PE3 and PE4 to the VPLS 669 preference value contained in the VPLS advertisement it receives from 670 ASBR1. ASBR1 MUST do the same for the NLRIs it sends to PE1 and PE2. 671 If ASBR1 receives a VPLS advertisement without a valid VPLS 672 preference from a PE within its AS, then ASBR1 MUST set the VPLS 673 preference in the advertisements to the Local Preference value before 674 sending it to ASBR2. Similarly, ASBR2 must do the same for 675 advertisements without VPLS Preference it receives from PEs within 676 its AS. Thus, in method (b), ASBRs MUST update the VPLS and Local 677 Preference based on the advertisements they receive either from an 678 ASBR or a PE within their AS. 680 In Figure 5, PE1 will send the VPLS advertisements with Route Origin 681 Extended Community containing its loopback address. PE2 will do the 682 same. Even though PE3 receives the VPLS advertisements for VE-ID 1 683 and 2 from the same BGP nexthop, ASBR2, the source PE address 684 contained in the Route Origin Extended Community is different for the 685 CE1 and CE2 advertisements, and thus, PE3 creates two PWs, one for 686 CE1 (for VE-ID 1) and another one for CE2 (for VE-ID 2). 688 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of VPLS 689 Information between ASes 691 In this method, there is a multi-hop E-BGP peering between the PEs or 692 Route Reflectors in AS1 and the PEs or Route Reflectors in AS2. 693 There is no VPLS state in either control or data plane on the ASBRs. 694 The multi-homing operations on the PEs in this method are exactly the 695 same as they are in intra-AS scenario. However, since Local 696 Preference is not carried across AS boundary, the translation of LP 697 to VP and vice versa MUST be done by RR, if RR is used to reflect 698 VPLS advertisements to other ASes. This is exactly the same as what 699 a ASBR does in case of method (b). A RR must set the VP to the LP 700 value in an advertisement before sending it to other ASes and must 701 set the LP to the VP value in an advertisement that it receives from 702 other ASes before sending to the PEs within the AS. 704 5. MAC Flush Operations 706 In a service provider VPLS network, customer MAC learning is confined 707 to PE devices and any intermediate nodes, such as a Route Reflector, 708 do not have any state for MAC addresses. 710 Topology changes either in the service provider's network or in 711 customer's network can result in the movement of MAC addresses from 712 one PE device to another. Such events can result into traffic being 713 dropped due to stale state of MAC addresses on the PE devices. Age 714 out timers that clear the stale state will resume the traffic 715 forwarding, but age out timers are typically in minutes, and 716 convergence of the order of minutes can severely impact customer's 717 service. To handle such events and expedite convergence of traffic, 718 flushing of affected MAC addresses is highly desirable. 720 5.1. MAC Flush Indicators 722 If 'D' bit in the control flags is set in a received VE NLRI, the 723 receiving PE SHOULD flush all the MAC addresses learned from the PE 724 advertising the failure. 726 Anytime a designated forwarder change occurs, a remote PE SHOULD 727 flush all the MAC addresses it learned from the PE that lost the DF 728 election (old designated forwarder). If multiple customer sites are 729 connected to the same PE, PE1 as shown in Figure 2, and redundancy 730 per site is desired when multi-homing procedures described in this 731 document are in effect, then it is desirable to flush just the 732 relevant MAC addresses from a particular site when the site 733 connectivity is lost. However, procedures for flushing a limited set 734 of MAC addresses is beyond the scope of this document. Use of either 735 'D' or 'F' bit in control flags only allows to flush all MAC 736 addresses associated with a PE. 738 Designated forwarder change can occur in absence of failures, such as 739 when an attachment circuit comes up. Consider the case in Figure 2 740 where PE1-CE1 link is non-operational and PE2 is the designated 741 forwarder for CE1. Also assume that Local Preference of PE1 is 742 higher than PE2. When PE1-CE1 link becomes operational, PE1 will 743 send a BGP CE advertisement for CE1 to all it's peers. If PE3 744 performs the DF election before PE2, there is a chance that PE3 might 745 learn MAC addresses from PE2 after it was done electing PE1. This 746 can happen since PE2 has not yet processed the BGP CE advertisement 747 from PE1 and as a result continues to send traffic to PE3. This can 748 cause traffic from PE3 to CE1 to black-hole until those MAC addresses 749 are deleted due to age out timers. Therefore, to avoid such race- 750 conditions, a designated forwarder must set the F bit and a non- 751 designated forwarder must clear the F bit when sending BGP CE 752 advertisements. A state transition from one to zero for the 'F' bit 753 can be used by a remote PE to flush all the MACs learned from the PE 754 that is transitioning from designated forwarder to non-designated 755 forwarder. 757 5.2. Minimizing the effects of fast link transitions 759 Certain failure scenarios may result in fast transitions of the link 760 towards the multi-homing CE which in turn will generate fast status 761 transitions of one or multiple multi-homed sites reflected through 762 multiple BGP CE advertisements and LDP MAC Flush messages. 764 It is recommended that a timer to damp the link flaps be used for the 765 port towards the multi-homed CE to minimize the number of MAC Flush 766 events in the remote PEs and the occurrences of BGP state compression 767 for F bit transitions. A timer value more than the time it takes BGP 768 to converge in the network is recommended. 770 6. Backwards Compatibility 772 No forwarding loops are formed when PEs or Route Reflectors that do 773 not support procedures defined in this section co exist in the 774 network with PEs or Route Reflectors that do support. 776 6.1. BGP based VPLS 778 As explained in this section, multi-homed PEs to the same customer 779 site MUST assign the same CE-ID and related NLRI SHOULD contain the 780 block offset, block size and label base as zero. Remote PEs that 781 lack support of multi-homing operations specified in this document 782 will fail to create any PWs for the multi-homed CE-IDs due to the 783 label value of zero and thus, the multi-homing NLRI should have no 784 impact on the operation of Remote PEs that lack support of multi- 785 homing operations specified in this document. 787 For compatibility with PEs that use multiple VE-IDs with non-zero 788 label block values for multi-homing operation, it is a requirement 789 that a PE receiving such advertisements must use the labels in the 790 NLRIs associated with lowest VE-ID for PW creation. It is possible 791 that maintaining PW association with lowest VE-ID can result in PW 792 flap, and thus, traffic loss. However, it is necessary to maintain 793 the association of PW with the lowest VE-ID as it provides 794 deterministic DF election among all the VPLS PEs. 796 6.2. LDP VPLS with BGP Auto-discovery 798 The BGP-AD NLRI has a prefix length of 12 containing only a 8 bytes 799 RD and a 4 bytes VSI-ID. If a LDP VPLS PEs running BGP AD lacks 800 support of multi-homing operations specified in this document, it 801 SHOULD ignore a CE NLRI with the length field of 17. As a result it 802 will not ask LDP to create any PWs for the multi-homed Site-ID and 803 thus, the multi-homing NLRI should have no impact on LDP VPLS 804 operation. MH PEs may use existing LDP MAC Flush to flush the remote 805 LDP VPLS PEs or may use the MAC Flush procedures as described in 806 Section 5 808 7. Security Considerations 810 No new security issues are introduced beyond those that are described 811 in [RFC4761] and [RFC4762]. 813 8. IANA Considerations 815 At this time, this memo includes no request to IANA. 817 9. Acknowledgments 819 The authors would like to thank Yakov Rekhter, Nischal Sheth, Mitali 820 Singh, Ian Cowburn and Jonathan Hardwick for their insightful 821 comments and probing questions. 823 10. References 824 10.1. Normative References 826 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 827 Requirement Levels", BCP 14, RFC 2119, 828 DOI 10.17487/RFC2119, March 1997, 829 . 831 [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private 832 LAN Service (VPLS) Using BGP for Auto-Discovery and 833 Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, 834 . 836 [RFC6074] Rosen, E., Davie, B., Radoaca, V., and W. Luo, 837 "Provisioning, Auto-Discovery, and Signaling in Layer 2 838 Virtual Private Networks (L2VPNs)", RFC 6074, 839 DOI 10.17487/RFC6074, January 2011, 840 . 842 10.2. Informative References 844 [I-D.kothari-l2vpn-auto-site-id] 845 Kothari, B., Kompella, K., and T. IV, "Automatic 846 Generation of Site IDs for Virtual Private LAN Service", 847 draft-kothari-l2vpn-auto-site-id-01 (work in progress), 848 October 2008. 850 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 851 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 852 February 2006, . 854 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 855 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 856 2006, . 858 [RFC4762] Lasserre, M., Ed. and V. Kompella, Ed., "Virtual Private 859 LAN Service (VPLS) Using Label Distribution Protocol (LDP) 860 Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007, 861 . 863 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 864 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 865 DOI 10.17487/RFC4271, January 2006, 866 . 868 Authors' Addresses 870 Bhupesh Kothari 871 Gainspeed 872 295 Santa Ana Court 873 Sunnyvale, CA 94085 874 US 876 Email: bhupesh@gainspeed.com 878 Kireeti Kompella 879 Juniper Networks 880 1194 N. Mathilda Ave. 881 Sunnyvale, CA 94089 882 US 884 Email: kireeti.kompella@gmail.com 886 Wim Henderickx 887 Alcatel-Lucent 889 Email: wim.henderickx@alcatel-lucent.be 891 Florin Balus 892 Alcatel-Lucent 894 Email: florin.balus@alcatel-lucent.com 896 James Uttaro 897 AT&T 898 200 S. Laurel Avenue 899 Middletown, NJ 07748 900 US 902 Email: uttaro@att.com 904 Senad Palislamovic 905 Alcatel-Lucent 907 Email: senad.palislamovic@alcatel-lucent.com 908 Wen Lin 909 Juniper Networks 911 Email: wlin@juniper.net