idnits 2.17.1 draft-ietf-l2vpn-vpls-multihoming-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC4761, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4761, updated by this document, for RFC5378 checks: 2003-07-22) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 25, 2013) is 4078 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC4456' is defined on line 848, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Kothari 3 Internet-Draft Cohere Networks 4 Updates: 4761 (if approved) K. Kompella 5 Intended status: Standards Track Juniper Networks 6 Expires: August 29, 2013 W. Henderickx 7 F. Balus 8 Alcatel-Lucent 9 J. Uttaro 10 AT&T 11 S. Palislamovic 12 Alcatel-Lucent 13 W. Lin 14 Juniper Networks 15 February 25, 2013 17 BGP based Multi-homing in Virtual Private LAN Service 18 draft-ietf-l2vpn-vpls-multihoming-05.txt 20 Abstract 22 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 23 Network (VPN) that gives its customers the appearance that their 24 sites are connected via a Local Area Network (LAN). It is often 25 required for the Service Provider (SP) to give the customer redundant 26 connectivity to some sites, often called "multi-homing". This memo 27 shows how BGP-based multi-homing can be offered in the context of LDP 28 and BGP VPLS solutions. 30 Status of this Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at http://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on August 29, 2013. 47 Copyright Notice 48 Copyright (c) 2013 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 1.1. General Terminology . . . . . . . . . . . . . . . . . . . 4 65 1.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 5 66 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 6 67 2.1. Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 6 68 2.2. VPLS Multi-homing Considerations . . . . . . . . . . . . . 7 69 3. Multi-homing Operation . . . . . . . . . . . . . . . . . . . . 8 70 3.1. Multi-homing NLRI . . . . . . . . . . . . . . . . . . . . 8 71 3.2. Provisioning Model . . . . . . . . . . . . . . . . . . . . 9 72 3.3. Designated Forwarder Election . . . . . . . . . . . . . . 10 73 3.3.1. Attributes . . . . . . . . . . . . . . . . . . . . . . 10 74 3.3.2. Variables Used . . . . . . . . . . . . . . . . . . . . 11 75 3.3.3. Election Procedures . . . . . . . . . . . . . . . . . 12 76 3.4. DF Election on PEs . . . . . . . . . . . . . . . . . . . . 14 77 4. Multi-AS VPLS . . . . . . . . . . . . . . . . . . . . . . . . 15 78 4.1. Route Origin Extended Community . . . . . . . . . . . . . 15 79 4.2. VPLS Preference . . . . . . . . . . . . . . . . . . . . . 15 80 4.3. Use of BGP-MH attributes in Inter-AS Methods . . . . . . . 16 81 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS 82 Information between ASBRs . . . . . . . . . . . . . . 16 83 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution 84 of VPLS Information between ASes . . . . . . . . . . . 17 85 5. MAC Flush Operations . . . . . . . . . . . . . . . . . . . . . 19 86 5.1. MAC List FLush . . . . . . . . . . . . . . . . . . . . . . 19 87 5.2. Implicit MAC Flush . . . . . . . . . . . . . . . . . . . . 19 88 5.3. Minimizing the effects of fast link transitions . . . . . 20 89 6. Backwards Compatibility . . . . . . . . . . . . . . . . . . . 21 90 6.1. BGP based VPLS . . . . . . . . . . . . . . . . . . . . . . 21 91 6.2. LDP VPLS with BGP Auto-discovery . . . . . . . . . . . . . 21 92 7. Security Considerations . . . . . . . . . . . . . . . . . . . 22 93 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 94 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 24 95 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 96 10.1. Normative References . . . . . . . . . . . . . . . . . . . 25 97 10.2. Informative References . . . . . . . . . . . . . . . . . . 25 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 26 100 1. Introduction 102 Virtual Private LAN Service (VPLS) is a Layer 2 Virtual Private 103 Network (VPN) that gives its customers the appearance that their 104 sites are connected via a Local Area Network (LAN). It is often 105 required for a Service Provider (SP) to give the customer redundant 106 connectivity to one or more sites, often called "multi-homing". 107 [RFC4761] explains how VPLS can be offered using BGP for auto- 108 discovery and signaling; section 3.5 of that document describes how 109 multi-homing can be achieved in this context. [RFC6074] explains how 110 VPLS can be offered using BGP for auto-discovery (BGP-AD) and 111 [RFC4762] explains how VPLS can be offered using LDP for signaling. 112 This document provides a BGP-based multi-homing solution applicable 113 to both BGP and LDP VPLS technologies. Note that BGP MH can be used 114 for LDP VPLS without the use of the BGP-AD solution. 116 Section 2 lays out some of the scenarios for multi-homing, other ways 117 that this can be achieved, and some of the expectations of BGP-based 118 multi-homing. Section 3 defines the components of BGP-based multi- 119 homing, and the procedures required to achieve this. Section 7 may 120 someday discuss security considerations. 122 1.1. General Terminology 124 Some general terminology is defined here; most is from [RFC4761], 125 [RFC4762] or [RFC4364]. Terminology specific to this memo is 126 introduced as needed in later sections. 128 A "Customer Edge" (CE) device, typically located on customer 129 premises, connects to a "Provider Edge" (PE) device, which is owned 130 and operated by the SP. A "Provider" (P) device is also owned and 131 operated by the SP, but has no direct customer connections. A "VPLS 132 Edge" (VE) device is a PE that offers VPLS services. 134 A VPLS domain represents a bridging domain per customer. A Route 135 Target community as described in [RFC4360] is typically used to 136 identify all the PE routers participating in a particular VPLS 137 domain. A VPLS site is a grouping of ports on a PE that belong to 138 the same VPLS domain. A Multi-homed (MH) site is uniquely identified 139 by a MH site ID (MH-ID). Sites are referred to as local or remote 140 depending on whether they are configured on the PE router in context 141 or on one of the remote PE routers (network peers). The terms "VPLS 142 instance" and "VPLS domain" are used interchangeably in this 143 document. 145 1.2. Conventions 147 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 148 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 149 document are to be interpreted as described in [RFC2119]. 151 2. Background 153 This section describes various scenarios where multi-homing may be 154 required, and the implications thereof. It also describes some of 155 the singular properties of VPLS multi-homing, and what that means 156 from both an operational point of view and an implementation point of 157 view. There are other approaches for providing multi-homing such as 158 Spanning Tree Protocol, and this document specifies use of BGP for 159 multi-homing. Comprehensive comparison among the approaches is 160 outside the scope of this document. 162 2.1. Scenarios 164 CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 for redundant 165 connectivity. 167 ............... 168 . . ___ CE2 169 ___ PE1 . / 170 / : PE3 171 __/ : Service : 172 CE1 __ : Provider PE4 173 \ : : \___ CE3 174 \___ PE2 . 175 . . 176 ............... 178 Figure 1: Scenario 1 180 CE1 is a VPLS CE that is dual-homed to both PE1 and PE2 for redundant 181 connectivity. However, CE4, which is also in the same VPLS domain, 182 is single-homed to just PE1. 184 CE4 ------- ............... 185 \ . . ___ CE2 186 ___ PE1 . / 187 / : PE3 188 __/ : Service : 189 CE1 __ : Provider PE4 190 \ : : \___ CE3 191 \___ PE2 . 192 . . 193 ............... 195 Figure 2: Scenario 2 197 2.2. VPLS Multi-homing Considerations 199 The first (perhaps obvious) fact about a multi-homed VPLS CE, such as 200 CE1 in Figure 1 is that if CE1 is an Ethernet switch or bridge, a 201 loop has been created in the customer VPLS. This is a dangerous 202 situation for an Ethernet network, and the loop must be broken. Even 203 if CE1 is a router, it will get duplicates every time a packet is 204 flooded, which is clearly undesirable. 206 The next is that (unlike the case of IP-based multi-homing) only one 207 of PE1 and PE2 can be actively sending traffic, either towards CE1 or 208 into the SP cloud. That is to say, load balancing techniques will 209 not work. All other PEs MUST choose the same designated forwarder 210 for a multi-homed site. Call the PE that is chosen to send traffic 211 to/from CE1 the "designated forwarder". 213 In Figure 2, CE1 and CE4 must be dealt with independently, since CE1 214 is dual-homed, but CE4 is not. 216 3. Multi-homing Operation 218 This section describes procedures for electing a designated forwarder 219 among the set of PEs that are multi-homed to a customer site. The 220 procedures described in this section are applicable to BGP based 221 VPLS, LDP based VPLS with BGP-AD or a VPLS that contains a mix of 222 both BGP and LDP signaled PWs. 224 3.1. Multi-homing NLRI 226 Section 3.2.2 in [RFC4761] specifies a NLRI to be used for BGP based 227 VPLS (BGP VPLS NLRI). The format of the BGP VPLS NLRI is shown 228 below. 230 +------------------------------------+ 231 | Length (2 octets) | 232 +------------------------------------+ 233 | Route Distinguisher (8 octets) | 234 +------------------------------------+ 235 | VE ID (2 octets) | 236 +------------------------------------+ 237 | VE Block Offset (2 octets) | 238 +------------------------------------+ 239 | VE Block Size (2 octets) | 240 +------------------------------------+ 241 | Label Base (3 octets) | 242 +------------------------------------+ 244 BGP VPLS NLRI 246 For multi-homing operation, a multi-homing NLRI (MH NLRI) is proposed 247 that uses BGP VPLS NLRI with the following fields set to zero: VE 248 Block Offset, VE Block Size and Label Base. In addition, the VE-ID 249 field of the NLRI is set to MH-ID. Thus, the MH NLRI contains 2 250 octets indicating the length, 8 octets for Route Distinguisher, 2 251 octets for MH-ID and 7 octets with value zero. 253 It is valid to have non-zero VE block offset, VE block size and label 254 base in the VPLS NLRI for a multi-homed site. VPLS operations, 255 including multi-homing, in such a case are outside the scope of this 256 document. However, for interoperability with existing deployments 257 that use non-zero VE block offset, VE block size and label base for 258 multi-homing operation, Section 6.1 provides more detail. 260 3.2. Provisioning Model 262 It is mandatory that each instance within a VPLS domain MUST be 263 provisioned with a unique Route Distinguisher value. Unique Route 264 Distinguisher allows VPLS advertisements from different VPLS PEs to 265 be distinct even if the advertisements have the same VE-ID, which can 266 occur in case of multi-homing. This allows standard BGP path 267 selection rules to be applied to VPLS advertisements. 269 Each VPLS PE must advertise a unique VE-ID with non-zero VE Block 270 Offset, VE Block Size and Label Base values in the BGP NLRI. VE-ID 271 is associated with the base VPLS instance and the NLRI associated 272 with it must be used for creating PWs among VPLS PEs. Any single 273 homed customer sites connected to the VPLS instance do not require 274 any special addressing. Any multi-homed customer sites connected to 275 the VPLS instance require special addressing, which is achieved by 276 use of MH-ID. A set of customer sites are distinguished as multi- 277 homed if they all have the same MH-ID. The following examples 278 illustrate the use of VE-ID and MH-ID. 280 Figure 1 shows a customer site, CE1, multi-homed to two VPLS PEs, PE1 281 and PE2. In order for all VPLS PEs to set up PWs to each other, each 282 VPLS PE must be configured with a unique VE-ID for its base VPLS 283 instance. In addition, in order for all VPLS PEs within the same 284 VPLS domain to elect one of the multi-homed PEs as the designated 285 forwarder, an indicator that the PEs are multi-homed to the same 286 customer site is required. This is achieved by assigning the same 287 multi-homed site ID (MH-ID) on PE1 and PE2 for CE1. When remote VPLS 288 PEs receive NLRI advertisement from PE1 and PE2 for CE1, the two NLRI 289 advertisements for CE1 are identified as candidates for designated 290 forwarder selection due to the same MH-ID. Thus, same MH-ID MUST be 291 assigned on all VPLS PEs that are multi-homed to the same customer 292 site. 294 Figure 2 shows two customer sites, CE1 and CE4, connected to PE1 with 295 CE1 multi-homed to PE1 and PE2. Similar to Figure 1 provisioning 296 model, each VPLS PE must be configured with a unique VE-ID for it 297 base VPLS instance. CE4 does not require special addressing on PE1. 298 However, CE1 which is multi-homed to PE1 and PE2 requires 299 configuration of MH-ID and both PE1 and PE2 MUST be provisioned with 300 the same MH-ID for CE1. 302 Note that a MH-ID=0 is invalid and a PE should discard such an 303 advertisement. 305 Use of multiple VE-IDs per VPLS instance for either multi-homing 306 operation or for any other purpose is outside the scope of this 307 document. However, for interoperability with existing deployments 308 that use multiple VE-IDs, Section 6.1 provides more detail. 310 3.3. Designated Forwarder Election 312 BGP-based multi-homing for VPLS relies on standard BGP path selection 313 and VPLS DF election. The net result of doing both BGP path 314 selection and VPLS DF election is that of electing a single 315 designated forwarder (DF) among the set of PEs to which a customer 316 site is multi-homed. All the PEs that are elected as non-designated 317 forwarders MUST keep their attachment circuit to the multi-homed CE 318 in blocked status (no forwarding). 320 These election algorithms operate on VPLS advertisements, which 321 include both the NLRI and attached BGP attributes. These election 322 algorithms are applicable to all VPLS NLRIs, and not just to MH 323 NLRIs. In order to simplify the explanation of these algorithms, we 324 will use a number of variables derived from fields in the VPLS 325 advertisement. These variables are: RD, SITE-ID, VBO, DOM, ACS, PREF 326 and PE-ID. The notation ADV -> means that from a received VPLS advertisement ADV, the 328 respective variables were derived. The following sections describe 329 two attributes needed for DF election, then describe the variables 330 and how they are derived from fields in VPLS advertisement ADV, and 331 finally describe how DF election is done. 333 3.3.1. Attributes 335 The procedures below refer to two attributes: the Route Origin 336 community (see Section 4.1) and the L2-info community (see 337 Section 4.2). These attributes are required for inter-AS operation; 338 for generality, the procedures below show how they are to be used. 339 The procedures also outline how to handle the case that either or 340 both are not present. 342 For BGP-based Multi-homing, ADV MUST contain an L2-info extended 343 community as specified in [RFC4761]. Within this community are 344 various control flags. Two new control flags are proposed in this 345 document. Figure 3 shows the position of the new 'D' and 'F' flags. 347 Control Flags Bit Vector 349 0 1 2 3 4 5 6 7 350 +-+-+-+-+-+-+-+-+ 351 |D|Z|F|Z|Z|Z|C|S| (Z = MUST Be Zero) 352 +-+-+-+-+-+-+-+-+ 354 Figure 3 356 1. 'D' (Down): Indicates connectivity status between a CE site and a 357 VPLS PE. The bit MUST be set to one if all the attachment 358 circuits connecting a CE site to a VPLS PE are down. 360 2. 'F' (Flush): Indicates when to flush MAC state. A designated 361 forwarder must set the F bit and a non-designated forwarder must 362 clear the F bit when sending BGP MH advertisements. A state 363 transition from one to zero for the F bit can be used by a remote 364 PE to flush all the MACs learned from the PE that is 365 transitioning from designated forwarder to non-designated 366 forwarder. Refer to Section 5.2 for more details on the use 367 case. 369 3.3.2. Variables Used 371 3.3.2.1. RD 373 RD is simply set to the Route Distinguisher field in the NLRI part of 374 ADV. 376 3.3.2.2. SITE-ID 378 SITE-ID is simply set to the VE-ID field in the NLRI part of the ADV. 380 Note that no distinction is made whether VE-ID is for a multi-homed 381 site or not. 383 3.3.2.3. VBO 385 VBO is simply set to the VE Block Offset field in the NLRI part of 386 ADV. 388 3.3.2.4. DOM 390 This variable, indicating the VPLS domain to which ADV belongs, is 391 derived by applying BGP policy to the Route Target extended 392 communities in ADV. The details of how this is done are outside the 393 scope of this document. 395 3.3.2.5. ACS 397 ACS is the status of the attachment circuits for a given site of a 398 VPLS. ACS = 1 if all attachment circuits for the site are down, and 399 0 otherwise. 401 ACS is set to the value of the 'D' bit in ADV that belongs to MH 402 NLRI. If ADV belongs to base VPLS instance with non-zero label block 403 values, no change must be made to ACS. 405 3.3.2.6. PREF 407 PREF is derived from the Local Preference (LP) attribute in ADV as 408 well as the VPLS Preference field (VP) in the L2-info extended 409 community. If the Local Preference attribute is missing, LP is set 410 to 0; if the L2-info community is missing, VP is set to 0. The 411 following table shows how PREF is computed from LP and VP. 413 +---------+---------------+----------+------------------------------+ 414 | VP | LP Value | PREF | Comment | 415 | Value | | Value | | 416 +---------+---------------+----------+------------------------------+ 417 | 0 | 0 | 0 | malformed advertisement, | 418 | | | | unless ACS=1 | 419 | | | | | 420 | 0 | 1 to (2^16-1) | LP | backwards compatibility | 421 | | | | | 422 | 0 | 2^16 to | (2^16-1) | backwards compatibility | 423 | | (2^32-1) | | | 424 | | | | | 425 | >0 | LP same as VP | VP | Implementation supports VP | 426 | | | | | 427 | >0 | LP != VP | 0 | malformed advertisement | 428 +---------+---------------+----------+------------------------------+ 430 Table 1 432 3.3.2.7. PE-ID 434 If ADV contains a Route Origin (RO) community (see Section 4.1) with 435 type 0x01, then PE-ID is set to the Global Administrator sub-field of 436 the RO. Otherwise, if ADV has an ORIGINATOR_ID attribute, then PE-ID 437 is set to the ORIGINATOR_ID. Otherwise, PE-ID is set to the BGP 438 Identifier. 440 3.3.3. Election Procedures 442 The election procedures described in this section apply equally to 443 BGP VPLS and LDP VPLS. A distinction MUST NOT be made on whether the 444 NLRI is a multi-homing NLRI or not. Subset of these procedures 445 documented in standard BGP best path selection deals with general IP 446 Prefix BGP route selection processing as defined in [RFC4271]. A 447 separate part of the algorithm defined under VPLS DF election is 448 specific to designated forwarded election procedures performed on 449 VPLS advertisements. A concept of bucketization is introduced to 450 define route selection rules for VPLS advertisements. Note that this 451 is a conceptual description of the process; an implementation MAY 452 choose to realize this differently as long as the semantics are 453 preserved. 455 3.3.3.1. Bucketization for standard BGP path selection 457 An advertisement 459 ADV -> 461 is put into the bucket for . In other words, the 462 information in BGP path selection consists of and 463 only advertisements with exact same are candidates 464 for BGP path selection procedure as defined in [RFC4271]. 466 3.3.3.2. Bucketization for VPLS DF Election 468 An advertisement 470 ADV -> 472 is discarded if DOM is not of interest to the VPLS PE. Otherwise, 473 ADV is put into the bucket for . In other words, all 474 advertisements for a particular VPLS domain that have the same 475 SITE-ID are candidates for VPLS DF election. 477 3.3.3.3. Tie-breaking Rules 479 This section describes the tie-breaking rules for VPLS DF election. 480 Tie-breaking rules for VPLS DF election are applied to candidate 481 advertisements by all VPLS PEs and the actions taken by VPLS PEs 482 based on the VPLS DF election result are described in Section 3.4. 484 Given two advertisements ADV1 and ADV2 from a given bucket, first 485 compute the variables needed for DF election: 487 ADV1 -> 488 ADV2 -> 490 Note that SITE-ID1 = SITE-ID2 and DOM1 = DOM2, since ADV1 and ADV2 491 came from the same bucket. Then the following tie-breaking rules 492 MUST be applied in the given order. 494 1. if (ACS1 != 1) AND (ACS2 == 1) ADV1 wins; stop 495 if (ACS1 == 1) AND (ACS2 != 1) ADV2 wins; stop 496 else continue 498 2. if (PREF1 > PREF2) ADV1 wins; stop; 499 else if (PREF1 < PREF2) ADV2 wins; stop; 500 else continue 502 3. if (PE-ID1 < PE-ID2) ADV1 wins; stop; 503 else if (PE-ID1 > PE-ID2) ADV2 wins; stop; 504 else ADV1 and ADV2 are from the same VPLS PE 506 If there is no winner and ADV1 and ADV2 are from the same PE, a VPLS 507 PE MUST retain both ADV1 and ADV2. 509 3.4. DF Election on PEs 511 DF election algorithm MUST be run by all multi-homed VPLS PEs. In 512 addition, all other PEs SHOULD also run the DF election algorithm. 513 As a result of the DF election, multi-homed PEs that lose the DF 514 election for a SITE-ID MUST put the ACs associated with the SITE-ID 515 in non-forwarding state. 517 DF election result on the egress PEs can be used in traffic 518 forwarding decision. Figure 2 shows two customer sites, CE1 and CE4, 519 connected to PE1 with CE1 multi-homed to PE1 and PE2. If PE1 is the 520 designated forwarder for CE1, based on the DF election result, PE3 521 can chose to not send unknown unicast and multicast traffic to PE2 as 522 PE2 is not the designated forwarder for any customer site and it has 523 no other single homed sites connected to it. 525 4. Multi-AS VPLS 527 This section describes multi-homing in an inter-AS context. 529 4.1. Route Origin Extended Community 531 Due to lack of information about the PEs that originate the VPLS 532 NLRIs in inter-AS operations, Route Origin Extended Community 533 [RFC4360] is used to carry the source PE's IP address. 535 To use Route Origin Extended Community for carrying the originator 536 VPLS PE's loopback address, the type field of the community MUST be 537 set to 0x01 and the Global Administrator sub-field MUST be set to the 538 PE's loopback IP address. 540 4.2. VPLS Preference 542 When multiple PEs are assigned the same site ID for multi-homing, it 543 is often desired to be able to control the selection of a particular 544 PE as the designated forwarder. Section 3.5 in [RFC4761] describes 545 the use of BGP Local Preference in path selection to choose a 546 particular NLRI, where Local Preference indicates the degree of 547 preference for a particular VE. The use of Local Preference is 548 inadequate when VPLS PEs are spread across multiple ASes as Local 549 Preference is not carried across AS boundary. A new field, VPLS 550 preference (VP), is introduced in this document that can be used to 551 accomplish this. VPLS preference indicates a degree of preference 552 for a particular customer site. VPLS preference is not mandatory for 553 intra-AS operation; the algorithm explained in Section 3.3 will work 554 with or without the presence of VPLS preference. 556 Section 3.2.4 in [RFC4761] describes the Layer2 Info Extended 557 Community that carries control information about the pseudowires. 558 The last two octets that were reserved now carries VPLS preference as 559 shown in Figure 4. 561 +------------------------------------+ 562 | Extended community type (2 octets) | 563 +------------------------------------+ 564 | Encaps Type (1 octet) | 565 +------------------------------------+ 566 | Control Flags (1 octet) | 567 +------------------------------------+ 568 | Layer-2 MTU (2 octet) | 569 +------------------------------------+ 570 | VPLS Preference (2 octets) | 571 +------------------------------------+ 573 Figure 4: Layer2 Info Extended Community 575 A VPLS preference is a 2-octets unsigned integer. A value of zero 576 indicates absence of a VP and is not a valid preference value. This 577 interpretation is required for backwards compatibility. 578 Implementations using Layer2 Info Extended Community as described in 579 (Section 3.2.4) [RFC4761] MUST set the last two octets as zero since 580 it was a reserved field. 582 For backwards compatibility, if VPLS preference is used, then BGP 583 Local Preference MUST be set to the value of VPLS preference. Note 584 that a Local Preference value of zero for a MH-ID is not valid unless 585 'D' bit in the control flags is set (see 586 [I-D.kothari-l2vpn-auto-site-id]). In addition, Local Preference 587 value greater than or equal to 2^16 for VPLS advertisements is not 588 valid. 590 4.3. Use of BGP-MH attributes in Inter-AS Methods 592 Section 3.4 in [RFC4761] and section 4 in [RFC6074] describe three 593 methods (a, b and c) to connect sites in a VPLS to PEs that are 594 across multiple AS. Since VPLS advertisements in method (a) do not 595 cross AS boundaries, multi-homing operations for method (a) remain 596 exactly the same as they are within as AS. However, for method (b) 597 and (c), VPLS advertisements do cross AS boundary. This section 598 describes the VPLS operations for method (b) and method (c). 599 Consider Figure 5 for inter-AS VPLS with multi-homed customer sites. 601 4.3.1. Inter-AS Method (b): EBGP Redistribution of VPLS Information 602 between ASBRs 604 AS1 AS2 605 ........ ........ 606 CE2 _______ . . . . 607 ___ PE1 . . PE3 --- CE3 608 / : . . : 609 __/ : : : : 610 CE1 __ : ASBR1 --- ASBR2 : 611 \ : : : : 612 \___ PE2 . . PE4 ---- CE4 613 . . . . 614 ........ ........ 616 Figure 5: Inter-AS VPLS 618 A customer has four sites, CE1, CE2, CE3 and CE4. CE1 is multi-homed 619 to PE1 and PE2 in AS1. CE2 is single-homed to PE1. CE3 and CE4 are 620 also single homed to PE3 and PE4 respectively in AS2. Assume that in 621 addition to the base LDP/BGP VPLS addressing (VSI-IDs/VE-IDs), MH ID 622 1 is assigned for CE1. After running DF election algorithm, all four 623 VPLS PEs must elect the same designated forwarder for CE1 site. 624 Since BGP Local Preference is not carried across AS boundary, VPLS 625 preference as described in Section 4.2 MUST be used for carrying site 626 preference in inter-AS VPLS operations. 628 For Inter-AS method (b) ASBR1 will send a VPLS NLRI received from PE1 629 to ASBR2 with itself as the BGP nexthop. ASBR2 will send the 630 received NLRI from ASBR1 to PE3 and PE4 with itself as the BGP 631 nexthop. Since VPLS PEs use BGP Local Preference in DF election, for 632 backwards compatibility, ASBR2 MUST set the Local Preference value in 633 the VPLS advertisements it sends to PE3 and PE4 to the VPLS 634 preference value contained in the VPLS advertisement it receives from 635 ASBR1. ASBR1 MUST do the same for the NLRIs it sends to PE1 and PE2. 636 If ASBR1 receives a VPLS advertisement without a valid VPLS 637 preference from a PE within its AS, then ASBR1 MUST set the VPLS 638 preference in the advertisements to the Local Preference value before 639 sending it to ASBR2. Similarly, ASBR2 must do the same for 640 advertisements without VPLS Preference it receives from PEs within 641 its AS. Thus, in method (b), ASBRs MUST update the VPLS and Local 642 Preference based on the advertisements they receive either from an 643 ASBR or a PE within their AS. 645 In Figure 5, PE1 will send the VPLS advertisements with Route Origin 646 Extended Community containing its loopback address. PE2 will do the 647 same. Even though PE3 receives the VPLS advertisements for VE-ID 1 648 and 2 from the same BGP nexthop, ASBR2, the source PE address 649 contained in the Route Origin Extended Community is different for the 650 CE1 and CE2 advertisements, and thus, PE3 creates two PWs, one for 651 CE1 (for VE-ID 1) and another one for CE2 (for VE-ID 2). 653 4.3.2. Inter-AS Method (c): Multi-Hop EBGP Redistribution of VPLS 654 Information between ASes 656 In this method, there is a multi-hop E-BGP peering between the PEs or 657 Route Reflectors in AS1 and the PEs or Route Reflectors in AS2. 658 There is no VPLS state in either control or data plane on the ASBRs. 659 The multi-homing operations on the PEs in this method are exactly the 660 same as they are in intra-AS scenario. However, since Local 661 Preference is not carried across AS boundary, the translation of LP 662 to VP and vice versa MUST be done by RR, if RR is used to reflect 663 VPLS advertisements to other ASes. This is exactly the same as what 664 a ASBR does in case of method (b). A RR must set the VP to the LP 665 value in an advertisement before sending it to other ASes and must 666 set the LP to the VP value in an advertisement that it receives from 667 other ASes before sending to the PEs within the AS. 669 5. MAC Flush Operations 671 In a service provider VPLS network, customer MAC learning is confined 672 to PE devices and any intermediate nodes, such as a Route Reflector, 673 do not have any state for MAC addresses. 675 Topology changes either in the service provider's network or in 676 customer's network can result in the movement of MAC addresses from 677 one PE device to another. Such events can result into traffic being 678 dropped due to stale state of MAC addresses on the PE devices. Age 679 out timers that clear the stale state will resume the traffic 680 forwarding, but age out timers are typically in minutes, and 681 convergence of the order of minutes can severely impact customer's 682 service. To handle such events and expedite convergence of traffic, 683 flushing of affected MAC addresses is highly desirable. 685 This section describes the scenarios where VPLS flush is desirable 686 and the specific VPLS Flush TLVs that provide capability to flush the 687 affected MAC addresses on the PE devices. All operations described 688 in this section are in context of a particular VPLS domain and not 689 across multiple VPLS domains. Mechanisms for MAC flush are described 690 in [I-D.kothari-l2vpn-vpls-flush] for BGP based VPLS and in [RFC4762] 691 for LDP based VPLS. 693 5.1. MAC List FLush 695 If multiple customer sites are connected to the same PE, PE1 as shown 696 in Figure 2, and redundancy per site is desired when multi-homing 697 procedures described in this document are in effect, then it is 698 desirable to flush just the relevant MAC addresses from a particular 699 site when the site connectivity is lost. 701 To flush particular set of MAC addresses, a PE SHOULD originate a 702 flush message with MAC list that contains a list of MAC addresses 703 that needs to be flushed. In Figure 2, if connectivity between CE1 704 and PE1 goes down and if PE1 was the designated forwarder for CE1, 705 PE1 MAY send a list of MAC addresses that belong to CE1 to all its 706 BGP peers. 708 It is RECOMMENDED that in case of excessive link flap of customer 709 attachment circuit in a short duration, a PE should have a means to 710 throttle advertisements of flush messages so that excessive flooding 711 of such advertisements do not occur. 713 5.2. Implicit MAC Flush 715 Implicit MAC Flush refers to the use of BGP MH advertisements by the 716 PEs to flush the MAC addresses learned from the previous designated 717 forwarder. 719 In case of a failure, when connectivity to a customer site is lost, 720 remote PEs learn that a particular site is no longer reachable. The 721 local PE either withdraws the VPLS NLRI that it previously advertised 722 for the site or it sends a BGP update message for the site's VPLS 723 NLRI with the 'D' bit set. In such cases, the remote PEs can flush 724 all the MACs that were learned from the PE which reported the 725 failure. 727 However, in cases when a designated forwarder change occurs in 728 absence of failures, such as when an attachment circuit comes up, the 729 BGP MH advertisement from the PE reporting the change is not 730 sufficient for MAC flush procedures. Consider the case in Figure 2 731 where PE1-CE1 link is non-operational and PE2 is the designated 732 forwarder for CE1. Also assume that Local Preference of PE1 is 733 higher than PE2. When PE1-CE1 link becomes operational, PE1 will 734 send a BGP MH advertisement to all it's peers. If PE3 elects PE1 as 735 the new designated forwarder for CE1 and as a result flushes all the 736 MACs learned from PE1 before PE2 elects itself as the non-designated 737 forwarder, there is a chance that PE3 might learn MAC addresses from 738 PE2 and as a result may black-hole traffic until those MAC addresses 739 are deleted due to age out timers. 741 A designated forwarder must set the F bit and a non-designated 742 forwarder must clear the F bit when sending BGP MH advertisements. A 743 state transition from one to zero for the F bit can be used by a 744 remote PE to flush all the MACs learned from the PE that is 745 transitioning from designated forwarder to non-designated forwarder. 747 5.3. Minimizing the effects of fast link transitions 749 Certain failure scenarios may result in fast transitions of the link 750 towards the multi-homing CE which in turn will generate fast status 751 transitions of one or multiple multi-homed sites reflected through 752 multiple BGP MH advertisements and LDP MAC Flush messages. 754 It is recommended that a timer to damp the link flaps be used for the 755 port towards the multi-homed CE to minimize the number of MAC Flush 756 events in the remote PEs and the occurrences of BGP state 757 compressions for F bit transitions. A timer value more than the time 758 it takes BGP to converge in the network is recommended. 760 6. Backwards Compatibility 762 No forwarding loops are formed when PEs or Route Reflectors that do 763 not support procedures defined in this section co exist in the 764 network with PEs or Route Reflectors that do support. 766 6.1. BGP based VPLS 768 As explained in this section, multi-homed PEs to the same customer 769 site MUST assign the same MH-ID and related NLRI SHOULD contain the 770 block offset, block size and label base as zero. Remote PEs that 771 lack support of multi-homing operations specified in this document 772 will fail to create any PWs for the multi-homed MH-IDs due to the 773 label value of zero and thus, the multi-homing NLRI should have no 774 impact on the operation of Remote PEs that lack support of multi- 775 homing operations specified in this document. 777 For compatibility with PEs that use multiple VE-IDs with non-zero 778 label block values for multi-homing operation, it is a requirement 779 that a PE receiving such advertisements must use the labels in the 780 NLRIs associated with lowest VE-ID for PW creation. It is possible 781 that maintaining PW association with lowest VE-ID can result in PW 782 flap, and thus, traffic loss. However, it is necessary to maintain 783 the assocation of PW with the lowest VE-ID as it provides 784 deterministic DF election among all the VPLS PEs. 786 6.2. LDP VPLS with BGP Auto-discovery 788 The BGP-AD NLRI has a prefix length of 12 containing only a 8 bytes 789 RD and a 4 bytes VSI-ID. If a LDP VPLS PEs running BGP AD lacks 790 support of multi-homing operations specified in this document, it 791 SHOULD ignore a MH NLRI with the length field of 17. As a result it 792 will not ask LDP to create any PWs for the multi-homed Site-ID and 793 thus, the multi-homing NLRI should have no impact on LDP VPLS 794 operation. MH PEs may use existing LDP MAC Flush to flush the remote 795 LDP VPLS PEs or may use the implicit MAC Flush procedure. 797 7. Security Considerations 799 No new security issues are introduced beyond those that are described 800 in [RFC4761] and [RFC4762]. 802 8. IANA Considerations 804 At this time, this memo includes no request to IANA. 806 9. Acknowledgments 808 The authors would like to thank Yakov Rekhter, Nischal Sheth, Mitali 809 Singh and Ian Cowburn for their insightful comments and probing 810 questions. 812 10. References 814 10.1. Normative References 816 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 817 Requirement Levels", BCP 14, RFC 2119, March 1997. 819 [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service 820 (VPLS) Using BGP for Auto-Discovery and Signaling", 821 RFC 4761, January 2007. 823 [RFC6074] Rosen, E., Davie, B., Radoaca, V., and W. Luo, 824 "Provisioning, Auto-Discovery, and Signaling in Layer 2 825 Virtual Private Networks (L2VPNs)", RFC 6074, 826 January 2011. 828 10.2. Informative References 830 [I-D.kothari-l2vpn-vpls-flush] 831 Kothari, B. and R. Fernando, "VPLS Flush in BGP-based 832 Virtual Private LAN Service", 833 draft-kothari-l2vpn-vpls-flush-00 (work in progress), 834 October 2008. 836 [I-D.kothari-l2vpn-auto-site-id] 837 Kothari, B., Kompella, K., and T. IV, "Automatic 838 Generation of Site IDs for Virtual Private LAN Service", 839 draft-kothari-l2vpn-auto-site-id-01 (work in progress), 840 October 2008. 842 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 843 Communities Attribute", RFC 4360, February 2006. 845 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 846 Networks (VPNs)", RFC 4364, February 2006. 848 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 849 Reflection: An Alternative to Full Mesh Internal BGP 850 (IBGP)", RFC 4456, April 2006. 852 [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service 853 (VPLS) Using Label Distribution Protocol (LDP) Signaling", 854 RFC 4762, January 2007. 856 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 857 Protocol 4 (BGP-4)", RFC 4271, January 2006. 859 Authors' Addresses 861 Bhupesh Kothari 862 Cohere Networks 863 295 Santa Ana Court 864 Sunnyvale, CA 94085 865 US 867 Email: bhupesh@cohere.net 869 Kireeti Kompella 870 Juniper Networks 871 1194 N. Mathilda Ave. 872 Sunnyvale, CA 94089 873 US 875 Email: kireeti.kompella@gmail.com 877 Wim Henderickx 878 Alcatel-Lucent 880 Email: wim.henderickx@alcatel-lucent.be 882 Florin Balus 883 Alcatel-Lucent 885 Email: florin.balus@alcatel-lucent.com 887 James Uttaro 888 AT&T 889 200 S. Laurel Avenue 890 Middletown, NJ 07748 891 US 893 Email: uttaro@att.com 895 Senad Palislamovic 896 Alcatel-Lucent 898 Email: senad.palislamovic@alcatel-lucent.com 899 Wen Lin 900 Juniper Networks 902 Email: wlin@juniper.net