idnits 2.17.1 draft-ietf-bess-evpn-df-election-framework-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 24, 2018) is 2164 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'HRW1999' == Outdated reference: A later version (-05) exists of draft-ietf-bess-vpls-multihoming-01 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet Draft Nokia 4 S. Mohanty, Ed. 5 Intended status: Standards Track A. Sajassi 6 Cisco 7 J. Drake 8 Juniper 9 K. Nagaraj 10 S. Sathappan 11 Nokia 13 Expires: November 25, 2018 May 24, 2018 15 Framework for EVPN Designated Forwarder Election Extensibility 16 draft-ietf-bess-evpn-df-election-framework-03 18 Abstract 20 The Designated Forwarder (DF) in EVPN networks is the PE responsible 21 for sending broadcast, unknown unicast and multicast (BUM) traffic to 22 a multi-homed CE, on a given VLAN on a particular Ethernet Segment 23 (ES). The DF is selected out of a list of candidate PEs that 24 advertise the same Ethernet Segment Identifier (ESI) to the EVPN 25 network. By default, EVPN uses a DF Election algorithm referred to as 26 "Service Carving" and it is based on a modulus function (V mod N) 27 that takes the number of PEs in the ES (N) and the VLAN value (V) as 28 input. This default DF Election algorithm has some inefficiencies 29 that this document addresses by defining a new DF Election algorithm 30 and a capability to influence the DF Election result for a VLAN, 31 depending on the state of the associated Attachment Circuit (AC). In 32 addition, this document creates a registry with IANA, for future DF 33 Election Algorithms and Capabilities. It also presents a formal 34 definition and clarification of the DF Election Finite State Machine. 36 Status of this Memo 38 This Internet-Draft is submitted in full conformance with the 39 provisions of BCP 78 and BCP 79. 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF), its areas, and its working groups. Note that 43 other groups may also distribute working documents as Internet- 44 Drafts. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 51 The list of current Internet-Drafts can be accessed at 52 http://www.ietf.org/ietf/1id-abstracts.txt 54 The list of Internet-Draft Shadow Directories can be accessed at 55 http://www.ietf.org/shadow.html 57 This Internet-Draft will expire on November 24, 2018. 59 Copyright Notice 61 Copyright (c) 2018 IETF Trust and the persons identified as the 62 document authors. All rights reserved. 64 This document is subject to BCP 78 and the IETF Trust's Legal 65 Provisions Relating to IETF Documents 66 (http://trustee.ietf.org/license-info) in effect on the date of 67 publication of this document. Please review these documents 68 carefully, as they describe your rights and restrictions with respect 69 to this document. Code Components extracted from this document must 70 include Simplified BSD License text as described in Section 4.e of 71 the Trust Legal Provisions and are provided without warranty as 72 described in the Simplified BSD License. 74 Table of Contents 76 1. Conventions and Terminology . . . . . . . . . . . . . . . . . . 3 77 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 78 2.1. Default Designated Forwarder (DF) Election in EVPN . . . . 4 79 2.2. Problem Statement . . . . . . . . . . . . . . . . . . . . . 5 80 2.2.1. Unfair Load-Balancing and Service Disruption . . . . . 6 81 2.2.2. Traffic Black-Holing on Individual AC Failures . . . . 7 82 2.3. The Need for Extending the Default DF Election in EVPN . . 9 83 3. Designated Forwarder Election Protocol and BGP Extensions . . . 10 84 3.1 The DF Election Finite State Machine (FSM) . . . . . . . . . 10 85 3.2 The DF Election Extended Community . . . . . . . . . . . . . 13 86 3.3 Auto-Derivation of ES-Import Route Target . . . . . . . . . 15 87 4. The Highest Random Weight DF Election Type . . . . . . . . . . 15 88 4.1. HRW and Consistent Hashing . . . . . . . . . . . . . . . . 16 89 4.2. HRW Algorithm for EVPN DF Election . . . . . . . . . . . . 16 90 5. The Attachment Circuit Influenced DF Election Capability . . . 17 91 5.1. AC-Influenced DF Election Capability For VLAN-Aware 92 Bundle Services . . . . . . . . . . . . . . . . . . . . . . 19 93 6. Solution Benefits . . . . . . . . . . . . . . . . . . . . . . . 20 94 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 21 95 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 21 96 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 97 9.1. Normative References . . . . . . . . . . . . . . . . . . . 21 98 9.2. Informative References . . . . . . . . . . . . . . . . . . 22 99 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 23 100 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 23 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 103 1. Conventions and Terminology 105 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 106 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 107 "OPTIONAL" in this document are to be interpreted as described in BCP 108 14 [RFC2119] [RFC8174] when, and only when, they appear in all 109 capitals, as shown here. 111 o AC and ACS - Attachment Circuit and Attachment Circuit Status. An 112 AC has an Ethernet Tag associated to it. 114 o BUM - refers to the Broadcast, Unknown unicast and Multicast 115 traffic. 117 o DF, NDF and BDF - Designated Forwarder, Non-Designated Forwarder 118 and Backup Designated Forwarder 120 o Ethernet A-D per ES route - refers to [RFC7432] route type 1 or 121 Auto-Discovery per Ethernet Segment route. 123 o Ethernet A-D per EVI route - refers to [RFC7432] route type 1 or 124 Auto-Discovery per EVPN Instance route. 126 o ES and ESI - Ethernet Segment and Ethernet Segment Identifier. 128 o EVI - EVPN Instance. 130 o BD - Broadcast Domain. An EVI may be comprised of one (VLAN-Based 131 or VLAN-Bundle services) or multiple (VLAN-Aware Bundle services) 132 Broadcast Domains. 134 o HRW - Highest Random Weight 136 o VID and CE-VID - VLAN Identifier and Customer Equipment VLAN 137 Identifier. 139 o Ethernet Tag - used to represent a Broadcast Domain that is 140 configured on a given ES for the purpose of DF election. Note that 141 any of the following may be used to represent a Broadcast Domain: 142 VIDs (including double Q-in-Q tags), configured IDs, VNI, 143 normalized VID, I-SIDs, etc., as long as the representation of the 144 broadcast domains is configured consistently across the multi-homed 145 PEs attached to that ES. 147 o DF Election Procedure and DF Algorithm - The Designated Forwarder 148 Election Procedure or simply DF Election, refers to the process in 149 its entirety, including the discovery of the PEs in the ES, the 150 creation and maintenance of the PE candidate list and the selection 151 of a PE. The Designated Forwarder Algorithm is just a component of 152 the DF Election Procedure and strictly refers to the selection of a 153 PE for a given . 155 This document also assumes familiarity with the terminology of 156 [RFC7432]. 158 2. Introduction 160 2.1. Default Designated Forwarder (DF) Election in EVPN 162 [RFC7432] defines the Designated Forwarder (DF) as the EVPN PE 163 responsible for: 165 o Flooding Broadcast, Unknown unicast and Multicast traffic (BUM), on 166 a given Ethernet Tag on a particular Ethernet Segment (ES), to the 167 CE. This is valid for single-active and all-active EVPN 168 multi-homing. 170 o Sending unicast traffic on a given Ethernet Tag on a particular ES 171 to the CE. This is valid for single-active multi-homing. 173 Figure 1 illustrates and example that we will use to explain the 174 Designated Forwarder function. 176 +---------------+ 177 | IP/MPLS | 178 | CORE | 179 +----+ ES1 +----+ +----+ 180 | CE1|-----| |-----------| |____ES2 181 +----+ | PE1| | PE2| \ 182 | |-------- +----+ \+----+ 183 +----+ | | | CE2| 184 | | +----+ /+----+ 185 | |__| |____/ | 186 | | PE3| ES2 / 187 | +----+ / 188 | | / 189 +-------------+----+ / 190 | PE4|____/ES2 191 | | 192 +----+ 194 Figure 1 Multi-homing Network of EVPN 196 Figure 1 illustrates a case where there are two Ethernet Segments, 197 ES1 and ES2. PE1 is attached to CE1 via Ethernet Segment ES1 whereas 198 PE2, PE3 and PE4 are attached to CE2 via ES2 i.e. PE2, PE3 and PE4 199 form a redundancy group. Since CE2 is multi-homed to different PEs on 200 the same Ethernet Segment, it is necessary for PE2, PE3 and PE4 to 201 agree on a DF to satisfy the above mentioned requirements. 203 Layer-2 devices are particularly susceptible to forwarding loops 204 because of the broadcast nature of the Ethernet traffic. Therefore it 205 is very important that, in case of multi-homing, only one of the 206 links be used to direct traffic to/from the core. 208 One of the pre-requisites for this support is that participating PEs 209 must agree amongst themselves as to who would act as the Designated 210 Forwarder (DF). This needs to be achieved through a distributed 211 algorithm in which each participating PE independently and 212 unambiguously selects one of the participating PEs as the DF, and the 213 result should be unanimously in agreement. 215 The default algorithm for DF election defined by [RFC7432] at the 216 granularity of (ESI,EVI) is referred to as "service carving". In this 217 document, service carving or default DF Election algorithm is used 218 indistinctly. With service carving, it is possible to elect multiple 219 DFs per Ethernet Segment (one per EVI) in order to perform load- 220 balancing of traffic destined to a given Segment. The objective is 221 that the load-balancing procedures should carve up the BD space among 222 the redundant PE nodes evenly, in such a way that every PE is the DF 223 for a disjoint set of EVIs. 225 The DF Election algorithm as described in [RFC7432] (Section 8.5) is 226 based on a modulus operation. The PEs to which the ES (for which DF 227 election is to be carried out per VLAN) is multi-homed form an 228 ordered (ordinal) list in ascending order of the PE IP address 229 values. For example, there are N PEs: PE0, PE1,... PEN-1 ranked as 230 per increasing IP addresses in the ordinal list; then for each VLAN 231 with Ethernet Tag V, configured on the Ethernet Segment ES1, PEx is 232 the DF for VLAN V on ES1 when x equals (V mod N). In the case of 233 VLAN-Bundle only the lowest VLAN is used. In the case when the 234 planned density is high (meaning there are significant number of 235 VLANs and the Ethernet Tags are uniformly distributed), the thinking 236 is that the DF Election will be spread across the PEs hosting that 237 Ethernet Segment and good service carving can be achieved. 239 However, the described default DF Election algorithm has some 240 undesirable properties and in some cases can be somewhat disruptive 241 and unfair. This document describes some of those issues and proposes 242 a mechanism for dealing with them. These mechanisms do involve 243 changes to the default DF Election algorithm, but they do not require 244 any changes to the EVPN Route exchange and have minimal changes to 245 their content per se. 247 In addition, there is a need to extend the DF Election procedures so 248 that new algorithms and capabilities are possible. A single algorithm 249 (the default DF Election algorithm) may not meet the requirements in 250 all the use-cases. 252 Note that while [RFC7432] elects a DF per , this document 253 elects a DF per . This means that unlike [RFC7432], where for 254 a VLAN Aware Bundle service EVI there is only one DF for the EVI, 255 this document specifies that there will be multiple DFs, one for each 256 BD configured in that EVI. 258 2.2. Problem Statement 260 This section describes some potential issues on the default DF 261 Election algorithm. 263 2.2.1. Unfair Load-Balancing and Service Disruption 265 There are three fundamental problems with the current default DF 266 Election algorithm. 268 1- First, the algorithm will not perform well when the Ethernet Tag 269 follows a non-uniform distribution, for instance when the Ethernet 270 Tags are all even or all odd. In such a case let us assume that 271 the ES is multi-homed to two PEs; all the VLANs will only pick one 272 of the PEs as the DF. This is very sub-optimal. It defeats the 273 purpose of service carving as the DFs are not really evenly spread 274 across. In fact, in this particular case, one of the PEs does not 275 get elected as DF at all, so it does not participate in the DF 276 responsibilities at all. Consider another example where, referring 277 to Figure 1, lets assume that PE2, PE3, PE4 are in ascending order 278 of the IP address; and each VLAN configured on ES2 is associated 279 with an Ethernet Tag of of the form (3x+1), where x is an integer. 280 This will result in PE3 always be selected as the DF. 282 2- Even in the case when the Ethernet Tag distribution is uniform the 283 instance of a PE being up or down results in re-computation ((v 284 mod N-1) or (v mod N+1) as is the case); the resulting modulus 285 value need not be uniformly distributed because it can be subject 286 to the primality of N-1 or N+1 as may be the case. 288 3- The third problem is one of disruption. Consider a case when the 289 same Ethernet Segment is multi homed to a set of PEs. When the ES 290 is down in one of the PEs, say PE1, or PE1 itself reboots, or the 291 BGP process goes down or the connectivity between PE1 and an RR 292 goes down, the effective number of PEs in the system now becomes 293 N-1, and DFs are computed for all the VLANs that are configured on 294 that Ethernet Segment. In general, if the DF for a VLAN v happens 295 not to be PE1, but some other PE, say PE2, it is likely that some 296 other PE will become the new DF. This is not desirable. Similarly 297 when a new PE hosts the same Ethernet Segment, the mapping again 298 changes because of the modulus operation. This results in needless 299 churn. Again referring to Figure 1, say v1, v2 and v3 are VLANs 300 configured on ES2 with associated Ethernet Tags of value 999, 1000 301 and 10001 respectively. So PE1, PE2 and PE3 are the DFs for v1, v2 302 and v3 respectively. Now when PE3 goes down, PE2 will become the 303 DF for v1 and PE1 will become the DF for v2. 305 One point to note is that the default DF election algorithm assumes 306 that all the PEs who are multi-homed to the same Ethernet Segment 307 (and interested in the DF Election by exchanging EVPN routes) use an 308 Originating Router's IP Address of the same family. This does not 309 need to be the case as the EVPN address-family can be carried over a 310 v4 or v6 peering, and the PEs attached to the same ES may use an 311 address of either family. 313 Mathematically, a conventional hash function maps a key k to a number 314 i representing one of m hash buckets through a function h(k) i.e. 315 i=h(k). In the EVPN case, h is simply a modulo-m hash function viz. 316 h(v) = v mod N, where N is the number of PEs that are multi-homed to 317 the Ethernet Segment in discussion. It is well-known that for good 318 hash distribution using the modulus operation, the modulus N should 319 be a prime-number not too close to a power of 2 [CLRS2009]. When the 320 effective number of PEs changes from N to N-1 (or vice versa); all 321 the objects (VLAN V) will be remapped except those for which V mod N 322 and V mod (N-1) refer to the same PE in the previous and subsequent 323 ordinal rankings respectively. From a forwarding perspective, this is 324 a churn, as it results in programming the PE side ports as blocking 325 or non-blocking at potentially all PEs when the DF changes. 327 This document addresses this problem and furnishes a solution to this 328 undesirable behavior. 330 2.2.2. Traffic Black-Holing on Individual AC Failures 332 As discussed in section 2.1 the default DF Election algorithm defined 333 by [RFC7432] takes into account only two variables in the modulus 334 function for a given ES: the existence of the PE's IP address on the 335 candidate list and the locally provisioned Ethernet Tags. 337 If the DF for an fails (due to physical link/node 338 failures) an ES route withdrawal will make the Non-DF (NDF) PEs re- 339 elect the DF for that and the service will be recovered. 341 However, the default DF election procedure does not provide a 342 protection against "logical" failures or human errors that may occur 343 at service level on the DF, while the list of active PEs for a given 344 ES does not change. These failures may have an impact not only on the 345 local PE where the issue happens, but also on the rest of the PEs of 346 the ES. Some examples of such logical failures are listed below: 348 a) A given individual Attachment Circuit (AC) defined in an ES is 349 accidentally shutdown or even not provisioned yet (hence the 350 Attachment Circuit Status - ACS - is DOWN), while the ES is 351 operationally active (since the ES route is active). 353 b) A given MAC-VRF - with a defined ES - is shutdown or not 354 provisioned yet, while the ES is operationally active (since the 355 ES route is active). In this case, the ACS of all the ACs defined 356 in that MAC-VRF is considered to be DOWN. 358 Neither (a) nor (b) will trigger the DF re-election on the remote 359 multi-homed PEs for a given ES since the ACS is not taken into 360 account in the DF election procedures. While the ACS is used as a DF 361 election tie-breaker and trigger in VPLS multi-homing procedures 362 [VPLS-MH], there is no procedure defined in EVPN [RFC7432] to trigger 363 the DF re-election based on the ACS change on the DF. 365 Figure 2 illustrates the described issue with an example. 367 +---+ 368 |CE4| 369 +---+ 370 | 371 PE4 | 372 +-----+-----+ 373 +---------------| +-----+ |---------------+ 374 | | | BD-1| | | 375 | +-----------+ | 376 | | 377 | EVPN | 378 | | 379 | PE1 PE2 PE3 | 380 | (NDF) (DF) (NDF)| 381 +-----------+ +-----------+ +-----------+ 382 | | BD-1| | | | BD-1| | | | BD-1| | 383 | +-----+ |-------| +-----+ |-------| +-----+ | 384 +-----------+ +-----------+ +-----------+ 385 AC1\ ES12 /AC2 AC3\ ES23 /AC4 386 \ / \ / 387 \ / \ / 388 +----+ +----+ 389 |CE12| |CE23| 390 +----+ +----+ 392 Figure 2 Default DF Election and Traffic Black-Holing 394 BD-1 is defined in PE1, PE2, PE3 and PE4. CE12 is a multi-homed CE 395 connected to ES12 in PE1 and PE2. Similarly CE23 is multi-homed to 396 PE2 and PE3 using ES23. Both, CE12 and CE23, are connected to BD-1 397 through VLAN-based service interfaces: CE12-VID 1 (VLAN ID 1 on CE12) 398 is associated to AC1 and AC2 in BD-1, whereas CE23-VID 1 is 399 associated to AC3 and AC4 in BD-1. Assume that, although not 400 represented, there are other ACs defined on these ES mapped to 401 different BDs. 403 After running the [RFC7432] default DF election algorithm, PE2 turns 404 out to be the DF for ES12 and ES23 in BD-1. The following issues may 405 arise: 407 a) If AC2 is accidentally shutdown or even not configured, CE12 408 traffic will be impacted. In case of all-active multi-homing, the 409 BUM traffic to CE12 will be "black-holed", whereas for single- 410 active multi-homing, all the traffic to/from CE12 will be 411 discarded. This is due to the fact that a logical failure in PE2's 412 AC2 may not trigger an ES route withdrawn for ES12 (since there 413 are still other ACs active on ES12) and therefore PE1 will not re- 414 run the DF election procedures. 416 b) If the Bridge Table for BD-1 is administratively shutdown or even 417 not configured yet on PE2, CE12 and CE23 will both be impacted: 418 BUM traffic to both CEs will be discarded in case of all-active 419 multi- homing and all traffic will be discarded to/from the CEs in 420 case of single-active multi-homing. This is due to the fact that 421 PE1 and PE3 will not re-run the DF election procedures and will 422 keep assuming PE2 is the DF. 424 Quoting [RFC7432], "when an Ethernet Tag is decommissioned on an 425 Ethernet Segment, then the PE MUST withdraw the Ethernet A-D per EVI 426 route(s) announced for the that are impacted by 427 the decommissioning", however, while this A-D per EVI route 428 withdrawal is used at the remote PEs performing aliasing or backup 429 procedures, it is not used to influence the DF election for the 430 affected EVIs. 432 This document adds an optional modification of the DF Election 433 procedure so that the ACS may be taken into account as a variable in 434 the DF election, and therefore EVPN can provide protection against 435 logical failures. 437 2.3. The Need for Extending the Default DF Election in EVPN 439 Section 2.2 describes some of the issues that exist in the default DF 440 Election procedures. In order to address those issues, this document 441 introduces a new DF Election framework. This framework allows the PEs 442 to agree on a common DF election type, as well as the capabilities to 443 enable during the DF Election procedure. In general, "DF Election 444 Type" refers to the type of DF election algorithm that takes a number 445 of parameters as input and determines the DF PE. A "DF Election 446 capability" refers to an additional feature that can be executed 447 along with the DF election algorithm, such as modifying the inputs 448 (or list of candidate PEs) before the DF Election algorithm chooses 449 the DF. 451 Within this framework, this document defines a new DF Election 452 algorithm and a new capability that can influence the DF Election 453 result: 455 o The new DF Election algorithm is referred to as "Highest Random 456 Weight" (HRW). The HRW procedures are described in section 4. 458 o The new DF Election capability is referred to as "AC-Influenced DF 459 Election" (AC-DF). The AC-DF procedures are described in section 5. 461 o HRW and AC-DF mechanisms are independent of each other. Therefore, 462 a PE MAY support either HRW or AC-DF independently or MAY support 463 both of them together. A PE MAY also support AC-DF capability along 464 with the default DF election algorithm per [RFC7432]. 466 In addition, this document defines a way to indicate the support of 467 HRW and/or AC-DF along with the EVPN ES routes advertised for a given 468 ES. Refer to section 3.2 for more details. 470 3. Designated Forwarder Election Protocol and BGP Extensions 472 This section describes the BGP extensions required to support the new 473 DF Election procedures. In addition, since the specification in EVPN 474 [RFC7432] does leave several questions open as to the precise final 475 state machine behavior of the DF election, section 3.1 describes 476 precisely the intended behavior. 478 3.1 The DF Election Finite State Machine (FSM) 480 Per [RFC7432], the FSM described in Figure 3 is executed per 481 in case of VLAN-based service or in case of VLAN-Bundle on each participating PE. 484 Observe that currently the VLANs are derived from local configuration 485 and the FSM does not provide any protection against misconfiguration 486 where the same (EVI,ESI) combination has different set of VLANs on 487 different participating PEs or one of the PEs elects to consider 488 VLANs as VLAN-Bundle and another as separate VLANs for election 489 purposes (service type mismatch). 491 The FSM is conceptual and any design or implementation MUST comply 492 with a behavior equivalent to the one outlined in this FSM. 494 LOST_ES 495 RCVD_ES RCVD_ES 496 LOST_ES +----+ 497 +----+ | v 498 | | ++----++ RCVD_ES 499 | +-+----+ ES_UP | DF +<--------+ 500 +->+ INIT +---------------> WAIT | | 501 ++-----+ +----+-+ | 502 ^ | | 503 +-----------+ | |DF_TIMER | 504 | ANY STATE +-------+ VLAN_CHANGE | | 505 +-----------+ ES_DOWN +-----------------+ | ^ 506 | LOST_ES v v | 507 +-----++ ++---+-+ | 508 | DF | | DF +---------+ 509 | DONE +<--------------+ CALC +v-+ | 510 +-+----+ CALCULATED +----+-+ | | 511 | | | | 512 | +----+ | 513 | LOST_ES | 514 | VLAN_CHANGE | 515 | | 516 +-------------------------------------+ 518 Figure 3 DF Election Finite State Machine 520 States: 522 1. INIT: Initial State 524 2. DF WAIT: State in which the participant waits for enough 525 information to perform the DF election for the EVI/ESI/VLAN 526 combination. 528 3. DF CALC: State in which the new DF is recomputed. 530 4. DF DONE: State in which the according DF for the EVI/ESI/VLAN 531 combination has been elected. 533 Events: 535 1. ES_UP: The ESI has been locally configured as 'up'. 537 2. ES_DOWN: The ESI has been locally configured as 'down'. 539 3. VLAN_CHANGE: The VLANs configured in a bundle (that uses the ESI) 540 changed. This event is necessary for VLAN-Bundles only. 542 4. DF_TIMER: DF Wait timer has expired. 544 5. RCVD_ES: A new or changed Ethernet Segment Route is received in a 545 BGP REACH UPDATE. Receiving an unchanged UPDATE MUST NOT trigger 546 this event. 548 6. LOST_ES: A BGP UNREACH UPDATE for a previously received Ethernet 549 Segment route has been received. If an UNREACH is seen for a 550 route that has not been advertised previously, the event MUST NOT 551 be triggered. 553 7. CALCULATED: DF has been successfully calculated. 555 According actions when transitions are performed or states 556 entered/exited: 558 1. ANY STATE on ES_DOWN: (i) stop DF timer (ii) assume non-DF for 559 local PE. 561 2. INIT on ES_UP: transition to DF_WAIT. 563 3. INIT on RCVD_ES, LOST_ES: do nothing. 565 4. DF_WAIT on entering the state: (i) start DF timer if not started 566 already or expired (ii) assume non-DF for local PE. 568 5. DF_WAIT on RCVD_ES, LOST_ES: do nothing. 570 6. DF_WAIT on DF_TIMER: transition to DF_CALC. 572 7. DF_CALC on entering or re-entering the state: (i) rebuild 573 candidate list, hash and perform election (ii) Afterwards FSM 574 generates CALCULATED event against itself. 576 8. DF_CALC on LOST_ES or VLAN_CHANGE: do nothing. 578 9. DF_CALC on RCVD_ES: transition to DF_WAIT. 580 10. DF_CALC on CALCULATED: mark election result for VLAN or bundle, 581 and transition to DF_DONE. 583 11. DF_DONE on exiting the state: (i) if [RFC7432] election or new 584 election and lost primary DF then assume non-DF for local PE for 585 VLAN or VLAN-Bundle. 587 12. DF_DONE on VLAN_CHANGE or LOST_ES: transition to DF_CALC. 589 13. DF_DONE on RCVD_ES: transition to DF_WAIT. 591 3.2 The DF Election Extended Community 593 For the DF election procedures to be globally consistent and 594 unanimous, it is necessary that all the participating PEs agree on 595 the DF Election type and capabilities to be used. For instance, it is 596 not possible that some PEs continue to use the default DF Election 597 algorithm and some PEs use HRW. For brown-field deployments and for 598 interoperability with legacy boxes, its is important that all PEs 599 need to have the capability to fall back on the Default DF Election. 600 A PE can indicate its willingness to support HRW and/or AC-DF by 601 signaling a DF Election Extended Community along with the Ethernet 602 Segment Route (Type-4). 604 The DF Election Extended Community is a new BGP transitive extended 605 community attribute [RFC4360] that is defined to identify the DF 606 election procedure to be used for the Ethernet Segment. Figure 4 607 shows the encoding of the DF Election Extended Community. 609 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 610 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 611 | Type=0x06 | Sub-Type(0x06)| DF Type | Bitmap | 612 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 613 | Reserved = 0 | 614 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 616 Figure 4 DF Election Extended Community 618 Where: 620 o Type is 0x06 as registered with IANA for EVPN Extended Communities. 622 o Sub-Type is 0x06 - "DF Election Extended Community" as requested by 623 this document to IANA. 625 o DF Type (1 octet) - Encodes the DF Election algorithm values 626 (between 0 and 255) that the advertising PE desires to use for the 627 ES. This document requests IANA to set up a registry called "DF 628 Type Registry" and solicits the following values: 630 - Type 0: Default DF Election algorithm, or modulus-based algorithm 631 as in [RFC7432]. 633 - Type 1: HRW algorithm (explained in this document). 635 - Types 2-254: Unassigned. 637 - Type 255: Reserved for Experimental Use. 639 o Bitmap (1 octet) - Encodes "capabilities" associated to the DF 640 Election algorithm in the field "DF Type". This document requests 641 IANA to create a registry for the Bitmap field, called "DF Election 642 Capabilities" and solicits the following values: 644 - Bit 24: Unassigned. 646 - Bit 25: AC-DF (AC-Influenced DF Election, explained in this 647 document). When set to 1, it indicates the desire to use AC- 648 Influenced DF Election with the rest of the PEs in the ES. 650 - Bits 26-31: Unassigned. 652 The DF Election Extended Community is used as follows: 654 o A PE SHOULD attach the DF Election Extended Community to any 655 advertised ES route and the Extended Community MUST be sent if the 656 ES is locally configured with a DF election type different from the 657 Default Election algorithm or if a capability is required to be 658 used. In the Extended Community, the PE indicates the desired "DF 659 Type" algorithm and "Bitmap" capabilities to be used for the ES. 661 - Only one DF Election Extended Community can be sent along with an 662 ES route. Note that the intent is not for the advertising PE to 663 indicate all the supported DF Types and capabilities, but signal 664 the preferred ones. 666 - DF Types 0 and 1 can be both used with bit AC-DF set to 0 or 1. 668 - In general, a specific DF Type MAY determine the use of the 669 reserved bits in the Extended Community. In case of DF Type HRW, 670 the reserved bits will be sent as 0 and will be ignored on 671 reception. 673 o When a PE receives the ES Routes from all the other PEs for the ES 674 in question, it checks to see if all the advertisements have the 675 extended community with the same DF Type and Bitmap: 677 - In the case that they do, this particular PE MUST follow the 678 procedures for the advertised DF Type and capabilities. For 679 instance, if all ES routes for a given ES indicate DF Type HRW 680 and AC-DF set to 1, the receiving PE and by induction all the 681 other PEs in the ES will proceed to do DF Election as per the HRW 682 Algorithm and following the AC-DF procedures. 684 - Otherwise if even a single advertisement for the type-4 route is 685 not received with the locally configured DF Type and capability, 686 the default DF Election algorithm (modulus) algorithm MUST be 687 used as in [RFC7432]. 689 - The absence of the DF Election Extended Community MUST be 690 interpreted by a receiving PE as an indication of the default DF 691 Election algorithm on the sending PE, that is, DF Type 0 and no 692 DF Election capabilities. 694 o When all the PEs in an ES advertise DF Type 255, they will rely on 695 the local policy to decide how to proceed with the DF Election. 697 o For any new capability defined in the future, the 698 applicability/compatibility of this new capability to the existing 699 DF types must be assessed on a per case by case basis. 701 o Likewise, for any new DF type defined in future, its 702 applicability/compatibility to the existing capabilities must be 703 assessed on a per case by case basis. 705 3.3 Auto-Derivation of ES-Import Route Target 707 Section 7.6 of [RFC7432] describes how the value of the ES-Import 708 Route Target for ESI types 1, 2, and 3 can be auto-derived by using 709 the high-order six bytes of the nine byte ESI value. The same auto- 710 derivation procedure can be extended to ESI types 0, 4, and 5 as long 711 as it is ensured that the auto-derived values for ES-Import RT among 712 different ES types don't overlap. 714 4. The Highest Random Weight DF Election Type 716 The procedure discussed in this section is applicable to the DF 717 Election in EVPN Services [RFC7432] and EVPN Virtual Private Wire 718 Services [RFC8214]. 720 Highest Random Weight (HRW) as defined in [HRW1999] is originally 721 proposed in the context of Internet Caching and proxy Server load 722 balancing. Given an object name and a set of servers, HRW maps a 723 request to a server using the object-name (object-id) and server-name 724 (server-id) rather than the state of the server states. HRW forms a 725 hash out of the server-id and the object-id and forms an ordered list 726 of the servers for the particular object-id. The server for which the 727 hash value is highest, serves as the primary responsible for that 728 particular object, and the server with the next highest value in that 729 hash serves as the backup server. HRW always maps a given object name 730 to the same server within a given cluster; consequently it can be 731 used at client sites to achieve global consensus on object-server 732 mappings. When that server goes down, the backup server becomes the 733 responsible designate. 735 Choosing an appropriate hash function that is statistically oblivious 736 to the key distribution and imparts a good uniform distribution of 737 the hash output is an important aspect of the algorithm. Fortunately 738 many such hash functions exist. [HRW1999] provides pseudo-random 739 functions based on Unix utilities rand and srand and easily 740 constructed XOR functions that perform considerably well. This 741 imparts very good properties in the load balancing context. Also each 742 server independently and unambiguously arrives at the primary server 743 selection. HRW already finds use in multicast and ECMP [RFC2991], 744 [RFC2992]. 746 4.1. HRW and Consistent Hashing 748 HRW is not the only algorithm that addresses the object to server 749 mapping problem with goals of fair load distribution, redundancy and 750 fast access. There is another family of algorithms that also 751 addresses this problem; these fall under the umbrella of the 752 Consistent Hashing Algorithms [CHASH]. These will not be considered 753 here. 755 4.2. HRW Algorithm for EVPN DF Election 757 The applicability of HRW to DF Election is described here. Let DF(v) 758 denote the Designated Forwarder and BDF(v) the Backup Designated 759 forwarder for the Ethernet Tag V, where v is the VLAN, Si is the IP 760 address of server i, Es denotes the Ethernet Segment Identifier and 761 weight is a pseudo-random function of v and Si. 763 Note that while the DF election algorithm in [RFC7432] uses PE 764 address and vlan as inputs, this document uses PE address, ESI, and 765 vlan as inputs. This is because if the same set of PEs are multi- 766 homed to the same set of ESes, then the DF election algorithm used in 767 [RFC7432] would result in the same PE being elected DF for the same 768 set of broadcast domains on each ES, which can have adverse side- 769 effects on both load balancing and redundancy. Including ESI in the 770 DF election algorithm introduces additional entropy which 771 significantly reduces the probability of the same PE being elected DF 772 for the same set of broadcast domains on each ES. Therefore, the ESI 773 value in the Weight function below SHOULD be set to that of 774 corresponding ES. The ESI value MAY be set to all 0's in the Weight 775 function below if the operator chooses so. 777 In case of a VLAN-Bundle service, v denotes the lowest VLAN similar 778 to the 'lowest VLAN in bundle' logic of [RFC7432]. 780 1. DF(v) = Si: Weight(v, Es, Si) >= Weight(V, Es, Sj), for all j. In 781 case of a tie, choose the PE whose IP address is numerically the 782 least. Note 0 <= i,j <= Number of PEs in the redundancy group. 784 2. BDF(v) = Sk: Weight(v, Es, Si) >= Weight(V, Es, Sk) and Weight(v, 785 Es, Sk) >= Weight(v, Es, Sj). In case of tie choose the PE whose 786 IP address is numerically the least. 788 Since the Weight is a Pseudo-random function with domain as the 789 three-tuple (v, Es, S), it is an efficient deterministic algorithm 790 which is independent of the Ethernet Tag V sample space distribution. 791 Choosing a good hash function for the pseudo-random function is an 792 important consideration for this algorithm to perform probably better 793 than the default algorithm. As mentioned previously, such functions 794 are described in the HRW paper. We take as candidate hash functions 795 two of the ones that are preferred in [HRW1999]. 797 1. Wrand(v, Es, Si) = (1103515245((1103515245.Si+12345)XOR 798 D(v,Es))+12345)(mod 2^31) and 800 2. Wrand2(v, Es, Si) = (1103515245((1103515245.D(v,Es)+12345)XOR 801 Si)+12345)(mod 2^31) 803 Here D(v,Es) is the 31-bit digest (CRC-32 and discarding the MSB as 804 in [HRW1999]) of the 14-byte stream, the Ethernet Tag v (4 bytes) 805 followed by the Ethernet Segment Identifier (10 bytes). It is 806 mandated that the 14-byte stream is formed by concatenation of the 807 Ethernet tag and the Ethernet Segment identifier in network byte 808 order. The CRC should proceed as if the architecture is in network 809 byte order (big-endian). Si is address of the ith server. The 810 server's IP address length does not matter as only the low-order 31 811 bits are modulo significant. Although both the above hash functions 812 perform similarly, we select the first hash function (1) of choice, 813 as the hash function has to be the same in all the PEs participating 814 in the DF election. 816 A point to note is that the Weight function takes into consideration 817 the combination of the Ethernet Tag, Ethernet Segment and the PE IP- 818 address, and the actual length of the server IP address (whether V4 819 or V6) is not really relevant. The default algorithm in [RFC7432] 820 cannot employ both V4 and V6 PE addresses, since [RFC7432] does not 821 specify how to decide on the ordering (the ordinal list) when both V4 822 and V6 PEs are present. 824 HRW solves the disadvantage pointed out in Section 2.2.1 and ensures: 826 o with very high probability that the task of DF election for 827 respective VLANs is more or less equally distributed among the PEs 828 even for the 2 PE case. 830 o If a PE, hosting some VLANs on given ES, but is neither the DF nor 831 the BDF for that VLAN, goes down or its connection to the ES goes 832 down, it does not result in a DF and BDF reassignment the other 833 PEs. This saves computation, especially in the case when the 834 connection flaps. 836 o More importantly it avoids the needless disruption case of Section 837 2.2.1 (3), that is inherent in the existing default DF Election. 839 o In addition to the DF, the algorithm also furnishes the BDF, which 840 would be the DF if the current DF fails. 842 5. The Attachment Circuit Influenced DF Election Capability 844 The procedure discussed in this section is applicable to the DF 845 Election in EVPN Services [RFC7432] and EVPN Virtual Private Wire 846 Services [RFC8214]. 848 The AC-DF capability MAY be used with any "DF Type" algorithm. It 849 MUST modify the DF Election procedures by removing from consideration 850 any candidate PE in the ES that cannot forward traffic on the AC that 851 belongs to the BD. This section is applicable to VLAN-Based and VLAN- 852 Bundle service interfaces. Section 5.1 describes the procedures for 853 VLAN-Aware Bundle interfaces. 855 In particular, when used with the default DF Type, the AC-DF 856 capability modifies the Step 3 in the DF Election procedure described 857 in [RFC7432] Section 8.5, as follows: 859 3. When the timer expires, each PE builds an ordered "candidate" list 860 of the IP addresses of all the PE nodes connected to the Ethernet 861 Segment (including itself), in increasing numeric value. The 862 candidate list is based on the Originator Router's IP addresses of 863 the ES routes, excluding all the PEs for which no Ethernet A-D per 864 ES route has been received, or for which the route has been 865 withdrawn. Afterwards, the DF Election algorithm is applied on a 866 per or , however, the IP address for a 867 PE will not be considered candidate for a given or 868 until the corresponding Ethernet A-D per EVI 869 route has been received from that PE. In other words, the ACS on 870 the ES for a given PE must be UP so that the PE is considered as 871 candidate for a given BD. 873 The above paragraph differs from [RFC7432] Section 8.5, Step 3, in 874 two aspects: 876 o Any DF Type algorithm can be used, and not only the modulus-based 877 one (which is the default DF Election, or DF Type 0 in this 878 document). 880 o The candidate list is pruned based on the Ethernet A-D routes: a 881 PE's IP address MUST be removed from the ES candidate list if its 882 Ethernet A-D per ES route is withdrawn. A PE's IP address MUST NOT 883 be considered as candidate DF for a or , 884 if its Ethernet A-D per EVI route for the or respectively, is withdrawn. 887 The following example illustrates the AC-DF behavior applied to the 888 Default DF election algorithm, assuming the network in Figure 2: 890 a) When PE1 and PE2 discover ES12, they advertise an ES route for 891 ES12 with the associated ES-import extended community and the DF 892 Election Extended Community indicating AC-DF=1; they start a timer 893 at the same time. Likewise, PE2 and PE3 advertise an ES route for 894 ES23 with AC-DF=1 and start a timer. 896 b) PE1/PE2 advertise an Ethernet A-D per ES route for ES12, and 897 PE2/PE3 advertise an Ethernet A-D per ES route for ES23. 899 c) In addition, PE1/PE2/PE3 advertise an Ethernet A-D per EVI route 900 for AC1, AC2, AC3 and AC4 as soon as the ACs are enabled. Note 901 that the AC can be associated to a single customer VID (e.g. VLAN- 902 based service interfaces) or a bundle of customer VIDs (e.g. VLAN- 903 Bundle service interfaces). 905 d) When the timer expires, each PE builds an ordered "candidate" list 906 of the IP addresses of all the PE nodes connected to the Ethernet 907 Segment (including itself) as explained above in [RFC7432] Step 3. 908 All the PEs for which no Ethernet A-D per ES route has been 909 received, are pruned from the list. 911 e) When electing the DF for a given BD, a PE will not be considered 912 candidate until an Ethernet A-D per EVI route has been received 913 from that PE. In other words, the ACS on the ES for a given PE 914 must be UP so that the PE is considered as candidate for a given 915 BD. For example, PE1 will not consider PE2 as candidate for DF 916 election for until an Ethernet A-D per EVI route is 917 received from PE2 for . 919 f) Once the PEs with ACS = DOWN for a given BD have been removed from 920 the candidate list, the DF Election can be applied for the 921 remaining N candidates. 923 Note that this procedure only modifies the existing EVPN control 924 plane by adding and processing the DF Election Extended Community, 925 and by pruning the candidate list of PEs that take part in the DF 926 election. 928 In addition to the events defined in the FSM in Section 3.1, the 929 following events SHALL modify the candidate PE list and trigger the 930 DF re-election in a PE for a given or . In 931 the FSM of Figure 3, the events below MUST trigger a transition from 932 DF_DONE to DF_CALC: 934 i. Local AC going DOWN/UP. 936 ii. Reception of a new Ethernet A-D per EVI update/withdraw for the 937 or . 939 iii. Reception of a new Ethernet A-D per ES update/withdraw for the 940 ES. 942 5.1. AC-Influenced DF Election Capability For VLAN-Aware Bundle Services 944 The procedure described section 5 works for VLAN-based and 945 VLAN-Bundle service interfaces since, for those service types, a PE 946 advertises only one Ethernet A-D per EVI route per or 947 . The withdrawal of such route means that the PE 948 cannot forward traffic on that particular or 949 , therefore the PE can be removed from consideration 950 for DF. 952 According to [RFC7432], in VLAN-aware bundle services, the PE 953 advertises multiple Ethernet A-D per EVI routes per 954 (one route per Ethernet Tag), while the DF Election is still 955 performed per . The withdrawal of an individual route 956 only indicates the unavailability of a specific AC but not 957 necessarily all the ACs in the . 959 This document modifies the DF Election for VLAN-Aware Bundle services 960 in the following way: 962 o After confirming that all the PEs in the ES advertise the AC-DF 963 capability, a PE will perform a DF Election per , as 964 opposed to per in [RFC7432]. Now, the withdrawal 965 of an Ethernet per EVI route for a VLAN will indicate that the 966 advertising PE's ACS is DOWN and the rest of the PEs in the ES can 967 remove the PE from consideration for DF in the . 969 o The PEs will now follow the procedures in section 5. 971 For example, assuming three bridge tables in PE1 for the same MAC-VRF 972 (each one associated to a different Ethernet Tag, e.g. VLAN-1, VLAN-2 973 and VLAN-3), PE1 will advertise three Ethernet A-D per EVI routes for 974 ES12. Each of the three routes will indicate the status of each of 975 the three ACs in ES12. PE1 will be considered as a valid candidate PE 976 for DF election in , , as 977 long as its three routes are active. For instance, if PE1 withdraws 978 the Ethernet A-D per EVI routes for , the PEs in ES12 979 will not consider PE1 as a suitable DF candidate for . 981 6. Solution Benefits 983 The solution described in this document provides the following 984 benefits: 986 a) Extends the DF Election in [RFC7432] to address the unfair load- 987 balancing and potential black-holing issues of the default DF 988 Election algorithm. The solution is applicable to the DF Election 989 in EVPN Services [RFC7432] and EVPN Virtual Private Wire Services 990 [RFC8214]. 992 b) It defines a way to signal the DF Election algorithm and 993 capabilities intended by the advertising PE. This is done by 994 defining the DF Election Extended Community, which allow signaling 995 of the capabilities supported by this document as well as any 996 other future DF Election algorithms and capabilities. 998 c) The solution is backwards compatible with the procedures defined 999 in [RFC7432]. If one or more PEs in the ES do not support the new 1000 procedures, they will all follow the [RFC7432] DF Election. 1002 7. Security Considerations 1004 The same Security Considerations described in [RFC7432] are valid for 1005 this document. 1007 8. IANA Considerations 1009 IANA is requested to: 1011 o Allocate Sub-Type value 0x06 as "DF Election Extended Community" in 1012 the "EVPN Extended Community Sub-Types" registry. 1014 o Set up a registry "DF Type" for the DF Type octet in the Extended 1015 Community. The following values in that registry are requested: 1017 - Type 0: Default DF Election. 1018 - Type 1: HRW algorithm. 1019 - Type 255: Reserved for Experimental use. 1021 o Set up a registry "DF Election Capabilities" for the Bitmap octet 1022 in the Extended Community. The following values in that registry 1023 are requested: 1025 - Bit 25: AC-DF capability. 1027 o The registration policy for the two registries is "Specification 1028 Required". 1030 9. References 1032 9.1. Normative References 1034 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1035 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 1036 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, 1037 . 1039 [RFC8214] Boutros, S., Sajassi, A., Salam, S., Drake, J., and J. 1040 Rabadan, "Virtual Private Wire Service Support in Ethernet VPN", RFC 1041 8214, DOI 10.17487/RFC8214, August 2017, . 1044 [HRW1999] Thaler, D. and C. Ravishankar, "Using Name-Based Mappings 1045 to Increase Hit Rates", IEEE/ACM Transactions in networking Volume 6 1046 Issue 1, February 1998. 1048 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1049 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1050 1997, . 1052 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1053 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, 1054 . 1056 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 1057 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, February 1058 2006, . 1060 9.2. Informative References 1062 [VPLS-MH] Kothari, Henderickx et al., "BGP based Multi-homing in 1063 Virtual Private LAN Service", draft-ietf-bess-vpls-multihoming- 1064 01.txt, work in progress, January, 2016. 1066 [CHASH] Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, 1067 M., and D. Lewin, "Consistent Hashing and Random Trees: Distributed 1068 Caching Protocols for Relieving Hot Spots on the World Wide Web", ACM 1069 Symposium on Theory of Computing ACM Press New York, May 1997. 1071 [CLRS2009] Cormen, T., Leiserson, C., Rivest, R., and C. Stein, 1072 "Introduction to Algorithms (3rd ed.)", MIT Press and McGraw-Hill 1073 ISBN 0-262-03384-4., February 2009. 1075 [RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and 1076 Multicast Next-Hop Selection", RFC 2991, DOI 10.17487/RFC2991, 1077 November 2000, . 1079 [RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path 1080 Algorithm", RFC 2992, DOI 10.17487/RFC2992, November 2000, 1081 . 1083 10. Acknowledgments 1085 The authors want to thank Sriram Venkateswaran, Laxmi Padakanti, 1086 Ranganathan Boovaraghavan, Tamas Mondal, Sami Boutros, Jakob Heitz, 1087 Mrinmoy Ghosh, Leo Mermelstein, Mankamana Mishra and Samir Thoria for 1088 their review and contributions. Special thanks to Stephane Litkowski 1089 for his thorough review and detailed contributions. 1091 11. Contributors 1093 In addition to the authors listed on the front page, the following 1094 coauthors have also contributed to this document: 1096 Antoni Przygienda 1097 Juniper Networks, Inc. 1098 1194 N. Mathilda Drive 1099 Sunnyvale, CA 95134 1100 USA 1101 Email: prz@juniper.net 1103 Vinod Prabhu 1104 Nokia 1105 Email: vinod.prabhu@nokia.com 1107 Wim Henderickx 1108 Nokia 1109 Email: wim.henderickx@nokia.com 1111 Wen Lin 1112 Juniper Networks, Inc. 1113 Email: wlin@juniper.net 1115 Patrice Brissette 1116 Cisco Systems 1117 Email: pbrisset@cisco.com 1119 Keyur Patel 1120 Arrcus, Inc 1121 Email: keyur@arrcus.com 1123 Autumn Liu 1124 Ciena 1125 Email: hliu@ciena.com 1127 Authors' Addresses 1129 Jorge Rabadan 1130 Nokia 1131 777 E. Middlefield Road 1132 Mountain View, CA 94043 USA 1133 Email: jorge.rabadan@nokia.com 1135 Satya Mohanty 1136 Cisco Systems, Inc. 1137 225 West Tasman Drive 1138 San Jose, CA 95134 1139 USA 1140 Email: satyamoh@cisco.com 1142 Ali Sajassi 1143 Cisco Systems, Inc. 1144 225 West Tasman Drive 1145 San Jose, CA 95134 1146 USA 1147 Email: sajassi@cisco.com 1149 John Drake 1150 Juniper Networks, Inc. 1151 1194 N. Mathilda Drive 1152 Sunnyvale, CA 95134 1153 USA 1154 Email: jdrake@juniper.net 1156 Kiran Nagaraj 1157 Nokia 1158 701 E. Middlefield Road 1159 Mountain View, CA 94043 USA 1160 Email: kiran.nagaraj@nokia.com 1162 Senthil Sathappan 1163 Nokia 1164 701 E. Middlefield Road 1165 Mountain View, CA 94043 USA 1166 Email: senthil.sathappan@nokia.com