idnits 2.17.1 draft-ietf-bess-evpn-df-election-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 345 has weird spacing: '... of the ether...' -- The document date (October 10, 2017) is 2383 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'HRW1999' Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Working Group S. Mohanty 3 Internet-Draft K. Patel 4 Intended status: Standards Track A. Sajassi 5 Expires: April 13, 2018 Cisco Systems, Inc. 6 J. Drake 7 Juniper Networks, Inc. 8 A. Przygienda 9 Juniper 10 October 10, 2017 12 A new Designated Forwarder Election for the EVPN 13 draft-ietf-bess-evpn-df-election-03 15 Abstract 17 This document describes an improved EVPN Designated Forwarder 18 Election (DF) algorithm which can be used to enhance operational 19 experience in terms of convergence speed and robustness over a WAN 20 deploying EVPN 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on April 13, 2017. 39 Copyright Notice 41 Copyright (c) 2015 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 1.1. Finite State Machine . . . . . . . . . . . . . . . . . . 4 58 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 4 59 2. The modulus based DF Election Algorithm . . . . . . . . . . . 4 60 3. Problems with the modulus based DF Election Algorithm . . . . 5 61 4. Highest Random Weight . . . . . . . . . . . . . . . . . . . . 6 62 5. HRW and Consistent Hashing . . . . . . . . . . . . . . . . . 7 63 6. HRW Algorithm for EVPN DF Election . . . . . . . . . . . . . 7 64 7. Protocol Considerations . . . . . . . . . . . . . . . . . . . 9 65 7.1. Finite State Machine . . . . . . . . . . . . . . . . . . 10 66 8. Auto-Derivation of ES-Import Route Target . . . . . . . . . . 12 67 9. Operational Considerations . . . . . . . . . . . . . . . . . 12 68 10. Security Considerations . . . . . . . . . . . . . . . . . . . 12 69 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 70 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 71 12.1. Normative References . . . . . . . . . . . . . . . . . . 13 72 12.2. Informative References . . . . . . . . . . . . . . . . . 13 73 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 75 1. Introduction 77 Ethernet MPLS VPN (EVPN) [RFC7432] is an emerging technology that is 78 gaining prominence in Internet Service Provider IP/MPLS networks. In 79 EVPN, mac addresses are disseminated as routes across the 80 geographical area via the Border Gateway Protocol, BGP [RFC4271] 81 using the familiar L3VPN model [RFC4364]. An EVPN instance that 82 spans across PEs is defined as an EVI. Constrained Route 83 Distribution [RFC4684] can be used in conjunction to selectively 84 advertise the routes to where they are needed. One of the major 85 advantages of EVPN over VPLS [RFC4761],[RFC6624] is that it provides 86 a solution for minimizing flooding of unknown traffic and also 87 provides all Active mode of operation so that the traffic can truly 88 be multi-homed. In technologies such as EVPN or VPLS, managing 89 Broadcast, Unknown Unicast and multicast traffic (BUM) is a key 90 requirement. In the case where the customer edge (CE) router is 91 multi-homed to one or more Provider Edge (PE) Routers, it is 92 necessary that one and only one of the PE routers should forward BUM 93 traffic into the core or towards the CE as and when appropriate. 95 Specifically, quoting Section 8.5, [RFC7432], Consider a CE that is a 96 host or a router that is multi-homed directly to more than one PE in 97 an EVPN instance on a given Ethernet segment. One or more Ethernet 98 Tags may be configured on the Ethernet segment. In this scenario 99 only one of the PEs, referred to as the Designated Forwarder (DF), is 100 responsible for certain actions: 102 a. Sending multicast and broadcast traffic, on a given Ethernet Tag 103 on a particular Ethernet segment, to the CE. 105 b. Flooding unknown unicast traffic (i.e. traffic for which an PE 106 does not know the destination MAC address), on a given Ethernet 107 Tag on a particular Ethernet segment to the CE, if the 108 environment requires flooding of unknown unicast traffic. 110 +---------------+ 111 | IP/MPLS | 112 | CORE | 113 +----+ ES1 +----+ +----+ 114 | CE1|-----| |-----------| |____ES2 115 +----+ | PE1| | PE2| \ 116 | |-------- +----+ \+----+ 117 +----+ | | | CE2| 118 | | +----+ /+----+ 119 | |__| |____/ | 120 | | PE3| ES2 / 121 | +----+ / 122 | | / 123 +-------------+----+ / 124 | PE4|____/ES2 125 | | 126 +----+ 128 Figure 1 Multi-homing Network of E-VPN 130 Figure 1 132 Figure 1 illustrates a case where there are two Ethernet Segments, 133 ES1 and ES2. PE1 is attached to CE1 via Ethernet Segment ES1 whereas 134 PE2, PE3 and PE4 are attached to CE2 via ES2 i.e. PE2, PE3 and PE4 135 form a redundancy group. Since CE2 is multi-homed to different PEs 136 on the same Ethernet Segment, it is necessary for PE2, PE3 and PE4 to 137 agree on a DF to satisfy the above mentioned requirements. 139 Layer2 devices are particularly susceptible to forwarding loops 140 because of the broadcast nature of the Ethernet traffic. Therefore 141 it is very important that in case of multi-homing, only one of the 142 links be used to direct traffic to/from the core. 144 One of the pre-requisites for this support is that participating PEs 145 must agree amongst themselves as to who would act as the Designated 146 Forwarder. This needs to be achieved through a distributed algorithm 147 in which each participating PE independently and unambiguously 148 selects one of the participating PEs as the DF, and the result should 149 be unanimously in agreement. 151 The DF election algorithm as described in [RFC7432] has some 152 undesirable properties and in some cases can be somewhat disruptive 153 and unfair. This document describes those issues and proposes a 154 mechanism for dealing with those issues. These mechanisms do involve 155 changes to the DF Election algorithm , but do not require any 156 protocol changes to the EVPN Route exchange and have minimal changes 157 to their content per se. 159 1.1. Finite State Machine 161 Since the specification in EVPN RFC [RFC7432] does leave several 162 questions open as to the precise final state machine behavior of the 163 DF election, the document also includes a section describing 164 precisely the intended behavior. The finite state machine is 165 presented in Section 7.1 167 1.2. Requirements Language 169 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 170 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 171 document are to be interpreted as described in [RFC2119]. 173 2. The modulus based DF Election Algorithm 175 The default procedure for DF election at the granularity of (ESI,EVI) 176 is referred to as "service carving". With service carving, it is 177 possible to elect multiple DFs per Ethernet Segment (one per EVI) in 178 order to perform load-balancing of multi-destination traffic destined 179 to a given Segment. The objective is that the load-balancing 180 procedures should carve up the EVI space among the redundant PE nodes 181 evenly, in such a way that every PE is the DF for a disjoint set of 182 EVIs. 184 The existing DF algorithm as described in the EVPN RFC(Section 8.5 185 [RFC7432]) is based on a modulus operation. The PEs to which the ES 186 (for which DF election is to be carried out per vlan) is multi-homed 187 from an ordered (ordinal) list in ascending order of the PE ip 188 address values. Say, there are N PEs, P0, P1, ... PN-1 ranked as per 189 increasing IP addresses in the ordinal list; then for each vlan with 190 ethernet tag v, configured on the ethernet segment ES1, PEx is the DF 191 for vlan v on ES ES1 when x equals (v mod N). In the case of VLAN 192 bundle only the lowest VLAN is used. In the case when the plan 193 density is high meaning there are significant number of vlans and the 194 vlan-id or ethernet-tag is uniformly distributed, the thinking is 195 that the DF election will be spread across the PEs hosting that 196 ethernet segment and good service carving can be achieved. 198 3. Problems with the modulus based DF Election Algorithm 200 There are three fundamental problems with the current DF Election. 202 First, the algorithm will not perform well when the ethernet tag 203 follows a non-uniform distribution, for instance when the ethernet 204 tags are all even or all odd. In such a case let us assume that 205 the ES is multi-homed to two PEs; all the vlans will only pick one 206 of the PEs as the DF. This is very sub-optimal. It defeats the 207 purpose of service carving as the DFs are not really evenly spread 208 across. In this particular case, in fact one of the PEs does not 209 get elected all as the DF, so it does not participate in the DF 210 responsibilities at all. Consider another example where referring 211 to Figure 1, lets assume that PE2, PE3, PE4 are in ascending order 212 of the IP address; and each vlan configured on ES2 is associated 213 with an Ethernet Tag of of the form (3x+1), where x is an integer. 214 This will result in PE3 always be selected as the DF. 216 Even in the case when the ethernet tag distribution is uniform the 217 instance of a PE being up or down results in re-computation ((v 218 mod N-1) or (v mod N+1) as is the case); The resulting modulus 219 value need not be uniformly distributed but subject to the 220 primality of N-1 or N+1 as may be the case. 222 The third problem is one of disruption. Consider a case when the 223 same Ethernet Segment is multi homed to a set of PEs. When the ES 224 is down in one of the PEs, say PE1, or PE1 itself reboots, or the 225 BGP process goes down or the connectivity between PE1 and an RR 226 goes down, the effective number of PEs in the system now becomes 227 N-1 and DFs are computed for all the vlans that are configured on 228 that ethernet segment. In general, if the DF for a vlan v happens 229 not to be PE1, but some other PE, say PE2, it is likely that some 230 other PE will become the new DF. This is not desirable. 231 Similarly when a new PE hosts the same Ethernet segment, the 232 mapping again changes because of the mod operation. This results 233 in needless churn. Again referring to Figure 1, say v1, v2 and v3 234 are vlans configured on ES2 with associated ethernet tags of value 235 999, 1000 and 10001 respectively. So PE1, PE2 and PE3 are also 236 the DFs for v1, v2 and v3 respectively. Now when PE3 goes down, 237 PE2 will become the DF for v1 and PE1 will become the DF for v2. 239 One point to note is that the current DF election algorithm assumes 240 that all the PEs who are multi-homed to the same Ethernet Segment and 241 interested in the DF Election by exchanging EVPN routes have a V4 242 peering with each other or via a Route Reflector. This need not be 243 the case as there can be a v6 peering and supporting the EVPN 244 address-family. 246 Mathematically, a conventional hash function maps a key k to a number 247 i representing one of m hash buckets through a function h(k) i.e. 248 i=h(k). In the EVPN case, h is simply a modulo-m hash function viz. 249 h(v) = v mod N, where N is the number of PEs that are multi-homed to 250 the Ethernet Segment in discussion. It is well-known that for good 251 hash distribution using the modulus operation, the modulus N should 252 be a prime-number not too close to a power of 2 [CLRS2009]. When the 253 effective number of PEs changes from N to N-1 (or vice versa); all 254 the objects (vlan v) will be remapped except those for which v mod N 255 and v mod (N-1) refer to the same PE in the previous and subsequent 256 ordinal rankings respectively. 258 From a forwarding perspective, this is a churn, as it results in 259 programming the CE and PE side ports as blocking or non-blocking at 260 potentially all PEs when the DF changes either because (i) a new PE 261 is added or (ii) another one goes down or loses connectivity or else 262 cannot take part in the DF election process for whatever reason. 263 This draft addresses this problem and furnishes a solution to this 264 undesirable behavior. 266 4. Highest Random Weight 268 Highest Random Weight (HRW) as defined in [HRW1999] is originally 269 proposed in the context of Internet Caching and proxy Server load 270 balancing. Given an object name and a set of servers, HRW maps a 271 request to a server using the object-name (object-id) and server-name 272 (server-id) rather than the state of the server states. HRW forms a 273 hash out of the server-id and the object-id and forms an ordered list 274 of the servers for the particular object-id. The server for which 275 the hash value is highest, serves as the primary responsible for that 276 particular object, and the server with the next highest value in that 277 hash serves as the backup server. HRW always maps a given object 278 object name to the same server within a given cluster; consequently 279 it can be used at client sites to achieve global consensus on object- 280 server mappings. When that server goes down, the backup server 281 becomes the responsible designate. 283 Choosing an appropriate hash function that is statistically oblivious 284 to the key distribution and imparts a good uniform distribution of 285 the hash output is an important aspect of the algorithm,. Fortunately 286 many such hash functions exist. [HRW1999] provides pseudorandom 287 functions based on Unix utilities rand and srand and easily 288 constructed XOR functions that perform considerably well. This 289 imparts very good properties in the load balancing context. Also 290 each server independently and unambiguously arrives at the primary 291 server selection. HRW already finds use in multicast and ECMP 292 [RFC2991],[RFC2992]. 294 In the existing DF algorithm Section 2, whenever a new PE comes up or 295 an existing PE goes down, there is a significant interval before the 296 change is noticed by all peer PEs as it has to be conveyed by the BGP 297 update message involving the type-4 route. There is a timer to batch 298 all the messages before triggering the service carving procedures. 299 When the timer expires, each PE will build the ordered list and 300 follow the procedures for DF Election. In the proposed method which 301 we will describe shortly this "jittered" behavior is retained. 303 5. HRW and Consistent Hashing 305 HRW is not the only algorithm that addresses the object to server 306 mapping problem with goals of fair load distribution, redundancy and 307 fast access. There is another family of algorithms that also 308 addresses this problem; these fall under the umbrella of the 309 Consistent Hashing Algorithms [CHASH]. These will not be considered 310 here. 312 6. HRW Algorithm for EVPN DF Election 314 The applicability of HRW to DF Election can be described here. Let 315 DF(v) denote the Designated Forwarder and BDF(v) the Backup 316 Designated forwarder for the ethernet tag V, where v is the vlan, Si 317 is the IP address of server i and weight is a pseudorandom function 318 of v and Si. In case of a vlan bundle service, v denotes the lowest 319 vlan similar to the 'lowest vlan in bundle' logic of [RFC7432]. 321 1. DF(v) = Si: Weight(v, Si) >= Weight(V, Sj) , for all j. In case 322 of a tie, choose the PE whose IP address is numerically the 323 least. Note 0 <= i,j <= Number of PEs in the redundancy group. 325 2. BDF(v) = Sk: Weight(v, Si) >= Weight(V, Sk) and Weight(v, Sk) >= 326 Weight(v, Sj). in case of tie choose the PE whose IP address is 327 numerically the least. 329 Since the Weight is a Pseudorandom function with domain as a 330 concatenation of (v, S), it is an efficient deterministic algorithm 331 which is independent of the Ethernet Tag V sample space distribution. 332 Choosing a good hash function for the pseudorandom function is an 333 important consideration for this algorithm to perform provably better 334 than the existing algorithm. As mentioned previously, such functions 335 are described in the HRW paper. We take as candidate hash functions 336 two of the ones that are preferred in [HRW1999]. 338 1. Wrand(v, Si) = (1103515245((1103515245.Si+12345)XOR 339 D(v))+12345)(mod 2^31) and 341 2. Wrand2(v, Si) = (1103515245((1103515245.D(v)+12345)XOR 342 Si)+12345)(mod 2^31) 344 Here D(v) is the 31-bit digest (CRC-32 and discarding the MSB as in 345 [HRW1999] ) of the ethernet-tag v and Si is 346 address of the ith server. The server's IP address length does not 347 matter as only the low-order 31 bits are modulo significant. 348 Although both the above hash functions perform similarly, we will 349 select the first hash function (1), as the hash function has to be 350 the same in all the PEs. 352 A point to note is that the the domain of the Weight function is a 353 concatenation of the ethernet-tag and the PE IP-address, and the 354 actual length of the server IP address (whether V4 or V6) is not 355 really relevant, so long as the actual hash algorithm takes into 356 consideration the concatenated string. The existing algorithm in 357 [RFC7432] as is cannot employ both V4 and V6 neighbor peering 358 address. 360 HRW solves the disadvantage pointed out in Section 3 and ensures 362 o with very high probability that the task of DF election for 363 respective vlans is more or less equally distributed among the PEs 364 even for the 2 PE case 366 o If a PE, hosting some vlans on given ES, but is neither the DF nor 367 the BDF for that vlan, goes down or its connection to the ES goes 368 down, it does not result in a DF and BDF reassignment the other 369 PEs. This saves computation, especially in the case when the 370 connection flaps. 372 o More importantly it avoids the needless disruption case (c) that 373 are inherent in the existing modulus based algorithm 375 o In addition to the DF, the algorithm also furnishes the BDF, which 376 would be the DF if the current DF fails. 378 7. Protocol Considerations 380 Note that for the DF election procedures to be globally convergent 381 and unanimous, it is necessary that all the participating PEs agree 382 on the DF Election algorithm to be used. It is not possible that 383 some PEs continue to use the existing modulus based DF election and 384 some newer PEs use the HRW. For brownfield deployments and for 385 interoperability with legacy boxes, its is important that all PEs 386 need to have the capability to fall back on the modulus algorithm. A 387 PE (one with a newer version of the software) can indicate its 388 willingness to support HRW by signaling a new extended community 389 along with the Ethernet-Segment Route (Type-4). This extended 390 community is explained in the next paragraph. When a PE receives the 391 Ethernet-Segment Routes from all the other PEs for the ethernet 392 segment in question, it checks to see if all the advertisements have 393 the extended community attached; in the case that they do, this 394 particular PE, and by induction all the other PEs proceed to do DF 395 Election as per the HRW Algorithm. Otherwise if even a single 396 advertisement for the type-4 route is not received with the extended 397 community or the received DF types (including locally configured 398 type) do not ALL match a single value, the default modulus algorithm 399 is used as before. Also, the HRW algorithm needs to be executed 400 after the "batching" time. 402 A new BGP extended community attribute [RFC4360] needs to be defined 403 to identify the DF election procedure to be used for the Ethernet 404 Segment. We propose to name this extended community as the DF 405 Election Extended Community. It is a new transitive extended 406 community where the Type field is 0x06, and the Sub-Type is to be 407 defined. It may be advertised along with Ethernet Segment routes. 409 Each DF Election Extended Community is encoded as a 8-octet value as 410 follows: 412 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 413 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 414 | Type=0x06 | Sub-Type(TBD) | DF Type(One Octet) |Reserved=0 | 415 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 416 | Reserved = 0 | 417 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 419 Figure 2 421 The DF Type state is encoded as one octet. A value of 0 means that 422 the default (the mod based) DF election procedures are used and a 423 value of 1 means that the HRW algorithm will be employed. A request 424 needs to registered with the IETF authority for the subtype 425 [I-D.ietf-idr-extcomm-iana] 427 7.1. Finite State Machine 429 Per [RFC7432], the FSM described in Figure 3 is executed per ESI/VLAN 430 in case of VLAN aware service or ESI/[VLANs in VLAN Bundle] in case 431 of VLAN Bundle on each participating PE. 433 Observe that currently the VLANs are derived from local configuration 434 and the FSM does not provide any protection against misconfiguration 435 where same EVI,ESI combination has different set of VLANs on 436 different participating PEs or one of the PEs elects to consider 437 VLANs as VLAN bundle and another as separate VLANs for election 438 purposes (service type mismatch). 440 The FSM is normative in the sense that any design or implementation 441 MUST behave towards external peers and as observable external 442 behavior (DF) in a manner equivalent to this FSM. 444 LOST_ES 445 RCVD_ES RCVD_ES 446 LOST_ES +----+ 447 +----+ | v 448 | | ++----++ RCVD_ES 449 | +-+----+ ES_UP | DF +<--------+ 450 +->+ INIT +---------------> WAIT | | 451 ++-----+ +----+-+ | 452 ^ | | 453 +-----------+ | |DF_TIMER | 454 | ANY STATE +-------+ VLAN_CHANGE | | 455 +-----------+ ES_DOWN +-----------------+ | ^ 456 | LOST_ES v v | 457 +-----++ ++---+-+ | 458 | DF | | DF +---------+ 459 | DONE +<--------------+ CALC +v-+ | 460 +-+----+ CALCULATED +----+-+ | | 461 | | | | 462 | +----+ | 463 | LOST_ES | 464 | VLAN_CHANGE | 465 | | 466 +-------------------------------------+ 468 Figure 3 470 States: 472 1. INIT: Initial State 474 2. DF WAIT: State in which the participants waits for enough 475 information to perform the DF election for the EVI/ESI/VLAN 476 combination. 478 3. DF CALC: State in which the new DF is recomputed. 480 4. DF DONE: State in which the according DF for the EVI/ESI/VLAN 481 combination has been elected. 483 Events: 485 1. ES_UP: The ESI has been locally configured as 'up'. 487 2. ES_DOWN: The ESI has been locally configured as 'down'. 489 3. VLAN_CHANGE: The VLANs configured in a bundle that uses the ESI 490 changed. This event is necessary for VLAN bundles only. 492 4. DF_TIMER: DF Wait timer has expired. 494 5. RCVD_ES: A new or changed Ethernet Segment Route is received in a 495 BGP REACH UPDATE. Receiving an unchanged UPDATE MUST NOT trigger 496 this event. 498 6. LOST_ES: A BGP UNREACH UPDATE for a previously received Ethernet 499 Segment route has been received. If an UNREACH is seen for a 500 route that has not been advertised previously, the event MUST NOT 501 be triggered. 503 7. CALCULATED: DF has been succesfully calculated. 505 According actions when transitions are performed or states entered/ 506 exited: 508 1. ANY STATE on ES_DOWN: (i)stop DF timer (ii) assume non-DF for 509 local PE 511 2. INIT on ES_UP: (i)do nothing 513 3. INIT on RCVD_ES, LOST_ES: (i)do nothing 514 4. DF_WAIT on entering the state: (i) start DF timer if not started 515 already or expired (ii) assume non-DF for local PE 517 5. DF_WAIT on RCVD_ES, LOST_ES: do nothing 519 6. DF_WAIT on DF_TIMER: do nothing 521 7. DF_CALC on entering or re-entering the state: (i) rebuild 522 according list and hashes and perform election (ii) FSM 523 generates CALCULATED event against itself 525 8. DF_CALC on LOST_ES or VLAN_CHANGE: do nothing 527 9. DF_CALC on RCVD_ES: do nothing 529 10. DF_CALC on CALCULATED: (i) mark election result for VLAN or 530 bundle 532 11. DF_DONE on exiting the state: (i)if RFC7432 election or new 533 election and lost primary DF then assume non-DF for local PE for 534 VLAN or VLAN bundle. 536 12. DF_DONE on VLAN_CHANGE or LOST_ES: do nothing 538 8. Auto-Derivation of ES-Import Route Target 540 Section 7.6 of RFC7432 describes how the value of the ES-Import Route 541 Target for ESI types 1, 2, and 3 can be auto-derived by using the 542 high-order six bytes of the nine byte ESI value. This document 543 extends the same auto-derivation procedure to ESI types 0, 4, and 5. 545 9. Operational Considerations 547 TBD. 549 10. Security Considerations 551 This document raises no new security issues for EVPN. 553 11. Acknowledgements 555 The authors would like to thank Tamas Mondal, Sami Boutros, Jakob 556 Heitz, Jorge Rabadan and Patrice Brissette for useful feedback and 557 discussions. 559 12. References 561 12.1. Normative References 563 [HRW1999] Thaler, D. and C. Ravishankar, "Using Name-Based Mappings 564 to Increase Hit Rates", IEEE/ACM Transactions in 565 networking Volume 6 Issue 1, February 1998. 567 [I-D.ietf-idr-extcomm-iana] 568 Rosen, E. and Y. Rekhter, "IANA Registries for BGP 569 Extended Communities", draft-ietf-idr-extcomm-iana-02 570 (work in progress), December 2013. 572 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 573 Requirement Levels", BCP 14, RFC 2119, 574 DOI 10.17487/RFC2119, March 1997, 575 . 577 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 578 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 579 DOI 10.17487/RFC4271, January 2006, 580 . 582 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 583 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 584 February 2006, . 586 [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private 587 LAN Service (VPLS) Using BGP for Auto-Discovery and 588 Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, 589 . 591 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 592 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 593 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 594 2015, . 596 12.2. Informative References 598 [CHASH] Karger, D., Lehman, E., Leighton, T., Panigrahy, R., 599 Levine, M., and D. Lewin, "Consistent Hashing and Random 600 Trees: Distributed Caching Protocols for Relieving Hot 601 Spots on the World Wide Web", ACM Symposium on Theory of 602 Computing ACM Press New York, May 1997. 604 [CLRS2009] 605 Cormen, T., Leiserson, C., Rivest, R., and C. Stein, 606 "Introduction to Algorithms (3rd ed.)", MIT Press and 607 McGraw-Hill ISBN 0-262-03384-4., February 2009. 609 [RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and 610 Multicast Next-Hop Selection", RFC 2991, 611 DOI 10.17487/RFC2991, November 2000, 612 . 614 [RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path 615 Algorithm", RFC 2992, DOI 10.17487/RFC2992, November 2000, 616 . 618 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 619 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 620 2006, . 622 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 623 R., Patel, K., and J. Guichard, "Constrained Route 624 Distribution for Border Gateway Protocol/MultiProtocol 625 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 626 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 627 November 2006, . 629 [RFC6624] Kompella, K., Kothari, B., and R. Cherukuri, "Layer 2 630 Virtual Private Networks Using BGP for Auto-Discovery and 631 Signaling", RFC 6624, DOI 10.17487/RFC6624, May 2012, 632 . 634 Authors' Addresses 636 Satya Ranjan Mohanty 637 Cisco Systems, Inc. 638 225 West Tasman Drive 639 San Jose, CA 95134 640 USA 642 Email: satyamoh@cisco.com 644 Keyur Patel 645 Cisco Systems, Inc. 646 225 West Tasman Drive 647 San Jose, CA 95134 648 USA 650 Email: keyupate@cisco.com 651 Ali Sajassi 652 Cisco Systems, Inc. 653 225 West Tasman Drive 654 San Jose, CA 95134 655 USA 657 Email: sajassi@cisco.com 659 John Drake 660 Juniper Networks, Inc. 661 1194 N. Mathilda Drive 662 Sunnyvale, CA 95134 663 USA 665 Email: jdrake@juniper.com 667 Antoni Przygienda 668 Juniper Networks, Inc. 669 1194 N. Mathilda Drive 670 Sunnyvale, CA 95134 671 USA 673 Email: prz@juniper.net