idnits 2.17.1 draft-malhotra-bess-evpn-irb-extended-mobility-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 11, 2018) is 2256 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 185, but not defined == Missing Reference: 'VM-IP1' is mentioned on line 292, but not defined == Missing Reference: 'VM-IP2' is mentioned on line 292, but not defined == Missing Reference: 'VM-IP3' is mentioned on line 292, but not defined == Missing Reference: 'VM-IP4' is mentioned on line 292, but not defined == Missing Reference: 'VM-IP5' is mentioned on line 292, but not defined == Missing Reference: 'VM-IP6' is mentioned on line 292, but not defined == Missing Reference: 'IP1' is mentioned on line 366, but not defined == Missing Reference: 'MAC1' is mentioned on line 297, but not defined == Missing Reference: 'GW1' is mentioned on line 345, but not defined == Missing Reference: 'GW2' is mentioned on line 345, but not defined == Missing Reference: 'GW3' is mentioned on line 422, but not defined == Missing Reference: 'GW4' is mentioned on line 422, but not defined == Missing Reference: 'MAC2' is mentioned on line 301, but not defined == Missing Reference: 'VM-IP1-M1' is mentioned on line 343, but not defined == Missing Reference: 'VM-IP2-M2' is mentioned on line 343, but not defined == Missing Reference: 'VM-IP3-M3' is mentioned on line 343, but not defined == Missing Reference: 'VM-IP4-M4' is mentioned on line 343, but not defined == Missing Reference: 'VM-IP5-M5' is mentioned on line 343, but not defined == Missing Reference: 'VM-IP6-M6' is mentioned on line 343, but not defined == Missing Reference: 'IP7' is mentioned on line 364, but not defined == Missing Reference: 'M1' is mentioned on line 366, but not defined == Missing Reference: 'RFC 7814' is mentioned on line 726, but not defined == Missing Reference: 'RFC 7432' is mentioned on line 895, but not defined == Unused Reference: 'RFC7814' is defined on line 952, but no explicit reference was found in the text == Outdated reference: A later version (-16) exists of draft-ietf-bess-evpn-proxy-arp-nd-02 == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 ** Downref: Normative reference to an Informational RFC: RFC 7814 Summary: 1 error (**), 0 flaws (~~), 28 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT N. Malhotra, Ed. 4 A. Sajassi 5 A. Pattekar 6 Intended Status: Proposed Standard (Cisco) 7 A. Lingala 8 (AT&T) 9 J. Rabadan 10 (Nokia) 11 J. Drake 12 (Juniper Networks) 14 Expires: August 15, 2018 February 11, 2018 16 Extended Mobility Procedures for EVPN-IRB 17 draft-malhotra-bess-evpn-irb-extended-mobility-02 19 Abstract 21 The procedure to handle host mobility in a layer 2 Network with EVPN 22 control plane is defined as part of RFC 7432. EVPN has since evolved 23 to find wider applicability across various IRB use cases that include 24 distributing both MAC and IP reachability via a common EVPN control 25 plane. MAC Mobility procedures defined in RFC 7432 are extensible to 26 IRB use cases if a fixed 1:1 mapping between VM IP and MAC is assumed 27 across VM moves. Generic mobility support for IP and MAC that allows 28 these bindings to change across moves is required to support a 29 broader set of EVPN IRB use cases, and requires further 30 consideration. EVPN all-active multi-homing further introduces 31 scenarios that require additional consideration from mobility 32 perspective. Intent of this draft is to enumerate a set of design 33 considerations applicable to mobility across EVPN IRB use cases and 34 define generic sequence number assignment procedures to address these 35 IRB use cases. 37 Status of this Memo 39 This Internet-Draft is submitted to IETF in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF), its areas, and its working groups. Note that 44 other groups may also distribute working documents as 45 Internet-Drafts. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 The list of current Internet-Drafts can be accessed at 53 http://www.ietf.org/1id-abstracts.html 55 The list of Internet-Draft Shadow Directories can be accessed at 56 http://www.ietf.org/shadow.html 58 Copyright and License Notice 60 Copyright (c) 2017 IETF Trust and the persons identified as the 61 document authors. All rights reserved. 63 This document is subject to BCP 78 and the IETF Trust's Legal 64 Provisions Relating to IETF Documents 65 (http://trustee.ietf.org/license-info) in effect on the date of 66 publication of this document. Please review these documents 67 carefully, as they describe your rights and restrictions with respect 68 to this document. Code Components extracted from this document must 69 include Simplified BSD License text as described in Section 4.e of 70 the Trust Legal Provisions and are provided without warranty as 71 described in the Simplified BSD License. 73 Table of Contents 75 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 76 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 77 2. Optional MAC only RT-2 . . . . . . . . . . . . . . . . . . . . 5 78 3. Mobility Use Cases . . . . . . . . . . . . . . . . . . . . . . 6 79 3.1 VM MAC+IP Move . . . . . . . . . . . . . . . . . . . . . . 6 80 3.2 VM IP Move to new MAC . . . . . . . . . . . . . . . . . . . 6 81 3.2.1 VM Reload . . . . . . . . . . . . . . . . . . . . . . . 6 82 3.2.2 MAC Sharing . . . . . . . . . . . . . . . . . . . . . . 6 83 3.2.3 Problem . . . . . . . . . . . . . . . . . . . . . . . . 7 84 3.3 VM MAC move to new IP . . . . . . . . . . . . . . . . . . . 8 85 3.3.1 Problem . . . . . . . . . . . . . . . . . . . . . . . . 8 86 4. EVPN All Active multi-homed ES . . . . . . . . . . . . . . . . 10 87 5. Design Considerations . . . . . . . . . . . . . . . . . . . . 11 88 6. Solution Components . . . . . . . . . . . . . . . . . . . . . 12 89 6.1 Sequence Number Inheritance . . . . . . . . . . . . . . . . 12 90 6.2 MAC Sharing . . . . . . . . . . . . . . . . . . . . . . . . 13 91 6.3 Multi-homing Mobility Synchronization . . . . . . . . . . . 14 93 7. Requirements for Sequence Number Assignment . . . . . . . . . 14 94 7.1 LOCAL MAC-IP learning . . . . . . . . . . . . . . . . . . . 14 95 7.2 LOCAL MAC learning . . . . . . . . . . . . . . . . . . . . 15 96 7.3 Remote MAC OR MAC-IP Update . . . . . . . . . . . . . . . . 15 97 7.4 REMOTE (SYNC) MAC update . . . . . . . . . . . . . . . . . 15 98 7.5 REMOTE (SYNC) MAC-IP update . . . . . . . . . . . . . . . . 16 99 7.6 Inter-op . . . . . . . . . . . . . . . . . . . . . . . . . 16 100 8. Routed Overlay . . . . . . . . . . . . . . . . . . . . . . . . 16 101 9. Duplicate Host Detection . . . . . . . . . . . . . . . . . . . 18 102 9.1 Scenario A . . . . . . . . . . . . . . . . . . . . . . . . . 18 103 9.2 Scenario B . . . . . . . . . . . . . . . . . . . . . . . . . 18 104 9.2.1 Duplicate IP Detection Procedure for Scenario B . . . . 19 105 9.3 Scenario C . . . . . . . . . . . . . . . . . . . . . . . . . 19 106 9.4 Duplicate Host Recovery . . . . . . . . . . . . . . . . . . 20 107 9.4.1 Route Un-freezing Configuration . . . . . . . . . . . . 20 108 9.4.2 Route Clearing Configuration . . . . . . . . . . . . . 21 109 10. Security Considerations . . . . . . . . . . . . . . . . . . . 21 110 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 111 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 112 12.1 Normative References . . . . . . . . . . . . . . . . . . . 21 113 12.2 Informative References . . . . . . . . . . . . . . . . . . 22 114 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 22 115 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 116 Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 118 1 Introduction 120 EVPN-IRB enables capability to advertise both MAC and IP routes via a 121 single MAC+IP RT-2 advertisement. MAC is imported into local bridge 122 MAC table and enables L2 bridged traffic across the network overlay. 123 IP is imported into the local ARP table in an asymmetric IRB design 124 OR imported into the IP routing table in a symmetric IRB design, and 125 enables routed traffic across the layer 2 network overlay. Please 126 refer to [EVPN-INTER-SUBNET] more background on EVPN IRB forwarding 127 modes. 129 To support EVPN mobility procedure, a single sequence number mobility 130 attribute is advertised with the combined MAC+IP route. A single 131 sequence number advertised with the combined MAC+IP route to resolve 132 both MAC and IP reachability implicitly assumes a 1:1 fixed mapping 133 between IP and MAC. While a fixed 1:1 mapping between IP and MAC is a 134 common use case that could be addressed via existing MAC mobility 135 procedure, additional IRB scenarios need to be considered, that don't 136 necessarily adhere to this assumption. Following IRB mobility 137 scenarios are considered: 139 o VM move results in VM IP and MAC moving together 141 o VM move results in VM IP moving to a new MAC association 143 o VM move results in VM MAC moving to a new IP association 145 While existing MAC mobility procedure can be leveraged for MAC+IP 146 move in the first scenario, subsequent scenarios result in a new MAC- 147 IP association. As a result, a single sequence number assigned 148 independently per-[MAC, IP] is not sufficient to determine most 149 recent reachability for both MAC and IP, unless the sequence number 150 assignment algorithm is designed to allow for changing MAC-IP 151 bindings across moves. 153 Purpose of this draft is to define additional sequence number 154 assignment and handling procedures to adequately address generic 155 mobility support across EVPN-IRB overlay use cases that allow MAC-IP 156 bindings to change across VM moves and can support mobility for both 157 MAC and IP components carried in an EVPN RT-2 for these use cases. 159 In addition, for hosts on an ESI multi-homed to multiple GW devices, 160 additional procedure is proposed to ensure synchronized sequence 161 number assignments across the multi-homing devices. 163 Content presented in this draft is independent of data plane 164 encapsulation used in the overlay being MPLS or NVO Tunnels. It is 165 also largely independent of the EVPN IRB solution being based on 166 symmetric OR asymmetric IRB design as defined in [EVPN-INTER-SUBNET]. 167 In addition to symmetric and asymmetric IRB, mobility solution for a 168 routed overlay, where traffic to an end host in the overlay is always 169 IP routed using EVPN RT-5 is also presented in section 8. 171 To summarize, this draft covers mobility mobility for the following 172 independent of the overlay encapsulation being MPLS or an NVO Tunnel: 174 o Symmetric EVPN IRB overlay 176 o Asymmetric EVPN IRB overlay 178 o Routed EVPN overlay 180 1.1 Terminology 182 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 183 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 184 document are to be interpreted as described in RFC 2119 [RFC2119]. 186 o ARP is widely referred to in this document. This is simply for 187 ease of reading, and as such, these references are equally 188 applicable to ND (neighbor discovery) as well. 190 o GW: used widely in the document refers to an IRB GW that is 191 doing routing and bridging between an access network and an EVPN 192 enabled overlay network. 194 o RT-2: EVPN route type 2 carrying both MAC and IP reachability 196 o RT-5: EVPN route type 5 carrying IP prefix reachability 198 o ES: EVPN Ethernet Segment 200 o MAC-IP: IP association for a MAC, referred to in this document 201 may be IPv4, IPv6 or both. 203 2. Optional MAC only RT-2 205 In an EVPN IRB scenario, where a single MAC+IP RT-2 advertisement 206 carries both IP and MAC routes, a MAC only RT-2 advertisement is 207 redundant for host MACs that are advertised via MAC+IP RT-2. As a 208 result, a MAC only RT-2 is an optional route that may not be 209 advertised from or received at an IRB GW. This is an important 210 consideration for mobility scenarios discussed in subsequent 211 sections. 213 MAC only RT-2 may still be advertised for non-IP host MACs that are 214 not advertised via MAC+IP RT-2. 216 3. Mobility Use Cases 218 This section describes the IRB mobility use cases considered in this 219 document. Procedures to address them are covered later in section 6 220 and section 7. 222 o VM move results in VM IP and MAC moving together 224 o VM move results in VM IP moving to a new MAC association 226 o VM move results in VM MAC moving to a new IP association 228 3.1 VM MAC+IP Move 230 This is the baseline case, wherein a VM move results in both VM MAC 231 and IP moving together with no change in MAC-IP binding across a 232 move. Existing MAC mobility defined in RFC 7432 may be leveraged to 233 apply to corresponding MAC+IP route to support this mobility 234 scenario. 236 3.2 VM IP Move to new MAC 238 This is the case, where a VM move results in VM IP moving to a new 239 MAC binding. 241 3.2.1 VM Reload 243 A VM reload or an orchestrated VM move that results in VM being re- 244 spawned at a new location may result in VM getting a new MAC 245 assignment, while maintaining existing IP address. This results in a 246 VM IP move to a new MAC binding: 248 IP-a, MAC-a ---> IP-a, MAC-b 250 3.2.2 MAC Sharing 252 This takes into account scenarios, where multiple hosts, each with a 253 unique IP, may share a common MAC binding, and a host move results in 254 a new MAC binding for the host IP. 256 As an example, host VMs running on a single physical server, each 257 with a unique IP, may share the same physical server MAC. In yet 258 another scenario, an L2 access network may be behind a firewall, such 259 that all hosts IPs on the access network are learnt with a common 260 firewall MAC. In all such "shared MAC" use cases, multiple local MAC- 261 IP ARP entries may be learnt with the same MAC. A VM IP move, in such 262 scenarios (for e.g., to a new physical server), could result in new 263 MAC association for the VM IP. 265 3.2.3 Problem 267 In both of the above scenarios, a combined MAC+IP EVPN RT-2 268 advertised with a single sequence number attribute implicitly assumes 269 a fixed IP to MAC mapping. A host IP move to a new MAC breaks this 270 assumption and results in a new MAC+IP route. If this new MAC+IP 271 route is independently assigned a new sequence number, the sequence 272 number can no longer be used to determine most recent host IP 273 reachability in a symmetric EVPN-IRB design OR the most recent IP to 274 MAC binding in an asymmetric EVPN-IRB design. 276 +------------------------+ 277 | Underlay Network Fabric| 278 +------------------------+ 280 +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ 281 | GW1 | | GW2 | | GW3 | | GW4 | | GW5 | | GW6 | 282 +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ 283 \ / \ / \ / 284 \ ESI-1 / \ ESI-2 / \ ESI-3 / 285 \ / \ / \ / 286 +\---/+ +\---/+ +\---/+ 287 | \ / | | \ / | | \ / | 288 +--+--+ +--+--+ +--+--+ 289 | | | 290 Server-MAC1 Server-MAC2 Server-MAC3 291 | | | 292 [VM-IP1, VM-IP2] [VM-IP3, VM-IP4] [VM-IP5, VM-IP6] 294 Figure 1 296 As an example, consider a topology shown in Figure 1, with host VMs 297 sharing the physical server MAC. In steady state, [IP1, MAC1] route 298 is learnt at [GW1, GW2] and advertised to remote GWs with a sequence 299 number N. Now, VM-IP1 is moved to Server-MAC2. ARP or ND based local 300 learning at [GW3, GW4] would now result in a new [IP1, MAC2] route 301 being learnt. If route [IP1, MAC2] is learnt as a new MAC+IP route 302 and assigned a new sequence number of say 0, mobility procedure for 303 VM-IP1 will not trigger across the overlay network. 305 A clear sequence number assignment procedure needs to be defined to 306 unambiguously determine the most recent IP reachability, IP to MAC 307 binding, and MAC reachability for such a MAC sharing scenario. 309 3.3 VM MAC move to new IP 311 This is a scenario where host move or re-provisioning behind a new 312 gateway location may result in the same VM MAC getting a new IP 313 address assigned. 315 3.3.1 Problem 317 Complication with this scenario is that MAC reachability could be 318 carried via a combined MAC+IP route while a MAC only route may not be 319 advertised at all. A single sequence number association with the 320 MAC+IP route again implicitly assumes a fixed mapping between MAC and 321 IP. A MAC move resulting in a new IP association for the host MAC 322 breaks this assumption and results in a new MAC+IP route. If this new 323 MAC+IP route independently assumes a new sequence number, this 324 mobility attribute can no longer be used to determine most recent 325 host MAC reachability as opposed to the older existing MAC 326 reachability. 328 +------------------------+ 329 | Underlay Network Fabric| 330 +------------------------+ 331 +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ 332 | GW1 | | GW2 | | GW3 | | GW4 | | GW5 | | GW6 | 333 +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ 334 \ / \ / \ / 335 \ ESI-1 / \ ESI-2 / \ ESI-3 / 336 \ / \ / \ / 337 +\---/+ +\---/+ +\---/+ 338 | \ / | | \ / | | \ / | 339 +--+--+ +--+--+ +--+--+ 340 | | | 341 Server1 Server2 Server3 342 | | | 343 [VM-IP1-M1, VM-IP2-M2] [VM-IP3-M3, VM-IP4-M4] [VM-IP5-M5, VM-IP6-M6] 345 As an example, IP1-M1 is learnt locally at [GW1, GW2] and currently 346 advertised to remote hosts with a sequence number N. Consider a 347 scenario where a VM with MAC M1 is re-provisioned at server 2, 348 however, as part of this re-provisioning, assigned a different IP 349 address say IP7. [IP7, M1] is learnt as a new route at [GW3, GW4] and 350 advertised to remote GWs with a sequence number of 0. As a result, L3 351 reachability to IP7 would be established across the overlay, however, 352 MAC mobility procedure for MAC1 will not trigger as a result of this 353 MAC-IP route advertisement. If an optional MAC only route is also 354 advertised, sequence number associated with the MAC only route would 355 trigger MAC mobility as per [RFC7432]. However, in the absence of an 356 additional MAC only route advertisement, a single sequence number 357 advertised with a combined MAC+IP route would not be sufficient to 358 update MAC reachability across the overlay. 360 A MAC-IP sequence number assignment procedure needs to be defined to 361 unambiguously determine the most recent MAC reachability in such a 362 scenario without a MAC only route being advertised. 364 Further, GW1/GW2, on learning new reachability for [IP7, M1] via 365 GW3/GW4 MUST probe and delete any local IPs associated with MAC M1, 366 such as [IP1, M1] in the above example. 368 Arguably, MAC mobility sequence number defined in [RFC7432], could be 369 interpreted to apply only to the MAC part of MAC-IP route, and would 370 hence cover this scenario. It could hence be interpreted as a 371 clarification to [RFC7432] and one of the considerations for a common 372 sequence number assignment procedure across all MAC-IP mobility 373 scenarios detailed in this document. 375 4. EVPN All Active multi-homed ES 377 +------------------------+ 378 | Underlay Network Fabric| 379 +------------------------+ 381 +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ 382 | GW1 | | GW2 | | GW3 | | GW4 | | GW5 | | GW6 | 383 +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ 384 \ / \ / \ / 385 \ ESI-1 / \ ESI-2 / \ ESI-3 / 386 \ / \ / \ / 387 +\---/+ +\---/+ +\---/+ 388 | \ / | | \ / | | \ / | 389 +--+--+ +--+--+ +--+--+ 390 | | | 391 Server-1 Server-2 Server-3 393 Figure 2 395 Consider an EVPN-IRB overlay network shown in Figure 2, with hosts 396 multi-homed to two or more leaf GW devices via an all-active multi- 397 homed ES. MAC and ARP entries learnt on a local ESI may also be 398 synchronized across the multi-homing GW devices sharing this ESI. 399 This MAC and ARP SYNC enables local switching of intra and inter 400 subnet ECMP traffic flows from remote hosts. In other words, local 401 MAC and ARP entries on a given Ethernet segment (ES) may be learnt 402 via local learning and / or sync from another GW device sharing the 403 same ES. 405 For a host that is multi-homed to multiple GW devices via an all- 406 active ES interface, local learning of host MAC and MAC-IP at each GW 407 device is an independent asynchronous event, that is dependent on 408 traffic flow and or ARP / ND response from the host hashing to a 409 directly connected GW on the MC-LAG interface. As a result, sequence 410 number mobility attribute value assigned to a locally learnt MAC or 411 MAC-IP route (as per RFC 7432) at each device may not always be the 412 same, depending on transient states on the device at the time of 413 local learning. 415 As an example, consider a host VM that is deleted from ESI-2 and 416 moved to ESI-1. It is possible for host to be learnt on say, GW1 417 following deletion of the remote route from [GW3, GW4], while being 418 learnt on GW2 prior to deletion of remote route from [GW3, GW4]. If 419 so, GW1 would process local host route learning as a new route and 420 assign a sequence number of 0, while GW2 would process local host 421 route learning as a remote to local move and assign a sequence number 422 of N+1, N being the existing sequence number assigned at [GW3, GW4]. 423 Inconsistent sequence numbers advertised from multi-homing devices 424 introduces ambiguity with respect to sequence number based mobility 425 procedures across the overlay. 427 o Ambiguity with respect to how the remote ToRs should handle 428 paths with same ESI and different sequence numbers. A remote ToR 429 may not program ECMP paths if it receives routes with different 430 sequence numbers from a set of multi-homing GWs sharing the same 431 ESI. 433 o Breaks consistent route versioning across the network overlay 434 that is needed for EVPN mobility procedures to work. 436 As an example, in this inconsistent state, GW2 would drop a remote 437 route received for the same host with sequence number N (as its local 438 sequence number is N+1), while GW1 would install it as the best route 439 (as its local sequence number is 0). 441 There is need for a mechanism to ensure consistency of sequence 442 numbers advertised from a set of multi-homing devices for EVPN 443 mobility to work reliably. 445 In order to support mobility for multi-homed hosts using the sequence 446 number mobility attribute, local MAC and MAC-IP routes MUST be 447 advertised with the same sequence number by all GW devices that the 448 ESI is multi-homed to. In other words, there is need for a mechanism 449 to ensure consistency of sequence numbers advertised from a set of 450 multi-homing devices for EVPN mobility to work reliably. 452 5. Design Considerations 454 To summarize, sequence number assignment scheme and implementation 455 must take following considerations into account: 457 o MAC+IP may be learnt on an ESI multi-homed to multiple GW 458 devices, hence requires sequence numbers to be synchronized 459 across multi-homing GW devices. 461 o MAC only RT-2 is optional in an IRB scenario and may not 462 necessarily be advertised in addition to MAC+IP RT-2 464 o Single MAC may be associated with multiple IPs, i.e., multiple 465 host IPs may share a common MAC 467 o Host IP move could result in host moving to a new MAC, resulting 468 in a new IP to MAC association and a new MAC+IP route. 470 o Host MAC move to a new location could result in host MAC being 471 associated with a different IP address, resulting in a new MAC to 472 IP association and a new MAC+IP route 474 o LOCAL MAC-IP learn via ARP would always accompanied by a LOCAL 475 MAC learn event resulting from the ARP packet. MAC and MAC-IP 476 learning, however, could happen in any order 478 o Use cases discussed earlier that do not maintain a constant 1:1 479 MAC-IP mapping across moves could potentially be addressed by 480 using separate sequence numbers associated with MAC and IP 481 components of MAC+IP route. Maintaining two separate sequence 482 numbers however adds significant overhead with respect to 483 complexity, debugability, and backward compatibility. It is 484 therefore goal of solution presented here to address these 485 requirements via a single sequence number attribute. 487 6. Solution Components 489 This section goes over main components of the EVPN IRB mobility 490 solution proposed in this draft. Later sections will go over exact 491 sequence number assignment procedures resulting from concepts 492 described in this section. 494 6.1 Sequence Number Inheritance 496 Main idea presented here is to view a LOCAL MAC-IP route as a child 497 of the corresponding LOCAL MAC only route that inherits the sequence 498 number attribute from the parent LOCAL MAC only route: 500 Mx-IPx -----> Mx (seq# = N) 502 As a result, both parent MAC and child MAC-IP routes share one common 503 sequence number associated with the parent MAC route. Doing so 504 ensures that a single sequence number attribute carried in a combined 505 MAC+IP route represents sequence number for both a MAC only route as 506 well as a MAC+IP route, and hence makes the MAC only route truly 507 optional. As a result, optional MAC only route with its own sequence 508 number is not required to establish most recent reachability for a 509 MAC in the overlay network. Specifically, this enables a MAC to 510 assume a different IP address on a move, and still be able to 511 establish most recent reachability to the MAC across the overlay 512 network via mobility attribute associated with the MAC+IP route 513 advertisement. As an example, when Mx moves to a new location, it 514 would result in LOCAL Mx being assigned a higher sequence number at 515 its new location as per RFC 7432. If this move results in Mx assuming 516 a different IP address, IPz, LOCAL Mx+IPz route would inherit the new 517 sequence number from Mx. 519 LOCAL MAC and LOCAL MAC-IP routes would typically be sourced from 520 data plane learning and ARP learning respectively, and could get 521 learnt in control plane in any order. Implementation could either 522 replicate inherited sequence number in each MAC-IP entry OR maintain 523 a single attribute in the parent MAC by creating a forward reference 524 LOCAL MAC object for cases where a LOCAL MAC-IP is learnt before the 525 LOCAL MAC. 527 Arguably, this inheritance may be assumed from RFC 7432, in which 528 case, the above may be interpreted as a clarification with respect to 529 interpretation of a MAC sequence number in a MAC-IP route. 531 6.2 MAC Sharing 533 Further, for the shared MAC scenario, this would result in multiple 534 LOCAL MAC-IP siblings inheriting sequence number attribute from a 535 common parent MAC route: 537 Mx-IP1 ----- 538 | | 539 Mx-IP2 ----- 540 . | 541 . +---> Mx (seq# = N) 542 . | 543 Mx-IPw ----- 544 | | 545 Mx-IPx ----- 547 In such a case, a host-IP move to a different physical server would 548 result in IP moving to a new MAC binding. A new MAC-IP route 549 resulting from this move must now be advertised with a sequence 550 number that is higher than the previous MAC-IP route for this IP, 551 advertised from the prior location. As an example, consider a route 552 Mx-IPx that is currently advertised with sequence number N from GW1. 553 IPx moving to a new physical server behind GW2 results in IPx being 554 associated with MAC Mz. A new local Mz-IPx route resulting from this 555 move at GW2 must now be advertised with a sequence number higher than 556 N. This is so that GW devices, including GW1, GW2, and other remote 557 GW devices that are part of the overlay can clearly determine and 558 program the most recent MAC binding and reachability for the IP. GW1, 559 on receiving this new Mz-IPx route with sequence number say, N+1, for 560 symmetric IRB case, would update IPx reachability via GW2 in 561 forwarding, for asymmetric IRB case, would update IPx's ARP binding 562 to Mz. In addition, GW1 would clear and withdraw the stale Mx-IPx 563 route with the lower sequence number. 565 This also implies that sequence number associated with local MAC Mz 566 and all local MAC-IP children of Mz at GW2 must now be incremented to 567 N+1, and re-advertised across the overlay. While this re- 568 advertisement of all local MAC-IP children routes affected by the 569 parent MAC route is an overhead, it avoids the need for two separate 570 sequence number attributes to be maintained and advertised for IP and 571 MAC components of MAC+IP RT-2. Implementation would need to be able 572 to lookup MAC-IP routes for a given IP and update sequence number for 573 it's parent MAC and its MAC-IP children. 575 6.3 Multi-homing Mobility Synchronization 577 In order to support mobility for multi-homed hosts, local MAC and 578 MAC-IP routes learnt on the shared ESI MUST be advertised with the 579 same sequence number by all GW devices that the ESI is multi-homed 580 to. This also applies to local MAC only routes. LOCAL MAC and MAC-IP 581 may be learnt natively via data plane and ARP/ND respectively as well 582 as via SYNC from another multi-homing GW to achieve local switching. 583 Local and SYNC route learning can happen in any order. Local MAC-IP 584 routes advertised by all multi-homing GW devices sharing the ESI must 585 carry the same sequence number, independent of the order in which 586 they are learnt. This implies: 588 o On local or sync MAC-IP route learning, sequence number for the 589 local MAC-IP route MUST be compared and updated to the higher 590 value. 592 o On local or sync MAC route learning, sequence number for the 593 local MAC route MUST be compared and updated to the higher value. 595 If an update to local MAC-IP sequence number is required as a result 596 of above comparison with sync MAC-IP route, it would essentially 597 amount to a sequence number update on the parent local MAC, resulting 598 in the inherited sequence number update on the MAC-IP route. 600 7. Requirements for Sequence Number Assignment 602 Following sections summarize sequence number assignment procedure 603 needed on local and sync MAC and MAC-IP route learning events in 604 order to accomplish the above. 606 7.1 LOCAL MAC-IP learning 608 A local Mx-IPx learning via ARP or ND should result in computation OR 609 re-computation of parent MAC Mx's sequence number, following which 610 the MAC-IP route Mx-IPx would simply inherit parent MAC's sequence 611 number. Parent MAC Mx Sequence number should be computed as follows: 613 o MUST be higher than any existing remote MAC route for Mx, as per 614 RFC 7432. 616 o MUST be at least equal to corresponding SYNC MAC sequence number 617 if one is present. 619 o If the IP is also associated with a different remote MAC "Mz", 620 MUST be higher than "Mz" sequence number 622 Once new sequence number for MAC route Mx is computed as per above, 623 all LOCAL MAC-IPs associated with MAC Mx MUST inherit the updated 624 sequence number. 626 7.2 LOCAL MAC learning 628 Local MAC Mx Sequence number should be computed as follows: 630 o MUST be higher than any existing remote MAC route for Mx, as per 631 RFC 7432. 633 o MUST be at least equal to corresponding SYNC MAC sequence number 634 if one is present. 636 o Once new sequence number for MAC route Mx is computed as per 637 above, all LOCAL MAC-IPs associated with MAC Mx MUST inherit the 638 updated sequence number. 640 Note that the local MAC sequence number might already be present if 641 there was a local MAC-IP learnt prior to the local MAC, in which case 642 the above may not result in any change in local MAC's sequence 643 number. 645 7.3 Remote MAC OR MAC-IP Update 647 On receiving a remote MAC OR MAC-IP route update associated with a 648 MAC Mx with a sequence number that is higher than a LOCAL route for 649 MAC Mx: 651 o GW MUST trigger probe and deletion procedure for all LOCAL IPs 652 associated with MAC Mx 654 o GW MUST trigger deletion procedure for LOCAL MAC route for Mx 656 7.4 REMOTE (SYNC) MAC update 658 Corresponding local MAC Mx (if present) Sequence number should be re- 659 computed as follows: 661 o If the current sequence number is less than the received SYNC 662 MAC sequence number, it MUST be increased to be equal to received 663 SYNC MAC sequence number. 665 o If a LOCAL MAC sequence number is updated as a result of the 666 above, all LOCAL MAC-IPs associated with MAC Mx MUST inherit the 667 updated sequence number. 669 7.5 REMOTE (SYNC) MAC-IP update 671 If this is a SYNCed MAC-IP on a local ESI, it would also result in a 672 derived SYNC MAC Mx route entry, as MAC only RT-2 advertisement is 673 optional. Corresponding local MAC Mx (if present) Sequence number 674 should be re-computed as follows: 676 o If the current sequence number is less than the received SYNC 677 MAC sequence number, it MUST be increased to be equal to received 678 SYNC MAC sequence number. 680 o If a LOCAL MAC sequence number is updated as a result of the 681 above, all LOCAL MAC-IPs associated with MAC Mx MUST inherit the 682 updated sequence number. 684 7.6 Inter-op 686 In general, if all GW nodes in the overlay network follow the above 687 sequence number assignment procedure, and the GW is advertising both 688 MAC+IP and MAC routes, sequence number advertised with the MAC and 689 MAC+IP routes with the same MAC would always be the same. However, an 690 inter-op scenario with a different implementation could arise, where 691 a GW implementation non-compliant with this document or with RFC 7432 692 assigns and advertises independent sequence numbers to MAC and MAC+IP 693 routes. To handle this case, if different sequence numbers are 694 received for remote MAC+IP and corresponding remote MAC routes from a 695 remote GW, sequence number associated with the remote MAC route 696 should be computed as: 698 o Highest of the all received sequence numbers with remote MAC+IP 699 and MAC routes with the same MAC. 701 o MAC sequence number would be re-computed on a MAC or MAC+IP 702 route withdraw as per above. 704 A MAC and / or IP move to the local GW would now result in the MAC 705 (and hence all MAC-IP) sequence numbers incremented from the above 706 computed remote MAC sequence number. 708 8. Routed Overlay 709 An additional use case is possible, such that traffic to an end host 710 in the overlay is always IP routed. In a purely routed overlay such 711 as this: 713 o A host MAC is never advertised in EVPN overlay control plane 715 o Host /32 or /128 IP reachability is distributed across the 716 overlay via EVPN route type 5 (RT-5) along with a zero or non- 717 zero ESI 719 o An overlay IP subnet may still be stretched across the underlay 720 fabric, however, intra-subnet traffic across the stretched 721 overlay is never bridged 723 o Both inter-subnet and intra-subnet traffic, in the overlay is 724 IP routed at the EVPN GW. 726 Please refer to [RFC 7814] for more details. 728 Host mobility within the stretched subnet would still need to be 729 supported for this use. In the absence of any host MAC routes, 730 sequence number mobility EXT-COMM specified in [RFC7432], section 7.7 731 may be associated with a /32 OR /128 host IP prefix advertised via 732 EVPN route type 5. MAC mobility procedures defined in RFC 7432 can 733 now be applied as is to host IP prefixes: 735 o On LOCAL learning of a host IP, on a new ESI, host IP MUST be 736 advertised with a sequence number attribute that is higher than 737 what is currently advertised with the old ESI 739 o on receiving a host IP route advertisement with a higher 740 sequence number, a PE MUST trigger ARP/ND probe and deletion 741 procedure on any LOCAL route for that IP with a lower sequence 742 number. A PE would essentially move the forwarding entry to point 743 to the remote route with a higher sequence number and send an 744 ARP/ND PROBE for the local IP route. If the IP has indeed moved, 745 PROBE would timeout and the local IP host route would be deleted. 747 Note that there is still only one sequence number associated with a 748 host route at any time. For earlier use cases where a host MAC is 749 advertised along with the host IP, a sequence number is only 750 associated with a MAC. Only if the MAC is not advertised at all, as 751 in this use case, is a sequence number associated with a host IP. 753 Note that this mobility procedure would not apply to "anycast IPv6" 754 hosts advertised via NA messages with 0-bit=0. Please refer to [EVPN- 755 PROXY-ARP]. 757 9. Duplicate Host Detection 759 Duplicate host detection scenarios across EVPN IRB can be classified 760 as follows: 762 o Scenario A: where two hosts have the same MAC (host IPs may or 763 may not be duplicate) 765 o Scenario B: where two hosts have the same IP but different MACs 767 o Scenario C: where two hosts have the same IP and host MAC is not 768 advertised at all 770 Duplicate detection procedures for scenario B and C would not apply 771 to "anycast IPv6" hosts advertised via NA messages with 0-bit=0. 772 Please refer to [EVPN-PROXY-ARP]. 774 9.1 Scenario A 776 For all use cases where duplicate hosts have the same MAC, MAC is 777 detected as duplicate via duplicate MAC detection procedure described 778 in RFC 7432. Corresponding MAC-IP routes with the same MAC do not 779 require duplicate detection and MUST simply inherit the DUPLICATE 780 property from the corresponding MAC route. In other words, if a MAC 781 route is in DUPLICATE state, all corresponding MAC-IP routes MUST 782 also be treated as DUPLICATE. Duplicate detection procedure need only 783 be applied to MAC routes. 785 9.2 Scenario B 787 Due to misconfiguration, a situation may arise where hosts with 788 different MACs are configured with the same IP. This scenario would 789 not be detected by existing duplicate MAC detection procedure and 790 would result in incorrect forwarding of routed traffic destined to 791 this IP. 793 Such a situation, on LOCAL MAC-IP learning, would be detected as a 794 move scenario via the following local MAC sequence number computation 795 procedure described earlier in section 5.1: 797 o If the IP is also associated with a different remote MAC "Mz", 798 MUST be higher than "Mz" sequence number 800 Such a move that results in sequence number increment on local MAC 801 because of a remote MAC-IP route associated with a different MAC MUST 802 be counted as an "IP move" against the "IP" independent of MAC. 803 Duplicate detection procedure described in RFC 7432 can now be 804 applied to an "IP" entity independent of MAC. Once an IP is detected 805 as DUPLICATE, corresponding MAC-IP route should be treated as 806 DUPLICATE. Associated MAC routes and any other MAC-IP routes 807 associated with this MAC should not be affected. 809 9.2.1 Duplicate IP Detection Procedure for Scenario B 811 Duplicate IP detection procedure for such a scenario is specified in 812 [EVPN-PROXY-ARP]. What counts as an "IP move" in this scenario is 813 further clarified as follows: 815 o On learning a LOCAL MAC-IP route Mx-IPx, check if there is an 816 existing REMOTE OR LOCAL route for IPx with a different MAC 817 association, say, Mz-IPx. If so, count this as an "IP move" count 818 for IPx, independent of the MAC 820 o On learning a REMOTE MAC-IP route Mz-IPx, check if there is an 821 existing LOCAL route for IPx with a different MAC association, 822 say, Mx-IPx. If so, count this as an "IP move" count for IPx, 823 independent of the MAC 825 A MAC-IP route SHOULD be treated as DUPLICATE if either of the 826 following two conditions are met: 828 o Corresponding MAC route is marked as DUPLICATE via existing 829 duplicate detection procedure 831 o Corresponding IP is marked as DUPLICATE via extended procedure 832 described above 834 9.3 Scenario C 836 For a purely routed overlay scenario described in section 8, where 837 only a host IP is advertised via EVPN RT-5, together with a sequence 838 number mobility attribute, duplicate MAC detection procedures 839 specified in RFC 7432 can be intuitively applied to IP only host 840 routes for the purpose of duplicate IP detection. 842 o On learning a LOCAL host IP route IPx, check if there is an 843 existing REMOTE OR LOCAL route for IPx with a different ESI 844 association. If so, count this as an "IP move" count for IPx. 846 o On learning a REMOTE host IP route IPx, check if there is an 847 existing LOCAL route for IPx with a different ESI association. If 848 so, count this as an "IP move" count for IPx 850 o With configurable parameters "N" and "M", If "N" IP moves are 851 detected within "M" seconds for IPx, treat IPx as DUPLICATE 853 9.4 Duplicate Host Recovery 855 Once a MAC or IP is marked as DUPLICATE and FROZEN, corrective action 856 must be taken to un-provision one of the duplicate MAC or IP. Un- 857 provisioning a duplicate MAC or IP in this context refers to a 858 corrective action taken on the host side. Once one of the duplicate 859 MAC or IP is un-provisioned, normal operation would not resume until 860 the duplicate MAC or IP ages out, following this correction, unless 861 additional action is taken to speed up recovery. 863 This section lists possible additional corrective actions that could 864 be taken to achieve faster recovery to normal operation. 866 9.4.1 Route Un-freezing Configuration 868 Unfreezing the DUPLICATE OR FROZEN MAC or IP via a CLI can be 869 leveraged to recover from DUPLICATE and FROZEN state following 870 corrective un-provisioning of the duplicate MAC or IP. 872 Unfreezing the frozen MAC or IP via a CLI at a GW should result in 873 that MAC OR IP being advertised with a sequence number that is higher 874 than the sequence number advertised from the other location of that 875 MAC or IP. 877 Two possible corrective un-provisioning scenarios exist: 879 o Scenario A: A duplicate MAC or IP may have been un-provisioned 880 at the location where it was NOT marked as DUPLICATE and FROZEN 882 o Scenario B: A duplicate MAC or IP may have been un-provisioned 883 at the location where it was marked as DUPLICATE and FROZEN 885 Unfreezing the DUPLICATE and FROZEN MAC or IP, following the above 886 corrective un-provisioning scenarios would result in recovery to 887 steady state as follows: 889 o Scenario A: If the duplicate MAC or IP was un-provisioned at 890 the location where it was NOT marked as DUPLICATE, unfreezing the 891 route at the FROZEN location will result in the route being 892 advertised with a higher sequence number. This would in-turn 893 result in automatic clearing of local route at the GW location, 894 where the host was un-provisioned via ARP/ND PROBE and DELETE 895 procedure specified earlier in section 8 and in [RFC 7432]. 897 o Scenario B: If the duplicate host is un-provisioned at the 898 location where it was marked as DUPLICATE, unfreezing the route 899 will trigger an advertisement with a higher sequence number to 900 the other location. This would in-turn trigger re-learning of 901 local route at the remote location, resulting in another 902 advertisement with a higher sequence number from the remote 903 location. Route at the local location would now be cleared on 904 receiving this remote route advertisement, following the ARP/ND 905 PROBE. 907 9.4.2 Route Clearing Configuration 909 In addition to the above, route clearing CLIs may also be leveraged 910 to clear the local MAC or IP route, to be executed AFTER the 911 duplicate host is un-provisioned: 913 o clear mac CLI: A clear MAC CLI can be leveraged to clear a 914 DUPLICATE MAC route, to recover from a duplicate MAC scenario 916 o clear ARP/ND: A clear ARP/ND CLI may be leveraged to clear a 917 DUPLICATE IP route to recover from a duplicate IP scenario 919 Note that the route unfreeze CLI may still need to be run if the 920 route was un-provisioned and cleared from the NON-DUPLICATE / NON- 921 FROZEN location. Given that unfreezing of the route via the un-freeze 922 CLI would any ways result in auto-clearing of the route from the "un- 923 provisioned" location, as explained in the prior section, need for a 924 route clearing CLI for recovery from DUPLICATE / FROZEN state is 925 truly optional. 927 10. Security Considerations 929 11. IANA Considerations 931 12. References 933 12.1 Normative References 935 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 936 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 937 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 938 2015, . 940 [EVPN-PROXY-ARP] Rabadan et al., "Operational Aspects of Proxy- 941 ARP/ND in EVPN Networks", draft-ietf-bess-evpn-proxy-arp- 942 nd-02, work in progress, April 2017, 943 . 946 [EVPN-INTER-SUBNET] Sajassi et al., "Integrated Routing and Bridging 947 in EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03, 948 work in progress, Feb 2017, 949 . 952 [RFC7814] Xu, X., Jacquenet, C., Raszuk, R., Boyes, T., Fee, B., 953 "Virtual Subnet: A BGP/MPLS IP VPN-Based Subnet Extension 954 Solution", RFC 7814, March 2016, 955 . 957 12.2 Informative References 959 13. Acknowledgements 961 Authors would like to thank Vibov Bhan and Patrice Brisset for 962 feedback and comments through the process. 964 Authors' Addresses 966 Neeraj Malhotra (Editor) 967 EMail: neeraj.ietf@gmail.com 969 Ali Sajassi 970 Cisco 971 EMail: sajassi@cisco.com 973 Aparna Pattekar 974 Cisco 975 Email: apjoshi@cisco.com 977 Avinash Lingala 978 AT&T 979 Email: ar977m@att.com 981 Jorge Rabadan 982 Nokia 983 Email: jorge.rabadan@nokia.com 985 John Drake 986 Juniper Networks 987 EMail: jdrake@juniper.net 989 Appendix A 991 An alternative approach considered was to associate two independent 992 sequence number attributes with MAC and IP components of a MAC-IP 993 route. However, the approach of enabling IRB mobility procedures 994 using a single sequence number associated with a MAC, as specified in 995 this document was preferred for the following reasons: 997 o Procedural overhead and complexity associated with maintaining 998 two separate sequence numbers all the time, only to address 999 scenarios with changing MAC-IP bindings is a big overhead for 1000 topologies where MAC-IP bindings never change. 1002 o Using a single sequence number associated with MAC is much 1003 simpler and adds no overhead for topologies where MAC-IP bindings 1004 never change. 1006 o Using a single sequence number associated with MAC is aligned 1007 with existing MAC mobility implementations. On other words, it is 1008 an easier implementation extension to existing MAC mobility 1009 procedure.