idnits 2.17.1 draft-ietf-bess-evpn-proxy-arp-nd-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == The 'Updates: ' line in the draft header should list only the _numbers_ of the RFCs which will be updated by this document (if approved); it should not include the word 'RFC' in the list. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (October 9, 2020) is 1295 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 6820 == Outdated reference: A later version (-09) exists of draft-ietf-bess-evpn-na-flags-07 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet-Draft S. Sathappan 4 Updates: RFC7432 (if approved) K. Nagaraj 5 Intended status: Standards Track G. Hankins 6 Expires: April 12, 2021 Nokia 7 T. King 8 DE-CIX 9 October 9, 2020 11 Operational Aspects of Proxy-ARP/ND in EVPN Networks 12 draft-ietf-bess-evpn-proxy-arp-nd-09 14 Abstract 16 The EVPN MAC/IP Advertisement route can optionally carry IPv4 and 17 IPv6 addresses associated with a MAC address. Remote PEs importing 18 those routes in the same Broadcast Domain (BD) can use this 19 information to reply locally (act as proxy) to IPv4 ARP requests and 20 IPv6 Neighbor Solicitation messages (or 'unicast-forward' them to the 21 owner of the MAC) and reduce/suppress the flooding produced by the 22 Address Resolution procedure. This EVPN capability is extremely 23 useful in Internet Exchange Points (IXPs) and Data Centers (DCs) with 24 large BDs, where the amount of ARP/ND flooded traffic causes issues 25 on connected routers and CEs. This document describes the EVPN 26 Proxy-ARP/ND function augmented by the capability of the ARP/ND 27 Extended Community, which together help IXPs and other operators to 28 deal with the issues derived from Address Resolution in large BDs. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on April 12, 2021. 47 Copyright Notice 49 Copyright (c) 2020 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 66 2.1. The DC Use-Case . . . . . . . . . . . . . . . . . . . . . 5 67 2.2. The IXP Use-Case . . . . . . . . . . . . . . . . . . . . 5 68 3. Solution Requirements . . . . . . . . . . . . . . . . . . . . 6 69 4. Solution Description . . . . . . . . . . . . . . . . . . . . 7 70 4.1. Learning Sub-Function . . . . . . . . . . . . . . . . . . 9 71 4.1.1. Proxy-ND and the NA Flags . . . . . . . . . . . . . . 10 72 4.2. Reply Sub-Function . . . . . . . . . . . . . . . . . . . 11 73 4.3. Unicast-forward Sub-Function . . . . . . . . . . . . . . 12 74 4.4. Maintenance Sub-Function . . . . . . . . . . . . . . . . 13 75 4.5. Flooding (to Remote PEs) Reduction/Suppression . . . . . 14 76 4.6. Duplicate IP Detection . . . . . . . . . . . . . . . . . 15 77 5. Solution Benefits . . . . . . . . . . . . . . . . . . . . . . 17 78 6. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . 17 79 6.1. All Dynamic Learning . . . . . . . . . . . . . . . . . . 17 80 6.2. Dynamic Learning with Proxy-ARP/ND . . . . . . . . . . . 18 81 6.3. Hybrid Dynamic Learning and Static Provisioning with 82 Proxy-ARP/ND . . . . . . . . . . . . . . . . . . . . . . 18 83 6.4. All Static Provisioning with Proxy-ARP/ND . . . . . . . . 18 84 6.5. Deployment Scenarios in IXPs . . . . . . . . . . . . . . 19 85 6.6. Deployment Scenarios in DCs . . . . . . . . . . . . . . . 20 86 7. Security Considerations . . . . . . . . . . . . . . . . . . . 20 87 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 88 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 89 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 21 90 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 91 11.1. Normative References . . . . . . . . . . . . . . . . . . 21 92 11.2. Informative References . . . . . . . . . . . . . . . . . 22 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 95 1. Terminology 97 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 98 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 99 "OPTIONAL" in this document are to be interpreted as described in 100 BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all 101 capitals, as shown here. 103 BUM: Broadcast, Unknown unicast and Multicast layer-2 traffic. 105 BD: Broadcast Domain. 107 ARP: Address Resolution Protocol. 109 GARP: Gratuitous ARP message. 111 ND: Neighbor Discovery Protocol. 113 NS: Neighbor Solicitation message. 115 NA: Neighbor Advertisement. 117 IXP: Internet eXchange Point. 119 IXP-LAN: the IXP's large Broadcast Domain to where Internet routers 120 are connected. 122 DC: Data Center. 124 IP->MAC: an IP address associated to a MAC address. IP->MAC entries 125 are programmed in Proxy-ARP/ND tables and may be of three different 126 types: dynamic, static or EVPN-learned. 128 SN-multicast address: Solicited-Node IPv6 multicast address used by 129 NS messages. 131 NUD: Neighbor Unreachability Detection, as per [RFC4861]. 133 DAD: Duplicate Address Detection, as per [RFC4861]. 135 SLLA: Source Link Layer Address, as per [RFC4861]. 137 TLLA: Target Link Layer Address, as per [RFC4861]. 139 R Flag: Router Flag in NA messages, as per [RFC4861]. 141 O Flag: Override Flag in NA messages, as per [RFC4861]. 143 S Flag: Solicited Flag in NA messages, as per [RFC4861]. 145 RT2: EVPN Route type 2 or MAC/IP Advertisement route, as per 146 [RFC7432]. 148 MAC or IP DA: MAC or IP Destination Address. 150 MAC or IP SA: MAC or IP Source Address. 152 AS-MAC: Anti-spoofing MAC. 154 LAG: Link Aggregation Group. 156 BD: Broadcast Domain. 158 This document assumes familiarity with the terminology used in 159 [RFC7432]. 161 2. Introduction 163 As specified in [RFC7432] the IP Address field in the MAC/IP 164 Advertisement route may optionally carry one of the IP addresses 165 associated with the MAC address. A PE may learn local IP->MAC pairs 166 and advertise them in EVPN MAC/IP Advertisement routes. Remote PEs 167 importing those routes in the same Broadcast Domain (BD) may add 168 those IP->MAC pairs to their Proxy-ARP/ND tables and reply to local 169 ARP requests or Neighbor Solicitations (or 'unicast-forward' those 170 packets to the owner MAC), reducing and even suppressing in some 171 cases the flooding in the EVPN network. 173 EVPN and its associated Proxy-ARP/ND function are extremely useful in 174 Data Centers (DCs) or Internet Exchange Points (IXPs) with large 175 broadcast domains, where the amount of ARP/ND flooded traffic causes 176 issues on connected routers and CEs. [RFC6820] describes the Address 177 Resolution problems in Large Data Center networks. 179 This document describes the Proxy-ARP/ND function in [RFC7432] 180 networks, augmented by the capability of the ARP/ND Extended 181 Community [I-D.ietf-bess-evpn-na-flags]. 183 Proxy-ARP/ND may be implemented to help IXPs, DCs and other operators 184 deal with the issues derived from Address Resolution in large 185 broadcast domains. 187 2.1. The DC Use-Case 189 As described in [RFC6820] the IPv4 and IPv6 Address Resolution can 190 create a lot of issues in large DCs. In particular, the issues 191 created by the IPv4 Address Resolution Protocol procedures may be 192 significant. 194 On one hand, ARP Requests use broadcast MAC addresses, therefore any 195 Tenant System in a large Broadcast Domain will see a large amount of 196 ARP traffic, which is not addressed to most of the receivers. 198 On the other hand, the flooding issue becomes even worse if some 199 Tenant Systems disappear from the broadcast domain, since some 200 implementations will persistently retry sending ARP Requests. As 201 [RFC6820] states, there are no clear requirements for retransmitting 202 ARP Requests in the absence of replies, hence an implementation may 203 choose to keep retrying endlessly even if there are no replies. 205 The amount of flooding that Address Resolution creates can be 206 mitigated with the use of EVPN and its Proxy-ARP/ND function. 208 2.2. The IXP Use-Case 210 The implementation described in this document is especially useful in 211 IXP networks. 213 A typical IXP provides access to a large layer-2 peering network, 214 where (hundreds of) Internet routers are connected. Because of the 215 requirement to connect all routers to a single layer-2 network the 216 peering networks use IPv4 layer-3 addresses in length ranges from /21 217 to /24 (and even bigger for IPv6), which can create very large 218 broadcast domains. This peering network is transparent to the 219 Customer Edge (CE) devices and therefore floods any ARP request or NS 220 messages to all the CEs in the network. Unsolicited GARP and NA 221 messages are flooded to all the CEs too. 223 In these IXP networks, most of the CEs are typically peering routers 224 and roughly all the BUM traffic is originated by the ARP and ND 225 address resolution procedures. This ARP/ND BUM traffic causes 226 significant data volumes that reach every single router in the 227 peering network. Since the ARP/ND messages are processed in "slow 228 path" software processors and they take high priority in the routers, 229 heavy loads of ARP/ND traffic can cause some routers to run out of 230 resources. CEs disappearing from the network may cause Address 231 Resolution explosions that can make a router with limited processing 232 power fail to keep BGP sessions running. 234 The issue may be better in IPv6 routers, since ND uses SN-multicast 235 address in NS messages, however ARP uses broadcast and has to be 236 processed by all the routers in the network. Some routers may also 237 be configured to broadcast periodic GARPs [RFC5227]. The amount of 238 ARP/ND flooded traffic grows exponentially with the number of IXP 239 participants, therefore the issue can only go worse as new CEs are 240 added. 242 In order to deal with this issue, IXPs have developed certain 243 solutions over the past years. One example is the ARP-Sponge daemon 244 [ARP-Sponge], which can reduce significantly the amount of ARP 245 messages sent to an absent router. While these solutions may 246 mitigate the issues of Address Resolution in large broadcasts 247 domains, EVPN provides new more efficient possibilities to IXPs. 248 EVPN and its Proxy-ARP/ND function may help solve the issue in a 249 distributed and scalable way, fully integrated with the PE network. 251 3. Solution Requirements 253 The distributed EVPN Proxy-ARP/ND function described in this document 254 meets the following requirements: 256 o The solution supports the learning of the CE IP->MAC entries on 257 the EVPN PEs via the management, control or data planes. An 258 implementation should allow to intentionally enable or disable 259 those possible learning mechanisms. 261 o The solution may suppress completely the flooding of the ARP/ND 262 messages in the EVPN network, assuming that all the CE IP->MAC 263 addresses local to the PEs are known or provisioned on the PEs 264 from a management system. Note that in this case, the unknown 265 unicast flooded traffic can also be suppressed, since all the 266 expected unicast traffic will be destined to known MAC addresses 267 in the PE BDs. 269 o The solution reduces significantly the flooding of the ARP/ND 270 messages in the EVPN network, assuming that some or all the CE 271 IP->MAC addresses are learned on the data plane by snooping ARP/ND 272 messages issued by the CEs. 274 o The solution provides a way to refresh periodically the CE IP->MAC 275 entries learned through the data plane, so that the IP->MAC 276 entries are not withdrawn by EVPN when they age out unless the CE 277 is not active anymore. This option helps reducing the EVPN 278 control plane overhead in a network with active CEs that do not 279 send packets frequently. 281 o The solution provides a mechanism to detect duplicate IP 282 addresses. 284 In case of duplication, the detecting PE should not reply to 285 requests for the duplicate IP. Instead, the PE should alert the 286 operator and may optionally prevent any other CE from sending 287 traffic to the duplicate IP. 289 o The solution should not change any existing behavior in the CEs 290 connected to the EVPN PEs. 292 4. Solution Description 294 Figure 1 illustrates an example EVPN network where the Proxy-ARP/ND 295 function is enabled. 297 BD1 298 Proxy-ARP/ND 299 +------------+ 300 IP1/M1 +----------------------------+ |IP1->M1 EVPN| 301 GARP --->Proxy-ARP/ND | |IP2->M2 EVPN| 302 +---+ +----+---+ RT2(IP1/M1) | |IP3->M3 sta | 303 |CE1+------+ BD1 | ------> +------+---|IP4->M4 dyn | 304 +---+ +--------+ | +------------+ 305 PE1 | +--------+ Who has IP1? 306 | EVPN | | BD1 | <----- +---+ 307 | EVI1 | | | | |CE3| 308 IP2/M2 | | | | -----> +---+ 309 GARP --->Proxy-ARP/ND | +--------+ | IP1->M1 310 +---+ +--------+ RT2(IP2/M2) | | 311 |CE2+----+ BD1 | ------> +--------------+ 312 +---+ +--------+ PE3| +---+ 313 PE2 | +----+CE4| 314 +----------------------------+ +---+ 315 <---IP4/M4 GARP 317 Figure 1: Proxy-ARP/ND network example 319 When the Proxy-ARP/ND function is enabled in a BD (Broadcast Domain) 320 of the EVPN PEs, each PE creates a Proxy table specific to that BD 321 that can contain three types of Proxy-ARP/ND entries: 323 a. Dynamic entries: learned by snooping CE's ARP and ND messages. 324 For instance, IP4->M4 in Figure 1. 326 b. Static entries: provisioned on the PE by the management system. 327 For instance, IP3->M3 in Figure 1. 329 c. EVPN-learned entries: learned from the IP/MAC information encoded 330 in the received RT2's coming from remote PEs. For instance, IP1- 331 >M1 and IP2->M2 in Figure 1. 333 As a high level example, the operation of the EVPN Proxy-ARP/ND 334 function in the network of Figure 1 is described below. In this 335 example we assume IP1, IP2 and IP3 are IPv4 addresses: 337 1. Proxy-ARP/ND is enabled in BD1 of PE1, PE2 and PE3. 339 2. The PEs start adding dynamic, static and EVPN-learned entries to 340 their Proxy tables: 342 A. PE3 adds IP1->M1 and IP2->M2 based on the EVPN routes 343 received from PE1 and PE2. Those entries were previously 344 learned as dynamic entries in PE1 and PE2 respectively, and 345 advertised in BGP EVPN. 346 B. PE3 adds IP4->M4 as dynamic. This entry is learned by 347 snooping the corresponding ARP messages sent by CE4. 348 C. An operator also provisions the static entry IP3->M3. 350 3. When CE3 sends an ARP Request asking for the MAC address of IP1, 351 PE3 will: 353 A. Intercept the ARP Request and perform a Proxy-ARP lookup for 354 IP1. 355 B. If the lookup is successful (as in Figure 1), PE3 will send 356 an ARP Reply with IP1->M1. The ARP Request will not be 357 flooded to the EVPN network or any other local CEs. 358 C. If the lookup is not successful, PE3 will flood the ARP 359 Request in the EVPN network and the other local CEs. 361 As PE3 learns more and more host entries in the Proxy-ARP/ND table, 362 the flooding of ARP Request messages is reduced and in some cases it 363 can even be suppressed. In a network where most of the participant 364 CEs are not moving between PEs and they advertise their presence with 365 GARPs or unsolicited NA messages, the ARP/ND flooding as well as the 366 unknown unicast flooding can practically be suppressed. In an EVPN- 367 based IXP network, where all the entries are Static, the ARP/ND 368 flooding is in fact totally suppressed. 370 The Proxy-ARP/ND function can be structured in six sub-functions or 371 procedures: 373 1. Learning sub-function 375 2. Reply sub-function 376 3. Unicast-forward sub-function 378 4. Maintenance sub-function 380 5. Flooding reduction/suppression sub-function 382 6. Duplicate IP detection sub-function 384 A Proxy-ARP/ND implementation MAY support all those sub-functions or 385 only a subset of them. The following sections describe each 386 individual sub-function. 388 4.1. Learning Sub-Function 390 A Proxy-ARP/ND implementation SHOULD support static, dynamic and 391 EVPN-learned entries. 393 Static entries are provisioned from the management plane. The 394 provisioned static IP->MAC entry SHOULD be advertised in EVPN with an 395 ARP/ND extended community where the Immutable ARP/ND Binding Flag 396 flag (I) is set to 1, as per [I-D.ietf-bess-evpn-na-flags]. When the 397 I flag in the ARP/ND extended community is 1, the advertising PE 398 indicates that the IP address MUST NOT be associated to a MAC, other 399 than the one included in the MAC/IP Advertisement route. The 400 advertisement of I=1 in the ARP/ND extended community is compatible 401 with any value of the Sticky bit (S) or Sequence Number in the 402 [RFC7432] MAC Mobility extended community. Note that the I bit in 403 the ARP/ND extended community refers to the immutable configured 404 association between the IP and the MAC address in the IP->MAC 405 binding, whereas the S bit in the MAC Mobility extended community 406 refers to the fact that the advertised MAC address is not subject to 407 the [RFC7432] mobility procedures. 409 An entry MAY associate a configured static IP to a list of potential 410 MACs, i.e. IP1->(MAC1,MAC2..MACN). When there is more than one MAC 411 in the list of allowed MACs, the PE will not advertise any IP->MAC in 412 EVPN until a local ARP/NA message or any other frame is received from 413 the CE. Upon receiving traffic from the CE, the PE will check that 414 the source MAC is included in the list of allowed MACs. Only in that 415 case, the PE will activate the IP->MAC and advertise it in EVPN. 417 EVPN-learned entries MUST be learned from received valid EVPN MAC/IP 418 Advertisement routes containing a MAC and IP address. 420 Dynamic entries are learned in different ways depending on whether 421 the entry contains an IPv4 or IPv6 address: 423 a. Proxy-ARP dynamic entries: 425 They SHOULD be learned by snooping any ARP packet (Ethertype 426 0x0806) received from the CEs attached to the BD. The 427 Learning function will add the Sender MAC and Sender IP of the 428 snooped ARP packet to the Proxy-ARP table. Note that MAC and 429 IPs with value 0 SHOULD NOT be learned. 431 b. Proxy-ND dynamic entries: 433 They SHOULD be learned out of the Target Address and TLLA 434 information in NA messages (Ethertype 0x86DD, ICMPv6 type 136) 435 received from the CEs attached to the BD. A Proxy-ND 436 implementation SHOULD NOT learn IP->MAC entries from NS 437 messages, since they don't contain the R Flag required by the 438 Proxy-ND reply function. See section 4.1.1 for more 439 information about the R Flag. 441 Note that if the O Flag is zero in the received NA message, 442 the IP->MAC SHOULD only be learned in case IPv6 'anycast' is 443 enabled in the EVI. 445 The following procedure associated to the Learning sub-function is 446 RECOMMENDED: 448 o When a new Proxy-ARP/ND EVPN or static active entry is learned (or 449 provisioned), the PE SHOULD send an unsolicited GARP or NA message 450 to the access CEs. The PE SHOULD send an unsolicited GARP/NA 451 message for dynamic entries only if the ARP/NA message creating 452 the entry was NOT flooded before. This unsolicited GARP/NA 453 message makes sure the CE ARP/ND caches are updated even if the 454 ARP/NS/NA messages from remote CEs are not flooded in the EVPN 455 network. 457 Note that if a Static entry is provisioned with the same IP as an 458 existing EVPN-learned or Dynamic entry, the Static entry takes 459 precedence. 461 4.1.1. Proxy-ND and the NA Flags 463 [RFC4861] describes the use of the R Flag in IPv6 Address Resolution: 465 o Nodes capable of routing IPv6 packets must reply to NS messages 466 with NA messages where the R Flag is set (R Flag=1). 468 o Hosts that are not able to route IPv6 packets must indicate that 469 inability by replying with NA messages that contain R Flag=0. 471 The use of the R Flag in NA messages has an impact on how hosts 472 select their default gateways when sending packets off-link: 474 o Hosts build a Default Router List based on the received RAs and 475 NAs with R Flag=1. Each cache entry has an IsRouter flag, which 476 must be set based on the R Flag in the received NAs. A host can 477 choose one or more Default Routers when sending packets off-link. 479 o In those cases where the IsRouter flag changes from TRUE to FALSE 480 as a result of a NA update, the node MUST remove that router from 481 the Default Router List and update the Destination Cache entries 482 for all destinations using that neighbor as a router, as specified 483 in [RFC4861] section 7.3.3. This is needed to detect when a node 484 that is used as a router stops forwarding packets due to being 485 configured as a host. 487 The R Flag and O Flag will be learned in the following ways: 489 o Static entries SHOULD have the R Flag information added by the 490 management interface. The O Flag information MAY also be added by 491 the management interface. 493 o Dynamic entries SHOULD learn the R Flag and MAY learn the O Flag 494 from the snooped NA messages used to learn the IP->MAC itself. 496 o EVPN-learned entries SHOULD learn the R Flag and MAY learn the O 497 Flag from the ARP/ND Extended Community 498 [I-D.ietf-bess-evpn-na-flags] received from EVPN along with the 499 RT2 used to learn the IP->MAC itself. If no ARP/ND extended 500 community is received, the PE will add a configured R Flag/O Flag 501 to the entry. This configured R Flag SHOULD be an administrative 502 choice with a default value of 1. 504 Note that the O Flag SHOULD only be learned if 'anycast' is enabled 505 in the BD. If so, Duplicate IP Detection must be disabled so that 506 the PE is able to learn the same IP mapped to different MACs in the 507 same Proxy-ND table. If 'anycast' is disabled, NA messages with O 508 Flag = 0 will not create a Proxy-ND entry, hence no EVPN 509 advertisement with ND extended community will be generated. 511 4.2. Reply Sub-Function 513 This sub-function will reply to Address Resolution requests/ 514 solicitations upon successful lookup in the Proxy-ARP/ND table for a 515 given IP address. The following considerations should be taken into 516 account: 518 a. When replying to ARP Request or NS messages, the PE SHOULD use 519 the Proxy-ARP/ND entry MAC address as MAC SA. This is 520 RECOMMENDED so that the resolved MAC can be learned in the MAC 521 FIB of potential layer-2 switches sitting between the PE and the 522 CE requesting the Address Resolution. 524 b. A PE SHOULD NOT reply to a request/solicitation received on the 525 same attachment circuit over which the IP->MAC is learned. In 526 this case the requester and the requested IP are assumed to be 527 connected to the same layer-2 switch/access network linked to the 528 PE's attachment circuit, and therefore the requested IP owner 529 will receive the request directly. 531 c. A PE SHOULD reply to broadcast/multicast Address Resolution 532 messages, that is, ARP-Request, NS messages as well as DAD NS 533 messages. A PE SHOULD NOT reply to unicast Address Resolution 534 requests (for instance, NUD NS messages). 536 d. A PE SHOULD include the R-bit learned for the IP->MAC entry in 537 the NA messages (see Section 4.1.1). The S Flag will be set/ 538 unset as per [RFC4861]. The O Flag will be included if IPv6 539 'anycast' is enabled in the BD and it is learned for the IP->MAC 540 entry. If 'anycast' is enabled and there are more than one MAC 541 for a given IP, the PE will reply to NS messages with as many NA 542 responses as 'anycast' entries are in the Proxy-ND table. 544 e. A PE SHOULD NOT reply to ARP probes received from the CEs. An 545 ARP probe is an ARP request constructed with an all-zero sender 546 IP address that may be used by hosts for IPv4 Address Conflict 547 Detection [RFC5227]. 549 f. A PE SHOULD only reply to ARP-Request and NS messages with the 550 format specified in [RFC0826] and [RFC4861] respectively. 551 Received ARP-Requests and NS messages with unknown options SHOULD 552 be either forwarded (as unicast packets) to the owner of the 553 requested IP (assuming the MAC is known in the Proxy-ARP/ND table 554 and BD) or discarded. An administrative option to control this 555 behavior ('unicast-forward' or 'discard') SHOULD be supported. 556 The 'unicast-forward' option is described in section Section 4.3. 558 4.3. Unicast-forward Sub-Function 560 As discussed in Section 4.2, in some cases the operator may want to 561 'unicast-forward' certain ARP-Request and NS messages as opposed to 562 reply to them. The operator SHOULD be able to activate this option 563 with one of the following parameters: 565 a. unicast-forward always 567 b. unicast-forward unknown-options 568 If 'unicast-forward always' is enabled, the PE will perform a Proxy- 569 ARP/ND table lookup and in case of a hit, the PE will forward the 570 packet to the owner of the MAC found in the Proxy-ARP/ND table. This 571 is irrespective of the options carried in the ARP/ND packet. This 572 option provides total transparency in the BD and yet reduces the 573 amount of flooding significantly. 575 If 'unicast-forward unknown-options' is enabled, upon a successful 576 Proxy-ARP/ND lookup, the PE will perform a 'unicast-forward' action 577 only if the ARP-Request or NS messages carry unknown options, as 578 explained in Section 4.2. As an example, this would allow to enable 579 Proxy-ND and Secure ND [RFC3971] in the same EVI. The 'unicast- 580 forward unknown-options' configuration allows the support of new 581 applications using ARP/ND in the BD while still reducing the 582 flooding. 584 4.4. Maintenance Sub-Function 586 The Proxy-ARP/ND tables SHOULD follow a number of maintenance 587 procedures so that the dynamic IP->MAC entries are kept if the owner 588 is active and flushed if the owner is no longer in the network. The 589 following procedures are RECOMMENDED: 591 a. Age-time 593 A dynamic Proxy-ARP/ND entry MUST be flushed out of the table if 594 the IP->MAC has not been refreshed within a given age-time. The 595 entry is refreshed if an ARP or NA message is received for the 596 same IP->MAC entry. The age-time is an administrative option and 597 its value should be carefully chosen depending on the specific 598 use-case: in IXP networks (where the CE routers are fairly 599 static) the age-time may normally be longer than in DC networks 600 (where mobility is required). 602 b. Send-refresh option 604 The PE MAY send periodic refresh messages (ARP/ND "probes") to 605 the owners of the dynamic Proxy-ARP/ND entries, so that the 606 entries can be refreshed before they age out. The owner of the 607 IP->MAC entry would reply to the ARP/ND probe and the 608 corresponding entry age-time reset. The periodic send-refresh 609 timer is an administrative option and is RECOMMENDED to be a 610 third of the age-time or a half of the age-time in scaled 611 networks. 613 An ARP refresh issued by the PE will be an ARP-Request message 614 with the Sender's IP = 0 sent from the PE's MAC SA. If the PE 615 has an IP address in the subnet, for instance on an IRB 616 interface, then it MAY use it as a source for the ARP request 617 (instead of Sender's IP = 0). An ND refresh will be a NS message 618 issued from the PE's MAC SA and a Link Local Address associated 619 to the PE's MAC. 621 The refresh request messages SHOULD be sent only for dynamic 622 entries and not for static or EVPN-learned entries. Even though 623 the refresh request messages are broadcast or multicast, the PE 624 SHOULD only send the message to the attachment circuit associated 625 to the MAC in the IP->MAC entry. 627 The age-time and send-refresh options are used in EVPN networks to 628 avoid unnecessary EVPN RT2 withdrawals: if refresh messages are sent 629 before the corresponding BD FIB and Proxy-ARP/ND age-time for a given 630 entry expires, inactive but existing hosts will reply, refreshing the 631 entry and therefore avoiding unnecessary EVPN MAC/IP Advertisement 632 withdrawals in EVPN. Both entries (MAC in the BD and IP->MAC in 633 Proxy-ARP/ND) are reset when the owner replies to the ARP/ND probe. 634 If there is no response to the ARP/ND probe, the MAC and IP->MAC 635 entries will be legitimately flushed and the RT2s withdrawn. 637 4.5. Flooding (to Remote PEs) Reduction/Suppression 639 The Proxy-ARP/ND function implicitly helps reducing the flooding of 640 ARP Request and NS messages to remote PEs in an EVPN network. 641 However, in certain use-cases, the flooding of ARP/NS/NA messages 642 (and even the unknown unicast flooding) to remote PEs can be 643 suppressed completely in an EVPN network. 645 For instance, in an IXP network, since all the participant CEs are 646 well known and will not move to a different PE, the IP->MAC entries 647 may be all provisioned by a management system. Assuming the entries 648 for the CEs are all provisioned on the local PE, a given Proxy-ARP/ND 649 table will only contain static and EVPN-learned entries. In this 650 case, the operator may choose to suppress the flooding of ARP/NS/NA 651 to remote PEs completely. 653 The flooding may also be suppressed completely in IXP networks with 654 dynamic Proxy-ARP/ND entries assuming that all the CEs are directly 655 connected to the PEs and they all advertise their presence with a 656 GARP/unsolicited-NA when they connect to the network. 658 In networks where fast mobility is expected (DC use-case), it is NOT 659 RECOMMENDED to suppress the flooding of unknown ARP-Requests/NS or 660 GARPs/unsolicited-NAs. Unknown ARP-Requests/NS refer to those ARP- 661 Request/NS messages for which the Proxy-ARP/ND lookups for the 662 requested IPs do not succeed. 664 In order to give the operator the choice to suppress/allow the 665 flooding to remote PEs, a PE MAY support administrative options to 666 individually suppress/allow the flooding of: 668 o Unknown ARP-Request and NS messages. 670 o GARP and unsolicited-NA messages. 672 The operator will use these options based on the expected behavior on 673 the CEs. 675 4.6. Duplicate IP Detection 677 The Proxy-ARP/ND function SHOULD support duplicate IP detection so 678 that ARP/ND-spoofing attacks or duplicate IPs due to human errors can 679 be detected. 681 ARP/ND spoofing is a technique whereby an attacker sends "fake" ARP/ 682 ND messages onto a broadcast domain. Generally the aim is to 683 associate the attacker's MAC address with the IP address of another 684 host causing any traffic meant for that IP address to be sent to the 685 attacker instead. 687 The distributed nature of EVPN and Proxy-ARP/ND allows the easy 688 detection of duplicated IPs in the network, in a similar way to the 689 MAC duplication function supported by [RFC7432] for MAC addresses. 691 Duplicate IP detection monitors "IP-moves" in the Proxy-ARP/ND table 692 in the following way: 694 a. When an existing active IP1->MAC1 entry is modified, a PE starts 695 an M-second timer (default value of M=180), and if it detects N 696 IP moves before the timer expires (default value of N=5), it 697 concludes that a duplicate IP situation has occurred. An IP move 698 is considered when, for instance, IP1->MAC1 is replaced by 699 IP1->MAC2 in the Proxy-ARP/ND table. Static IP->MAC entries, 700 that is, locally provisioned or EVPN-learned entries (with I=1 in 701 the ARP/ND extended community), are not subject to this 702 procedure. Static entries MUST NOT be overridden by dynamic 703 Proxy-ARP/ND entries. 705 b. In order to detect the duplicate IP faster, the PE MAY send a 706 CONFIRM message to the former owner of the IP. A CONFIRM message 707 is a unicast ARP-Request/NS message sent by the PE to the MAC 708 addresses that previously owned the IP, when the MAC changes in 709 the Proxy-ARP/ND table. The CONFIRM message uses a sender's IP 710 0.0.0.0 in case of ARP (if the PE has an IP address in the subnet 711 then it MAY use it) and an IPv6 Link Local Address in case of NS. 713 If the PE does not receive an answer within a given timer, the 714 new entry will be confirmed and activated. In case of spoofing, 715 for instance, if IP1->MAC1 moves to IP1->MAC2, the PE may send a 716 unicast ARP-Request/NS message for IP1 with MAC DA= MAC1 and MAC 717 SA= PE's MAC. This will force the legitimate owner respond if 718 the move to MAC2 was spoofed, and make the PE issue another 719 CONFIRM message, this time to MAC DA= MAC2. If both, legitimate 720 owner and spoofer keep replying to the CONFIRM message, the PE 721 will detect the duplicate IP within the M timer: 723 - If the IP1->MAC1 pair was previously owned by the spoofer and 724 the new IP1->MAC2 was from a valid CE, then the issued CONFIRM 725 message would trigger a response from the spoofer. 727 - If it were the other way around, that is, IP1->MAC1 was 728 previously owned by a valid CE, the CONFIRM message would 729 trigger a response from the CE. 731 Either way, if this process continues, then duplicate 732 detection will kick in. 734 c. Upon detecting a duplicate IP situation: 736 1. The entry in duplicate detected state cannot be updated with 737 new dynamic or EVPN-learned entries for the same IP. The 738 operator MAY override the entry though with a static IP->MAC. 740 2. The PE SHOULD alert the operator and stop responding ARP/NS 741 for the duplicate IP until a corrective action is taken. 743 3. Optionally the PE MAY associate an "anti-spoofing-mac" (AS- 744 MAC) to the duplicate IP. The PE will send a GARP/ 745 unsolicited-NA message with IP1->AS-MAC to the local CEs as 746 well as an RT2 (with IP1->AS-MAC) to the remote PEs. This 747 will force all the CEs in the EVI to use the AS-MAC as MAC DA 748 for IP1, and prevent the spoofer from attracting any traffic 749 for IP1. Since the AS-MAC is a managed MAC address known by 750 all the PEs in the EVI, all the PEs MAY apply filters to drop 751 and/or log any frame with MAC DA= AS-MAC. The advertisement 752 of the AS-MAC as a "black-hole MAC" that can be used directly 753 in the BD to drop frames is for further study. 755 d. The duplicate IP situation will be cleared when a corrective 756 action is taken by the operator, or alternatively after a HOLD- 757 DOWN timer (default value of 540 seconds). 759 The values of M, N and HOLD-DOWN timer SHOULD be a configurable 760 administrative option to allow for the required flexibility in 761 different scenarios. 763 For Proxy-ND, Duplicate IP Detection SHOULD only monitor IP moves for 764 IP->MACs learned from NA messages with O Flag=1. NA messages with O 765 Flag=0 would not override the ND cache entries for an existing IP. 766 Duplicate IP Detection for IPv6 SHOULD be disabled when IPv6 767 'anycast' is activated in a given EVI. 769 5. Solution Benefits 771 The solution described in this document provides the following 772 benefits: 774 a. It may suppress completely the flooding of the ARP/ND and 775 unknown-unicast messages in the EVPN network, in cases where all 776 the CE IP->MAC addresses local to the PEs are known and 777 provisioned on the PEs from a management system. 779 b. Reduces significantly the flooding of the ARP/ND and unknown- 780 unicast messages in the EVPN network, in cases where all the CE 781 IP->MAC addresses local to the PEs are known and provisioned on 782 the PEs from a management system. 784 c. Reduces the control plane overhead and unnecessary BGP MAC/IP 785 Advertisements and Withdrawals in a network with active CEs that 786 do not send packets frequently. 788 d. Provides a mechanism to detect duplicate IP addresses and avoid 789 ARP/ND-spoof attacks or the effects of duplicate addresses due to 790 human errors. 792 6. Deployment Scenarios 794 Four deployment scenarios with different levels of ARP/ND control are 795 available to operators using this solution, depending on their 796 requirements to manage ARP/ND: all dynamic learning, all dynamic 797 learning with Proxy-ARP/ND, hybrid dynamic learning and static 798 provisioning with Proxy-ARP/ND, and all static provisioning with 799 Proxy-ARP/ND. 801 6.1. All Dynamic Learning 803 In this scenario for minimum security and mitigation, EVPN is 804 deployed in the peering network with the Proxy-ARP/ND function 805 shutdown. PEs do not intercept ARP/ND requests and flood all 806 requests, as in a conventional layer-2 network. While no ARP/ND 807 mitigation is used in this scenario, the IXP can still take advantage 808 of EVPN features such as control plane learning and all-active 809 multihoming in the peering network. Existing mitigation solutions, 810 such as the ARP-Sponge daemon [ARP-Sponge] MAY also be used in this 811 scenario. 813 Although this option does not require any of the procedures described 814 in this document, it is added as baseline/default option for 815 completeness. This option is equivalent to VPLS as far as ARP/ND is 816 concerned. The options described in Section 6.2, Section 6.3 and 817 Section 6.4 are only possible in EVPN networks in combination with 818 their Proxy-ARP/ND capabilities. 820 6.2. Dynamic Learning with Proxy-ARP/ND 822 This scenario minimizes flooding while enabling dynamic learning of 823 IP->MAC entries. The Proxy-ARP/ND function is enabled in the BDs of 824 the EVPN PEs, so that the PEs intercept and respond to CE requests. 826 The solution MAY further reduce the flooding of the ARP/ND messages 827 in the EVPN network by snooping ARP/ND messages issued by the CEs. 829 PEs will flood requests if the entry is not in their Proxy table. 830 Any unknown source MAC->IP entries will be learnt and advertised in 831 EVPN, and traffic to unknown entries is discarded at the ingress PE. 833 6.3. Hybrid Dynamic Learning and Static Provisioning with Proxy-ARP/ND 835 Some IXPs want to protect particular hosts on the peering network 836 while allowing dynamic learning of peering router addresses. For 837 example, an IXP may want to configure static MAC->IP entries for 838 management and infrastructure hosts that provide critical services. 839 In this scenario, static entries are provisioned from the management 840 plane for protected MAC->IP addresses, and dynamic learning with 841 Proxy-ARP/ND is enabled as described in Section 6.2 on the peering 842 network. 844 6.4. All Static Provisioning with Proxy-ARP/ND 846 For a solution that maximizes security and eliminates flooding and 847 unknown unicast in the peering network, all MAC-IP entries are 848 provisioned from the management plane. The Proxy-ARP/ND function is 849 enabled in the BDs of the EVPN PEs, so that the PEs intercept and 850 respond to CE requests. Dynamic learning and ARP/ND snooping is 851 disabled so that traffic to unknown entries is discarded at the 852 ingress PE. This scenario provides an IXP the most control over 853 MAC->IP entries and allows an IXP to manage all entries from a 854 management system. 856 6.5. Deployment Scenarios in IXPs 858 Nowadays, almost all IXPs installed some security rules in order to 859 protect the IXP-LAN. These rules are often called port security. 860 Port security summarizes different operational steps that limit the 861 access to the IXP-LAN, to the customer router and controls the kind 862 of traffic that the routers are allowed to be exchange (e.g., 863 Ethernet, IPv4, IPv6). Due to this, the deployment scenario as 864 described in Section 6.4 "All Static Provisioning with Proxy-ARP/ND" 865 is the predominant scenario for IXPs. 867 In addition to the "All Static Provisioning" behavior, in IXP 868 networks it is recommended to configure the Reply Sub-Function to 869 'discard' ARP-Requests/NS messages with unrecognized options. 871 At IXPs, customers usually follow a certain operational life-cycle. 872 For each step of the operational life-cycle specific operational 873 procedures are executed. 875 The following describes the operational procedures that are needed to 876 guarantee port security throughout the life-cycle of a customer with 877 focus on EVPN features: 879 1. A new customer is connected the first time to the IXP: 881 Before the connection between the customer router and the IXP-LAN 882 is activated, the MAC of the router is white-listed on the IXP's 883 switch port. All other MAC addresses are blocked. Pre-defined 884 IPv4 and IPv6 addresses of the IXP's peering network space are 885 configured at the customer router. The IP->MAC static entries 886 (IPv4 and IPv6) are configured in the management system of the 887 IXP for the customer's port in order to support Proxy-ARP/ND. 889 In case a customer uses multiple ports aggregated to a single 890 logical port (LAG) some vendors randomly select the MAC address 891 of the LAG from the different MAC addresses assigned to the 892 ports. In this case the static entry will be used associated to 893 a list of allowed MACs. 895 2. Replacement of customer router: 897 If a customer router is about to be replaced, the new MAC 898 address(es) must be installed in the management system besides 899 the MAC address(es) of the currently connected router. This 900 allows the customer to replace the router without any active 901 involvement of the IXP operator. For this, static entries are 902 also used. After the replacement takes place, the MAC 903 address(es) of the replaced router can be removed. 905 3. Decommissioning a customer router 907 If a customer router is decommissioned, the router is 908 disconnected from the IXP PE. Right after that, the MAC 909 address(es) of the router and IP->MAC bindings can be removed 910 from the management system. 912 6.6. Deployment Scenarios in DCs 914 DCs normally have different requirements than IXPs in terms of Proxy- 915 ARP/ND. Some differences are listed below: 917 a. The required mobility in virtualized DCs makes the "Dynamic 918 Learning" or "Hybrid Dynamic and Static Provisioning" models more 919 appropriate than the "All Static Provisioning" model. 921 b. IPv6 'anycast' may be required in DCs, while it is not a 922 requirement in IXP networks. Therefore if the DC needs IPv6 923 'anycast' it will be explicitly enabled in the Proxy-ND function, 924 hence the Proxy-ND sub-functions modified accordingly. For 925 instance, if IPv6 'anycast' is enabled in the Proxy-ND function, 926 Duplicate IP Detection must be disabled. 928 c. DCs may require special options on ARP/ND as opposed to the 929 Address Resolution function, which is the only one typically 930 required in IXPs. Based on that, the Reply Sub-function may be 931 modified to forward or discard unknown options. 933 7. Security Considerations 935 The procedures in this document reduce the amount of ARP/ND message 936 flooding, which in itself provides a protection to "slow path" 937 software processors of routers and Tenant Systems in large BDs. The 938 ARP/ND requests that are replied by the Proxy-ARP/ND function (hence 939 not flooded) are normally targeted to existing hosts in the BD. ARP/ 940 ND requests targeted to absent hosts are still normally flooded, 941 however the suppression of Unknown ARP-Requests and NS messages 942 described in Section 4.5. can provide an additional level of security 943 against ARP-Requests/NS messages issued to non-existing hosts. 945 The solution also provides protection against Denial Of Service 946 attacks that use ARP/ND-spoofing as a first step. The Duplicate IP 947 Detection and the use of an AS-MAC as explained in Section 4.6 will 948 definitely protect the BD against ARP/ND spoofing. 950 When EVPN and its associated Proxy-ARP/ND function are used in IXP 951 networks, they provide ARP/ND security and mitigation. IXPs MUST 952 still employ additional security mechanisms that protect the peering 953 network and SHOULD follow established BCPs such as the ones described 954 in [Euro-IX-BCP]. 956 For example, IXPs should disable all unneeded control protocols, and 957 block unwanted protocols from CEs so that only IPv4, ARP and IPv6 958 Ethertypes are permitted on the peering network. In addition, port 959 security features and ACLs can provide an additional level of 960 security. 962 8. IANA Considerations 964 No IANA considerations. 966 9. Acknowledgments 968 The authors want to thank Ranganathan Boovaraghavan, Sriram 969 Venkateswaran, Manish Krishnan, Seshagiri Venugopal, Tony Przygienda, 970 Robert Raszuk and Iftekhar Hussain for their review and 971 contributions. Thank you to Oliver Knapp as well, for his detailed 972 review. 974 10. Contributors 976 In addition to the authors listed on the front page, the following 977 co-authors have also contributed to this document: 979 Wim Henderickx 980 Nokia 982 Daniel Melzer 983 DE-CIX Management GmbH 985 Erik Nordmark 986 Zededa 988 11. References 990 11.1. Normative References 992 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 993 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 994 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 995 2015, . 997 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 998 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 999 DOI 10.17487/RFC4861, September 2007, 1000 . 1002 [RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or 1003 Converting Network Protocol Addresses to 48.bit Ethernet 1004 Address for Transmission on Ethernet Hardware", STD 37, 1005 RFC 826, DOI 10.17487/RFC0826, November 1982, 1006 . 1008 [RFC6820] Narten, T., Karir, M., and I. Foo, "Address Resolution 1009 Problems in Large Data Center Networks", RFC 6820, 1010 DOI 10.17487/RFC6820, January 2013, 1011 . 1013 [RFC3971] Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander, 1014 "SEcure Neighbor Discovery (SEND)", RFC 3971, 1015 DOI 10.17487/RFC3971, March 2005, 1016 . 1018 [RFC5227] Cheshire, S., "IPv4 Address Conflict Detection", RFC 5227, 1019 DOI 10.17487/RFC5227, July 2008, 1020 . 1022 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1023 Requirement Levels", BCP 14, RFC 2119, 1024 DOI 10.17487/RFC2119, March 1997, 1025 . 1027 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1028 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1029 May 2017, . 1031 [I-D.ietf-bess-evpn-na-flags] 1032 Rabadan, J., Sathappan, S., Nagaraj, K., and W. Lin, 1033 "Propagation of ARP/ND Flags in EVPN", draft-ietf-bess- 1034 evpn-na-flags-07 (work in progress), October 2020. 1036 11.2. Informative References 1038 [ARP-Sponge] 1039 N., W. M. A. S., "Effects of IPv4 and IPv6 address 1040 resolution on AMS-IX and the ARP Sponge", July 2009. 1042 [Euro-IX-BCP] 1043 Euro-IX, "European Internet Exchange Association Best 1044 Practises". 1046 Authors' Addresses 1048 Jorge Rabadan (editor) 1049 Nokia 1050 777 Middlefield Road 1051 Mountain View, CA 94043 1052 USA 1054 Email: jorge.rabadan@nokia.com 1056 Senthil Sathappan 1057 Nokia 1058 701 E. Middlefield Road 1059 Mountain View, CA 94043 USA 1061 Email: senthil.sathappan@nokia.com 1063 Kiran Nagaraj 1064 Nokia 1065 701 E. Middlefield Road 1066 Mountain View, CA 94043 USA 1068 Email: kiran.nagaraj@nokia.com 1070 Greg Hankins 1071 Nokia 1073 Email: greg.hankins@nokia.com 1075 Thomas King 1076 DE-CIX Management GmbH 1078 Email: thomas.king@de-cix.net