idnits 2.17.1 draft-ietf-bess-evpn-proxy-arp-nd-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (July 8, 2019) is 1754 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC7342' is defined on line 978, but no explicit reference was found in the text == Outdated reference: A later version (-09) exists of draft-ietf-bess-evpn-na-flags-04 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet Draft S. Sathappan 4 K. Nagaraj 5 Intended status: Informational G. Hankins 6 Nokia 8 T. King 9 DE-CIX 11 Expires: January 9, 2020 July 8, 2019 13 Operational Aspects of Proxy-ARP/ND in EVPN Networks 14 draft-ietf-bess-evpn-proxy-arp-nd-07 16 Abstract 18 The EVPN MAC/IP Advertisement route can optionally carry IPv4 and 19 IPv6 addresses associated with a MAC address. Remote PEs can use this 20 information to reply locally (act as proxy) to IPv4 ARP requests and 21 IPv6 Neighbor Solicitation messages (or 'unicast-forward' them to the 22 owner of the MAC) and reduce/suppress the flooding produced by the 23 Address Resolution procedure. This EVPN capability is extremely 24 useful in Internet Exchange Points (IXPs) and Data Centers (DCs) with 25 large broadcast domains, where the amount of ARP/ND flooded traffic 26 causes issues on routers and CEs. This document describes how the 27 EVPN Proxy-ARP/ND function may be implemented to help IXPs and other 28 operators deal with the issues derived from Address Resolution in 29 large broadcast domains. 31 Status of this Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF), its areas, and its working groups. Note that 38 other groups may also distribute working documents as Internet- 39 Drafts. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." The list 45 of current Internet-Drafts can be accessed at 46 http://www.ietf.org/ietf/1id-abstracts.txt 48 The list of Internet-Draft Shadow Directories can be accessed at 49 http://www.ietf.org/shadow.html 51 This Internet-Draft will expire on January 9, 2020. 53 Copyright Notice 55 Copyright (c) 2019 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (http://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3 71 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 72 2.1. The DC Use-Case . . . . . . . . . . . . . . . . . . . . . . 5 73 2.2. The IXP Use-Case . . . . . . . . . . . . . . . . . . . . . 5 74 3. Solution Requirements . . . . . . . . . . . . . . . . . . . . . 6 75 4. Solution Description . . . . . . . . . . . . . . . . . . . . . 7 76 4.1. Learning Sub-Function . . . . . . . . . . . . . . . . . . . 9 77 4.1.1. Proxy-ND and the NA Flags . . . . . . . . . . . . . . . 10 78 4.2. Reply Sub-Function . . . . . . . . . . . . . . . . . . . . 11 79 4.3. Unicast-forward Sub-Function . . . . . . . . . . . . . . . 12 80 4.4. Maintenance Sub-Function . . . . . . . . . . . . . . . . . 13 81 4.5. Flooding (to Remote PEs) Reduction/Suppression . . . . . . 14 82 4.6. Duplicate IP Detection . . . . . . . . . . . . . . . . . . 15 83 5. Solution Benefits . . . . . . . . . . . . . . . . . . . . . . . 17 84 6. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 17 85 6.1. All Dynamic Learning . . . . . . . . . . . . . . . . . . . 17 86 6.2. Dynamic Learning with Proxy-ARP/ND . . . . . . . . . . . . 18 87 6.3. Hybrid Dynamic Learning and Static Provisioning with 88 Proxy-ARP/ND . . . . . . . . . . . . . . . . . . . . . . . 18 89 6.4 All Static Provisioning with Proxy-ARP/ND . . . . . . . . . 18 90 6.5 Deployment Scenarios in IXPs . . . . . . . . . . . . . . . . 18 91 6.6 Deployment Scenarios in DCs . . . . . . . . . . . . . . . . 20 92 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 20 93 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 21 94 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 95 9.1. Normative References . . . . . . . . . . . . . . . . . . . 21 96 9.2. Informative References . . . . . . . . . . . . . . . . . . 22 97 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 22 98 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 22 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 101 1. Terminology 103 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 104 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 105 "OPTIONAL" in this document are to be interpreted as described in 106 BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all 107 capitals, as shown here. 109 BUM: Broadcast, Unknown unicast and Multicast layer-2 traffic. 111 ARP: Address Resolution Protocol. 113 GARP: Gratuitous ARP message. 115 ND: Neighbor Discovery Protocol. 117 NS: Neighbor Solicitation message. 119 NA: Neighbor Advertisement. 121 IXP: Internet eXchange Point. 123 IXP-LAN: it refers to the IXP's large Broadcast Domain to where 124 Internet routers are connected. 126 DC: Data Center. 128 IP->MAC: it refers to an IP address associated to a MAC address. The 129 entries may be of three different types: dynamic, static or EVPN- 130 learned. 132 SN-multicast address: Refers to the Solicited-Node IPv6 multicast 133 address used by NS messages. 135 NUD: Neighbor Unreachability Detection, as per [RFC4861]. 137 DAD: Duplicate Address Detection, as per [RFC4861]. 139 SLLA: Source Link Layer Address, as per [RFC4861]. 141 TLLA: Target Link Layer Address, as per [RFC4861]. 143 R-bit: Router Flag in NA messages, as per [RFC4861]. 145 O-bit: Override Flag in NA messages, as per [RFC4861]. 147 S-bit: Solicited Flag in NA messages, as per [RFC4861]. 149 RT2: EVPN Route type 2 or MAC/IP Advertisement route, as per 150 [RFC7432]. 152 MAC or IP DA: MAC or IP Destination Address. 154 MAC or IP SA: MAC or IP Source Address. 156 AS-MAC: Anti-spoofing MAC. 158 LAG: Link Aggregation Group. 160 BD: Broadcast Domain. 162 This document assumes familiarity with the terminology used in 163 [RFC7432]. 165 2. Introduction 167 As specified in [RFC7432] the IP Address field in the MAC/IP 168 Advertisement route may optionally carry one of the IP addresses 169 associated with the MAC address. A PE may learn local IP->MAC pairs 170 and advertise them in EVPN MAC/IP routes. The remote PEs may add 171 those IP->MAC pairs to their Proxy-ARP/ND tables and reply to local 172 ARP requests or Neighbor Solicitations (or 'unicast-forward' those 173 packets to the owner MAC), reducing and even suppressing in some 174 cases the flooding in the EVPN network. 176 EVPN and its associated Proxy-ARP/ND function are extremely useful in 177 Data Centers (DCs) or Internet Exchange Points (IXPs) with large 178 broadcast domains, where the amount of ARP/ND flooded traffic causes 179 issues on routers and CEs. [RFC6820] describes the Address Resolution 180 problems in Large Data Center networks. 182 This document describes how the [RFC7432] Proxy-ARP/ND function may 183 be implemented to help IXPs, DCs and other operators deal with the 184 issues derived from Address Resolution in large broadcast domains. 186 2.1. The DC Use-Case 188 As described in [RFC6820] the IPv4 and IPv6 Address Resolution can 189 create a lot of issues in large DCs. In particular, the issues 190 created by the IPv4 Address Resolution Protocol procedures may be 191 significant. 193 On one hand, ARP Requests use broadcast MAC addresses, therefore any 194 Tenant System in a large Broadcast Domain will see a large amount of 195 ARP traffic, which is not addressed to most of the receivers. 197 On the other hand, the flooding issue becomes even worse if some 198 Tenant Systems disappear from the broadcast domain, since some 199 implementations will persistently retry sending ARP Requests. As 200 [RFC6820] states, there are no clear requirements for retransmitting 201 ARP Requests in the absence of replies, hence an implementation may 202 choose to keep retrying endlessly even if there are no replies. 204 The amount of flooding that Address Resolution creates can be 205 mitigated with the use of EVPN and its Proxy-ARP/ND function. 207 2.2. The IXP Use-Case 209 The implementation described in this document is especially useful in 210 IXP networks. 212 A typical IXP provides access to a large layer-2 peering network, 213 where (hundreds of) Internet routers are connected. Because of the 214 requirement to connect all routers to a single layer-2 network the 215 peering networks use IPv4 layer-3 addresses in length ranges from /21 216 to /24 (and even bigger for IPv6), which can create very large 217 broadcast domains. This peering network is transparent to the 218 Customer Edge (CE) devices and therefore floods any ARP request or NS 219 messages to all the CEs in the network. Unsolicited GARP and NA 220 messages are flooded to all the CEs too. 222 In these IXP networks, most of the CEs are typically peering routers 223 and roughly all the BUM traffic is originated by the ARP and ND 224 address resolution procedures. This ARP/ND BUM traffic causes 225 significant data volumes that reach every single router in the 226 peering network. Since the ARP/ND messages are processed in "slow 227 path" software processors and they take high priority in the routers, 228 heavy loads of ARP/ND traffic can cause some routers to run out of 229 resources. CEs disappearing from the network may cause Address 230 Resolution explosions that can make a router with limited processing 231 power fail to keep BGP sessions running. 233 The issue may be better in IPv6 routers, since ND uses SN-multicast 234 address in NS messages, however ARP uses broadcast and has to be 235 processed by all the routers in the network. Some routers may also be 236 configured to broadcast periodic GARPs [RFC5227]. The amount of 237 ARP/ND flooded traffic grows exponentially with the number of IXP 238 participants, therefore the issue can only go worse as new CEs are 239 added. 241 In order to deal with this issue, IXPs have developed certain 242 solutions over the past years. One example is the ARP-Sponge daemon 243 [ARP-Sponge], which can reduce significantly the amount of ARP 244 messages sent to an absent router. While these solutions may mitigate 245 the issues of Address Resolution in large broadcasts domains, EVPN 246 provides new more efficient possibilities to IXPs. EVPN and its 247 Proxy-ARP/ND function may help solve the issue in a distributed and 248 scalable way, fully integrated with the PE network. 250 3. Solution Requirements 252 The distributed EVPN Proxy-ARP/ND function described in this document 253 meets the following requirements: 255 o The solution supports the learning of the CE IP->MAC entries on the 256 EVPN PEs via the management, control or data planes. An 257 implementation should allow to intentionally enable or disable 258 those possible learning mechanisms. 260 o The solution may suppress completely the flooding of the ARP/ND 261 messages in the EVPN network, assuming that all the CE IP->MAC 262 addresses local to the PEs are known or provisioned on the PEs from 263 a management system. Note that in this case, the unknown unicast 264 flooded traffic can also be suppressed, since all the expected 265 unicast traffic will be destined to known MAC addresses in the PE 266 BDs. 268 o The solution reduces significantly the flooding of the ARP/ND 269 messages in the EVPN network, assuming that some or all the CE 270 IP->MAC addresses are learned on the data plane by snooping ARP/ND 271 messages issued by the CEs. 273 o The solution provides a way to refresh periodically the CE IP->MAC 274 entries learned through the data plane, so that the IP->MAC entries 275 are not withdrawn by EVPN when they age out unless the CE is not 276 active anymore. This option helps reducing the EVPN control plane 277 overhead in a network with active CEs that do not send packets 278 frequently. 280 o The solution provides a mechanism to detect duplicate IP addresses. 282 In case of duplication, the detecting PE should not reply to 283 requests for the duplicate IP. Instead, the PE should alert the 284 operator and may optionally prevent any other CE from sending 285 traffic to the duplicate IP. 287 o The solution should not change any existing behavior in the CEs 288 connected to the EVPN PEs. 290 4. Solution Description 292 Figure 1 illustrates an example EVPN network where the Proxy-ARP/ND 293 function is enabled. 295 BD1 296 Proxy-ARP/ND 297 +------------+ 298 IP1/M1 +----------------------------+ |IP1->M1 EVPN| 299 GARP --->Proxy-ARP/ND | |IP2->M2 EVPN| 300 +---+ +----+---+ RT2(IP1/M1) | |IP3->M3 sta | 301 |CE1+------+ BD1 | ------> +------+---|IP4->M4 dyn | 302 +---+ +--------+ | +------------+ 303 PE1 | +--------+ Who has IP1? 304 | EVPN | | BD1 | <----- +---+ 305 | EVI1 | | | | |CE3| 306 IP2/M2 | | | | -----> +---+ 307 GARP --->Proxy-ARP/ND | +--------+ | IP1->M1 308 +---+ +--------+ RT2(IP2/M2) | | 309 |CE2+----+ BD1 | ------> +--------------+ 310 +---+ +--------+ PE3| +---+ 311 PE2 | +----+CE4| 312 +----------------------------+ +---+ 313 <---IP4/M4 GARP 315 Figure 1 Proxy-ARP/ND network example 317 When the Proxy-ARP/ND function is enabled in a BD (Broadcast Domain) 318 of the EVPN PEs, each PE creates a Proxy table specific to that BD 319 that can contain three types of Proxy-ARP/ND entries: 321 a) Dynamic entries: learned by snooping CE's ARP and ND messages. For 322 instance, IP4->M4 in Figure 1. 324 b) Static entries: provisioned on the PE by the management system. 325 For instance, IP3->M3 in Figure 1. 327 c) EVPN-learned entries: learned from the IP/MAC information encoded 328 in the received RT2's coming from remote PEs. For instance, IP1- 329 >M1 and IP2->M2 in Figure 1. 331 As a high level example, the operation of the EVPN Proxy-ARP/ND 332 function in the network of Figure 1 is described below. In this 333 example we assume IP1, IP2 and IP3 are IPv4 addresses: 335 1. Proxy-ARP/ND is enabled in BD1 of PE1, PE2 and PE3. 337 2. The PEs start adding dynamic, static and EVPN-learned entries to 338 their Proxy tables: 340 a. PE3 adds IP1->M1 and IP2->M2 based on the EVPN routes received 341 from PE1 and PE2. Those entries were previously learned as 342 dynamic entries in PE1 and PE2 respectively, and advertised in 343 BGP EVPN. 344 b. PE3 adds IP4->M4 as dynamic. This entry is learned by snooping 345 the corresponding ARP messages sent by CE4. 346 c. An operator also provisions the static entry IP3->M3. 348 3. When CE3 sends an ARP Request asking for the MAC address of IP1, 349 PE3 will: 351 a. Intercept the ARP Request and perform a Proxy-ARP lookup for 352 IP1. 353 b. If the lookup is successful (as in Figure 1), PE3 will send an 354 ARP Reply with IP1->M1. The ARP Request will not be flooded to 355 the EVPN network or any other local CEs. 356 c. If the lookup is not successful, PE3 will flood the ARP Request 357 in the EVPN network and the other local CEs. 359 As PE3 learns more and more host entries in the Proxy-ARP/ND table, 360 the flooding of ARP Request messages is reduced and in some cases it 361 can even be suppressed. In a network where most of the participant 362 CEs are not moving between PEs and they advertise their presence with 363 GARPs or unsolicited NA messages, the ARP/ND flooding as well as the 364 unknown unicast flooding can practically be suppressed. In an EVPN- 365 based IXP network, where all the entries are Static, the ARP/ND 366 flooding is in fact totally suppressed. 368 The Proxy-ARP/ND function can be structured in six sub-functions or 369 procedures: 371 1. Learning sub-function 372 2. Reply sub-function 373 3. Unicast-forward sub-function 374 4. Maintenance sub-function 375 5. Flooding reduction/suppression sub-function 376 6. Duplicate IP detection sub-function 377 A Proxy-ARP/ND implementation MAY support all those sub-functions or 378 only a subset of them. The following sections describe each 379 individual sub-function. 381 4.1. Learning Sub-Function 383 A Proxy-ARP/ND implementation SHOULD support static, dynamic and 384 EVPN-learned entries. 386 Static entries are provisioned from the management plane. The 387 provisioned static IP->MAC entry SHOULD be advertised in EVPN with an 388 ARP/ND extended community where the Immutable ARP/ND Binding Flag 389 flag (I) is set to 1, as per [EVPN-ARP-ND-FLAGS]. When the I flag in 390 the ARP/ND extended community is 1, the advertising PE indicates that 391 the IP address MUST NOT be associated to a MAC, other than the one 392 included in the MAC/IP route. The advertisement of I=1 in the ARP/ND 393 extended community is compatible with any value of the Sticky bit (S) 394 or Sequence Number in the [RFC7432] MAC Mobility extended community. 395 Note that the I bit in the ARP/ND extended community refers to the 396 immutable configured association between the IP and the MAC address 397 in the IP->MAC binding, whereas the S bit in the MAC Mobility 398 extended community refers to the fact that the advertised MAC address 399 is not subject to the [RFC7432] mobility procedures. 401 An entry MAY associate a configured static IP to a list of potential 402 MACs, i.e. IP1->(MAC1,MAC2..MACN). When there is more than one MAC in 403 the list of allowed MACs, the PE will not advertise any IP->MAC in 404 EVPN until a local ARP/NA message or any other frame is received from 405 the CE. Upon receiving traffic from the CE, the PE will check that 406 the source MAC is included in the list of allowed MACs. Only in that 407 case, the PE will activate the IP->MAC and advertise it in EVPN. 409 EVPN-learned entries MUST be learned from received valid EVPN MAC/IP 410 Advertisement routes containing a MAC and IP address. 412 Dynamic entries are learned in different ways depending on whether 413 the entry contains an IPv4 or IPv6 address: 415 a) Proxy-ARP dynamic entries: 417 They SHOULD be learned by snooping any ARP packet (Ethertype 418 0x0806) received from the CEs attached to the BD. The Learning 419 function will add the Sender MAC and Sender IP of the snooped ARP 420 packet to the Proxy-ARP table. Note that MAC and IPs with value 0 421 SHOULD NOT be learned. 423 b) Proxy-ND dynamic entries: 425 They SHOULD be learned out of the Target Address and TLLA 426 information in NA messages (Ethertype 0x86DD, ICMPv6 type 136) 427 received from the CEs attached to the BD. A Proxy-ND 428 implementation SHOULD NOT learn IP->MAC entries from NS messages, 429 since they don't contain the R-bit Flag required by the Proxy-ND 430 reply function. See section 4.1.1 for more information about the 431 R-bit flag. 433 Note that if the O-bit is zero in the received NA message, the 434 IP->MAC SHOULD only be learned in case IPv6 'anycast' is enabled 435 in the EVI. 437 The following procedure associated to the Learning sub-function is 438 RECOMMENDED: 440 o When a new Proxy-ARP/ND EVPN or static active entry is learned (or 441 provisioned), the PE SHOULD send an unsolicited GARP or NA message 442 to the access CEs. The PE SHOULD send an unsolicited GARP/NA 443 message for dynamic entries only if the ARP/NA message creating the 444 entry was NOT flooded before. This unsolicited GARP/NA message 445 makes sure the CE ARP/ND caches are updated even if the ARP/NS/NA 446 messages from remote CEs are not flooded in the EVPN network. 448 Note that if a Static entry is provisioned with the same IP as an 449 existing EVPN-learned or Dynamic entry, the Static entry takes 450 precedence. 452 4.1.1. Proxy-ND and the NA Flags 454 [RFC4861] describes the use of the R-bit flag in IPv6 Address 455 Resolution: 457 o Nodes capable of routing IPv6 packets must reply to NS messages 458 with NA messages where the R-bit flag is set (R-bit=1). 460 o Hosts that are not able to route IPv6 packets must indicate that 461 inability by replying with NA messages that contain R-bit=0. 463 The use of the R-bit flag in NA messages has an impact on how hosts 464 select their default gateways when sending packets off-link: 466 o Hosts build a Default Router List based on the received RAs and NAs 467 with R-bit=1. Each cache entry has an IsRouter flag, which must be 468 set based on the R-bit flag in the received NAs. A host can choose 469 one or more Default Routers when sending packets off-link. 471 o In those cases where the IsRouter flag changes from TRUE to FALSE 472 as a result of a NA update, the node MUST remove that router from 473 the Default Router List and update the Destination Cache entries 474 for all destinations using that neighbor as a router, as specified 475 in [RFC4861] section 7.3.3. This is needed to detect when a node 476 that is used as a router stops forwarding packets due to being 477 configured as a host. 479 The R-bit and O-bit will be learned in the following ways: 481 o Static entries SHOULD have the R-bit information added by the 482 management interface. The O-bit information MAY also be added by 483 the management interface. 485 o Dynamic entries SHOULD learn the R-bit and MAY learn the O-bit from 486 the snooped NA messages used to learn the IP->MAC itself. 488 o EVPN-learned entries SHOULD learn the R-bit and MAY learn the O-bit 489 from the ND Extended Community received from EVPN along with the 490 RT2 used to learn the IP->MAC itself. Please refer to [EVPN-ARP-ND- 491 FLAGS]. If no ND extended community is received, the PE will add 492 the default R-bit/O-bit to the entry. The default R-bit SHOULD be 493 an administrative choice. The default O-bit SHOULD be 1. 495 Note that the O-bit SHOULD only be learned if 'anycast' is enabled in 496 the EVI. If so, Duplicate IP Detection must be disabled so that the 497 PE is able to learn the same IP mapped to different MACs in the same 498 Proxy-ND table. If 'anycast' is disabled, NA messages with O-bit = 0 499 will not create a Proxy-ND entry, hence no EVPN advertisement with ND 500 extended community will be generated. 502 4.2. Reply Sub-Function 504 This sub-function will reply to Address Resolution 505 requests/solicitations upon successful lookup in the Proxy-ARP/ND 506 table for a given IP address. The following considerations should be 507 taken into account: 509 a) When replying to ARP Request or NS messages, the PE SHOULD use the 510 Proxy-ARP/ND entry MAC address as MAC SA. This is RECOMMENDED so 511 that the resolved MAC can be learned in the MAC FIB of potential 512 layer-2 switches sitting between the PE and the CE requesting the 513 Address Resolution. 515 b) A PE SHOULD NOT reply to a request/solicitation received on the 516 same attachment circuit over which the IP->MAC is learned. In this 517 case the requester and the requested IP are assumed to be 518 connected to the same layer-2 switch/access network linked to the 519 PE's attachment circuit, and therefore the requested IP owner will 520 receive the request directly. 522 c) A PE SHOULD reply to broadcast/multicast Address Resolution 523 messages, that is, ARP-Request, NS messages as well as DAD NS 524 messages. A PE SHOULD NOT reply to unicast Address Resolution 525 requests (for instance, NUD NS messages). 527 d) A PE SHOULD include the R-bit learned for the IP->MAC entry in the 528 NA messages (see section 4.1.1). The S-bit will be set/unset as 529 per [RFC4861]. The O-bit will be included if IPv6 'anycast' is 530 enabled in the EVI and it is learned for the IP->MAC entry. If 531 'anycast' is enabled and there are more than one MAC for a given 532 IP, the PE will reply to NS messages with as many NA responses as 533 'anycast' entries are in the Proxy-ND table. 535 e) A PE SHOULD NOT reply to ARP probes received from the CEs. An ARP 536 probe is an ARP request constructed with an all-zero sender IP 537 address that may be used by hosts for IPv4 Address Conflict 538 Detection [RFC5227]. 540 f) A PE SHOULD only reply to ARP-Request and NS messages with the 541 format specified in [RFC0826] and [RFC4861] respectively. Received 542 ARP-Requests and NS messages with unknown options SHOULD be either 543 forwarded (as unicast packets) to the owner of the requested IP 544 (assuming the MAC is known in the Proxy-ARP/ND table and BD) or 545 discarded. An administrative option to control this behavior 546 ('unicast-forward' or 'discard') SHOULD be supported. The 547 'unicast-forward' option is described in section 4.3. 549 4.3. Unicast-forward Sub-Function 551 As discussed in section 4.2. in some cases the operator may want to 552 'unicast-forward' certain ARP-Request and NS messages as opposed to 553 reply to them. The operator SHOULD be able to activate this option 554 with one of the following parameters: 556 a) unicast-forward always 557 b) unicast-forward unknown-options 559 If 'unicast-forward always' is enabled, the PE will perform a Proxy- 560 ARP/ND table lookup and in case of a hit, the PE will forward the 561 packet to the owner of the MAC found in the Proxy-ARP/ND table. This 562 is irrespective of the options carried in the ARP/ND packet. This 563 option provides total transparency in the EVI and yet reduces the 564 amount of flooding significantly. 566 If 'unicast-forward unknown-options' is enabled, upon a successful 567 Proxy-ARP/ND lookup, the PE will perform a 'unicast-forward' action 568 only if the ARP-Request or NS messages carry unknown options, as 569 explained in section 4.2. As an example, this would allow to enable 570 Proxy-ND and Secure ND [RFC3971] in the same EVI. The 'unicast- 571 forward unknown-options' configuration allows the support of new 572 applications using ARP/ND in the EVI while still reducing the 573 flooding at the same time. 575 4.4. Maintenance Sub-Function 577 The Proxy-ARP/ND tables SHOULD follow a number of maintenance 578 procedures so that the dynamic IP->MAC entries are kept if the owner 579 is active and flushed if the owner is no longer in the network. The 580 following procedures are RECOMMENDED: 582 a) Age-time 584 A dynamic Proxy-ARP/ND entry MUST be flushed out of the table if 585 the IP->MAC has not been refreshed within a given age-time. The 586 entry is refreshed if an ARP or NA message is received for the 587 same IP->MAC entry. The age-time is an administrative option and 588 its value should be carefully chosen depending on the specific 589 use-case: in IXP networks (where the CE routers are fairly static) 590 the age-time may normally be longer than in DC networks (where 591 mobility is required). 593 b) Send-refresh option 595 The PE MAY send periodic refresh messages (ARP/ND "probes") to the 596 owners of the dynamic Proxy-ARP/ND entries, so that the entries 597 can be refreshed before they age out. The owner of the IP->MAC 598 entry would reply to the ARP/ND probe and the corresponding entry 599 age-time reset. The periodic send-refresh timer is an 600 administrative option and is RECOMMENDED to be a third of the age- 601 time or a half of the age-time in scaled networks. 603 An ARP refresh issued by the PE will be an ARP-Request message 604 with the Sender's IP = 0 sent from the PE's MAC SA. If the PE has 605 an IP address in the subnet, for instance on an IRB interface, 606 then it MAY use it as a source for the ARP request (instead of 607 Sender's IP = 0). An ND refresh will be a NS message issued from 608 the PE's MAC SA and a Link Local Address associated to the PE's 609 MAC. 611 The refresh request messages SHOULD be sent only for dynamic 612 entries and not for static or EVPN-learned entries. Even though 613 the refresh request messages are broadcast or multicast, the PE 614 SHOULD only send the message to the attachment circuit associated 615 to the MAC in the IP->MAC entry. 617 The age-time and send-refresh options are used in EVPN networks to 618 avoid unnecessary EVPN RT2 withdrawals: if refresh messages are sent 619 before the corresponding BD FIB and Proxy-ARP/ND age-time for a given 620 entry expires, inactive but existing hosts will reply, refreshing the 621 entry and therefore avoiding unnecessary MAC and MAC-IP withdrawals 622 in EVPN. Both entries (MAC in the BD and IP->MAC in Proxy-ARP/ND) are 623 reset when the owner replies to the ARP/ND probe. If there is no 624 response to the ARP/ND probe, the MAC and IP->MAC entries will be 625 legitimately flushed and the RT2s withdrawn. 627 4.5. Flooding (to Remote PEs) Reduction/Suppression 629 The Proxy-ARP/ND function implicitly helps reducing the flooding of 630 ARP Request and NS messages to remote PEs in an EVPN network. 631 However, in certain use-cases, the flooding of ARP/NS/NA messages 632 (and even the unknown unicast flooding) to remote PEs can be 633 suppressed completely in an EVPN network. 635 For instance, in an IXP network, since all the participant CEs are 636 well known and will not move to a different PE, the IP->MAC entries 637 may be all provisioned by a management system. Assuming the entries 638 for the CEs are all provisioned on the local PE, a given Proxy-ARP/ND 639 table will only contain static and EVPN-learned entries. In this 640 case, the operator may choose to suppress the flooding of ARP/NS/NA 641 to remote PEs completely. 643 The flooding may also be suppressed completely in IXP networks with 644 dynamic Proxy-ARP/ND entries assuming that all the CEs are directly 645 connected to the PEs and they all advertise their presence with a 646 GARP/unsolicited-NA when they connect to the network. 648 In networks where fast mobility is expected (DC use-case), it is NOT 649 RECOMMENDED to suppress the flooding of unknown ARP-Requests/NS or 650 GARPs/unsolicited-NAs. Unknown ARP-Requests/NS refer to those 651 ARP-Request/NS messages for which the Proxy-ARP/ND lookups for the 652 requested IPs do not succeed. 654 In order to give the operator the choice to suppress/allow the 655 flooding to remote PEs, a PE MAY support administrative options to 656 individually suppress/allow the flooding of: 658 o Unknown ARP-Request and NS messages. 659 o GARP and unsolicited-NA messages. 661 The operator will use these options based on the expected behavior in 662 the CEs. 664 4.6. Duplicate IP Detection 666 The Proxy-ARP/ND function SHOULD support duplicate IP detection so 667 that ARP/ND-spoofing attacks or duplicate IPs due to human errors can 668 be detected. 670 ARP/ND spoofing is a technique whereby an attacker sends "fake" 671 ARP/ND messages onto a broadcast domain. Generally the aim is to 672 associate the attacker's MAC address with the IP address of another 673 host causing any traffic meant for that IP address to be sent to the 674 attacker instead. 676 The distributed nature of EVPN and Proxy-ARP/ND allows the easy 677 detection of duplicated IPs in the network, in a similar way to the 678 MAC duplication function supported by [RFC7432] for MAC addresses. 680 Duplicate IP detection monitors "IP-moves" in the Proxy-ARP/ND table 681 in the following way: 683 o When an existing active IP1->MAC1 entry is modified, a PE starts an 684 M-second timer (default value of M=180), and if it detects N IP 685 moves before the timer expires (default value of N=5), it concludes 686 that a duplicate IP situation has occurred. An IP move is 687 considered when, for instance, IP1->MAC1 is replaced by IP1->MAC2 688 in the Proxy-ARP/ND table. Static IP->MAC entries, that is, locally 689 provisioned or EVPN-learned entries (with I=1 in the ARP/ND 690 extended community), are not subject to this procedure. Static 691 entries MUST NOT be overridden by dynamic Proxy-ARP/ND entries. 693 o In order to detect the duplicate IP faster, the PE MAY send a 694 CONFIRM message to the former owner of the IP. A CONFIRM message is 695 a unicast ARP-Request/NS message sent by the PE to the MAC 696 addresses that previously owned the IP, when the MAC changes in the 697 Proxy-ARP/ND table. The CONFIRM message uses a sender's IP 0.0.0.0 698 in case of ARP (if the PE has an IP address in the subnet then it 699 MAY use it) and an IPv6 Link Local Address in case of NS. If the PE 700 does not receive an answer within a given timer, the new entry will 701 be confirmed and activated. In case of spoofing, for instance, if 702 IP1->MAC1 moves to IP1->MAC2, the PE may send a unicast ARP- 703 Request/NS message for IP1 with MAC DA= MAC1 and MAC SA= PE's MAC. 704 This will force the legitimate owner respond if the move to MAC2 705 was spoofed, and make the PE issue another CONFIRM message, this 706 time to MAC DA= MAC2. If both, legitimate owner and spoofer keep 707 replying to the CONFIRM message, the PE will detect the duplicate 708 IP within the M timer: 710 - If the IP1->MAC1 pair was previously owned by the spoofer and the 711 new IP1->MAC2 was from a valid CE, then the issued CONFIRM 712 message would trigger a response from the spoofer. 714 - If it were the other way around, that is, IP1->MAC1 was 715 previously owned by a valid CE, the CONFIRM message would trigger 716 a response from the CE. 718 Either way, if this process continues, then duplicate detection 719 will kick in. 721 o Upon detecting a duplicate IP situation: 723 a) The entry in duplicate detected state cannot be updated with new 724 dynamic or EVPN-learned entries for the same IP. The operator 725 MAY override the entry though with a static IP->MAC. 727 b) The PE SHOULD alert the operator and stop responding ARP/NS for 728 the duplicate IP until a corrective action is taken. 730 c) Optionally the PE MAY associate an "anti-spoofing-mac" (AS-MAC) 731 to the duplicate IP. The PE will send a GARP/unsolicited-NA 732 message with IP1->AS-MAC to the local CEs as well as an RT2 733 (with IP1->AS-MAC) to the remote PEs. This will force all the 734 CEs in the EVI to use the AS-MAC as MAC DA for IP1, and prevent 735 the spoofer from attracting any traffic for IP1. Since the AS- 736 MAC is a managed MAC address known by all the PEs in the EVI, 737 all the PEs MAY apply filters to drop and/or log any frame with 738 MAC DA= AS-MAC. The advertisement of the AS-MAC as a "black-hole 739 MAC" that can be used directly in the BD to drop frames is for 740 further study. 742 o The duplicate IP situation will be cleared when a corrective action 743 is taken by the operator, or alternatively after a HOLD-DOWN timer 744 (default value of 540 seconds). 746 The values of M, N and HOLD-DOWN timer SHOULD be a configurable 747 administrative option to allow for the required flexibility in 748 different scenarios. 750 For Proxy-ND, Duplicate IP Detection SHOULD only monitor IP moves for 751 IP->MACs learned from NA messages with O-bit=1. NA messages with 752 O-bit=0 would not override the ND cache entries for an existing IP. 753 Duplicate IP Detection for IPv6 SHOULD be disabled when IPv6 754 'anycast' is activated in a given EVI. 756 5. Solution Benefits 758 The solution described in this document provides the following 759 benefits: 761 a) The solution may suppress completely the flooding of the ARP/ND 762 and unknown-unicast messages in the EVPN network, in cases where 763 all the CE IP->MAC addresses local to the PEs are known and 764 provisioned on the PEs from a management system. 766 b) The solution reduces significantly the flooding of the ARP/ND 767 messages in the EVPN network, in cases where some or all the CE 768 IP->MAC addresses are learned on the data plane by snooping ARP/ND 769 messages issued by the CEs. 771 c) The solution reduces the control plane overhead and unnecessary 772 BGP MAC/IP Advertisements and Withdrawals in a network with active 773 CEs that do not send packets frequently. 775 d) The solution provides a mechanism to detect duplicate IP addresses 776 and avoid ARP/ND-spoof attacks or the effects of duplicate 777 addresses due to human errors. 779 6. Deployment Scenarios 781 Four deployment scenarios with different levels of ARP/ND control are 782 available to operators using this solution, depending on their 783 requirements to manage ARP/ND: all dynamic learning, all dynamic 784 learning with Proxy-ARP/ND, hybrid dynamic learning and static 785 provisioning with Proxy-ARP/ND, and all static provisioning with 786 Proxy-ARP/ND. 788 6.1. All Dynamic Learning 790 In this scenario for minimum security and mitigation, EVPN is 791 deployed in the peering network with the Proxy-ARP/ND function 792 shutdown. PEs do not intercept ARP/ND requests and flood all 793 requests, as in a conventional layer-2 network. While no ARP/ND 794 mitigation is used in this scenario, the IXP can still take advantage 795 of EVPN features such as control plane learning and all-active 796 multihoming in the peering network. Existing mitigation solutions, 797 such as the ARP-Sponge daemon [ARP-Sponge] MAY also be used in this 798 scenario. 800 Although this option does not require any of the procedures described 801 in this document, it is added as baseline/default option for 802 completeness. This option is equivalent to VPLS as far as ARP/ND is 803 concerned. The options described in 6.2, 6.3 and 6.4 are only 804 possible in EVPN networks in combination with their Proxy-ARP/ND 805 capabilities. 807 6.2. Dynamic Learning with Proxy-ARP/ND 809 This scenario minimizes flooding while enabling dynamic learning of 810 IP->MAC entries. The Proxy-ARP/ND function is enabled in the BDs of 811 the EVPN PEs, so that the PEs intercept and respond to CE requests. 813 The solution MAY further reduce the flooding of the ARP/ND messages 814 in the EVPN network by snooping ARP/ND messages issued by the CEs. 816 PEs will flood requests if the entry is not in their Proxy table. Any 817 unknown source MAC->IP entries will be learnt and advertised in EVPN, 818 and traffic to unknown entries is discarded at the ingress PE. 820 6.3. Hybrid Dynamic Learning and Static Provisioning with Proxy-ARP/ND 822 Some IXPs want to protect particular hosts on the peering network 823 while allowing dynamic learning of peering router addresses. For 824 example, an IXP may want to configure static MAC->IP entries for 825 management and infrastructure hosts that provide critical services. 826 In this scenario, static entries are provisioned from the management 827 plane for protected MAC->IP addresses, and dynamic learning with 828 Proxy-ARP/ND is enabled as described in section 6.2 on the peering 829 network. 831 6.4 All Static Provisioning with Proxy-ARP/ND 833 For a solution that maximizes security and eliminates flooding and 834 unknown unicast in the peering network, all MAC-IP entries are 835 provisioned from the management plane. The Proxy-ARP/ND function is 836 enabled in the BDs of the EVPN PEs, so that the PEs intercept and 837 respond to CE requests. Dynamic learning and ARP/ND snooping is 838 disabled so that traffic to unknown entries is discarded at the 839 ingress PE. This scenario provides an IXP the most control over 840 MAC->IP entries and allows an IXP to manage all entries from a 841 management system. 843 6.5 Deployment Scenarios in IXPs 845 Nowadays, almost all IXPs installed some security rules in order to 846 protect the IXP-LAN. These rules are often called port security. Port 847 security summarizes different operational steps that limit the access 848 to the IXP-LAN, to the customer router and controls the kind of 849 traffic that the routers are allowed to be exchange (e.g., Ethernet, 850 IPv4, IPv6). Due to this, the deployment scenario as described in 6.4 851 "All Static Provisioning with Proxy-ARP/ND" is the predominant 852 scenario for IXPs. 854 In addition to the "All Static Provisioning" behavior, in IXP 855 networks it is recommended to configure the Reply Sub-Function to 856 'discard' ARP-Requests/NS messages with unrecognized options. 858 At IXPs, customers usually follow a certain operational life-cycle. 859 For each step of the operational life-cycle specific operational 860 procedures are executed. 862 The following describes the operational procedures that are needed to 863 guarantee port security throughout the life-cycle of a customer with 864 focus on EVPN features: 866 1. A new customer is connected the first time to the IXP: 868 Before the connection between the customer router and the IXP-LAN 869 is activated, the MAC of the router is white-listed on the IXP's 870 switch port. All other MAC addresses are blocked. Pre-defined IPv4 871 and IPv6 addresses of the IXP's peering network space are 872 configured at the customer router. The IP->MAC static entries 873 (IPv4 and IPv6) are configured in the management system of the IXP 874 for the customer's port in order to support Proxy-ARP/ND. 876 In case a customer uses multiple ports aggregated to a single 877 logical port (LAG) some vendors randomly select the MAC address of 878 the LAG from the different MAC addresses assigned to the ports. In 879 this case the static entry will be used associated to a list of 880 allowed MACs. 882 2. Replacement of customer router: 884 If a customer router is about to be replaced, the new MAC 885 address(es) must be installed in the management system besides the 886 MAC address(es) of the currently connected router. This allows the 887 customer to replace the router without any active involvement of 888 the IXP operator. For this, static entries are also used. After 889 the replacement takes place, the MAC address(es) of the replaced 890 router can be removed. 892 3. Decommissioning a customer router 894 If a customer router is decommissioned, the router is disconnected 895 from the IXP PE. Right after that, the MAC address(es) of the 896 router and IP->MAC bindings can be removed from the management 897 system. 899 6.6 Deployment Scenarios in DCs 901 DCs normally have different requirements than IXPs in terms of Proxy- 902 ARP/ND. Some differences are listed below: 904 a) The required mobility in virtualized DCs makes the "Dynamic 905 Learning" or "Hybrid Dynamic and Static Provisioning" models more 906 appropriate than the "All Static Provisioning" model. 908 b) IPv6 'anycast' may be required in DCs, while it is not a 909 requirement in IXP networks. Therefore if the DC needs IPv6 910 'anycast' it will be explicitly enabled in the Proxy-ND function, 911 hence the Proxy-ND sub-functions modified accordingly. For 912 instance, if IPv6 'anycast' is enabled in the Proxy-ND function, 913 Duplicate IP Detection must be disabled. 915 c) DCs may require special options on ARP/ND as opposed to the 916 Address Resolution function, which is the only one typically 917 required in IXPs. Based on that, the Reply Sub-function may be 918 modified to forward or discard unknown options. 920 7. Security Considerations 922 The procedures in this document reduce the amount of ARP/ND message 923 flooding, which in itself provides a protection to "slow path" 924 software processors of routers and Tenant Systems in large BDs. The 925 ARP/ND requests that are replied by the Proxy-ARP/ND function (hence 926 not flooded) are normally targeted to existing hosts in the BD. 927 ARP/ND requests targeted to absent hosts are still normally flooded, 928 however the suppression of Unknown ARP-Requests and NS messages 929 described in Section 4.5. can provide an additional level of security 930 against ARP-Requests/NS messages issued to non-existing hosts. 932 The solution also provides protection against Denial Of Service 933 attacks that use ARP/ND-spoofing as a first step. The Duplicate IP 934 Detection and the use of an AS-MAC as explained in Section 4.6. will 935 definitely protect the BD against ARP/ND spoofing. 937 When EVPN and its associated Proxy-ARP/ND function are used in IXP 938 networks, they provide ARP/ND security and mitigation. IXPs MUST 939 still employ additional security mechanisms that protect the peering 940 network and SHOULD follow established BCPs such as the ones described 941 in [Euro-IX BCP]. 943 For example, IXPs should disable all unneeded control protocols, and 944 block unwanted protocols from CEs so that only IPv4, ARP and IPv6 945 Ethertypes are permitted on the peering network. In addition, port 946 security features and ACLs can provide an additional level of 947 security. 949 8. IANA Considerations 951 No IANA considerations. 953 9. References 955 9.1. Normative References 957 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 958 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 959 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, 960 . 962 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 963 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, DOI 964 10.17487/RFC4861, September 2007, . 967 [RFC0826] Plummer, D., "Ethernet Address Resolution Protocol: Or 968 Converting Network Protocol Addresses to 48.bit Ethernet Address for 969 Transmission on Ethernet Hardware", STD 37, RFC 826, DOI 970 10.17487/RFC0826, November 1982, . 973 [RFC6820] Narten, T., Karir, M., and I. Foo, "Address Resolution 974 Problems in Large Data Center Networks", RFC 6820, DOI 975 10.17487/RFC6820, January 2013, . 978 [RFC7342] Dunbar, L., Kumari, W., and I. Gashinsky, "Practices for 979 Scaling ARP and Neighbor Discovery (ND) in Large Data Centers", 980 RFC 7342, DOI 10.17487/RFC7342, August 2014, . 983 [RFC3971] Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander, 984 "SEcure Neighbor Discovery (SEND)", RFC 3971, DOI 10.17487/RFC3971, 985 March 2005, . 987 [RFC5227] Cheshire, S., "IPv4 Address Conflict Detection", RFC 5227, 988 DOI 10.17487/RFC5227, July 2008, . 991 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 992 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 993 1997, . 995 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC2119 996 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, 997 . 999 9.2. Informative References 1001 [ARP-Sponge] Wessel M. and Sijm N., Universiteit van Amsterdam, 1002 "Effects of IPv4 and IPv6 address resolution on AMS-IX and the ARP 1003 Sponge", July 2009. 1005 [EVPN-ARP-ND-FLAGS] Sathappan S., Nagaraj K. and Rabadan J., 1006 "Propagation of ARP/ND Flags in EVPN", draft-ietf-bess-evpn-na-flags- 1007 04, Work in Progress, July 2019. 1009 [Euro-IX BCP] https://www.euro-ix.net/pages/28/1/bcp_ixp.html 1011 10. Acknowledgments 1013 The authors want to thank Ranganathan Boovaraghavan, Sriram 1014 Venkateswaran, Manish Krishnan, Seshagiri Venugopal, Tony Przygienda, 1015 Robert Raszuk and Iftekhar Hussain for their review and 1016 contributions. Thank you to Oliver Knapp as well, for his detailed 1017 review. 1019 11. Contributors 1021 In addition to the authors listed on the front page, the following 1022 co-authors have also contributed to this document: 1024 Wim Henderickx 1025 Nokia 1027 Daniel Melzer 1028 DE-CIX Management GmbH 1030 Erik Nordmark 1031 Zededa 1033 Authors' Addresses 1035 Jorge Rabadan (Editor) 1036 Nokia 1037 777 E. Middlefield Road 1038 Mountain View, CA 94043 USA 1039 Email: jorge.rabadan@nokia.com 1040 Senthil Sathappan 1041 Nokia 1042 Email: senthil.sathappan@nokia.com 1044 Kiran Nagaraj 1045 Nokia 1046 Email: kiran.nagaraj@nokia.com 1048 Greg Hankins 1049 Nokia 1050 Email: greg.hankins@nokia.com 1052 Thomas King 1053 DE-CIX Management GmbH 1054 Lichtstrasse 43i, Cologne 50825, Germany 1055 Email: thomas.king@de-cix.net