idnits 2.17.1 draft-ietf-bess-evpn-proxy-arp-nd-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC7432, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (January 4, 2021) is 1207 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet-Draft S. Sathappan 4 Updates: 7432 (if approved) K. Nagaraj 5 Intended status: Standards Track G. Hankins 6 Expires: July 8, 2021 Nokia 7 T. King 8 DE-CIX 9 January 4, 2021 11 Operational Aspects of Proxy-ARP/ND in EVPN Networks 12 draft-ietf-bess-evpn-proxy-arp-nd-10 14 Abstract 16 This document describes the EVPN Proxy-ARP/ND function, augmented by 17 the capability of the ARP/ND Extended Community. Together, these 18 help operators of Internet Exchange Points (IXPs), Data Centers 19 (DCs), and other networks deal with IPv4 and IPv6 address resolution 20 issues associated with large Broadcast Domains (DBs) by reducing and 21 even suppressing the flooding produced by address resolution in the 22 EVPN network. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on July 8, 2021. 41 Copyright Notice 43 Copyright (c) 2021 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (https://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 2 59 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 60 2.1. The DC Use-Case . . . . . . . . . . . . . . . . . . . . . 4 61 2.2. The IXP Use-Case . . . . . . . . . . . . . . . . . . . . 5 62 3. Solution Description . . . . . . . . . . . . . . . . . . . . 6 63 3.1. Learning Sub-Function . . . . . . . . . . . . . . . . . . 8 64 3.1.1. Proxy-ND and the NA Flags . . . . . . . . . . . . . . 9 65 3.2. Reply Sub-Function . . . . . . . . . . . . . . . . . . . 10 66 3.3. Unicast-forward Sub-Function . . . . . . . . . . . . . . 11 67 3.4. Maintenance Sub-Function . . . . . . . . . . . . . . . . 12 68 3.5. Flooding (to Remote PEs) Reduction/Suppression . . . . . 13 69 3.6. Duplicate IP Detection . . . . . . . . . . . . . . . . . 14 70 4. Solution Benefits . . . . . . . . . . . . . . . . . . . . . . 16 71 5. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . 16 72 5.1. All Dynamic Learning . . . . . . . . . . . . . . . . . . 16 73 5.2. Dynamic Learning with Proxy-ARP/ND . . . . . . . . . . . 17 74 5.3. Hybrid Dynamic Learning and Static Provisioning with 75 Proxy-ARP/ND . . . . . . . . . . . . . . . . . . . . . . 17 76 5.4. All Static Provisioning with Proxy-ARP/ND . . . . . . . . 17 77 5.5. Deployment Scenarios in IXPs . . . . . . . . . . . . . . 18 78 5.6. Deployment Scenarios in DCs . . . . . . . . . . . . . . . 19 79 6. Security Considerations . . . . . . . . . . . . . . . . . . . 19 80 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 81 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 82 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 20 83 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 84 10.1. Normative References . . . . . . . . . . . . . . . . . . 20 85 10.2. Informative References . . . . . . . . . . . . . . . . . 21 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 88 1. Terminology 90 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 91 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 92 "OPTIONAL" in this document are to be interpreted as described in 93 BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all 94 capitals, as shown here. 96 BUM: Broadcast, Unknown unicast and Multicast layer-2 traffic. 98 BD: Broadcast Domain. 100 ARP: Address Resolution Protocol. 102 GARP: Gratuitous ARP message. 104 ND: Neighbor Discovery Protocol. 106 NS: Neighbor Solicitation message. 108 NA: Neighbor Advertisement. 110 IXP: Internet eXchange Point. 112 IXP-LAN: the IXP's large Broadcast Domain to where Internet routers 113 are connected. 115 DC: Data Center. 117 IP->MAC: an IP address associated to a MAC address. IP->MAC entries 118 are programmed in Proxy-ARP/ND tables and may be of three different 119 types: dynamic, static or EVPN-learned. 121 SN-multicast address: Solicited-Node IPv6 multicast address used by 122 NS messages. 124 NUD: Neighbor Unreachability Detection, as per [RFC4861]. 126 DAD: Duplicate Address Detection, as per [RFC4861]. 128 SLLA: Source Link Layer Address, as per [RFC4861]. 130 TLLA: Target Link Layer Address, as per [RFC4861]. 132 R Flag: Router Flag in NA messages, as per [RFC4861]. 134 O Flag: Override Flag in NA messages, as per [RFC4861]. 136 S Flag: Solicited Flag in NA messages, as per [RFC4861]. 138 RT2: EVPN Route type 2 or EVPN MAC/IP Advertisement route, as per 139 [RFC7432]. 141 MAC or IP DA: MAC or IP Destination Address. 143 MAC or IP SA: MAC or IP Source Address. 145 AS-MAC: Anti-spoofing MAC. 147 LAG: Link Aggregation Group. 149 BD: Broadcast Domain. 151 This document assumes familiarity with the terminology used in 152 [RFC7432]. 154 2. Introduction 156 As specified in [RFC7432] the IP Address field in the EVPN MAC/IP 157 Advertisement route may optionally carry one of the IP addresses 158 associated with the MAC address. A PE may learn local IP->MAC pairs 159 and advertise them in EVPN MAC/IP Advertisement routes. Remote PEs 160 importing those routes in the same Broadcast Domain (BD) may add 161 those IP->MAC pairs to their Proxy-ARP/ND tables and reply to local 162 ARP requests or Neighbor Solicitations (or 'unicast-forward' those 163 packets to the owner MAC), reducing and even suppressing in some 164 cases the flooding in the EVPN network. 166 EVPN and its associated Proxy-ARP/ND function are extremely useful in 167 Data Centers (DCs) or Internet Exchange Points (IXPs) with large 168 broadcast domains, where the amount of ARP/ND flooded traffic causes 169 issues on connected routers and CEs. [RFC6820] describes the Address 170 Resolution problems in Large Data Center networks. 172 This document describes the Proxy-ARP/ND function in [RFC7432] 173 networks, augmented by the capability of the ARP/ND Extended 174 Community [I-D.ietf-bess-evpn-na-flags]. 176 Proxy-ARP/ND may be implemented to help IXPs, DCs and other operators 177 deal with the issues derived from Address Resolution in large 178 broadcast domains. 180 2.1. The DC Use-Case 182 As described in [RFC6820] the IPv4 and IPv6 Address Resolution can 183 create a lot of issues in large DCs. In particular, the issues 184 created by the IPv4 Address Resolution Protocol procedures may be 185 significant. 187 On one hand, ARP Requests use broadcast MAC addresses, therefore any 188 Tenant System in a large Broadcast Domain will see a large amount of 189 ARP traffic, which is not addressed to most of the receivers. 191 On the other hand, the flooding issue becomes even worse if some 192 Tenant Systems disappear from the broadcast domain, since some 193 implementations will persistently retry sending ARP Requests. As 194 [RFC6820] states, there are no clear requirements for retransmitting 195 ARP Requests in the absence of replies, hence an implementation may 196 choose to keep retrying endlessly even if there are no replies. 198 The amount of flooding that Address Resolution creates can be 199 mitigated by the use of EVPN and its Proxy-ARP/ND function. 201 2.2. The IXP Use-Case 203 The implementation described in this document is especially useful in 204 IXP networks. 206 A typical IXP provides access to a large layer-2 peering network, 207 where (hundreds of) Internet routers are connected. Because of the 208 requirement to connect all routers to a single layer-2 network the 209 peering networks use IPv4 layer-3 addresses in length ranges from /21 210 to /24 (and even bigger for IPv6), which can create very large 211 broadcast domains. This peering network is transparent to the 212 Customer Edge (CE) devices, and therefore, floods any ARP request or 213 NS messages to all the CEs in the network. Unsolicited GARP and NA 214 messages are flooded to all the CEs too. 216 In these IXP networks, most of the CEs are typically peering routers 217 and roughly all the BUM traffic is originated by the ARP and ND 218 address resolution procedures. This ARP/ND BUM traffic causes 219 significant data volumes that reach every single router in the 220 peering network. Since the ARP/ND messages are processed in "slow 221 path" software processors and they take high priority in the routers, 222 heavy loads of ARP/ND traffic can cause some routers to run out of 223 resources. CEs disappearing from the network may cause Address 224 Resolution explosions that can make a router with limited processing 225 power fail to keep BGP sessions running. 227 The issue may be better in IPv6 routers, since ND uses SN-multicast 228 address in NS messages; however, ARP uses broadcast and has to be 229 processed by all the routers in the network. Some routers may also 230 be configured to broadcast periodic GARPs [RFC5227]. The amount of 231 ARP/ND flooded traffic grows exponentially with the number of IXP 232 participants, therefore the issue can only go worse as new CEs are 233 added. 235 In order to deal with this issue, IXPs have developed certain 236 solutions over the past years. One example is the ARP-Sponge daemon 237 [ARP-Sponge], which can reduce significantly the amount of ARP 238 messages sent to an absent router. While these solutions may 239 mitigate the issues of Address Resolution in large broadcasts 240 domains, EVPN provides new more efficient possibilities to IXPs. 241 EVPN and its Proxy-ARP/ND function may help solve the issue in a 242 distributed and scalable way, fully integrated with the PE network. 244 3. Solution Description 246 Figure 1 illustrates an example EVPN network where the Proxy-ARP/ND 247 function is enabled. 249 BD1 250 Proxy-ARP/ND 251 +------------+ 252 IP1/M1 +----------------------------+ |IP1->M1 EVPN| 253 GARP --->Proxy-ARP/ND | |IP2->M2 EVPN| 254 +---+ +----+---+ RT2(IP1/M1) | |IP3->M3 sta | 255 |CE1+------+ BD1 | ------> +------+---|IP4->M4 dyn | 256 +---+ +--------+ | +------------+ 257 PE1 | +--------+ Who has IP1? 258 | EVPN | | BD1 | <----- +---+ 259 | EVI1 | | | -----> |CE3| 260 IP2/M2 | | | | IP1->M1 +---+ 261 GARP --->Proxy-ARP/ND | +--------+ | IP3/M3 262 +---+ +--------+ RT2(IP2/M2) | | 263 |CE2+----+ BD1 | ------> +--------------+ 264 +---+ +--------+ PE3| +---+ 265 PE2 | +----+CE4| 266 +----------------------------+ +---+ 267 <---IP4/M4 GARP 269 Figure 1: Proxy-ARP/ND network example 271 When the Proxy-ARP/ND function is enabled in a BD (Broadcast Domain) 272 of the EVPN PEs, each PE creates a Proxy table specific to that BD 273 that can contain three types of Proxy-ARP/ND entries: 275 a. Dynamic entries: learned by snooping CE's ARP and ND messages. 276 For instance, IP4->M4 in Figure 1. 278 b. Static entries: provisioned on the PE by the management system. 279 For instance, IP3->M3 in Figure 1. 281 c. EVPN-learned entries: learned from the IP/MAC information encoded 282 in the received RT2's coming from remote PEs. For instance, 283 IP1->M1 and IP2->M2 in Figure 1. 285 As a high level example, the operation of the EVPN Proxy-ARP/ND 286 function in the network of Figure 1 is described below. In this 287 example we assume IP1, IP2 and IP3 are IPv4 addresses: 289 1. Proxy-ARP/ND is enabled in BD1 of PE1, PE2 and PE3. 291 2. The PEs start adding dynamic, static and EVPN-learned entries to 292 their Proxy tables: 294 A. PE3 adds IP1->M1 and IP2->M2 based on the EVPN routes 295 received from PE1 and PE2. Those entries were previously 296 learned as dynamic entries in PE1 and PE2 respectively, and 297 advertised in BGP EVPN. 298 B. PE3 adds IP4->M4 as dynamic. This entry is learned by 299 snooping the corresponding ARP messages sent by CE4. 300 C. An operator also provisions the static entry IP3->M3. 302 3. When CE3 sends an ARP Request asking for the MAC address of IP1, 303 PE3 will: 305 A. Intercept the ARP Request and perform a Proxy-ARP lookup for 306 IP1. 307 B. If the lookup is successful (as in Figure 1), PE3 will send 308 an ARP Reply with IP1->M1. The ARP Request will not be 309 flooded to the EVPN network or any other local CEs. 310 C. If the lookup is not successful, PE3 will flood the ARP 311 Request in the EVPN network and the other local CEs. 313 As PE3 learns more and more host entries in the Proxy-ARP/ND table, 314 the flooding of ARP Request messages is reduced and in some cases it 315 can even be suppressed. In a network where most of the participant 316 CEs are not moving between PEs and they advertise their presence with 317 GARPs or unsolicited NA messages, the ARP/ND flooding as well as the 318 unknown unicast flooding can practically be suppressed. In an EVPN- 319 based IXP network, where all the entries are Static, the ARP/ND 320 flooding is in fact totally suppressed. 322 The Proxy-ARP/ND function can be structured in six sub-functions or 323 procedures: 325 1. Learning sub-function 327 2. Reply sub-function 329 3. Unicast-forward sub-function 331 4. Maintenance sub-function 333 5. Flooding reduction/suppression sub-function 335 6. Duplicate IP detection sub-function 336 A Proxy-ARP/ND implementation MAY support all those sub-functions or 337 only a subset of them. The following sections describe each 338 individual sub-function. 340 3.1. Learning Sub-Function 342 A Proxy-ARP/ND implementation SHOULD support static, dynamic and 343 EVPN-learned entries. 345 Static entries are provisioned from the management plane. The 346 provisioned static IP->MAC entry SHOULD be advertised in EVPN with an 347 ARP/ND Extended Community where the Immutable ARP/ND Binding Flag 348 flag (I) is set to 1, as per [I-D.ietf-bess-evpn-na-flags]. When the 349 I flag in the ARP/ND Extended Community is 1, the advertising PE 350 indicates that the IP address must not be associated to a MAC, other 351 than the one included in the EVPN MAC/IP Advertisement route. The 352 advertisement of I=1 in the ARP/ND Extended Community is compatible 353 with any value of the Sticky bit (S) or Sequence Number in the 354 [RFC7432] MAC Mobility Extended Community. Note that the I bit in 355 the ARP/ND Extended Community refers to the immutable configured 356 association between the IP and the MAC address in the IP->MAC 357 binding, whereas the S bit in the MAC Mobility Extended Community 358 refers to the fact that the advertised MAC address is not subject to 359 the [RFC7432] mobility procedures. 361 An entry MAY associate a configured static IP to a list of potential 362 MACs, i.e. IP1->(MAC1,MAC2..MACN). When there is more than one MAC 363 in the list of allowed MACs, the PE will not advertise any IP->MAC in 364 EVPN until a local ARP/NA message or any other frame is received from 365 the CE. Upon receiving traffic from the CE, the PE will check that 366 the source MAC is included in the list of allowed MACs. Only in that 367 case, the PE will activate the IP->MAC and advertise it in EVPN. 369 EVPN-learned entries MUST be learned from received valid EVPN MAC/IP 370 Advertisement routes containing a MAC and IP address. 372 Dynamic entries are learned in different ways depending on whether 373 the entry contains an IPv4 or IPv6 address: 375 a. Proxy-ARP dynamic entries: 377 They SHOULD be learned by snooping any ARP packet (Ethertype 378 0x0806) received from the CEs attached to the BD. The 379 Learning function will add the Sender MAC and Sender IP of the 380 snooped ARP packet to the Proxy-ARP table. Note that MAC and 381 IPs with value 0 SHOULD NOT be learned. 383 b. Proxy-ND dynamic entries: 385 They SHOULD be learned out of the Target Address and TLLA 386 information in NA messages (Ethertype 0x86DD, ICMPv6 type 136) 387 received from the CEs attached to the BD. A Proxy-ND 388 implementation SHOULD NOT learn IP->MAC entries from NS 389 messages, since they don't contain the R Flag required by the 390 Proxy-ND reply function. See Section 3.1.1 for more 391 information about the R Flag. 393 Note that if the O Flag is zero in the received NA message, 394 the IP->MAC SHOULD only be learned in case IPv6 'anycast' is 395 enabled in the EVI. 397 The following procedure associated to the Learning sub-function is 398 RECOMMENDED: 400 o When a new Proxy-ARP/ND EVPN or static active entry is learned (or 401 provisioned), the PE SHOULD send an unsolicited GARP or NA message 402 to the all the connected access CEs. The PE SHOULD send an 403 unsolicited GARP/NA message for dynamic entries only if the ARP/NA 404 message creating the entry was NOT flooded before. This 405 unsolicited GARP/NA message makes sure the CE ARP/ND caches are 406 updated even if the ARP/NS/NA messages from remote CEs are not 407 flooded in the EVPN network. 409 Note that if a Static entry is provisioned with the same IP as an 410 existing EVPN-learned or Dynamic entry, the Static entry takes 411 precedence. 413 3.1.1. Proxy-ND and the NA Flags 415 [RFC4861] describes the use of the R Flag in IPv6 Address Resolution: 417 o Nodes capable of routing IPv6 packets must reply to NS messages 418 with NA messages where the R Flag is set (R Flag=1). 420 o Hosts that are not able to route IPv6 packets must indicate that 421 inability by replying with NA messages that contain R Flag=0. 423 The use of the R Flag in NA messages has an impact on how hosts 424 select their default gateways when sending packets off-link: 426 o Hosts build a Default Router List based on the received RAs and 427 NAs with R Flag=1. Each cache entry has an IsRouter flag, which 428 must be set based on the R Flag in the received NAs. A host can 429 choose one or more Default Routers when sending packets off-link. 431 o In those cases where the IsRouter flag changes from TRUE to FALSE 432 as a result of a NA update, the node MUST remove that router from 433 the Default Router List and update the Destination Cache entries 434 for all destinations using that neighbor as a router, as specified 435 in [RFC4861] section 7.3.3. This is needed to detect when a node 436 that is used as a router stops forwarding packets due to being 437 configured as a host. 439 The R Flag and O Flag will be learned in the following ways: 441 o Static entries SHOULD have the R Flag information added by the 442 management interface. The O Flag information MAY also be added by 443 the management interface. 445 o Dynamic entries SHOULD learn the R Flag and MAY learn the O Flag 446 from the snooped NA messages used to learn the IP->MAC itself. 448 o EVPN-learned entries SHOULD learn the R Flag and MAY learn the O 449 Flag from the ARP/ND Extended Community 450 [I-D.ietf-bess-evpn-na-flags] received from EVPN along with the 451 RT2 used to learn the IP->MAC itself. If no ARP/ND Extended 452 Community is received, the PE will add a configured R Flag/O Flag 453 to the entry. This configured R Flag SHOULD be an administrative 454 choice with a default value of 1. 456 Note that the O Flag SHOULD only be learned if 'anycast' is enabled 457 in the BD. If so, Duplicate IP Detection must be disabled so that 458 the PE is able to learn the same IP mapped to different MACs in the 459 same Proxy-ND table. If 'anycast' is disabled, NA messages with O 460 Flag = 0 will not create a Proxy-ND entry, hence no EVPN 461 advertisement with ARP/ND Extended Community will be generated. 463 3.2. Reply Sub-Function 465 This sub-function will reply to Address Resolution requests/ 466 solicitations upon successful lookup in the Proxy-ARP/ND table for a 467 given IP address. The following considerations should be taken into 468 account: 470 a. When replying to ARP Request or NS messages, the PE SHOULD use 471 the Proxy-ARP/ND entry MAC address as MAC SA. This is 472 RECOMMENDED so that the resolved MAC can be learned in the MAC 473 FIB of potential layer-2 switches sitting between the PE and the 474 CE requesting the Address Resolution. 476 b. A PE SHOULD NOT reply to a request/solicitation received on the 477 same attachment circuit over which the IP->MAC is learned. In 478 this case the requester and the requested IP are assumed to be 479 connected to the same layer-2 switch/access network linked to the 480 PE's attachment circuit, and therefore the requested IP owner 481 will receive the request directly. 483 c. A PE SHOULD reply to broadcast/multicast Address Resolution 484 messages, that is, ARP-Request, NS messages as well as DAD NS 485 messages. A PE SHOULD NOT reply to unicast Address Resolution 486 requests (for instance, NUD NS messages). 488 d. A PE SHOULD include the R-bit learned for the IP->MAC entry in 489 the NA messages (see Section 3.1.1). The S Flag will be set/ 490 unset as per [RFC4861]. The O Flag will be included if IPv6 491 'anycast' is enabled in the BD and it is learned for the IP->MAC 492 entry. If 'anycast' is enabled and there are more than one MAC 493 for a given IP, the PE will reply to NS messages with as many NA 494 responses as 'anycast' entries are in the Proxy-ND table. 496 e. A PE SHOULD NOT reply to ARP probes received from the CEs. An 497 ARP probe is an ARP request constructed with an all-zero sender 498 IP address that may be used by hosts for IPv4 Address Conflict 499 Detection [RFC5227]. 501 f. A PE SHOULD only reply to ARP-Request and NS messages with the 502 format specified in [RFC0826] and [RFC4861] respectively. 503 Received ARP-Requests and NS messages with unknown options SHOULD 504 be either forwarded (as unicast packets) to the owner of the 505 requested IP (assuming the MAC is known in the Proxy-ARP/ND table 506 and BD) or discarded. An administrative option to control this 507 behavior ('unicast-forward' or 'discard') SHOULD be supported. 508 The 'unicast-forward' option is described in Section 3.3. 510 3.3. Unicast-forward Sub-Function 512 As discussed in Section 3.2, in some cases the operator may want to 513 'unicast-forward' certain ARP-Request and NS messages as opposed to 514 reply to them. The operator SHOULD be able to activate this option 515 with one of the following parameters: 517 a. unicast-forward always 519 b. unicast-forward unknown-options 521 If 'unicast-forward always' is enabled, the PE will perform a Proxy- 522 ARP/ND table lookup and in case of a hit, the PE will forward the 523 packet to the owner of the MAC found in the Proxy-ARP/ND table. This 524 is irrespective of the options carried in the ARP/ND packet. This 525 option provides total transparency in the BD and yet reduces the 526 amount of flooding significantly. 528 If 'unicast-forward unknown-options' is enabled, upon a successful 529 Proxy-ARP/ND lookup, the PE will perform a 'unicast-forward' action 530 only if the ARP-Request or NS messages carry unknown options, as 531 explained in Section 3.2. The 'unicast-forward unknown-options' 532 configuration allows the support of new applications using ARP/ND in 533 the BD while still reducing the flooding. 535 3.4. Maintenance Sub-Function 537 The Proxy-ARP/ND tables SHOULD follow a number of maintenance 538 procedures so that the dynamic IP->MAC entries are kept if the owner 539 is active and flushed if the owner is no longer in the network. The 540 following procedures are RECOMMENDED: 542 a. Age-time 544 A dynamic Proxy-ARP/ND entry MUST be flushed out of the table if 545 the IP->MAC has not been refreshed within a given age-time. The 546 entry is refreshed if an ARP or NA message is received for the 547 same IP->MAC entry. The age-time is an administrative option and 548 its value should be carefully chosen depending on the specific 549 use-case: in IXP networks (where the CE routers are fairly 550 static) the age-time may normally be longer than in DC networks 551 (where mobility is required). 553 b. Send-refresh option 555 The PE MAY send periodic refresh messages (ARP/ND "probes") to 556 the owners of the dynamic Proxy-ARP/ND entries, so that the 557 entries can be refreshed before they age out. The owner of the 558 IP->MAC entry would reply to the ARP/ND probe and the 559 corresponding entry age-time reset. The periodic send-refresh 560 timer is an administrative option and is RECOMMENDED to be a 561 third of the age-time or a half of the age-time in scaled 562 networks. 564 An ARP refresh issued by the PE will be an ARP-Request message 565 with the Sender's IP = 0 sent from the PE's MAC SA. If the PE 566 has an IP address in the subnet, for instance on an IRB 567 interface, then it MAY use it as a source for the ARP request 568 (instead of Sender's IP = 0). An ND refresh will be a NS message 569 issued from the PE's MAC SA and a Link Local Address associated 570 to the PE's MAC. 572 The refresh request messages SHOULD be sent only for dynamic 573 entries and not for static or EVPN-learned entries. Even though 574 the refresh request messages are broadcast or multicast, the PE 575 SHOULD only send the message to the attachment circuit associated 576 to the MAC in the IP->MAC entry. 578 The age-time and send-refresh options are used in EVPN networks to 579 avoid unnecessary EVPN RT2 withdrawals: if refresh messages are sent 580 before the corresponding BD FIB and Proxy-ARP/ND age-time for a given 581 entry expires, inactive but existing hosts will reply, refreshing the 582 entry and therefore avoiding unnecessary EVPN MAC/IP Advertisement 583 withdrawals in EVPN. Both entries (MAC in the BD and IP->MAC in 584 Proxy-ARP/ND) are reset when the owner replies to the ARP/ND probe. 585 If there is no response to the ARP/ND probe, the MAC and IP->MAC 586 entries will be legitimately flushed and the RT2s withdrawn. 588 3.5. Flooding (to Remote PEs) Reduction/Suppression 590 The Proxy-ARP/ND function implicitly helps reducing the flooding of 591 ARP Request and NS messages to remote PEs in an EVPN network. 592 However, in certain use-cases, the flooding of ARP/NS/NA messages 593 (and even the unknown unicast flooding) to remote PEs can be 594 suppressed completely in an EVPN network. 596 For instance, in an IXP network, since all the participant CEs are 597 well known and will not move to a different PE, the IP->MAC entries 598 may be all provisioned by a management system. Assuming the entries 599 for the CEs are all provisioned on the local PE, a given Proxy-ARP/ND 600 table will only contain static and EVPN-learned entries. In this 601 case, the operator may choose to suppress the flooding of ARP/NS/NA 602 to remote PEs completely. 604 The flooding may also be suppressed completely in IXP networks with 605 dynamic Proxy-ARP/ND entries assuming that all the CEs are directly 606 connected to the PEs and they all advertise their presence with a 607 GARP/unsolicited-NA when they connect to the network. 609 In networks where fast mobility is expected (DC use-case), it is NOT 610 RECOMMENDED to suppress the flooding of unknown ARP-Requests/NS or 611 GARPs/unsolicited-NAs. Unknown ARP-Requests/NS refer to those ARP- 612 Request/NS messages for which the Proxy-ARP/ND lookups for the 613 requested IPs do not succeed. 615 In order to give the operator the choice to suppress/allow the 616 flooding to remote PEs, a PE MAY support administrative options to 617 individually suppress/allow the flooding of: 619 o Unknown ARP-Request and NS messages. 621 o GARP and unsolicited-NA messages. 623 The operator will use these options based on the expected behavior on 624 the CEs. 626 3.6. Duplicate IP Detection 628 The Proxy-ARP/ND function SHOULD support duplicate IP detection so 629 that ARP/ND-spoofing attacks or duplicate IPs due to human errors can 630 be detected. 632 ARP/ND spoofing is a technique whereby an attacker sends "fake" ARP/ 633 ND messages onto a broadcast domain. Generally the aim is to 634 associate the attacker's MAC address with the IP address of another 635 host causing any traffic meant for that IP address to be sent to the 636 attacker instead. 638 The distributed nature of EVPN and Proxy-ARP/ND allows the easy 639 detection of duplicated IPs in the network, in a similar way to the 640 MAC duplication function supported by [RFC7432] for MAC addresses. 642 Duplicate IP detection monitors "IP-moves" in the Proxy-ARP/ND table 643 in the following way: 645 a. When an existing active IP1->MAC1 entry is modified, a PE starts 646 an M-second timer (default value of M=180), and if it detects N 647 IP moves before the timer expires (default value of N=5), it 648 concludes that a duplicate IP situation has occurred. An IP move 649 is considered when, for instance, IP1->MAC1 is replaced by 650 IP1->MAC2 in the Proxy-ARP/ND table. Static IP->MAC entries, 651 that is, locally provisioned or EVPN-learned entries (with I=1 in 652 the ARP/ND Extended Community), are not subject to this 653 procedure. Static entries MUST NOT be overridden by dynamic 654 Proxy-ARP/ND entries. 656 b. In order to detect the duplicate IP faster, the PE MAY send a 657 CONFIRM message to the former owner of the IP. A CONFIRM message 658 is a unicast ARP-Request/NS message sent by the PE to the MAC 659 addresses that previously owned the IP, when the MAC changes in 660 the Proxy-ARP/ND table. The CONFIRM message uses a sender's IP 661 0.0.0.0 in case of ARP (if the PE has an IP address in the subnet 662 then it MAY use it) and an IPv6 Link Local Address in case of NS. 663 If the PE does not receive an answer within a given timer, the 664 new entry will be confirmed and activated. In case of spoofing, 665 for instance, if IP1->MAC1 moves to IP1->MAC2, the PE may send a 666 unicast ARP-Request/NS message for IP1 with MAC DA= MAC1 and MAC 667 SA= PE's MAC. This will force the legitimate owner to respond if 668 the move to MAC2 was spoofed, and make the PE issue another 669 CONFIRM message, this time to MAC DA= MAC2. If both, legitimate 670 owner and spoofer keep replying to the CONFIRM message, the PE 671 will detect the duplicate IP within the M-second timer: 673 - If the IP1->MAC1 pair was previously owned by the spoofer and 674 the new IP1->MAC2 was from a valid CE, then the issued CONFIRM 675 message would trigger a response from the spoofer. 677 - If it were the other way around, that is, IP1->MAC1 was 678 previously owned by a valid CE, the CONFIRM message would 679 trigger a response from the CE. 681 Either way, if this process continues, then duplicate 682 detection will kick in. 684 c. Upon detecting a duplicate IP situation: 686 1. The entry in duplicate detected state cannot be updated with 687 new dynamic or EVPN-learned entries for the same IP. The 688 operator MAY override the entry though with a static IP->MAC. 690 2. The PE SHOULD alert the operator and stop responding ARP/NS 691 for the duplicate IP until a corrective action is taken. 693 3. Optionally the PE MAY associate an "anti-spoofing-mac" (AS- 694 MAC) to the duplicate IP. The PE will send a GARP/ 695 unsolicited-NA message with IP1->AS-MAC to the local CEs as 696 well as an RT2 (with IP1->AS-MAC) to the remote PEs. This 697 will force all the CEs in the EVI to use the AS-MAC as MAC DA 698 for IP1, and prevent the spoofer from attracting any traffic 699 for IP1. Since the AS-MAC is a managed MAC address known by 700 all the PEs in the EVI, all the PEs MAY apply filters to drop 701 and/or log any frame with MAC DA= AS-MAC. The advertisement 702 of the AS-MAC as a "black-hole MAC" that can be used directly 703 in the BD to drop frames is for further study. 705 d. The duplicate IP situation will be cleared when a corrective 706 action is taken by the operator, or alternatively after a HOLD- 707 DOWN timer (default value of 540 seconds). 709 The values of M, N and HOLD-DOWN timer SHOULD be a configurable 710 administrative option to allow for the required flexibility in 711 different scenarios. 713 For Proxy-ND, Duplicate IP Detection SHOULD only monitor IP moves for 714 IP->MACs learned from NA messages with O Flag=1. NA messages with O 715 Flag=0 would not override the ND cache entries for an existing IP. 716 Duplicate IP Detection for IPv6 SHOULD be disabled when IPv6 717 'anycast' is activated in a given EVI. 719 4. Solution Benefits 721 The solution described in this document provides the following 722 benefits: 724 a. The solution may suppress completely the flooding of the ARP/ND 725 messages in the EVPN network, assuming that all the CE IP->MAC 726 addresses local to the PEs are known or provisioned on the PEs 727 from a management system. Note that in this case, the unknown 728 unicast flooded traffic can also be suppressed, since all the 729 expected unicast traffic will be destined to known MAC addresses 730 in the PE BDs. 732 b. The solution reduces significantly the flooding of the ARP/ND 733 messages in the EVPN network, assuming that some or all the CE 734 IP->MAC addresses are learned on the data plane by snooping ARP/ 735 ND messages issued by the CEs. 737 c. The solution provides a way to refresh periodically the CE 738 IP->MAC entries learned through the data plane, so that the 739 IP->MAC entries are not withdrawn by EVPN when they age out 740 unless the CE is not active anymore. This option helps reducing 741 the EVPN control plane overhead in a network with active CEs that 742 do not send packets frequently. 744 d. Provides a mechanism to detect duplicate IP addresses and avoid 745 ARP/ND-spoof attacks or the effects of duplicate addresses due to 746 human errors. 748 5. Deployment Scenarios 750 Four deployment scenarios with different levels of ARP/ND control are 751 available to operators using this solution, depending on their 752 requirements to manage ARP/ND: all dynamic learning, all dynamic 753 learning with Proxy-ARP/ND, hybrid dynamic learning and static 754 provisioning with Proxy-ARP/ND, and all static provisioning with 755 Proxy-ARP/ND. 757 5.1. All Dynamic Learning 759 In this scenario for minimum security and mitigation, EVPN is 760 deployed in the peering network with the Proxy-ARP/ND function 761 shutdown. PEs do not intercept ARP/ND requests and flood all 762 requests, as in a conventional layer-2 network. While no ARP/ND 763 mitigation is used in this scenario, the IXP can still take advantage 764 of EVPN features such as control plane learning and all-active 765 multihoming in the peering network. Existing mitigation solutions, 766 such as the ARP-Sponge daemon [ARP-Sponge] MAY also be used in this 767 scenario. 769 Although this option does not require any of the procedures described 770 in this document, it is added as baseline/default option for 771 completeness. This option is equivalent to VPLS as far as ARP/ND is 772 concerned. The options described in Section 5.2, Section 5.3 and 773 Section 5.4 are only possible in EVPN networks in combination with 774 their Proxy-ARP/ND capabilities. 776 5.2. Dynamic Learning with Proxy-ARP/ND 778 This scenario minimizes flooding while enabling dynamic learning of 779 IP->MAC entries. The Proxy-ARP/ND function is enabled in the BDs of 780 the EVPN PEs, so that the PEs intercept and respond to CE requests. 782 The solution MAY further reduce the flooding of the ARP/ND messages 783 in the EVPN network by snooping ARP/ND messages issued by the CEs. 785 PEs will flood requests if the entry is not in their Proxy table. 786 Any unknown source MAC->IP entries will be learnt and advertised in 787 EVPN, and traffic to unknown entries is discarded at the ingress PE. 789 5.3. Hybrid Dynamic Learning and Static Provisioning with Proxy-ARP/ND 791 Some IXPs want to protect particular hosts on the peering network 792 while allowing dynamic learning of peering router addresses. For 793 example, an IXP may want to configure static MAC->IP entries for 794 management and infrastructure hosts that provide critical services. 795 In this scenario, static entries are provisioned from the management 796 plane for protected MAC->IP addresses, and dynamic learning with 797 Proxy-ARP/ND is enabled as described in Section 5.2 on the peering 798 network. 800 5.4. All Static Provisioning with Proxy-ARP/ND 802 For a solution that maximizes security and eliminates flooding and 803 unknown unicast in the peering network, all MAC-IP entries are 804 provisioned from the management plane. The Proxy-ARP/ND function is 805 enabled in the BDs of the EVPN PEs, so that the PEs intercept and 806 respond to CE requests. Dynamic learning and ARP/ND snooping is 807 disabled so that traffic to unknown entries is discarded at the 808 ingress PE. This scenario provides an IXP the most control over 809 MAC->IP entries and allows an IXP to manage all entries from a 810 management system. 812 5.5. Deployment Scenarios in IXPs 814 Nowadays, almost all IXPs installed some security rules in order to 815 protect the IXP-LAN. These rules are often called port security. 816 Port security summarizes different operational steps that limit the 817 access to the IXP-LAN, to the customer router and controls the kind 818 of traffic that the routers are allowed to exchange (e.g., Ethernet, 819 IPv4, IPv6). Due to this, the deployment scenario as described in 820 Section 5.4 "All Static Provisioning with Proxy-ARP/ND" is the 821 predominant scenario for IXPs. 823 In addition to the "All Static Provisioning" behavior, in IXP 824 networks it is recommended to configure the Reply Sub-Function to 825 'discard' ARP-Requests/NS messages with unrecognized options. 827 At IXPs, customers usually follow a certain operational life-cycle. 828 For each step of the operational life-cycle specific operational 829 procedures are executed. 831 The following describes the operational procedures that are needed to 832 guarantee port security throughout the life-cycle of a customer with 833 focus on EVPN features: 835 1. A new customer is connected the first time to the IXP: 837 Before the connection between the customer router and the IXP-LAN 838 is activated, the MAC of the router is white-listed on the IXP's 839 switch port. All other MAC addresses are blocked. Pre-defined 840 IPv4 and IPv6 addresses of the IXP peering network space are 841 configured at the customer router. The IP->MAC static entries 842 (IPv4 and IPv6) are configured in the management system of the 843 IXP for the customer's port in order to support Proxy-ARP/ND. 845 In case a customer uses multiple ports aggregated to a single 846 logical port (LAG) some vendors randomly select the MAC address 847 of the LAG from the different MAC addresses assigned to the 848 ports. In this case the static entry will be used associated to 849 a list of allowed MACs. 851 2. Replacement of customer router: 853 If a customer router is about to be replaced, the new MAC 854 address(es) must be installed in the management system besides 855 the MAC address(es) of the currently connected router. This 856 allows the customer to replace the router without any active 857 involvement of the IXP operator. For this, static entries are 858 also used. After the replacement takes place, the MAC 859 address(es) of the replaced router can be removed. 861 3. Decommissioning a customer router 863 If a customer router is decommissioned, the router is 864 disconnected from the IXP PE. Right after that, the MAC 865 address(es) of the router and IP->MAC bindings can be removed 866 from the management system. 868 5.6. Deployment Scenarios in DCs 870 DCs normally have different requirements than IXPs in terms of Proxy- 871 ARP/ND. Some differences are listed below: 873 a. The required mobility in virtualized DCs makes the "Dynamic 874 Learning" or "Hybrid Dynamic and Static Provisioning" models more 875 appropriate than the "All Static Provisioning" model. 877 b. IPv6 'anycast' may be required in DCs, while it is not a 878 requirement in IXP networks. Therefore if the DC needs IPv6 879 'anycast' it will be explicitly enabled in the Proxy-ND function, 880 hence the Proxy-ND sub-functions modified accordingly. For 881 instance, if IPv6 'anycast' is enabled in the Proxy-ND function, 882 Duplicate IP Detection must be disabled. 884 c. DCs may require special options on ARP/ND as opposed to the 885 Address Resolution function, which is the only one typically 886 required in IXPs. Based on that, the Reply Sub-function may be 887 modified to forward or discard unknown options. 889 6. Security Considerations 891 The procedures in this document reduce the amount of ARP/ND message 892 flooding, which in itself provides a protection to "slow path" 893 software processors of routers and Tenant Systems in large BDs. The 894 ARP/ND requests that are replied by the Proxy-ARP/ND function (hence 895 not flooded) are normally targeted to existing hosts in the BD. ARP/ 896 ND requests targeted to absent hosts are still normally flooded; 897 however, the suppression of Unknown ARP-Requests and NS messages 898 described in Section 3.5. can provide an additional level of security 899 against ARP-Requests/NS messages issued to non-existing hosts. 901 The solution also provides protection against Denial Of Service 902 attacks that use ARP/ND-spoofing as a first step. The Duplicate IP 903 Detection and the use of an AS-MAC as explained in Section 3.6 904 protects the BD against ARP/ND spoofing. 906 When EVPN and its associated Proxy-ARP/ND function are used in IXP 907 networks, they provide ARP/ND security and mitigation. IXPs MUST 908 still employ additional security mechanisms that protect the peering 909 network and SHOULD follow established BCPs such as the ones described 910 in [Euro-IX-BCP]. 912 For example, IXPs should disable all unneeded control protocols, and 913 block unwanted protocols from CEs so that only IPv4, ARP and IPv6 914 Ethertypes are permitted on the peering network. In addition, port 915 security features and ACLs can provide an additional level of 916 security. 918 7. IANA Considerations 920 No IANA considerations. 922 8. Acknowledgments 924 The authors want to thank Ranganathan Boovaraghavan, Sriram 925 Venkateswaran, Manish Krishnan, Seshagiri Venugopal, Tony Przygienda, 926 Robert Raszuk and Iftekhar Hussain for their review and 927 contributions. Thank you to Oliver Knapp as well, for his detailed 928 review. 930 9. Contributors 932 In addition to the authors listed on the front page, the following 933 co-authors have also contributed to this document: 935 Wim Henderickx 936 Nokia 938 Daniel Melzer 939 DE-CIX Management GmbH 941 Erik Nordmark 942 Zededa 944 10. References 946 10.1. Normative References 948 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 949 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 950 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 951 2015, . 953 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 954 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 955 DOI 10.17487/RFC4861, September 2007, 956 . 958 [RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or 959 Converting Network Protocol Addresses to 48.bit Ethernet 960 Address for Transmission on Ethernet Hardware", STD 37, 961 RFC 826, DOI 10.17487/RFC0826, November 1982, 962 . 964 [RFC5227] Cheshire, S., "IPv4 Address Conflict Detection", RFC 5227, 965 DOI 10.17487/RFC5227, July 2008, 966 . 968 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 969 Requirement Levels", BCP 14, RFC 2119, 970 DOI 10.17487/RFC2119, March 1997, 971 . 973 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 974 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 975 May 2017, . 977 [I-D.ietf-bess-evpn-na-flags] 978 Rabadan, J., Sathappan, S., Nagaraj, K., and W. Lin, 979 "Propagation of ARP/ND Flags in EVPN", draft-ietf-bess- 980 evpn-na-flags-09 (work in progress), December 2020. 982 10.2. Informative References 984 [ARP-Sponge] 985 N., W. M. A. S., "Effects of IPv4 and IPv6 address 986 resolution on AMS-IX and the ARP Sponge", July 2009. 988 [Euro-IX-BCP] 989 Euro-IX, "European Internet Exchange Association Best 990 Practises". 992 [RFC6820] Narten, T., Karir, M., and I. Foo, "Address Resolution 993 Problems in Large Data Center Networks", RFC 6820, 994 DOI 10.17487/RFC6820, January 2013, 995 . 997 Authors' Addresses 999 Jorge Rabadan (editor) 1000 Nokia 1001 777 Middlefield Road 1002 Mountain View, CA 94043 1003 USA 1005 Email: jorge.rabadan@nokia.com 1006 Senthil Sathappan 1007 Nokia 1008 701 E. Middlefield Road 1009 Mountain View, CA 94043 USA 1011 Email: senthil.sathappan@nokia.com 1013 Kiran Nagaraj 1014 Nokia 1015 701 E. Middlefield Road 1016 Mountain View, CA 94043 USA 1018 Email: kiran.nagaraj@nokia.com 1020 Greg Hankins 1021 Nokia 1023 Email: greg.hankins@nokia.com 1025 Thomas King 1026 DE-CIX Management GmbH 1028 Email: thomas.king@de-cix.net