idnits 2.17.1 draft-ietf-bess-evpn-proxy-arp-nd-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC6820], [RFC7432]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 6, 2017) is 2570 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC7432' is mentioned on line 642, but not defined == Missing Reference: 'RFC6820' is mentioned on line 181, but not defined == Missing Reference: 'RFC4861' is mentioned on line 505, but not defined == Missing Reference: 'RFC5227' is mentioned on line 502, but not defined == Missing Reference: 'EVPN-NA-FLAGS' is mentioned on line 454, but not defined == Missing Reference: 'RFC0826' is mentioned on line 505, but not defined == Missing Reference: 'RFC3971' is mentioned on line 534, but not defined == Missing Reference: 'RFC2119' is mentioned on line 886, but not defined == Unused Reference: 'EVPN-ND-FLAGS' is defined on line 963, but no explicit reference was found in the text == Unused Reference: 'Euro-IX BCP' is defined on line 967, but no explicit reference was found in the text == Outdated reference: A later version (-07) exists of draft-snr-bess-evpn-na-flags-04 Summary: 1 error (**), 0 flaws (~~), 12 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup J. Rabadan, Ed. 3 Internet Draft S. Sathappan 4 K. Nagaraj 5 Intended status: Informational W. Henderickx 6 G. Hankins 7 Nokia 9 T. King 10 D. Melzer 11 DE-CIX 13 E. Nordmark 14 Arista Networks 16 Expires: October 8, 2017 April 6, 2017 18 Operational Aspects of Proxy-ARP/ND in EVPN Networks 19 draft-ietf-bess-evpn-proxy-arp-nd-02 21 Abstract 23 The MAC/IP Advertisement route specified in [RFC7432] can optionally 24 carry IPv4 and IPv6 addresses associated with a MAC address. Remote 25 PEs can use this information to reply locally (act as proxy) to IPv4 26 ARP requests and IPv6 Neighbor Solicitation messages (or 'unicast- 27 forward' them to the owner of the MAC) and reduce/suppress the 28 flooding produced by the Address Resolution procedure. This EVPN 29 capability is extremely useful in Internet Exchange Points (IXPs) and 30 Data Centers (DCs) with large broadcast domains, where the amount of 31 ARP/ND flooded traffic causes issues on routers and CEs, as explained 32 in [RFC6820]. This document describes how the [RFC7432] EVPN proxy- 33 ARP/ND function may be implemented to help IXPs and other operators 34 deal with the issues derived from Address Resolution in large 35 broadcast domains. 37 Status of this Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF), its areas, and its working groups. Note that 44 other groups may also distribute working documents as Internet- 45 Drafts. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." The list 51 of current Internet-Drafts can be accessed at 52 http://www.ietf.org/ietf/1id-abstracts.txt 54 The list of Internet-Draft Shadow Directories can be accessed at 55 http://www.ietf.org/shadow.html 57 This Internet-Draft will expire on October 8, 2017. 59 Copyright Notice 61 Copyright (c) 2017 IETF Trust and the persons identified as the 62 document authors. All rights reserved. 64 This document is subject to BCP 78 and the IETF Trust's Legal 65 Provisions Relating to IETF Documents 66 (http://trustee.ietf.org/license-info) in effect on the date of 67 publication of this document. Please review these documents 68 carefully, as they describe your rights and restrictions with respect 69 to this document. Code Components extracted from this document must 70 include Simplified BSD License text as described in Section 4.e of 71 the Trust Legal Provisions and are provided without warranty as 72 described in the Simplified BSD License. 74 Table of Contents 76 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3 77 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 78 2.1. The DC Use-Case . . . . . . . . . . . . . . . . . . . . . . 4 79 2.2. The IXP Use-Case . . . . . . . . . . . . . . . . . . . . . 4 80 3. Solution Requirements . . . . . . . . . . . . . . . . . . . . . 5 81 4. Solution Description . . . . . . . . . . . . . . . . . . . . . 6 82 4.1. Learning Sub-Function . . . . . . . . . . . . . . . . . . . 8 83 4.1.1. Proxy-ND and the NA Flags . . . . . . . . . . . . . . . 10 84 4.2. Reply Sub-Function . . . . . . . . . . . . . . . . . . . . 11 85 4.3. Unicast-forward Sub-Function . . . . . . . . . . . . . . . 12 86 4.4. Maintenance Sub-Function . . . . . . . . . . . . . . . . . 12 87 4.5. Flooding (to Remote PEs) Reduction/Suppression . . . . . . 13 88 4.6. Duplicate IP Detection . . . . . . . . . . . . . . . . . . 14 89 5. Solution Benefits . . . . . . . . . . . . . . . . . . . . . . . 16 90 6. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 16 91 6.1. All Dynamic Learning . . . . . . . . . . . . . . . . . . . 17 92 6.2. Dynamic Learning with Proxy-ARP/ND . . . . . . . . . . . . 17 93 6.3. Hybrid Dynamic Learning and Static Provisioning with 94 Proxy-ARP/ND . . . . . . . . . . . . . . . . . . . . . . . 17 95 6.4 All Static Provisioning with Proxy-ARP/ND . . . . . . . . . 18 96 6.5 Deployment Scenarios in IXPs . . . . . . . . . . . . . . . . 18 97 6.6 Deployment Scenarios in DCs . . . . . . . . . . . . . . . . 19 98 7. Conventions Used in this Document . . . . . . . . . . . . . . . 19 99 8. Security Considerations . . . . . . . . . . . . . . . . . . . . 20 100 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 20 101 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 102 10.1. Normative References . . . . . . . . . . . . . . . . . . . 20 103 10.2. Informative References . . . . . . . . . . . . . . . . . . 21 104 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 105 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 107 1. Terminology 109 BUM: Broadcast, Unknown unicast and Multicast layer-2 traffic. 111 ARP: Address Resolution Protocol. 113 GARP: Gratuitous ARP message. 115 ND: Neighbor Discovery Protocol. 117 NS: Neighbor Solicitation message. 119 NA: Neighbor Advertisement. 121 IXP: Internet eXchange Point. 123 IXP-LAN: it refers to the IXP's large Broadcast Domain to where 124 Internet routers are connected. 126 DC: Data Center. 128 IP->MAC: it refers to an IP address associated to a MAC address. The 129 entries may be of three different types: dynamic, static or EVPN- 130 learned. 132 SN-multicast address: Refers to the Solicited-Node IPv6 multicast 133 address used by NS messages. 135 NUD: Neighbor Unreachability Detection, as per [RFC4861]. 137 DAD: Duplicate Address Detection, as per [RFC4861]. 139 SLLA: Source Link Layer Address, as per [RFC4861]. 141 TLLA: Target Link Layer Address, as per [RFC4861]. 143 R-bit: Router Flag in NA messages, as per [RFC4861]. 145 O-bit: Override Flag in NA messages, as per [RFC4861]. 147 S-bit: Solicited Flag in NA messages, as per [RFC4861]. 149 RT2: EVPN Route type 2 or MAC/IP Advertisement route, as per 150 [RFC7432]. 152 MAC or IP DA: MAC or IP Destination Address. 154 MAC or IP SA: MAC or IP Source Address. 156 AS-MAC: Anti-spoofing MAC. 158 2. Introduction 160 As specified in [RFC7432] the IP Address field in the MAC/IP 161 Advertisement route may optionally carry one of the IP addresses 162 associated with the MAC address. A PE may learn local IP->MAC pairs 163 and advertise them in EVPN MAC/IP routes. The remote PEs may add 164 those IP->MAC pairs to their Proxy-ARP/ND tables and reply to local 165 ARP requests or Neighbor Solicitations (or 'unicast-forward' those 166 packets to the owner MAC), reducing and even suppressing in some 167 cases the flooding in the EVPN network. 169 EVPN and its associated Proxy-ARP/ND function are extremely useful in 170 Data Centers (DCs) or Internet Exchange Points (IXPs) with large 171 broadcast domains, where the amount of ARP/ND flooded traffic causes 172 issues on routers and CEs. [RFC6820] describes the Address Resolution 173 problems in Large Data Center networks. 175 This document describes how the [RFC7432] proxy-ARP/ND function may 176 be implemented to help IXPs, DCs and other operators deal with the 177 issues derived from Address Resolution in large broadcast domains. 179 2.1. The DC Use-Case 181 As described in [RFC6820] the IPv4 and IPv6 Address Resolution can 182 create a lot of issues in large DCs. The amount of flooding that 183 Address Resolution creates, as well as other associated issues can be 184 mitigated with the use of EVPN and its proxy-ARP/ND function. 186 2.2. The IXP Use-Case 187 The implementation described in this document is especially useful in 188 IXP networks. 190 A typical IXP provides access to a large layer-2 peering network, 191 where (hundreds of) Internet routers are connected. Because of the 192 requirement to connect all routers to a single layer-2 network the 193 peering networks use IPv4 layer-3 addresses in length ranges from /21 194 to /24, which can create very large broadcast domains. This peering 195 network is transparent to the Customer Edge (CE) devices and 196 therefore floods any ARP request or NS messages to all the CEs in the 197 network. Unsolicited GARP and NA messages are flooded to all the CEs 198 too. 200 In these IXP networks, most of the CEs are typically peering routers 201 and roughly all the BUM traffic is originated by the ARP and ND 202 address resolution procedures. This ARP/ND BUM traffic causes 203 significant data volumes that reach every single router in the 204 peering network. Since the ARP/ND messages are processed in software 205 processors and they take high priority in the routers, heavy loads of 206 ARP/ND traffic can cause some routers to run out of resources. CEs 207 disappearing from the network may cause Address Resolution explosions 208 that can make a router with limited processing power fail to keep BGP 209 sessions running. 211 The issue may be better in IPv6 routers, since ND uses SN-multicast 212 address in NS messages, however ARP uses broadcast and has to be 213 processed by all the routers in the network. Some routers may also be 214 configured to broadcast periodic GARPs [RFC5227]. The amount of 215 ARP/ND flooded traffic grows exponentially with the number of IXP 216 participants, therefore the issue can only go worse as new CEs are 217 added. 219 In order to deal with this issue, IXPs have developed certain 220 solutions over the past years. One example is the ARP-Sponge daemon 221 [ARP-Sponge]. While these solutions may mitigate the issues of 222 Address Resolution in large broadcasts domains, EVPN provides new 223 more efficient possibilities to IXPs. EVPN and its proxy-ARP/ND 224 function may help solve the issue in a distributed and scalable way, 225 fully integrated with the PE network. 227 3. Solution Requirements 229 The distributed EVPN proxy-ARP/ND function described in this document 230 SHOULD meet the following requirements: 232 o The solution SHOULD support the learning of the CE IP->MAC entries 233 on the EVPN PEs via the management, control or data planes. An 234 implementation SHOULD allow to intentionally enable or disable 235 those possible learning mechanisms. 237 o The solution MAY suppress completely the flooding of the ARP/ND 238 messages in the EVPN network, assuming that all the CE IP->MAC 239 addresses local to the PEs are known or provisioned on the PEs from 240 a management system. Note that in this case, the unknown unicast 241 traffic can also be suppressed, since all the expected unicast 242 traffic will be destined to known MAC addresses in the PE MAC-VRFs. 244 o The solution MAY reduce significantly the flooding of the ARP/ND 245 messages in the EVPN network, assuming that some or all the CE 246 IP->MAC addresses are learned on the data plane by snooping ARP/ND 247 messages issued by the CEs. 249 o The solution MAY provide a way to refresh periodically the CE 250 IP->MAC entries learned through the data plane, so that the IP->MAC 251 entries are not withdrawn by EVPN when they age out unless the CE 252 is not active anymore. This option helps reducing the EVPN control 253 plane overhead in a network with active CEs that do not send 254 packets frequently. 256 o The solution SHOULD provide a mechanism to detect duplicate IP 257 addresses. In case of duplication, the detecting PE should not 258 reply to requests for the duplicate IP. Instead, the PE should 259 alert the operator and may optionally prevent any other CE from 260 sending traffic to the duplicate IP. 262 o The solution MUST NOT change any existing behavior in the CEs 263 connected to the EVPN PEs. 265 4. Solution Description 267 Figure 1 illustrates an example EVPN network where the Proxy-ARP/ND 268 function is enabled. 270 MAC-VRF1 271 Proxy-ARP/ND 272 +------------+ 273 IP1/M1 +----------------------------+ |IP1->M1 EVPN| 274 GARP --->Proxy-ARP/ND | |IP2->M2 EVPN| 275 +---+ +----+---+ RT2(IP1/M1) | |IP3->M3 sta | 276 |CE1+------+MAC-VRF1| ------> +------+---|IP4->M4 dyn | 277 +---+ +--------+ | +------------+ 278 PE1 | +--------+ Who has IP1? 279 | EVPN | |MAC-VRF1| <----- +---+ 280 | EVI1 | | | | |CE3| 281 IP2/M2 | | | | -----> +---+ 282 GARP --->Proxy-ARP/ND | +--------+ | IP1->M1 283 +---+ +--------+ RT2(IP2/M2) | | 284 |CE2+----+MAC-VRF1| ------> +--------------+ 285 +---+ +--------+ PE3| +---+ 286 PE2 | +----+CE4| 287 +----------------------------+ +---+ 288 <---IP4/M4 GARP 290 Figure 1 Proxy-ARP/ND network example 292 When the Proxy-ARP/ND function is enabled in the MAC-VRFs of the EVPN 293 PEs, each PE creates a Proxy table specific to that MAC-VRF that can 294 contain three types of Proxy-ARP/ND entries: 296 a) Dynamic entries: learned by snooping CE's ARP and ND messages. For 297 instance, IP4->M4 in Figure 1. 299 b) Static entries: provisioned on the PE by the management system. 300 For instance, IP3->M3 in Figure 1. 302 c) EVPN-learned entries: learned from the IP/MAC information encoded 303 in the received RT2's coming from remote PEs. For instance, IP1- 304 >M1 and IP2->M2 in Figure 1. 306 As a high level example, the operation of the EVPN Proxy-ARP/ND 307 function in the network of Figure 1 is described below. In this 308 example we assume IP1, IP2 and IP3 are IPv4 addresses: 310 1. Proxy-ARP/ND is enabled in MAC-VRF1 of PE1, PE2 and PE3. 312 2. The PEs start adding dynamic, static and EVPN-learned entries to 313 their Proxy tables: 315 a. PE3 adds IP1->M1 and IP2->M2 based on the EVPN routes received 316 from PE1 and PE2. Those entries were previously learned as 317 dynamic entries in PE1 and PE2 respectively, and advertised in 318 BGP EVPN. 319 b. PE3 adds IP4->M4 as dynamic. This entry is learned by snooping 320 the corresponding ARP messages sent by CE4. 321 c. An operator also provisions the static entry IP3->M3. 323 3. When CE3 sends an ARP Request asking for IP1, PE3 will: 325 a. Intercept the ARP Request and perform a Proxy-ARP lookup for 326 IP1. 327 b. If the lookup is successful (as in Figure 1), PE3 will send an 328 ARP Reply with IP1->M1. The ARP Request will not be flooded to 329 the EVPN network or any other local CEs. 330 c. If the lookup is not successful, PE3 will flood the ARP Request 331 in the EVPN network and the other local CEs. 333 As PE3 learns more and more host entries in the Proxy-ARP/ND table, 334 the flooding of ARP Request messages is reduced and in some cases it 335 can even be suppressed. In a network where most of the participant 336 CEs are not moving between PEs and they advertise their presence with 337 GARPs or unsolicited NA messages, the ARP/ND flooding as well as the 338 unknown unicast flooding can practically be suppressed. In an EVPN- 339 based IXP network, where all the entries are Static, the ARP/ND 340 flooding is in fact totally suppressed. 342 The Proxy-ARP/ND function can be structured in six sub-functions or 343 procedures: 345 1. Learning sub-function 346 2. Reply sub-function 347 3. Unicast-forward sub-function 348 4. Maintenance sub-function 349 5. Flooding reduction/suppression sub-function 350 6. Duplicate IP detection sub-function 352 A Proxy-ARP/ND implementation MAY support all those sub-functions or 353 only a subset of them. The following sections describe each 354 individual sub-function. 356 4.1. Learning Sub-Function 358 A Proxy-ARP/ND implementation SHOULD support static, dynamic and 359 EVPN-learned entries. 361 Static entries are provisioned from the management plane. The 362 provisioned static IP->MAC entry SHOULD be advertised in EVPN with a 363 MAC Mobility extended community where the static flag is set to 1, as 364 per [RFC7432]. A static entry MAY associate and IP to a list of 365 potential MACs, i.e. IP1->(MAC1,MAC2..MACN). When there is more than 366 one MAC in the list of allowed MACs, the PE will not advertise any 367 IP->MAC in EVPN until a local ARP/NA message or any other frame is 368 received from the CE. Upon receiving traffic from the CE, the PE will 369 check that the source MAC is included in the list of allowed MACs. 370 Only in that case, the PE will activate the IP->MAC and advertise it 371 in EVPN. 373 EVPN-learned entries MUST be learned from received valid EVPN MAC/IP 374 Advertisement routes containing a MAC and IP address. 376 Dynamic entries are learned in different ways depending on whether 377 the entry contains an IPv4 or IPv6 address: 379 a) Proxy-ARP dynamic entries: 381 They SHOULD be learned by snooping any ARP packet (Ethertype 382 0x0806) received from the CEs attached to the MAC-VRF. The 383 Learning function will add the Sender MAC and Sender IP of the 384 snooped ARP packet to the Proxy-ARP table. Note that MAC and IPs 385 with value 0 SHOULD NOT be learned. 387 b) Proxy-ND dynamic entries: 389 They SHOULD be learned out of the Target Address and TLLA 390 information in NA messages (Ethertype 0x86DD, ICMPv6 type 136) 391 received from the CEs attached to the MAC-VRF. A Proxy-ND 392 implementation SHOULD NOT learn IP->MAC entries from NS messages, 393 since they don't contain the R-bit Flag required by the Proxy-ND 394 reply function. See section 4.1.1 for more information about the 395 R-bit flag. 397 Note that if the O-bit is zero in the received NA message, the 398 IP->MAC SHOULD only be learned in case IPv6 'anycast' is enabled 399 in the EVI. 401 The following procedure associated to the Learning sub-function is 402 recommended: 404 o When a new Proxy-ARP/ND EVPN or static active entry is learned (or 405 provisioned), the PE SHOULD send an unsolicited GARP or NA message 406 to the access CEs. The PE SHOULD send an unsolicited GARP/NA 407 message for dynamic entries only if the ARP/NA message creating the 408 entry was NOT flooded before. This unsolicited GARP/NA message 409 makes sure the CE ARP/ND caches are updated even if the ARP/NS/NA 410 messages from remote CEs are not flooded in the EVPN network. 412 Note that if a Static entry is provisioned with the same IP as an 413 existing EVPN-learned or Dynamic entry, the Static entry takes 414 precedence. 416 4.1.1. Proxy-ND and the NA Flags 418 [RFC4861] describes the use of the R-bit flag in IPv6 Address 419 Resolution: 421 o Nodes capable of routing IPv6 packets must reply to NS messages 422 with NA messages where the R-bit flag is set (R-bit=1). 424 o Hosts that are not able to route IPv6 packets must indicate that 425 inability by replying with NA messages that contain R-bit=0. 427 The use of the R-bit flag in NA messages has an impact on how hosts 428 select their default gateways when sending packets off-link: 430 o Hosts build a Default Router List based on the received RAs and NAs 431 with R-bit=1. Each cache entry has an IsRouter flag, which must be 432 set based on the R-bit flag in the received NAs. A host can choose 433 one or more Default Routers when sending packets off-link. 435 o In those cases where the IsRouter flag changes from TRUE to FALSE 436 as a result of a NA update, the node MUST remove that router from 437 the Default Router List and update the Destination Cache entries 438 for all destinations using that neighbor as a router, as specified 439 in [RFC4861] section 7.3.3. This is needed to detect when a node 440 that is used as a router stops forwarding packets due to being 441 configured as a host. 443 The R-bit and O-bit will be learned in the following ways: 445 o Static entries SHOULD have the R-bit information added by the 446 management interface. The O-bit information MAY also be added by 447 the management interface. 449 o Dynamic entries SHOULD learn the R-bit and MAY learn the O-bit from 450 the snooped NA messages used to learn the IP->MAC itself. 452 o EVPN-learned entries SHOULD learn the R-bit and MAY learn the O-bit 453 from the ND Extended Community received from EVPN along with the 454 RT2 used to learn the IP->MAC itself. Please refer to [EVPN-NA- 455 FLAGS]. If no ND extended community is received, the PE will add 456 the default R-bit/O-bit to the entry. The default R-bit SHOULD be 457 an administrative choice. The default O-bit SHOULD be 1. 459 Note that the O-bit SHOULD only be learned if 'anycast' is enabled in 460 the EVI. If so, Duplicate IP Detection must be disabled so that the 461 PE is able to learn the same IP mapped to different MACs in the same 462 Proxy-ND table. If 'anycast' is disabled, NA messages with O-bit = 0 463 will not create a proxy-ND entry, hence no EVPN advertisement with ND 464 extended community will be generated. 466 4.2. Reply Sub-Function 468 This sub-function will reply to Address Resolution 469 requests/solicitations upon successful lookup in the Proxy-ARP/ND 470 table for a given IP address. The following considerations should be 471 taken into account: 473 a) When replying to ARP Request or NS messages, the PE SHOULD use the 474 Proxy-ARP/ND entry MAC address as MAC SA. This is recommended so 475 that the resolved MAC can be learned in the MAC FIB of potential 476 Layer-2 switches seating between the PE and the CE requesting the 477 Address Resolution. 479 b) A PE SHOULD NOT reply to a request/solicitation received on the 480 same attachment circuit over which the IP->MAC is learned. In this 481 case the requester and the requested IP are assumed to be 482 connected to the same layer-2 switch/access network linked to the 483 PE's attachment circuit, and therefore the requested IP owner will 484 receive the request directly. 486 c) A PE SHOULD reply to broadcast/multicast Address Resolution 487 messages, that is, ARP-Request, NS messages as well as DAD NS 488 messages. A PE SHOULD NOT reply to unicast Address Resolution 489 requests (for instance, NUD NS messages). 491 d) A PE SHOULD include the R-bit learned for the IP->MAC entry in the 492 NA messages (see section 4.1.1). The S-bit will be set/unset as 493 per [RFC4861]. The O-bit will be included if IPv6 'anycast' is 494 enabled in the EVI and it is learned for the IP->MAC entry. If 495 'anycast' is enabled and there are more than one MAC for a given 496 IP, the PE will reply to NS messages with as many NA responses as 497 'anycast' entries are in the proxy-ND table. 499 e) A PE SHOULD NOT reply to ARP probes received from the CEs. An ARP 500 probe is an ARP request constructed with an all-zero sender IP 501 address that may be used by hosts for IPv4 Address Conflict 502 Detection [RFC5227]. 504 f) A PE SHOULD only reply to ARP-Request and NS messages with the 505 format specified in [RFC0826] and [RFC4861] respectively. Received 506 ARP-Requests and NS messages with unknown options SHOULD be either 507 forwarded (as unicast packets) to the owner of the requested IP 508 (assuming the MAC is known in the proxy-ARP/ND table and MAC-VRF) 509 or discarded. An administrative option to control this behavior 510 ('unicast-forward' or 'discard') SHOULD be supported. The 511 'unicast-forward' option is described in section 4.3. 513 4.3. Unicast-forward Sub-Function 515 As discussed in section 4.2. in some cases the operator may want to 516 'unicast-forward' certain ARP-Request and NS messages as opposed to 517 reply to them. The operator SHOULD be able to activate this option 518 with one of the following parameters: 520 a) unicast-forward always 521 b) unicast-forward unknown-options 523 If 'unicast-forward always' is enabled, the PE will perform a proxy- 524 ARP/ND table lookup and in case of a hit, the PE will forward the 525 packet to the owner of the MAC found in the proxy-ARP/ND table. This 526 is irrespective of the options carried in the ARP/ND packet. This 527 option provides total transparency in the EVI and yet reduces the 528 amount of flooding significantly. 530 If 'unicast-forward unknown-options' is enabled, upon a successful 531 proxy-ARP/ND lookup, the PE will perform a 'unicast-forward' action 532 only if the ARP-Request or NS messages carry unknown options, as 533 explained in section 4.2. As an example, this would allow to enable 534 proxy-ND and Secure ND [RFC3971] in the same EVI. The 'unicast- 535 forward unknown-options' configuration allows the support of new 536 applications using ARP/ND in the EVI while still reducing the 537 flooding at the same time. 539 4.4. Maintenance Sub-Function 541 The Proxy-ARP/ND tables SHOULD follow a number of maintenance 542 procedures so that the dynamic IP->MAC entries are kept if the owner 543 is active and flushed if the owner is no longer in the network. The 544 following procedures are recommended: 546 a) Age-time 548 A dynamic Proxy-ARP/ND entry SHOULD be flushed out of the table if 549 the IP->MAC has not been refreshed within a given age-time. The 550 entry is refreshed if an ARP or NA message is received for the 551 same IP->MAC entry. The age-time is an administrative option and 552 its value should be carefully chosen depending on the specific 553 use-case: in IXP networks (where the CE routers are fairly static) 554 the age-time may normally be longer than in DC networks (where 555 mobility is required). 557 b) Send-refresh option 559 The PE MAY send periodic refresh messages (ARP/ND "probes") to the 560 owners of the dynamic Proxy-ARP/ND entries, so that the entries 561 can be refreshed before they age out. The owner of the IP->MAC 562 entry would reply to the ARP/ND probe and the corresponding entry 563 age-time reset. The periodic send-refresh timer is an 564 administrative option and is recommended to be a third of the age- 565 time or a half of the age-time in scaled networks. 567 An ARP refresh issued by the PE will be an ARP-Request message 568 with the Sender's IP = 0 sent from the PE's MAC SA. If the PE has 569 an IP address in the subnet, for instance on an IRB interface, 570 then it MAY use it as a source for the ARP request (instead of 571 Sender's IP = 0). An ND refresh will be a NS message issued from 572 the PE's MAC SA and a Link Local Address associated to the PE's 573 MAC. 575 The refresh request messages should be sent only for dynamic 576 entries and not for static or EVPN-learned entries. Even though 577 the refresh request messages are broadcast or multicast, the PE 578 SHOULD only send the message to the attachment circuit associated 579 to the MAC in the IP->MAC entry. 581 The age-time and send-refresh options are used in EVPN networks to 582 avoid unnecessary EVPN RT2 withdrawals: if refresh messages are sent 583 before the corresponding MAC-VRF FIB and Proxy-ARP/ND age-time for a 584 given entry expires, inactive but existing hosts will reply, 585 refreshing the entry and therefore avoiding unnecessary MAC and MAC- 586 IP withdrawals in EVPN. Both entries (MAC in the MAC-VRF and IP->MAC 587 in Proxy-ARP/ND) are reset when the owner replies to the ARP/ND 588 probe. If there is no response to the ARP/ND probe, the MAC and 589 IP->MAC entries will be legitimately flushed and the RT2s withdrawn. 591 4.5. Flooding (to Remote PEs) Reduction/Suppression 593 The Proxy-ARP/ND function implicitly helps reducing the flooding of 594 ARP Request and NS messages to remote PEs in an EVPN network. 595 However, in certain use-cases, the flooding of ARP/NS/NA messages 596 (and even the unknown unicast flooding) to remote PEs can be 597 suppressed completely in an EVPN network. 599 For instance, in an IXP network, since all the participant CEs are 600 well known and will not move to a different PE, the IP->MAC entries 601 may be all provisioned by a management system. Assuming the entries 602 for the CEs are all provisioned on the local PE, a given Proxy-ARP/ND 603 table will only contain static and EVPN-learned entries. In this 604 case, the operator may choose to suppress the flooding of ARP/NS/NA 605 to remote PEs completely. 607 The flooding may also be suppressed completely in IXP networks with 608 dynamic Proxy-ARP/ND entries assuming that all the CEs are directly 609 connected to the PEs and they all advertise their presence with a 610 GARP/unsolicited-NA when they connect to the network. 612 In networks where fast mobility is expected (DC use-case), it is not 613 recommended to suppress the flooding of unknown ARP-Requests/NS or 614 GARPs/unsolicited-NAs. Unknown ARP-Requests/NS refer to those 615 ARP-Request/NS messages for which the Proxy-ARP/ND lookups for the 616 requested IPs do not succeed. 618 In order to give the operator the choice to suppress/allow the 619 flooding to remote PEs, a PE MAY support administrative options to 620 individually suppress/allow the flooding of: 622 o Unknown ARP-Request and NS messages. 623 o GARP and unsolicited-NA messages. 625 The operator will use these options based on the expected behavior in 626 the CEs. 628 4.6. Duplicate IP Detection 630 The Proxy-ARP/ND function SHOULD support duplicate IP detection so 631 that ARP/ND-spoofing attacks or duplicate IPs due to human errors can 632 be detected. 634 ARP/ND spoofing is a technique whereby an attacker sends "fake" 635 ARP/ND messages onto a broadcast domain. Generally the aim is to 636 associate the attacker's MAC address with the IP address of another 637 host causing any traffic meant for that IP address to be sent to the 638 attacker instead. 640 The distributed nature of EVPN and proxy-ARP/ND allows the easy 641 detection of duplicated IPs in the network, in a similar way to the 642 MAC duplication function supported by [RFC7432] for MAC addresses. 644 Duplicate IP detection monitors "IP-moves" in the Proxy-ARP/ND table 645 in the following way: 647 o When an existing active IP1->MAC1 entry is modified, a PE starts an 648 M-second timer (default value of M=180), and if it detects N IP 649 moves before the timer expires (default value of N=5), it concludes 650 that a duplicate IP situation has occurred. An IP move is 651 considered when, for instance, IP1->MAC1 is replaced by IP1->MAC2 652 in the Proxy-ARP/ND table. 654 o In order to detect the duplicate IP faster, the PE MAY send a 655 CONFIRM message to the former owner of the IP. A CONFIRM message is 656 a unicast ARP-Request/NS message sent by the PE to the MAC 657 addresses that previously owned the IP, when the MAC changes in the 658 Proxy-ARP/ND table. The CONFIRM message uses a sender's IP 0.0.0.0 659 in case of ARP (if the PE has an IP address in the subnet then it 660 MAY use it) and an IPv6 Link Local Address in case of NS. If the PE 661 does not receive an answer within a given timer, the new entry will 662 be confirmed and activated. In case of spoofing, for instance, if 663 IP1->MAC1 moves to IP1->MAC2, the PE may send a unicast ARP- 664 Request/NS message for IP1 with MAC DA= MAC1 and MAC SA= PE's MAC. 665 This will force the legitimate owner respond if the move to MAC2 666 was spoofed, and make the PE issue another CONFIRM message, this 667 time to MAC DA= MAC2. If both, legitimate owner and spoofer keep 668 replying to the CONFIRM message, the PE will detect the duplicate 669 IP within the M timer: 671 - If the IP1->MAC1 pair was previously owned by the spoofer and the 672 new IP1->MAC2 was from a valid CE, then the issued CONFIRM 673 message would trigger a response from the spoofer. 675 - If it were the other way around, that is, IP1->MAC1 was 676 previously owned by a valid CE, the CONFIRM message would trigger 677 a response from the CE. 679 Either way, if this process continues, then duplicate detection 680 will kick in. 682 o Upon detecting a duplicate IP situation: 684 a) The entry in duplicate detected state cannot be updated with new 685 dynamic or EVPN-learned entries for the same IP. The operator 686 MAY override the entry though with a static IP->MAC. 688 b) The PE SHOULD alert the operator and stop responding ARP/NS for 689 the duplicate IP until a corrective action is taken. 691 c) Optionally the PE MAY associate an "anti-spoofing-mac" (AS-MAC) 692 to the duplicate IP. The PE will send a GARP/unsolicited-NA 693 message with IP1->AS-MAC to the local CEs as well as an RT2 694 (with IP1->AS-MAC) to the remote PEs. This will force all the 695 CEs in the EVI to use the AS-MAC as MAC DA for IP1, and prevent 696 the spoofer from attracting any traffic for IP1. Since the AS- 697 MAC is a managed MAC address known by all the PEs in the EVI, 698 all the PEs MAY apply filters to drop and/or log any frame with 699 MAC DA= AS-MAC. The advertisement of the AS-MAC as a "black-hole 700 MAC" that can be used directly in the MAC-VRF to drop frames is 701 for further study. 703 o The duplicate IP situation will be cleared when a corrective action 704 is taken by the operator, or alternatively after a HOLD-DOWN timer 705 (default value of 540 seconds). 707 The values of M, N and HOLD-DOWN timer SHOULD be a configurable 708 administrative option to allow for the required flexibility in 709 different scenarios. 711 For Proxy-ND, Duplicate IP Detection SHOULD only monitor IP moves for 712 IP->MACs learned from NA messages with O-bit=1. NA messages with 713 O-bit=0 would not override the ND cache entries for an existing IP. 714 Duplicate IP Detection for IPv6 SHOULD be disabled when IPv6 715 'anycast' is activated in a given EVI. 717 5. Solution Benefits 719 The solution described in this document provides the following 720 benefits: 722 a) The solution may suppress completely the flooding of the ARP/ND 723 and unknown-unicast messages in the EVPN network, in cases where 724 all the CE IP->MAC addresses local to the PEs are known and 725 provisioned on the PEs from a management system. 727 b) The solution reduces significantly the flooding of the ARP/ND 728 messages in the EVPN network, in cases where some or all the CE 729 IP->MAC addresses are learned on the data plane by snooping ARP/ND 730 messages issued by the CEs. 732 c) The solution reduces the control plane overhead and unnecessary 733 BGP MAC/IP Advertisements and Withdrawals in a network with active 734 CEs that do not send packets frequently. 736 d) The solution provides a mechanism to detect duplicate IP addresses 737 and avoid ARP/ND-spoof attacks or the effects of duplicate 738 addresses due to human errors. 740 6. Deployment Scenarios 742 Four deployment scenarios with different levels of ARP/ND control are 743 available to operators using this solution, depending on their 744 requirements to manage ARP/ND: all dynamic learning, all dynamic 745 learning with proxy-ARP/ND, hybrid dynamic learning and static 746 provisioning with proxy-ARP/ND, and all static provisioning with 747 proxy-ARP/ND. 749 6.1. All Dynamic Learning 751 In this scenario for minimum security and mitigation, EVPN is 752 deployed in the peering network with the proxy-ARP/ND function 753 shutdown. PEs do not intercept ARP/ND requests and flood all 754 requests, as in a conventional layer-2 network. While no ARP/ND 755 mitigation is used in this scenario, the IXP can still take advantage 756 of EVPN features such as control plane learning and all-active 757 multihoming in the peering network. Existing mitigation solutions, 758 such as the ARP-Sponge daemon [ARP-Sponge] MAY also be used in this 759 scenario. 761 Although this option does not require any of the procedures described 762 in this document, it is added as baseline/default option for 763 completeness. This option is equivalent to VPLS as far as ARP/ND is 764 concerned. The options described in 6.2, 6.3 and 6.4 are only 765 possible in EVPN networks in combination with their Proxy-ARP/ND 766 capabilities. 768 6.2. Dynamic Learning with Proxy-ARP/ND 770 This scenario minimizes flooding while enabling dynamic learning of 771 IP->MAC entries. The Proxy-ARP/ND function is enabled in the MAC-VRFs 772 of the EVPN PEs, so that the PEs intercept and respond to CE 773 requests. 775 The solution MAY further reduce the flooding of the ARP/ND messages 776 in the EVPN network by snooping ARP/ND messages issued by the CEs. 778 PEs will flood requests if the entry is not in their Proxy table. Any 779 unknown source MAC->IP entries will be learnt and advertised in EVPN, 780 and traffic to unknown entries is discarded at the ingress PE. 782 6.3. Hybrid Dynamic Learning and Static Provisioning with Proxy-ARP/ND 784 Some IXPs want to protect particular hosts on the peering network 785 while allowing dynamic learning of peering router addresses. For 786 example, an IXP may want to configure static MAC->IP entries for 787 management and infrastructure hosts that provide critical services. 788 In this scenario, static entries are provisioned from the management 789 plane for protected MAC->IP addresses, and dynamic learning with 790 Proxy-ARP/ND is enabled as described in section 6.2 on the peering 791 network. 793 6.4 All Static Provisioning with Proxy-ARP/ND 795 For a solution that maximizes security and eliminates flooding and 796 unknown unicast in the peering network, all MAC-IP entries are 797 provisioned from the management plane. The Proxy-ARP/ND function is 798 enabled in the MAC-VRFs of the EVPN PEs, so that the PEs intercept 799 and respond to CE requests. Dynamic learning and ARP/ND snooping is 800 disabled so that traffic to unknown entries is discarded at the 801 ingress PE. This scenario provides and IXP the most control over 802 MAC->IP entries and allows an IXP to manage all entries from a 803 management system. 805 6.5 Deployment Scenarios in IXPs 807 Nowadays, almost all IXPs installed some security rules in order to 808 protect the IXP-LAN. These rules are often called port security. Port 809 security summarizes different operational steps that limit the access 810 to the IXP-LAN, to the customer router and controls the kind of 811 traffic that the routers are allowed to be exchange (e.g., Ethernet, 812 IPv4, IPv6). Due to this, the deployment scenario as described in 6.4 813 "All Static Provisioning with Proxy-ARP/ND" is the predominant 814 scenario for IXPs. 816 In addition to the "All Static Provisioning" behavior, in IXP 817 networks it is recommended to configure the Reply Sub-Function to 818 'discard' ARP-Requests/NS messages with unrecognized options. 820 At IXPs, customers usually follow a certain operational life-cycle. 821 For each step of the operational life-cycle specific operational 822 procedures are executed. 824 The following describes the operational procedures that are needed to 825 guarantee port security throughout the life-cycle of a customer with 826 focus on EVPN features: 828 1. A new customer is connected the first time to the IXP: 830 Before the connection between the customer router and the IXP-LAN 831 is activated, the MAC of the router is white-listed on the IXP's 832 switch port. All other MAC addresses are blocked. Pre-defined IPv4 833 and IPv6 addresses of the IXP's peering network space are 834 configured at the customer router. The IP->MAC static entries 835 (IPv4 and IPv6) are configured in the management system of the IXP 836 for the customer's port in order to support Proxy-ARP/ND. 838 In case a customer uses multiple ports aggregated to a single 839 logical port (LAG) some vendors randomly select the MAC address of 840 the LAG from the different MAC addresses assigned to the ports. In 841 this case the static entry will be used associated to a list of 842 allowed MACs. 844 2. Replacement of customer router: 846 If a customer router is about to be replaced, the new MAC 847 address(es) must be installed in the management system besides the 848 MAC address(es) of the currently connected router. This allows the 849 customer to replace the router without any active involvement of 850 the IXP operator. For this, static entries are also used. After 851 the replacement takes place, the MAC address(es) of the replaced 852 router can be removed. 854 3. Decommissioning a customer router 856 If a customer router is decommissioned, the router is disconnected 857 from the IXP PE. Right after that, the MAC address(es) of the 858 router and IP->MAC bindings can be removed from the management 859 system. 861 6.6 Deployment Scenarios in DCs 863 DCs normally have different requirements than IXPs in terms of Proxy- 864 ARP/ND. Some differences are listed below: 866 a) The required mobility in virtualized DCs makes the "Dynamic 867 Learning" or "Hybrid Dynamic and Static Provisioning" models more 868 appropriate than the "All Static Provisioning" model. 870 b) IPv6 'anycast' may be required in DCs, while it is not a 871 requirement in IXP networks. Therefore if the DC needs IPv6 872 'anycast' it will be explicitly enabled in the proxy-ND function, 873 hence the proxy-ND sub-functions modified accordingly. For 874 instance, if IPv6 'anycast' is enabled in the proxy-ND function, 875 Duplicate IP Detection must be disabled. 877 c) DCs may require special options on ARP/ND as opposed to the 878 Address Resolution function, which is the only one typically 879 required in IXPs. Based on that, the Reply Sub-function may be 880 modified to forward or discard unknown options. 882 7. Conventions Used in this Document 883 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 884 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 885 document are to be interpreted as described in RFC-2119 [RFC2119]. 887 In this document, these words will appear with that interpretation 888 only when in ALL CAPS. Lower case uses of these words are not to be 889 interpreted as carrying RFC-2119 significance. 891 In this document, the characters ">>" preceding an indented line(s) 892 indicates a compliance requirement statement using the key words 893 listed above. This convention aids reviewers in quickly identifying 894 or finding the explicit compliance requirements of this RFC. 896 8. Security Considerations 898 When EVPN and its associated Proxy-ARP/ND function are used in IXP 899 networks, they only provide ARP/ND security and mitigation. IXPs MUST 900 still employ security mechanisms that protect the peering network and 901 SHOULD follow established BCPs such as the ones described in [Euro-IX 902 BCP]. 904 For example, IXPs should disable all unneeded control protocols, and 905 block unwanted protocols from CEs so that only IPv4, ARP and IPv6 906 Ethertypes are permitted on the peering network. In addition, port 907 security features and ACLs can provide an additional level of 908 security. 910 9. IANA Considerations 912 No IANA considerations. 914 10. References 916 10.1. Normative References 918 [RFC7432]Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 919 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 920 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, . 923 [RFC4861]Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 924 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, DOI 925 10.17487/RFC4861, September 2007, . 928 [RFC0826]Plummer, D., "Ethernet Address Resolution Protocol: Or 929 Converting Network Protocol Addresses to 48.bit Ethernet Address for 930 Transmission on Ethernet Hardware", STD 37, RFC 826, DOI 931 10.17487/RFC0826, November 1982, . 934 [RFC6820]Narten, T., Karir, M., and I. Foo, "Address Resolution 935 Problems in Large Data Center Networks", RFC 6820, DOI 936 10.17487/RFC6820, January 2013, . 939 [RFC7342]Dunbar, L., Kumari, W., and I. Gashinsky, "Practices for 940 Scaling ARP and Neighbor Discovery (ND) in Large Data Centers", 941 RFC 7342, DOI 10.17487/RFC7342, August 2014, . 944 [RFC3971]Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander, 945 "SEcure Neighbor Discovery (SEND)", RFC 3971, DOI 10.17487/RFC3971, 946 March 2005, . 948 [RFC7432]Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 949 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet 950 VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, . 953 [RFC5227]Cheshire, S., "IPv4 Address Conflict Detection", RFC 5227, 954 DOI 10.17487/RFC5227, July 2008, . 957 10.2. Informative References 959 [ARP-Sponge] Wessel M. and Sijm N., Universiteit van Amsterdam, 960 "Effects of IPv4 and IPv6 address resolution on AMS-IX and the ARP 961 Sponge", July 2009. 963 [EVPN-ND-FLAGS] Sathappan S., Nagaraj K. and Rabadan J., "Propagation 964 of IPv6 Neighbor Advertisement Flags in EVPN", draft-snr-bess-evpn- 965 na-flags-04, Work in Progress, July 2016. 967 [Euro-IX BCP] https://www.euro-ix.net/pages/28/1/bcp_ixp.html 969 11. Acknowledgments 971 The authors want to thank Ranganathan Boovaraghavan, Sriram 972 Venkateswaran, Manish Krishnan, Seshagiri Venugopal, Tony Przygienda, 973 Robert Raszuk and Iftekhar Hussain for their review and 974 contributions. Thank you to Oliver Knapp as well, for his detailed 975 review. 977 Authors' Addresses 979 Jorge Rabadan (Editor) 980 Nokia 981 777 E. Middlefield Road 982 Mountain View, CA 94043 USA 983 Email: jorge.rabadan@nokia.com 985 Senthil Sathappan 986 Nokia 987 Email: senthil.sathappan@nokia.com 989 Kiran Nagaraj 990 Nokia 991 Email: kiran.nagaraj@nokia.com 993 Wim Henderickx 994 Nokia 995 Email: wim.henderickx@nokia.com 997 Greg Hankins 998 Nokia 999 Email: greg.hankins@nokia.com 1001 Thomas King 1002 DE-CIX Management GmbH 1003 Lichtstrasse 43i, Cologne 50825, Germany 1004 Email: thomas.king@de-cix.net 1006 Daniel Melzer 1007 DE-CIX Management GmbH 1008 Lichtstrasse 43i, Cologne 50825, Germany 1009 Email: daniel.melzer@de-cix.net 1011 Erik Nordmark 1012 Arista Networks 1013 Email: nordmark@arista.com