idnits 2.17.1 draft-gashinsky-6man-v6nd-enhance-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC4861]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (October 22, 2012) is 4205 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2119' is defined on line 386, but no explicit reference was found in the text == Unused Reference: 'RFC4398' is defined on line 389, but no explicit reference was found in the text == Unused Reference: 'RFC4862' is defined on line 396, but no explicit reference was found in the text == Unused Reference: 'RFC6164' is defined on line 399, but no explicit reference was found in the text == Unused Reference: 'RFC4255' is defined on line 411, but no explicit reference was found in the text == Outdated reference: A later version (-07) exists of draft-ietf-6man-impatient-nud-02 Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group W. Kumari 3 Internet-Draft Google 4 Intended status: Informational I. Gashinsky 5 Expires: April 25, 2013 Yahoo! 6 J. Jaeggli 7 Zynga 8 K. Chittimaneni 9 Google 10 October 22, 2012 12 Neighbor Discovery Enhancement for DOS mititgation 13 draft-gashinsky-6man-v6nd-enhance-02 15 Abstract 17 In IPv4, subnets are generally small, made just large enough to cover 18 the actual number of machines on the subnet. In contrast, the 19 default IPv6 subnet size is a /64, a number so large it covers 20 trillions of addresses, the overwhelming number of which will be 21 unassigned. Consequently, simplistic implementations of Neighbor 22 Discovery can be vulnerable to denial of service attacks whereby they 23 attempt to perform address resolution for large numbers of unassigned 24 addresses. Such denial of attacks can be launched intentionally (by 25 an attacker), or result from legitimate operational tools that scan 26 networks for inventory and other purposes. As a result of these 27 vulnerabilities, new devices may not be able to "join" a network, it 28 may be impossible to establish new IPv6 flows, and existing IPv6 29 transported flows may be interrupted. 31 This document describes a modification to the [RFC4861] neighbor 32 discovery protocol aimed at improving the resilience of the neighbor 33 discovery process. We call this process Gratuitous neighbor 34 discovery and it derives inspiration in part from analogous IPv4 35 gratuitous ARP implementation. 37 Status of this Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF). Note that other groups may also distribute 44 working documents as Internet-Drafts. The list of current Internet- 45 Drafts is at http://datatracker.ietf.org/drafts/current/. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 This Internet-Draft will expire on April 25, 2013. 54 Copyright Notice 56 Copyright (c) 2012 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents 61 (http://trustee.ietf.org/license-info) in effect on the date of 62 publication of this document. Please review these documents 63 carefully, as they describe your rights and restrictions with respect 64 to this document. Code Components extracted from this document must 65 include Simplified BSD License text as described in Section 4.e of 66 the Trust Legal Provisions and are provided without warranty as 67 described in the Simplified BSD License. 69 Table of Contents 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 72 1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . 4 73 2. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 2.1. Scenario 1 - DoS condition induced by default router 75 failure . . . . . . . . . . . . . . . . . . . . . . . . . 5 76 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 77 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 7 78 5. Neighbor Discovery Overview . . . . . . . . . . . . . . . . . 8 79 6. Proposed Solutions . . . . . . . . . . . . . . . . . . . . . . 8 80 6.1. NDP Protocol Gratuitous NA . . . . . . . . . . . . . . . . 9 81 6.2. User Configurable DELAY_FIRST_PROBE_TIME . . . . . . . . . 9 82 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 83 8. Security Considerations . . . . . . . . . . . . . . . . . . . 10 84 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10 85 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 86 10.1. Normative References . . . . . . . . . . . . . . . . . . . 10 87 10.2. Informative References . . . . . . . . . . . . . . . . . . 10 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 90 1. Introduction 92 This document describes modifications to the IPv6 Neighbor Discovery 93 protocol [RFC4861] in order to reduce exposure to vulnerabilities 94 when a network is scanned, either by an intruder, as part of a 95 deliberate DOS attempt, or through the use of scanning tools that 96 perform network inventory, security audits, etc. (e.g., "nmap"). In 97 some cases, DOS-like conditions can also be induced by legitimate 98 traffic in heavy traffic networks such as campuses or datacenters. 100 1.1. Applicability 102 This document is primarily intended for implementors of [RFC4861]. 104 This document is a companion to two additional documents. The first 105 document was [RFC6583] Operational Neighbor Discovery Problems which 106 addressed the problem in detail and described operational and 107 implementation mitigation within the framework of the Existing 108 protocol. The second related document [I-D.ietf-6man-impatient-nud] 109 Neighbor Unreachability Detection is too impatient proposes to alter 110 the Neighbor unreachability Detection by relaxing rules in an attempt 111 to keep devices in the cache. 113 In this document we propose alterations that allow the update or 114 installation of neighbor entries without the instigation of a full 115 [RFC4861] neighbor solicitation. 117 2. The Problem 119 In IPv4, subnets are generally small, made just large enough to cover 120 the actual number of machines on the subnet. For example, an IPv4 121 /20 contains only 4096 address. In contrast, the default IPv6 subnet 122 size is a /64, a number so large it covers literally billions of 123 billions of addresses, the overwhelming number of which will be 124 unassigned. Consequently, simplistic implementations of Neighbor 125 Discovery can be vulnerable to denial of service attacks whereby they 126 perform address resolution for large numbers of unassigned addresses. 127 Such denial of attacks can be launched intentionally (by an 128 attacker), or result from legitimate operational tools that scan 129 networks for inventory and other purposes. As a result of these 130 vulnerabilities, new devices may not be able to "join" a network, it 131 may be impossible to establish new IPv6 flows, and existing IPv6 132 transport flows may be interrupted. 134 Network scans attempt to find and probe devices on a network. 135 Typically, scans are performed on a range of target addresses, or all 136 the addresses on a particular subnet. When such probes are directed 137 via a router, and the target addresses are on a directly attached 138 network, the router will to attempt to perform address resolution on 139 a large number of destinations (i.e., some fraction of the 2^64 140 addresses on the subnet). The process of testing for the 141 (non)existence of neighbors can induce a denial of service condition, 142 where the number of Neighbor Discovery requests overwhelms the 143 implementation's capacity to process them, exhausts available memory, 144 replaces existing in-use mappings with incomplete entries that will 145 never be completed, etc. The result can be network disruption, where 146 existing traffic may be impacted, and devices that join the net find 147 that address resolutions fails. 149 In order to alleviate risk associated with this DOS threat, some 150 router implementations have taken steps to rate-limit the processing 151 rate of Neighbor Solicitations (NS). While these mitigations do 152 help, they do not fully address the issue and may introduce their own 153 set of potential liabilities to the neighbor discovery process. 155 In some network environments, legitimate Neighbor Discovery traffic 156 from a large number of connected hosts could induce a DoS condition 157 even without the use of any scanning tools. 159 2.1. Scenario 1 - DoS condition induced by default router failure 161 Consider the following scenario - You have a pair of routers, R1 and 162 R2, acting as default routers for a campus wifi network that serves 163 thousands of clients. These clients range from traditional laptops 164 with common OSes such as Windows, MAC OS X, etc., to smart phones and 165 tablets running a slew of mobile OSes. R1, R2 and all clients are 166 configured with default ND parameters. 168 Under normal operating conditions, R1 acts as a default gateway for 169 all client traffic and R2 is mostly acting as a standby. R1 and R2 170 routinely send out Router Advertisements and all nodes perform 171 Neighbor Discovery as per the default timers configured. Clients 172 that are actively transmitting and receiving data will likely have a 173 Neighbor Cache entry for R1 as REACHABLE and R2 as STALE. 175 Now imagine that for some reason (power outage, hardware failure, 176 etc.) R1 goes down. When this happens, R2 begins various 177 housekeeping tasks such as reconverging its routing protocols (OSPF, 178 BGP, etc.), recalculating layer 2 topologies such as in STP and so 179 on. Typically, such reconvergence incidents are quite CPU intensive 180 depending on the size of the topology and are generally aggravated in 181 dual stack environments. Once clients determine that R1 is no longer 182 reachable, they would start using R2 as their default router. 184 At this point, the Neighbor Cache Entry for R2 is still marked as 185 STALE. As per RFC4861, a node will start sending packets to R2, mark 186 the neighbor cache entry for R2 as DELAY and set a timer to expire in 187 DELAY_FIRST_PROBE_TIME seconds. DELAY_FIRST_PROBE_TIME is a fixed 188 node constant with a value of 5 seconds. If the entry is still in 189 the DELAY state when the timer expires, the entry's state changes to 190 PROBE. Upon entering the PROBE state, a node sends a unicast 191 Neighbor Solicitation message to R2 using the cached link-layer 192 address. 194 Ordinarily, it is highly likely that the client will receive 195 reachability confirmation within the 5 seconds of 196 DELAY_FIRST_PROBE_TIME by virtue of hints from upper layer protocols. 197 However, in this scenario, given that R2 is busy doing other things, 198 it is possible that it will take a longer time for the client to 199 receive said reachability confirmation, forcing it to enter the PROBE 200 state and send out a unicast NS message. 202 With thousands of clients now sending out unicast NS messages to R2 203 in a short period of time, while it is busy dealing with other 204 reconvergence related calculations, you effectively end up in a DoS 205 situation entirely with legitimate traffic. 207 3. Terminology 209 Address Resolution Address resolution is the process through which a 210 node determines the link-layer address of a neighbor given only 211 its IP address. In IPv6, address resolution is performed as part 212 of Neighbor Discovery [RFC4861], p60 214 Forwarding Plane That part of a router responsible for forwarding 215 packets. In higher-end routers, the forwarding plane is typically 216 implemented in specialized hardware optimized for performance. 217 Forwarding steps include determining the correct outgoing 218 interface for a packet, decrementing its Time To Live (TTL), 219 verifying and updating the checksum, placing the correct link- 220 layer header on the packet, and forwarding it. 222 Control Plane That part of the router implementation that maintains 223 the data structures that determine where packets should be 224 forwarded. The control plane is typically implemented as a 225 "slower" software process running on a general purpose processor 226 and is responsible for such functions as the routing protocols, 227 performing management and resolving the correct link-layer address 228 for adjacent neighbors. The control plane "controls" the 229 forwarding plane by programming it with the information needed for 230 packet forwarding. 232 Neighbor Cache As described in [RFC4861], the data structure that 233 holds the cache of (amongst other things) IP address to link-layer 234 address mappings for connected nodes. The forwarding plane 235 accesses the Neighbor Cache on every forwarded packet. Thus it is 236 usually implemented in an ASIC . 238 Neighbor Discovery Process The Neighbor Discovery Process (NDP) is 239 that part of the control plane that implements the Neighbor 240 Discovery protocol. NDP is responsible for performing address 241 resolution and maintaining the Neighbor Cache. When forwarding 242 packets, the forwarding plane accesses entries within the Neighbor 243 Cache. Whenever the forwarding plane processes a packet for which 244 the corresponding Neighbor Cache Entry is missing or incomplete, 245 it notifies NDP to take appropriate action (typically via a shared 246 queue). NDP picks up requests from the shared queue and performs 247 any necessary actions. In many implementations it is also 248 responsible for responding to router solicitation messages, 249 Neighbor Unreachability Detection (NUD), etc. 251 4. Background 253 Modern router architectures separate the forwarding of packets 254 (forwarding plane) from the decisions needed to decide where the 255 packets should go (control plane). In order to deal with the high 256 number of packets per second the forwarding plane is generally 257 implemented in hardware and is highly optimized for the task of 258 forwarding packets. In contrast, the NDP control plane is mostly 259 implemented in software processes running on a general purpose 260 processor. 262 When a router needs to forward an IP packet, the forwarding plane 263 logic performs the longest match lookup to determine where to send 264 the packet and what outgoing interface to use. To deliver the packet 265 to an adjacent node, It encapsulates the packet in a link-layer frame 266 (which contains a header with the link-layer destination address). 267 The forwarding plane logic checks the Neighbor Cache to see if it 268 already has a suitable link-layer destination, and if not, places the 269 request for the required information into a queue, and signals the 270 control plane (i.e., NDP) that it needs the link-layer address 271 resolved. 273 In order to protect NDP specifically and the control plane generally 274 from being overwhelmed with these requests, appropriate steps must be 275 taken. For example, the size and rate of the queue might be limited. 276 NDP running in the control plane of the router dequeues requests and 277 performs the address resolution function (by performing a neighbor 278 solicitation and listening for a neighbor advertisement). This 279 process is usually also responsible for other activities needed to 280 maintain link-layer information, such as Neighbor Unreachability 281 Detection (NUD). 283 An attacker sending the appropriate packets to addresses on a given 284 subnet can cause the router to queue attempts to resolve so many 285 addresses that it crowds out attempts to resolve "legitimate" 286 addresses (and in many cases becomes unable to perform maintenance of 287 existing entries in the neighbor cache, and unable to answer Neighbor 288 Solicitiation). This condition can result the inability to resolve 289 new neighbors and loss of reachability to neighbors with existing ND- 290 Cache entries. During testing it was concluded that 4 simultaneous 291 nmap sessions from a low-end computer was sufficient to make a 292 router's neighbor discovery process unhappy and therefore forwarding 293 unusable. 295 This behavior has been observed across multiple platforms and 296 implementations. 298 5. Neighbor Discovery Overview 300 When a packet arrives at (or is generated by) a router for a 301 destination on an attached link, the router needs to determine the 302 correct link-layer address to send the packet to. The router checks 303 the Neighbor Cache for an existing Neighbor Cache Entry for the 304 neighbor, and if none exists, invokes the address resolution portions 305 of the IPv6 Neighbor Discovery [RFC4861] protocol to determine the 306 link-layer address. 308 RFC4861 Section 5.2 (Conceptual Sending Algorithm) outlines how this 309 process works. A very high level summary is that the device creates 310 a new Neighbor Cache Entry for the neighbor, sets the state to 311 INCOMPLETE, queues the packet and initiates the actual address 312 resolution process. The device then sends out one or more Neighbor 313 Solicitations, and when it receives a corresponding Neighbor 314 Advertisement, completes the Neighbor Cache Entry and sends the 315 queued packet. 317 6. Proposed Solutions 319 Let us examine a few possible solutions that could alleviate the 320 issues discussed in 'The Problem' section 322 6.1. NDP Protocol Gratuitous NA 324 RFC 4861, section 7.2.5 and 7.2.6 [RFC4861] requires that unsolicited 325 neighbor advertisements result in the receiver setting it's neighbor 326 cache entry to STALE, kicking off the resolution of the neighbor 327 using neighbor solicitation. If the link layer address in an 328 unsolicited neighbor advertisement matches that of the existing ND 329 cache entry, routers SHOULD retain the existing entry updating it's 330 status with regards to LRU retention policy. 332 Hosts MAY be configured to send unsolicited Neighbor advertisement at 333 a rate set at the discretion of the operators. The rate SHOULD be 334 appropriate to the sizing of ND cache parameters and the host count 335 on the subnet. An unsolicited NA rate parameter MUST NOT be enabled 336 by default. The unsolicited rate interval as interpreted by hosts 337 must jitter the value for the interval between transmissions. Hosts 338 receiving a neighbor solicitation requests from a router following 339 each of three subsequent gratuitous NA intervals MUST revert to RFC 340 4861 behavior. 342 Implementation of new behavior for unsolicited neighbor advertisement 343 would make it possible under appropriate circumstances to greatly 344 reduce the dependence on the neighbor solicitation process for 345 retaining existing ND cache entries. 347 This may impact the detection of one-way reachability. 349 6.2. User Configurable DELAY_FIRST_PROBE_TIME 351 A very simple solution for Scenario 1 could be to have a user 352 configurable DELAY_FIRST_PROBE_TIME that could be set to a higher 353 value than the current constant of 5 seconds. This would allow 354 clients to keep sending traffic in the DELAY state, while giving more 355 time for R2 to stabilize before it has to process the barrage of ND 356 messages. It will be up to Network administrators to determine what 357 this value should be based upon unique characteristics of their 358 setup. Having a longer DELAY_FIRST_PROBE_TIME does run the risk of 359 clients sending traffic without ever knowing that they have forward 360 reachability. However, in most cases, the router's forwarding plane 361 remains unaffected during high CPU events and therefore the 362 likelihood of the traffic making it to the destination is high. 364 7. IANA Considerations 366 No IANA resources or consideration are requested in this draft. 368 8. Security Considerations 370 This technique has potential impact on neighbor detection and in 371 particular the discovery of unidirectional forwarding problems. 373 9. Acknowledgements 375 The authors would like to thank Ron Bonica, Troy Bonin, John Jason 376 Brzozowski, Randy Bush, Vint Cerf, Jason Fesler Erik Kline, Jared 377 Mauch, Chris Morrow and Suran De Silva. Special thanks to Thomas 378 Narten for detailed review and (even more so) for providing text! 380 Apologies for anyone we may have missed; it was not intentional. 382 10. References 384 10.1. Normative References 386 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 387 Requirement Levels", BCP 14, RFC 2119, March 1997. 389 [RFC4398] Josefsson, S., "Storing Certificates in the Domain Name 390 System (DNS)", RFC 4398, March 2006. 392 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 393 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 394 September 2007. 396 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 397 Address Autoconfiguration", RFC 4862, September 2007. 399 [RFC6164] Kohno, M., Nitzan, B., Bush, R., Matsuzaki, Y., Colitti, 400 L., and T. Narten, "Using 127-Bit IPv6 Prefixes on Inter- 401 Router Links", RFC 6164, April 2011. 403 10.2. Informative References 405 [I-D.ietf-6man-impatient-nud] 406 Nordmark, E. and I. Gashinsky, "Neighbor Unreachability 407 Detection is too impatient", 408 draft-ietf-6man-impatient-nud-02 (work in progress), 409 July 2012. 411 [RFC4255] Schlyter, J. and W. Griffin, "Using DNS to Securely 412 Publish Secure Shell (SSH) Key Fingerprints", RFC 4255, 413 January 2006. 415 [RFC6583] Gashinsky, I., Jaeggli, J., and W. Kumari, "Operational 416 Neighbor Discovery Problems", RFC 6583, March 2012. 418 Authors' Addresses 420 Warren Kumari 421 Google 423 Email: warren@kumari.net 425 Igor 426 Yahoo! 427 45 W 18th St 428 New York, NY 429 USA 431 Email: igor@yahoo-inc.com 433 Joel 434 Zynga 435 111 Evelyn 436 Sunnyvale, CA 437 USA 439 Email: jjaeggli@zynga.com 441 Kiran 442 Google 443 1600 Amphitheater Pkwy 444 Mountain View, CA 445 USA 447 Email: kk@google.com