idnits 2.17.1 draft-ietf-v6ops-v6nd-problems-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 03, 2012) is 4408 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2119' is defined on line 496, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 5157 (Obsoleted by RFC 7707) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 v6ops I. Gashinsky 3 Internet-Draft Yahoo! 4 Intended status: Informational J. Jaeggli 5 Expires: September 4, 2012 Zynga 6 W. Kumari 7 Google Inc 8 March 03, 2012 10 Operational Neighbor Discovery Problems 11 draft-ietf-v6ops-v6nd-problems-05 13 Abstract 15 In IPv4, subnets are generally small, made just large enough to cover 16 the actual number of machines on the subnet. In contrast, the 17 default IPv6 subnet size is a /64, a number so large it covers 18 trillions of addresses, the overwhelming number of which will be 19 unassigned. Consequently, simplistic implementations of Neighbor 20 Discovery (ND) can be vulnerable to deliberate or accidental denial 21 of service, whereby they attempt to perform address resolution for 22 large numbers of unassigned addresses. Such denial of attacks can be 23 launched intentionally (by an attacker), or result from legitimate 24 operational tools or accident conditions. As a result of these 25 vulnerabilities, new devices may not be able to "join" a network, it 26 may be impossible to establish new IPv6 flows, and existing IPv6 27 transported flows may be interrupted. 29 This document describes the potential for DOS in detail and suggests 30 possible implementation improvements as well as operational 31 mitigation techniques that can in some cases be used to protect 32 against or at least alleviate the impact of such attacks. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on September 4, 2012. 50 Copyright Notice 52 Copyright (c) 2012 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . 4 69 2. The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 71 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 6 72 5. Neighbor Discovery Overview . . . . . . . . . . . . . . . . . 7 73 6. Operational Mitigation Options . . . . . . . . . . . . . . . . 8 74 6.1. Filtering of unused address space. . . . . . . . . . . . . 8 75 6.2. Minimal Subnet Sizing. . . . . . . . . . . . . . . . . . . 8 76 6.3. Routing Mitigation. . . . . . . . . . . . . . . . . . . . 9 77 6.4. Tuning of the NDP Queue Rate Limit. . . . . . . . . . . . 9 78 7. Recommendations for Implementors. . . . . . . . . . . . . . . 9 79 7.1. Prioritize NDP Activities . . . . . . . . . . . . . . . . 10 80 7.2. Queue Tuning. . . . . . . . . . . . . . . . . . . . . . . 11 81 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 82 9. Security Considerations . . . . . . . . . . . . . . . . . . . 12 83 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 84 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 85 11.1. Normative References . . . . . . . . . . . . . . . . . . . 12 86 11.2. Informative References . . . . . . . . . . . . . . . . . . 13 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13 89 1. Introduction 91 This document describes implementation issues with IPv6's Neighbor 92 Discovery protocol that can result in vulnerabilities when a network 93 is scanned, either by an intruder or through the use of scanning 94 tools that perform network inventory, security audits, etc. (e.g. 95 "nmap"). 97 This document describes the problem in detail, suggests possible 98 implementation improvements, as well as operational mitigation 99 techniques, that can in some cases protect against such attacks. 101 The RFC series documents generally describe the behavior of 102 protocols, that is, "what" is to be done by a protocol, but not 103 exactly "how" it is to be implemented. The exact details of how best 104 to implement a protocol will depend on the overall hardware and 105 software architecture of a particular device. The actual "how" 106 decisions are (correctly) left in the hands of implementers, so long 107 as implementations differences will generally produce proper on-the- 108 wire behavior. 110 While reading this document, it is important to keep in mind that 111 discussions of how things have been implemented beyond basic 112 compliance with the specification is not within the scope of the 113 neighbor discovery RFCs. 115 1.1. Applicability 117 This document is primarily intended for operators of IPV6 networks 118 and implementors of [RFC4861]. The Document provides some 119 operational considerations as well as recommendations to increase the 120 resilience of the Neighbor Discovery protocol. 122 2. The Problem 124 In IPv4, subnets are generally small, made just large enough to cover 125 the actual number of machines on the subnet. For example, an IPv4 126 /20 contains only 4096 address. In contrast, the default IPv6 subnet 127 size is a /64, a number so large it covers literally billions of 128 billions of addresses, the overwhelming majority of which will be 129 unassigned. Consequently, simplistic implementations of Neighbor 130 Discovery may fail to perform as desired when they perform address 131 resolution of large numbers of unassigned addresses. Such failures 132 can be triggered either intentionally by an attacker launching a 133 Denial of Service attack (DoS)[RFC4732] to exploit this 134 vulnerability, or unintentionally due to the use of legitimate 135 operational tools that scan networks for inventory and other 136 purposes. As a result of these failures, new devices may not be able 137 to "join" a network, it may be impossible to establish new IPv6 138 flows, and existing IPv6 transport flows may be interrupted. 140 Network scans attempt to find and probe devices on a network. 141 Typically, scans are performed on a range of target addresses, or all 142 the addresses on a particular subnet. When such probes are directed 143 via a router, and the target addresses are on a directly attached 144 network, the router will attempt to perform address resolution on a 145 large number of destinations (i.e., some fraction of the 2^64 146 addresses on the subnet). The router's process of testing for the 147 (non)existence of neighbors can induce a denial of service condition, 148 where the number of necessary Neighbor Discovery requests overwhelms 149 the implementation's capacity to process them, exhausts available 150 memoryand replaces existing in-use mappings with incomplete entries 151 that will never be completed. A directed DoS attack may seek to 152 intentionally create similar conditions to that created 153 unintentionally by a network scan. The resulting network disruption 154 may impact existing traffic, and devices that join the network may 155 find that address resolution attempts fail. The DOS as a consequence 156 of network scanning was previously described in [RFC5157] 158 In order to mitigate risk associated with this DoS threat, some 159 router implementations have taken steps to rate-limit the processing 160 rate of Neighbor Solicitations (NS). While these mitigations do 161 help, they do not fully address the issue and may introduce their own 162 set of issues to the neighbor discovery process. 164 3. Terminology 166 Address Resolution Address resolution is the process through which a 167 node determines the link-layer address of a neighbor given only 168 its IP address. In IPv6, address resolution is performed as part 169 of Neighbor Discovery [RFC4861], p60 171 Forwarding Plane That part of a router responsible for forwarding 172 packets. In higher-end routers, the forwarding plane is typically 173 implemented in specialized hardware optimized for performance. 174 Steps in the forwarding process include determining the correct 175 outgoing interface for a packet, decrementing its Time To Live 176 (TTL), verifying and updating the checksum, placing the correct 177 link-layer header on the packet, and forwarding it. 179 Control Plane That part of the router implementation that maintains 180 the data structures that determine where packets should be 181 forwarded. The control plane is typically implemented as a 182 "slower" software process running on a general purpose processor 183 and is responsible for such functions as communicating network 184 status changes via routing protocols, maintaining the forwarding 185 table, performing management, and resolving the correct link-layer 186 address for adjacent neighbors. The control plane "controls" the 187 forwarding plane by programming it with the information needed for 188 packet forwarding. 190 Neighbor Cache As described in [RFC4861], the data structure that 191 holds the cache of (amongst other things) IP address to link-layer 192 address mappings for connected nodes. As the information in the 193 Neighbor Cache is needed by the forwarding plane every time it 194 forwards a packet, it is usually implemented in an ASIC. 196 Neighbor Discovery Process The Neighbor Discovery Process (NDP) is 197 that part of the control plane that implements the Neighbor 198 Discovery protocol. NDP is responsible for performing address 199 resolution and maintaining the Neighbor Cache. When forwarding 200 packets, the forwarding plane accesses entries within the Neighbor 201 Cache. When the forwarding plane processes a packet for which the 202 corresponding Neighbor Cache Entry is missing or incomplete, it 203 notifies NDP to take appropriate action (typically via a shared 204 queue). NDP picks up requests from the shared queue and performs 205 any necessary discovery action. In many implementations the NDP 206 is also responsible for responding to router solicitation 207 messages, Neighbor Unreachability Detection (NUD), etc. 209 4. Background 211 Modern router architectures separate the forwarding of packets 212 (forwarding plane) from the decisions needed to decide where the 213 packets should go (control plane). In order to deal with the high 214 number of packets per second, the forwarding plane is generally 215 implemented in hardware and is highly optimized for the task of 216 forwarding packets. In contrast, the NDP control plane is mostly 217 implemented in software processes running on a general purpose 218 processor. 220 When a router needs to forward an IP packet, the forwarding plane 221 logic performs the longest match lookup to determine where to send 222 the packet and what outgoing interface to use. To deliver the packet 223 to an adjacent node, the forwarding plane encapsulates the packet in 224 a link-layer frame (which contains a header with the link-layer 225 destination address). The forwarding plane logic checks the Neighbor 226 Cache to see if it already has a suitable link-layer destination, and 227 if not, places the request for the required information into a queue, 228 and signals the control plane (i.e., NDP) that it needs the link- 229 layer address resolved. 231 In order to protect NDP specifically and the control plane generally 232 from being overwhelmed with these requests, appropriate steps must be 233 taken. For example, the size and fill rate of the queue might be 234 limited. NDP running in the control plane of the router dequeues 235 requests and performs the address resolution function (by performing 236 a neighbor solicitation and listening for a neighbor advertisement). 237 This process is usually also responsible for other activities needed 238 to maintain link-layer information, such as Neighbor Unreachability 239 Detection (NUD). 241 By sending appropriate packets to addresses on a given subnet, an 242 attacker can cause the router to queue attempts to resolve so many 243 addresses that it crowds out attempts to resolve "legitimate" 244 addresses (and in many cases becomes unable to perform maintenance of 245 existing entries in the neighbor cache, and unable to answer Neighbor 246 Solicitation). This condition can result in the inability to resolve 247 new neighbors and loss of reachability to neighbors with existing ND- 248 Cache entries. During testing it was concluded that 4 simultaneous 249 nmap sessions from a low-end computer was sufficient to make a 250 router's neighbor discovery process unusable and therefore forwarding 251 became unavailable to the destination subnets. 253 The failure to maintain proper NDP behavior whilst under attack has 254 been observed across multiple platforms and implementations, 255 including the largest modern router platforms available (at the 256 inception of work on this document). 258 5. Neighbor Discovery Overview 260 When a packet arrives at (or is generated by) a router for a 261 destination on an attached link, the router needs to determine the 262 correct link-layer address to use in the destination field of the 263 layer 2 encapsulation. The router checks the Neighbor Cache for an 264 existing Neighbor Cache Entry for the neighbor, and if none exists, 265 invokes the address resolution portions of the IPv6 Neighbor 266 Discovery [RFC4861] protocol to determine the link-layer address of 267 the neighbor. 269 [RFC4861] Section 5.2 (Conceptual Sending Algorithm) outlines how 270 this process works. A very high level summary is that the device 271 creates a new Neighbor Cache Entry for the neighbor, sets the state 272 to INCOMPLETE, queues the packet and initiates the actual address 273 resolution process. The device then sends out one or more Neighbor 274 Solicitations, and when it receives a corresponding Neighbor 275 Advertisement, completes the Neighbor Cache Entry and sends the 276 queued packet. 278 6. Operational Mitigation Options 280 This section provides some feasible mitigation options that can be 281 employed today by network operators in order to protect network 282 availability while vendors implement more effective protection 283 measures. It can be stated that some of these options are "kludges", 284 and can be operationally difficult to manage. They are presented, as 285 they represent options we currently have. It is each operator's 286 responsibility to evaluate and understand the impact of changes to 287 their network due to these measures. 289 6.1. Filtering of unused address space. 291 The DoS condition is induced by making a router try to resolve 292 addresses on the subnet at a high rate. By carefully addressing 293 machines into a small portion of a subnet (such as the lowest 294 numbered addresses), it is possible to filter access to addresses not 295 in that assigned portion of address space using Access Control Lists 296 (ACLs), or by null routing, features which are available on most 297 existing platforms. This will prevent the attacker from making the 298 router attempt to resolve unused addresses. For example if there are 299 only 50 hosts connected to an interface, you may be able to filter 300 any address above the first 64 addresses of that subnet by null- 301 routing the subnet carrying a more specific /122 route or by applying 302 ACLs on the WAN link to prevent the attack traffic reaching the 303 vulnerable device. 305 As mentioned at the beginning of this section, it is fully understood 306 that this is ugly (and difficult to manage); but failing other 307 options, it may be a useful technique especially when responding to 308 an attack. 310 This solution requires that the hosts be statically or statefully 311 addressed (as is often done in a datacenter) and may not interact 312 well with networks using [RFC4862] 314 6.2. Minimal Subnet Sizing. 316 By sizing subnets to reflect the number of addresses actually in use, 317 the problem can be avoided. For example, [RFC6164] recommends sizing 318 the subnets for inter-router links to only have 2 addresses (a /127). 319 It is worth noting that this practice is common in IPv4 networks, in 320 part to protect against the harmful effects of ARP request flooding. 322 Subnet prefixes longer than a /64 are not able to use stateless auto- 323 configuration [RFC4862] so this approach is not suitable for use with 324 hosts that are not statically configured. 326 6.3. Routing Mitigation. 328 One very effective technique is to route the subnet to a discard 329 interface (most modern router platforms can discard traffic in 330 hardware / the forwarding plane) and then have individual hosts 331 announce routes for their IP addresses into the network (or use some 332 method to inject much more specific addresses into the local routing 333 domain). For example the network 2001:db8:1:2:3::/64 could be routed 334 to a discard interface on "border" routers, and then individual hosts 335 could announce 2001:db8:1:2:3::10/128, 2001:db8:1:2:3::66/128 into 336 the IGP. This is typically done by having the IP address bound to a 337 virtual interface on the host (for example the loopback interface), 338 enabling IP forwarding on the host and having it run a routing 339 daemon. For obvious reasons, host participation in the IGP makes 340 many operators uncomfortable, but can be a very powerful technique if 341 used in a disciplined and controlled manner. One method to help 342 address these concerns is to have the hosts participate in a 343 different IGP (or difference instance of the same IGP) and carefully 344 redistribute into the main IGP. 346 6.4. Tuning of the NDP Queue Rate Limit. 348 Many implementations provide a means to control the rate of 349 resolution of unknown addresses. By tuning this rate, it may be 350 possible to ameliorate the issue, as with most tuning knobs 351 (especially those that deal with rate limiting), the attack may be 352 completed more quickly due to the lower threshold. By excessively 353 lowering this rate you may negatively impact how long the device 354 takes to learn new addresses under normal conditions (for example, 355 after clearing the neighbor cache or when the router first boots). 356 Under attack conditions you may be unable to resolve "legitimate" 357 addresses sooner than if you had just left the parameter untouched. 359 It is worth noting that this technique is worth investigating only if 360 the device has separate queues for resolution of unknown addresses 361 and the maintenance of existing entries. 363 7. Recommendations for Implementors. 365 The section provides some recommendations to implementors of IPv6 366 Neighbor Discovery. 368 At a high-level, implementors should program defensively. That is, 369 they should assume that attackers will attempt to exploit 370 implementation weaknesses, and should ensure that implementations are 371 robust to various attacks. In the case of Neighbor Discovery, the 372 following general considerations apply: 374 Manage Resources Explicitly Resources such as processor cycles, 375 memory, etc. are never infinite, yet with IPv6's large subnets it 376 is easy to cause NDP to generate large numbers of address 377 resolution requests for non-existent destinations. 378 Implementations need to limit resources devoted to processing 379 Neighbor Discovery requests in a thoughtful manner. 381 Prioritize Some NDP requests are more important than others. For 382 example, when resources are limited, responding to Neighbor 383 Solicitations for one's own address is more important than 384 initiating address resolution requests that create new entries. 385 Likewise, performing Neighbor Unreachability Detection, which by 386 definition is only invoked on destinations that are actively being 387 used, is more important than creating new entries for possibly 388 non-existent neighbors. 390 7.1. Prioritize NDP Activities 392 Not all Neighbor Discovery activities are equally important. 393 Specifically, requests to perform large numbers of address 394 resolutions on non-existent Neighbor Cache Entries should not come at 395 the expense of servicing requests related to keeping existing, in-use 396 entries properly up-to-date. Thus, implementations should divide 397 work activities into categories having different priorities. The 398 following gives examples of different activities and their importance 399 in rough priority order. If implmented, the operation and priority 400 of these should be configurable by the operator. 402 1. It is critical to respond to Neighbor Solicitations for one's own 403 address, especially for a router. Whether for address resolution or 404 Neighbor Unreachability Detection, failure to respond to Neighbor 405 Solicitations results in immediate problems. Failure to respond to 406 NS requests that are part of NUD can cause neighbors to delete the 407 NCE for that address, and will result in followup NS messages using 408 multicast. Once an entry has been flushed, existing traffic for 409 destinations using that entry can no longer be forwarded until 410 address resolution completes successfully. In other words, not 411 responding to NS messages further increases the NDP load, and causes 412 on-going communication to fail. 414 2. It is critical to revalidate one's own existing NCEs in need of 415 refresh. As part of NUD, ND is required to frequently revalidate 416 existing, in-use entries. Failure to do so can result in the entry 417 being discarded. For in-use entries, discarding the entry will 418 almost certainly result in a subsequent request to perform address 419 resolution on the entry, but this time using multicast. As above, 420 once the entry has been flushed, existing traffic for destinations 421 using that entry can no longer be forwarded until address resolution 422 completes successfully. 424 3. To maintain the stability of the control plane, Neighbor 425 Discovery activity related to traffic sourced by the router (as 426 opposed to traffic being forwarded by the router) should be given 427 high priority. Whenever network problems occur, debugging and making 428 other operational changes requires being able to query and access the 429 router. In addition, routing protocols dependent on Neighbor 430 Discovery for connectivity may begin to react (negatively) to 431 perceived connectivity problems, causing additional undesirable 432 ripple effects. 434 4. Traffic to unknown addresses should be given lowest priority. 435 Indeed, it may be useful to distinguish between "never seen" 436 addresses and those that have been seen before, but that do not have 437 a corresponding NCE. Specifically, the conceptual processing 438 algorithm in IPv6 Neighbor Discovery [RFC4861] calls for deleting 439 NCEs under certain conditions. Rather than delete them completely, 440 however, it might be useful to at least keep track of the fact that 441 an entry at one time existed, in order to prioritize address 442 resolution requests for such neighbors compared with neighbors that 443 have never been seen before. 445 7.2. Queue Tuning. 447 On implementations in which requests to NDP are submitted via a 448 single queue, router vendors should provide operators with means to 449 control both the rate of link-layer address resolution requests 450 placed into the queue and the size of the queue. This will allow 451 operators to tune Neighbour Discovery for their specific environment. 452 The ability to set, or have per interface or per prefix queue limits 453 at a rate below that of the global queue limit might limit the damage 454 to the neighbor discovery processing to the network targeted by the 455 attack. 457 Setting those values must be a very careful balancing act - the lower 458 the rate of entry into the queue, the less load there will be on the 459 ND process, however, it will take the router longer to learn 460 legitimate destinations as a result. In a datacenter with 6,000 461 hosts attached to a single router, setting that value to be under 462 1000 would mean that resolving all of the addresses from an initial 463 state (or something that invalidates the address cache, such as a STP 464 TCN) may take over 6 seconds. Similarly, the lower the size of the 465 queue, the higher the likelihood of an attack being able to knock out 466 legitimate traffic (but less memory utilization on the router). 468 8. IANA Considerations 470 No IANA resources or consideration are requested in this draft. 472 9. Security Considerations 474 This document outlines mitigation options that operators can use to 475 protect themselves from Denial of Service attacks. Implementation 476 advice to router vendors aimed at ameliorating known problems carries 477 the risk of previously unforeseen consequences. It is not believed 478 that these mitigation techniques or the implementation of finer- 479 grained queuing of NDP activity create additional security risks or 480 DOS exposure. 482 10. Acknowledgements 484 The authors would like to thank Ron Bonica, Troy Bonin, John Jason 485 Brzozowski, Randy Bush, Vint Cerf, Tassos Chatzithomaoglou, Jason 486 Fesler, Wes George, Erik Kline, Jared Mauch, Chris Morrow and Suran 487 De Silva. Special thanks to Thomas Narten and Ray Hunter for 488 detailed review and (even more so) for providing text! 490 Apologies for anyone we may have missed; it was not intentional. 492 11. References 494 11.1. Normative References 496 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 497 Requirement Levels", BCP 14, RFC 2119, March 1997. 499 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 500 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 501 September 2007. 503 [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless 504 Address Autoconfiguration", RFC 4862, September 2007. 506 [RFC6164] Kohno, M., Nitzan, B., Bush, R., Matsuzaki, Y., Colitti, 507 L., and T. Narten, "Using 127-Bit IPv6 Prefixes on Inter- 508 Router Links", RFC 6164, April 2011. 510 11.2. Informative References 512 [RFC4732] Handley, M., Rescorla, E., and IAB, "Internet Denial-of- 513 Service Considerations", RFC 4732, December 2006. 515 [RFC5157] Chown, T., "IPv6 Implications for Network Scanning", 516 RFC 5157, March 2008. 518 Authors' Addresses 520 Igor Gashinsky 521 Yahoo! 522 45 W 18th St 523 New York, NY 524 USA 526 Email: igor@yahoo-inc.com 528 Joel Jaeggli 529 Zynga 530 111 Evelyn 531 Sunnyvale, CA 532 USA 534 Email: jjaeggli@zynga.com 536 Warren Kumari 537 Google Inc 538 1600 Amphitheatre Parkway 539 Mountain View, CA 540 USA 542 Email: warren@kumari.net